Sign in

Data engineer, musician, and gamer. Editor for Low Code for Advanced Data Science, writer for TDS and The Startup.

If you think others in the KNIME community could learn or improve their technical skills thanks to your article, if you have shared a useful component on the KNIME Hub, or if you have produced a great solution to a common data science task, then we want to hear from you!

The scope of the journal will fall into three main article categories:

  • Getting Started

If you are a newbie with KNIME software, your progress story will be hosted in the “Getting Started” category; if you are an educator, your lecture summary will…


PYTHON DEVELOPMENT | VISUAL STUDIO CODE

The latest language server for Python (from Microsoft) is a massive productivity enhancer.

If you work with Python and Visual Studio Code, go ahead and do yourself a favor: download the Pylance extension (preview) and try it out for yourself.

What is Pylance?

Pylance is an extension for Visual Studio Code. More specifically, Pylance is a Python language server — this means it offers enhancements to IntelliSense, syntax highlighting, package import resolution, and a myriad of other features for an improved development experience in the Python language.

You can find the full list of features here. …


AGILE DEVELOPMENT

Collaboration for developers has never been easier.

Coding via teleconference is similar to the situation above, but nobody has to shower!

My challenge to you…

If you’re a leading member of a development team that typically works separately, I triple-dog-dare you to:

  1. Pick a task that you think will take two weeks for your development team to complete (working separately). Perhaps an upcoming backlog item.


POLITICS | TECHNOLOGY

A simple and inevitable step to improve the lawmaking process.

True patriots use Github.

Introduction

Version control is the concept of digitally tracking changes in a document or set of documents over time. For example, a framework called git is one of the most widely used options for version control in the world. It’s used by developers across the world for managing code. Specifically, developers use it to coordinate hundreds of individuals making tens of thousands of changes to a relatively small set of documents.

And, on that note, what is code? It’s a set of instructions. The rules for how a system should behave. What is allowed, and what is not. …


DATA SCIENCE | MACHINE LEARNING

Output from a Recurrent Neural Network (RNN) trained on the collective works of Emily Dickinson

It’s not quite as bad as Vogon poetry, but it’s bad.

I recently trained a Recurrent Neural Network on the collective works of Emily Dickinson in order to generate new poetry.

In order to make this happen, I repurposed a demonstration available here on Google Colaboratory. If you check out the notebook, it documents a fairly standard Tensorflow prediction model.

I used The Complete Project Gutenberg’s Poems to fuel the model’s training data. I fed the model a large single file, but a much better approach would be to break each poem out separately before training.

Sample input poem

Below is a real poem written by Emily Dickinson (one of many used for training…


DATA SCIENCE | MACHINE LEARNING | DATA VISUALIZATION

A free, easy, and open-source tool for all things data? Yes, please!

If you work with data in any capacity, go ahead and do yourself a favor: download KNIME Analytics Platform right here.

More data science, less slamming of the mouse and keyboard.

What is KNIME Analytics Platform?

KNIME Analytics Platform is the strongest and most comprehensive free platform for drag-and-drop analytics, machine learning, statistics, and ETL that I’ve found to date. The fact that there’s neither a paywall nor locked features means the barrier to entry is nonexistent.

Connectors to data sources (both on-premise and on the cloud) are available for all major providers, making it easy to move data between environments. SQL Server to Azure? No problem. Google Sheets to Amazon Redshift? Sure, why not…


DEVELOPER ESSENTIALS

Get up and running with Python, git Bash, and VS Code in ~30 minutes.

Programming is hard enough on its own, but getting started can be even harder. This article is intended to help you install the tools required for modern Python development. I’d recommend that you use the Windows 10 or Ubuntu 18+ operating systems if you are following along.

Accessing the terminal

For those who are already familiar, feel free to skip this section.

However, if you are new to development, it’s essential for you to understand how to access the Command-Line Interface (CLI) of your operating system.

The terms “CLI”, “command-line interface”, “command-line”, and “terminal” are used interchangeably by developers. A CLI is a…


Combine spelling correction, term filtering, and the “Banana Test” to build bullet-proof text classification models

Source: https://pixabay.com/photos/bananas-fruits-food-grocery-store-698608/

Introduction

If your goal is maximizing accuracy for a new text classification model, you should consider using the Banana Test.

There are millions of instances in which businesses have collected free-form text in their systems. In order to automate business processes that utilize this data, the free-form text often needs to be bucketed into higher-level categories. Text classification models are capable of classifying such free-form text quickly and effectively… for the most part.

Regardless of the reported validation/test accuracy, there are a number of gotchas that can cause even the most well-trained text classification model to fail miserably at making an…


DEVELOPER ESSENTIALS

Code Confidently and Break Stuff on Purpose

Introduction

For anyone unfamiliar with the concept, unit testing is the practice of writing a series of tests (or “assertions”) regarding the behavior of your code to ensure that everything works as expected. Unit tests can be run at any point, over and over again, to reinforce your confidence in what you’ve written and allow you to understand (and often redefine) how your code handles various scenarios. By comprehensively testing both the “happy path” and “edge cases”, you can protect your code against breaking changes in the future.

Why unit testing is essential

Without unit tests, it may be very intimidating for you to make a…

SJ Porter

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store