Sign in

Data wrangler, musician, and gamer. Writer for Towards Data Science, The Startup, and Digital Diplomacy.


The latest language server for Python (from Microsoft) is a massive productivity enhancer.

What is Pylance?

Pylance is an extension for Visual Studio Code. More specifically, Pylance is a Python language server — this means it offers enhancements to IntelliSense, syntax highlighting, package import resolution, and a myriad of other features for an improved development experience in the Python language.

You can find the full list of features here. …


Collaboration for developers has never been easier.

Coding via teleconference is similar to the situation above, but nobody has to shower!

My challenge to you…

If you’re a leading member of a development team that typically works separately, I triple-dog-dare you to:

  1. Pick a task that you think will take two weeks for your development team to complete (working separately). Perhaps an upcoming backlog item.
  2. Invite your entire team to a call on MS Teams, Zoom, or whichever teleconferencing tool you use. Have them drop as many meetings as possible so that they’re free for all or most of the day.
  3. Start sharing your screen, talk about what you’re trying to do, and start coding while the rest of the team is watching. Start asking…


A simple and inevitable step to improve the lawmaking process.

True patriots use Github.


Version control is the concept of digitally tracking changes in a document or set of documents over time. For example, a framework called git is one of the most widely used options for version control in the world. It’s used by developers across the world for managing code. Specifically, developers use it to coordinate hundreds of individuals making tens of thousands of changes to a relatively small set of documents.

And, on that note, what is code? It’s a set of instructions. The rules for how a system should behave. What is allowed, and what is not. …


Output from a Recurrent Neural Network (RNN) trained on the collective works of Emily Dickinson

It’s not quite as bad as Vogon poetry, but it’s bad.

I recently trained a Recurrent Neural Network on the collective works of Emily Dickinson in order to generate new poetry.

In order to make this happen, I repurposed a demonstration available here on Google Colaboratory. If you check out the notebook, it documents a fairly standard Tensorflow prediction model.

I used The Complete Project Gutenberg’s Poems to fuel the model’s training data. I fed the model a large single file, but a much better approach would be to break each poem out separately before training.

Sample input poem

Below is a real poem written by Emily Dickinson (one of many used for training…


A free, easy, and open-source tool for all things data? Yes, please!

More data science, less slamming of the mouse and keyboard.

What is KNIME Analytics Platform?

KNIME Analytics Platform is the strongest and most comprehensive free platform for drag-and-drop analytics, machine learning, statistics, and ETL that I’ve found to date. The fact that there’s neither a paywall nor locked features means the barrier to entry is nonexistent.

Connectors to data sources (both on-premise and on the cloud) are available for all major providers, making it easy to move data between environments. SQL Server to Azure? No problem. Google Sheets to Amazon Redshift? Sure, why not…


Get up and running with Python, git Bash, and VS Code in ~30 minutes.

Accessing the terminal

For those who are already familiar, feel free to skip this section.

However, if you are new to development, it’s essential for you to understand how to access the Command-Line Interface (CLI) of your operating system.

The terms “CLI”, “command-line interface”, “command-line”, and “terminal” are used interchangeably by developers. A CLI is a…

Combine spelling correction, term filtering, and the “Banana Test” to build bullet-proof text classification models



If your goal is maximizing accuracy for a new text classification model, you should consider using the Banana Test.

There are millions of instances in which businesses have collected free-form text in their systems. In order to automate business processes that utilize this data, the free-form text often needs to be bucketed into higher-level categories. Text classification models are capable of classifying such free-form text quickly and effectively… for the most part.

Regardless of the reported validation/test accuracy, there are a number of gotchas that can cause even the most well-trained text classification model to fail miserably at making an…


Code Confidently and Break Stuff on Purpose


For anyone unfamiliar with the concept, unit testing is the practice of writing a series of tests (or “assertions”) regarding the behavior of your code to ensure that everything works as expected. Unit tests can be run at any point, over and over again, to reinforce your confidence in what you’ve written and allow you to understand (and often redefine) how your code handles various scenarios. By comprehensively testing both the “happy path” and “edge cases”, you can protect your code against breaking changes in the future.

Why unit testing is essential

Without unit tests, it may be very intimidating for you to make a…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store