post

DDL Data Science Project Pitchfest 3

DDL Incubator

The Spring cohort of the District Data Labs Data Science Project Incubator is coming to a conclusion with Pitchfest this Friday evening. I'm inviting you to join us and check out the projects that the teams will be presenting! [Read more…]

post

Application Skeleton for Flask and AngularJS

Flask and AngularJS

A constant challenge we face at IST Research is ensuring we build all of our applications in a way that makes them easy to scale. During my practice of deep work this week and thinking about that challenge, I decided that every application I build needs the following three things:

  1. Logging
  2. Statistics
  3. API (Application Programming Interface)

All three of these are very important when building and scaling fully distributed applications. [Read more…]

post

Incorporating Human In The Loop Processes into Data Pipelines

Human Robot

Even if you're working with 100% machine-created data, more than likely you're performing some amount of manual inspection on your data at different points in the data analysis process, and the output of your machine learning models.

Many companies including Google, GoDaddy, Yahoo! and LinkedIn use what's known as HITL, or Human-In-The-Loop, to improve the accuracy of everything from maps, matching business listings, ranking top search results and referring relevant job postings.

Why are we still at this point? Because the promise of fully-automated end-to-end flows is a false one. So if we have to have a human involved at some point, what’s the best way to go about it?

Join me for a complimentary webinar on Thursday April 14th at 7PM EST where I'll show you multiple ways to implement and leverage HITL processes as part of your data pipelines.

Reserve your seat today >>

post

How to Build a Data Pipeline in Data Science Studio

Join me Thursday, March 24th at 7PM EST for a complimentary webinar where you'll learn how to build a data pipeline for cleaning and standardizing data using Data Science Studio (DSS). We all deal with dirty, messy data. I'll show you how to use DSS to clean it up and get it ready for analysis using the super easy to use drag-and-drop interface DSS provides.

Sign up for the webinar today >>

post

The Next Four Months at Data Wranglers DC

Data Wranglers DC Logo

Following the Black Hat Data Wrangling talk that Travis Hoppe and I did to kick off the 2016 year of Data Wranglers DC, the next four months is going to be awesome. Here's the lineup: [Read more…]

post

Announcing the Data Science Studio User Group

Data Science Studio User Group

On November 13, 2013 I founded Data Wranglers DC (DWDC). The focus of Data Wranglers has and is data engineering, the 80-90% of time spent on data projects that most people don't like to talk about. It includes everything from gathering and cleaning data to engineering the IT systems to gather, store and process all of the data. [Read more…]

post

District Data Labs Incubator Now Accepting Applications

Incubator

During the dotcom boom of 2000 I found myself in a catch 22: I couldn't get a tech job without experience, but I could get experience without a job. Fast forward to 2016. Companies are scrambling to build data teams and can't find enough experienced people. But how can you, someone new to data science, get experience to get one of these jobs? The District Data Labs incubator can help with that. [Read more…]

post

Recruiters: How to Find Good Software and Data People

Search

If you're a software developer, data engineer or data scientist with a LinkedIn profile or website, you're probably hearing from recruiters on a daily basis. No longer are these folks dressing in suits and attending meetup – they're getting a bit more savvy.

If you're a recruiter I've got two tips for finding awesome software and data peeps, without pissing anyone off. Read on! [Read more…]

post

Black Hat Data Wrangling

Black Hat Data Wrangler
Simply put, the goal of any good data wrangler is to make data more accessible.

Consider the antithesis, the idea of hiding and obfuscating your data but still publishing it on the web. Let's learn from the our anti-hero, the black hat data wrangler. [Read more…]

post

Data Science in Five Steps

Special Agent OsoWhen my daughter Palamee was younger she watched a cartoon with a character named Special Agent Oso. Oso would complete his missions using three simple steps. In a recent conversation I was asked to provide my definition of data science. Today I'm going to provide that definition in not three easy steps, but five, and show a real-world implementation. [Read more…]