Six Types of Data; Are You Still Stuck on the First Two?

Running numeric and categorical data through glm and random forest is so 2014. Are you ready for the techniques required to process real-world data?

Techniques for text analytics, image understanding, full motion video, and digital signal processing have been around for decades. But now in the era of Big Data, the purple unicorn (or at least a data science team) is going to need to bring all of them to bear on the rich data being collected, stored and streamed.

(Of course there are R packages for analyzing most or all of these rich data types, despite my glib caption above, which was aimed at the "I've taken an R course data scientist.")

With the Internet of Things, rich data is going to explode. You think we have Big Data now with clickstreams? Rich data is vastly more space-consuming. Just from automobiles alone: all the video, engine sensor readings, tire pressure, outdoor temperature, temperatures from various points around the car. And that's not even considering the other major sources of data in IoT: personal (body) area network, home automation, retail and entertainment venues, military, warehouses, etc.

Data Science teams are going to need to augment in the coming months and years.

Image credits: image parsing from, full motion video from, digital signal processing from