Spark Summit 2017 Review

This morning at Spark Summit Europe 2016, Ion Stoica announced during his keynote the Drizzle project, which promises to reduce streaming data latency in Spark to be less than Flink and Storm. Ion announced this in the context of the new RISE Lab at UC Berkeley.
For years now, companies have been hiding behind the moniker "machine learning" as a way to avoid the stigma associated (or associated in the past from the AI winters) with "artificial intelligence". Well, no more. This week Google announced Google Neural Machine Translation (GNMT) that gets close to human performance.
The free 2016 O'Reilly Data Science Salary Survey has been released. This year, they applied a handy linear regression and spelled out the entire model on page 42. A couple of things jumped out at me. In terms of $1,000s of dollars:
What better way to start getting into TensorFlow than with a notebook technology like Jupyter, the successor to IPython Notebook? There are two little hurdles to achieve this:
Spark Summit (West) 2016 took place this past week in San Francisco, with the big news of course being Spark 2.0 which among other things ushers in yet another 10x performance improvement through whole-stage code generation.
Image is from Evan Sparks et al TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries