Globally Scalable Web Document Classification Using Word2Vec

Register here.

Monday, April 27, 2015 - 6:30 PM to 9:00 PM

Digital Garage Development LLC
717 Market Street, San Francisco, CA

Main Talk: Globally Scalable Web Document Classification Using Word2Vec

Abstract

Extracting information from unstructured web documents is a common problem for many applications and determining which category they belong to can be especially challenging at planetary scale.

In this talk, we will show how SmartNews achieves globally scalable, real-time web document classification using new machine learning techniques, especially Word2Vec's extended distributed representation model. We will also discuss the pros and cons for using distributed representation from a real-world, operational standpoint, as well as new classification approaches being used in Japan.

Bio

Kohei Nakaji is software engineer at SmartNews, one of Japan’s hottest startups with 10M+ users worldwide. SmartNews news discovery platform uniquely uses machine learning to extract, categorize, target, rank and deliver culturally relevant news to 150+ countries. Kohei’s research and engineering focus is machine learning and natural language processing.

Tentative Schedule: 

6:30pm-7:00pm -- socializing

7:00pm-7:15pm -- lightning talk (tbd)

7:20pm - 8:20pm -- main talk

8:20pm - 9:00pm -- socializing

Date: 
Monday, April 27, 2015 - 6:30pm to 9:00pm