My Takes from KDD 2013, Chicago

In August 2013, I attended KDD conference in Chicago. In this post I documented the highlights from a few interesting talks that I went to.

  1. Outlier Detection and Description
  2. Data Mining for Online Advertising
  3. Deep Learning
  4. To Buy or Not to Buy - "That is the Question"
  5. Optimization in Learning and Data Analysis by Stephen Wright
  6. Targeting and influencing at scale : from president election to social goods
  7. Hadoop industry session
  8. A Data Scientist's Guide to Making Money from Start-ups
  9. Title: Death of the expert? The rise of algorithms and decline of domain experts
  10. Predicting the Present with Search Engine Data
  11. Ads Industry track

A) Outlier Detection and Description

Outlier detection is an interesting space in machine learning where the main focus is to define a sort of outliers depending on your application and then build machine learning models to detect those outliers. Below I listed my takes from two talks at Outlier Detection and Description workshop. Anomaly detection is related to what I'm working on these days in my current company Trulioo. One of my big task in this company is to detect fake/fraud users in cyberspace. Thus, I apply "Anomaly Detection Techniques" in order to identify abnormal users and separate them from normal ones!

(I) Outlier Detection in Personalized Medicine by Raymond Ng

(II) Enhancing One-class Support Vector Machines for Unsupervised Anomaly Detection

B) Data Mining for Online Advertising:

One of the areas that I've been heavily involved in the last three years is to design predictive models which analyzes ads impressions to predict ads click through rates. These algorithms are very useful for predicting performance of display ads. KDD 2013 also hosted AKDD workshop which is solely focused on advertising space. Below I documented some of my takes from two keynotes that I attended.

(I) Machine Learning Challenges in Targeted Advertising: Transfer Learning in Action: Claudia Perlich

(II) Advertising - Why Human Intuition Still Exceeds Our Best Technology: Brian Burdick

(II) More on Ad talks

There was an ad talk where authors focusing on building a model for predicting if a person make a purchase when visiting a website. One problem is the large no of features. They turn the posterior probability to a ranking problem because what we care is what ad should be shown on the website based on user's features collected from their cookies instead of P(conversion|features).

Another ad talk was about how to build a model using K-means and then use the distance for a new user from the centroid of cluster of the built model to rank users and finding if the user is going to buy a car or not.

C) Deep Learning

In KDD-13 conference, I also learnt about a new machine learning technique called "Deep Learning" invented by Geoffrey Hinton at UoT. Hinton started a company based on DL which was acquired by Google! Seems several big companies and startups started using this technique for analyzing big data. Google has started using this technique for indexing web images. Microsoft (Richard F. Rashid's team) also used deep learning for speech recognition and language translation.

D) To Buy or Not to Buy - "That is the Question" by Oren Etzioni

Decide.com is an interesting company in data mining space. The core business of the company is to predict the price trend for electronic devices. I met Oren Etzioni the CTO of the company who gave an interesting talk about the company technology. Decide.com was aquired by Ebay in Sep 2013. Below you can find some highlights from Oren's talk.

E) Optimization in Learning and Data Analysis by Stephen Wright

Optimization plays a very crucial role in mchine learning. Stephen Wright gave talk where he summarized several important optimization techniques. Below is my takes from his talk

F) Targeting and influencing at scale : from president election to social goods by Rayid Ghani (Univ of Chicago)

In 2012, I collected 40 millions tweets in order to predict US presidential election! In KDD 2013, I met Rayid Ghani who worked for Obama campain in 2012. He gave an insightful talk about how Obama campaign used marketing strategies in order to persuade likly voter to go and vote for Obama!

G) Hadoop industry session by Milind Bhandarkar

H) A Data Scientist's Guide to Making Money from Start-ups

This was a very insightful talk that I attended in KDD 2013. In this panel several Enterpruners discussed about how to start a company in data science space. Below you can find my summary from that panel.

I) Title: Death of the expert? The rise of algorithms and decline of domain experts

J) Predicting the Present with Search Engine Data

Google Trend is very ineresting service where you can analyze how people use different keyword search in Google since 2004. In this talk Hal Varian, Chief Economist at Google gave a talk about how one can use Google Trend in order to predict the present!

K) Ads Industry track

Ads Industry track was another interesting track where serveral big companies like Google and Yahoo gave presentation about their in-house predictive models for ad click prediction problem.

I) A Unified Search Federation System Based on Online User Feedback

II) Scalable Supervised Dimensionality Reduction Using Clustering

III) Ad Click Prediction: a View from the Trenches