Entrepreneur, Data Scientist, Software Developer
I'm a Technology Entrepreneur and Data Scientist. I completed my PhD in Computer Science in 2012. Have academic background in Machine Learning, Data Mining, Algorithm Design, Social Networks Analysis, and Natural Language Processing. My research interest is in extracting interesting patterns/signals from big data which can be turned into valuable business/marketing actions. I was lucky to have Dr. Valerie King & Dr. Ali Shoja as my PhD supervisors.
I have 3 years industrial experience in building advanced Machine Learning Algorithms for predicting Click Through Rate for Online Display Ads. This was done by real-time analysis of a very large volume of ad performance data shown on websites.
I have more than 3 years industrial/research experience in building complicated NLP models such as Topic Modeling, Name Entity Recognition, Sentiment Analysis, & Spam Detection. Have more than 7 years experience in designing/building predictive models to forecast results of political elections, marketing campaigns, & flu outbreaks by mining/analyzing unstructured data from online social networks such as Twitter.
I have one year experience in Online Fraud Detection where I was responsible to design/implement Machine Learning algorithms to analyze Facebook, Twitter, Google+, LinkedIn & other social networks data in order to detect Fake/Fraud Digital Identities.
You can view the latest version of my CV from here: Kazem's Resume.
Here is the word cloud of my research projects in the last four years:
Link to: Contact information.
In July 2014, we made our Java ML/NLP library open source! We used this library for analyzing tweets and building predictive models. The predictive models can be used for analyzing election campaigns in order to dig into social media conversations (public opinions) and get insights for making intelligent decisions. You can pull the code from its github repository: Twitter Mining.
In July 2014, we submitted a paper to ICDM 2014 conference on analysing tweets for predicting US 2012 presidential election. You can read the paper from here: The Predictive Power of Social Media: On the Predictability of U.S. Presidential Elections using Twitter.
I wrote an article on Deep Learning where I highlighted the challenges and benefits that DL brings to ML community. Read the article from here: Deep Learning: Challenges and Excitements!
I finally documented my notes from 2013 Knowledge Discovery and Data Mining Conference in Chicago! Most of my notes are about talks in advertising and outlier detection spaces. I also posted some notes about some interesting talks that I attended related to optimization and its importance in machine learning, Google Trends and the possibility of predicting present, how to start a data company and so on. Read the article from here: My Takes from KDD 2013!
On Dec 1 2013, I started benchmarking different languages from performance point of view. You can find the details here: Benchmarks!
On Nov 26 2013, I gave a talk on Using Machine Learning & Statistics To Predict The US Presidential Election at machine learning and data science meetups in Vancouver.
From August 10 to 15 I attended KDD 2013 (Knowledge Discovery and Data Mining) coneference in Chicago.
I wrote a short article: "Who's a Data Scientist?" which describes my academic/industrial experiences in data analysis area.
In July 2013, I attended hackwithus event @Victoria with three other hackers. We built a simple AI for Snake game!
In July 2013 @Seeker we submitted a paper to EMNLP 2013: Conference on Empirical Methods in Natural Language Processing. The focus of our paper was on analyzing the performance of generative models (e.g. HMM) and discriminative models (e.g. CRFs) for extracting biomedical entites (i.e. disease and treatments) in the presence of rarity.
In March 2013, I attended Morbify hack event. We built a fun game with a purpose (GWAP) to hunt images! Read more about our game from Image Hunt.
In March 2013, we wrote an article: "Tracking Social Media Trends and Their Influence on E-Commerce Markets" were we showed correlation between eBay consumers and social media trend (e.g. Twitter).
In Feb 2013, I attended the Open Data Day Hackathon. We consumed Vancouver open data and built a web app which computes/shows a score for quality of life in different regions of Vancouver city. Read more about this project from Vancouver City Talks!
In January 2013, I attended the Firefox OS App Day in Vancouver. We built a web application which computes the likelihood that a person catches the flu by collecting/analyzing data from Twitter. You can read more about this project from Predicting Flu!
In January 2013, I joined Seeker Solutions: a company whose focus is on using natural language processing and machine learning technologies for health informatics domain. In Seeker I'm invloved in building software technologies using ML algorithms to solve NLP problems.
In Fall 2012, I did an extensive research in the area of computational advertisement for Red Brick Media company. My focus was on designing/developing strategies to compute & show ads with the highest conversion rates to web users. We formulated ad-selection problem as Multi-Armed Bandit problem which is a classical paradigm in Machine Learning. I used machine learning, data mining, probability and statistics in order to analyze big advertisement data and devise efficient ad selection strategies.
In December 2012 I attended the AngelHack Fall 2012 event in Seattle. The conference was a program for mentorship and fundraising for entrepreneurs. We designed/implemented a mobile crowdsensing application (for iPhone/Android) to detect/track/display real time events. The mobile application had four major components: sensing component (reading from GPS, audio, accelerometer, bluetooth, and etc.), machine learning component, sharing real-time data componet, and data visualization component.
I attended a 24-hour AbeBooks Hackathon event on Friday Sep 28, 2012. There, we built a web application (using Amazon Web Service) which uses machine learning & data mining for analyzing big data.
- I defended my PhD in August 2012. My thesis topic was on "Contact Prediction, Routing and Fast Information Spreading in Social Networks". You can download a pdf of my thesis from Jahanbakhsh_Kazem_PhD. You can also download my defence slides from phd slides.
- During summer 2011, I did an internship for Proven.com in San Francisco. In Proven.com, our goal was to help trade people find jobs. We implemented ideas from social networking area to efficiently connect employers to workers using CakePhP, MySql, JQuery, and Ajax technologies. I also built a Facebook application called HireProven for Proven website in order to integrate Facebook social features with Proven website.
- Real-Time Bus Tracking System: This was a project that we designed/implemented/demoed in AngelHack Hackathon in Seattle. It was a mobile crowdsensing system (iPhone/Android) for tracking bus locations in real-time by using machine learning algorithms. Click RTBTS to find more about this project.
- Predicting US 2012 Presidency Election using Twitter: This is an ongoing research project for analyzing/mining 2012 US election conversations in Twitter. The main goal is to test the possibility of predicting election results using political tweets. Read more about this project from Predicting US 2012 Election Results.
- Geo Crawler: This is a project for crawling and indexing places that are hard to be found by using Google map service. Click Geo Crawler to read more about this project.
- Twheat Map: A web application for showing a real-time map of geo-tagged tweets with their labels (positive/negative) computed by using a sentiment analysis algorithm. This application was implemented in Abebooks Hackathon 2012 event in Victoria. Click here to find more about this application.
- Mobile Social Trivia Game: a Twilio SMS powered trivia application developed in HackVan 2012 event in Vancouver. Enter a code and join a multi-player trivia SMS game. Click Trivia to find more about this project.
- K-means Clustering: a Python implementation of k-means algorithm. Click k-means to find more about the algorithm and download the code.
- Drinking-Fountain Finder App.: a web application which shows the closest drinking fountain to your current location. This application was developed in Open Data Hackaton event in Vancouver. Click Fountains to find more about this application.
- Social Community Detection: an implementation of Girvan-Newman community detection algorithm for weighted graphs in Python. You can find more about this code and download its source code from Cmty link.
- Flickr Crawler & Hometown Predictor: a two-layer crawler for collecting frienship graph of people and attributes of their uploaded photos from Flickr website. The main goal of this project was to predict Flickr users' hometowns by exploiting the geotag information of their uploaded photos. You can download the source codes and find more about the crawler from Flickr link.
- Reliable Datagram Protocol: a multi-threaded reliable transport layer implemented in C. This is an application layer which runs on top of UDP layer in order to make UDP reliable as TCP. You can read more about this project and download its source code from RDP link.
- Language Detection: a Java applet for recognizing language of an input sentence by using Naive Bayes classifier. Enter a sentence and find out its natural language. You can read more about this project and download its source code from Language Recognition link.
- Soma-Cube Puzzle Solver: a Java code for solving the 7-pieces Soma Cube puzzle by using a recursive backtracking search. You can read more about the puzzle and download the puzzle solver's source code from Soma Cube link.
- Autonomous Flying Blimp: an embedded system developed for controlling an autonomous blimp. We developed both the hardware and software to control our blimp. This project was done by me and two other colleagues in 2008 for a course called "Software for Embedded & Mechatronics Systems". You can find the design and source codes for our flying blimp at Super Blimp! You can also click Flying Blimp to watch one of our demos.
Software Research Projects
- Information Spreading/Advertising in Online Social Networks: an efficient and scalable program implemented in C for analyzing running times of rumor spreading algorithms in online social networks. Click Spread to find more about this project.
- Social Networks Connectivity: a C code for analyzing the detail connectivity of online social networks such as Facebook. Click Connectivity to find more about this project.
- Social-Sim Simulator: a comprehensive simulator written in C++ for studying the underlying properties of mobile social networks as well as evaluation of our proposed Social-Greedy routing algorithm. You can find more technical details about this project and download its source code from Social-Sim link.
- Human Contact Predictor: a Python code for inferring people movements and contact patterns in real scenarios such as conference or campus environments by exploiting statistical properties of contact graphs. Visit Prediction for more information.
- Diffusion of Virus in Social Networks: an efficient C code for simulation of how a virus/disease diffuses in social networks. You can find more about this code at Diffusion.
- Distributed Computing (Parallel SIQS): a parallel and optimized software program written in C using Message Passing Interface library for cracking large RSA keys. This project was part of my master thesis. In this project, I also built & configured a "Linux Cluster" of 17 nodes to crack RSA keys. You can find more about my thesis and its code at PSIQS. You can also download my master thesis presentation from master slides.
- K. Jahanbakhsh and Y. Moon, The Predictive Power of Social Media: On the Predictability of U.S. Presidential Elections using Twitter, submitted to ICDM 2014.
- K. Jahanbakhsh, V. King, G.C. Shoja, Predicting Missing Contacts in Mobile Social Networks, Pervasive and Mobile Computing Journal (PMC), 2012.
- K. Jahanbakhsh, V. King, G.C. Shoja, Predicting Human Contacts in Mobile Social Networks using Supervised Learning, Simplex 2012 (in conjunction with www 2012), Lyon, France.
- K. Jahanbakhsh, V. King, G.C. Shoja, Empirical Comparison of Information Spreading Algorithms in the Presence of 1-Whiskers, Social Computing 2011, MIT, Boston, USA (Read More).
- K. Jahanbakhsh, V. King, G.C. Shoja, Predicting Missing Contacts in Mobile Social Networks, World of Wireless Mobile and Multimedia Networks (WoWMoM) 2011, Lucca, Italy. [Slides]
- K. Jahanbakhsh, V. King, G.C. Shoja, They Know Where You Live!, posted on arxiv website, 2010.
- K. Jahanbakhsh, G.C. Shoja, V. King, Human Contact Prediction Using Contact Graph Inference, 2010 International Symposiumm on Social Computing and Networking (SocialNet-2010), Hangzhou, China. [Slides]
- K. Jahanbakhsh, G.C. Shoja, V. King, Social-Greedy: A Socially-Based Greedy Routing Algorithm for Delay Tolerant Networks, ACM/SIGMOBILE MobiOpp, Feburary 2010, Pisa, Italy.
- Y.O. Yazir, K. Jahanbakhsh, S. Ganti, G.C. Shoja, Y. Coady, A low-cost realistic testbed for mobile ad hoc networks, PACRIM, 2009, Victoria, Canada.
- M. Ghelichi, K. Jahanbakhsh, E. Sanaei, RCCT: Robust Clustering with Cooperative Transmission for Energy Efficient Wireless Sensor Networks, 7th International Conference on Information Technology : New Generations, 2008.
- K. Jahanbakhsh, M. Hajhosseini, Improving Performance of Cluster Based Routing Protocol using Cross-Layer Design, 2008. [You can find more details about this paper here.]
- K. Jahanbakhsh, J. Papadopoulos, An efficient Parallel Implementation of Self Initialization Quadratic Sieve for Integer Factorizations Using Message Passing Interface (MPI), Proceedings of 14th Iranian Conference on Electrical Engineering, Tehran (IRAN), May 2006.
- N. Jahangiri, K. Jahanbakhsh, M. Yaghubi, B. V. Vahdat, Device Drivers Skelton in Windows 98, Proceedings of 12th Iranian Conference on Electrical Engineering, Mashhad (IRAN), May 2004.
TA for Randomized Algorithms (CSC 423 : Spring 2012)
TA for Algorithms and Data Structures I (CSC 225 : Spring 2011)
TA for Introduction to Operating Systems (CSC 360 : Fall 2008, Summer and Fall 2010)
Lab Instructor for Computer Communication and Networks (CSC 361: 2008 - 2010, 2011)
TA for Operations Research: Simulation (CSC 546: Fall 2008)
PC member of Social Computing 2013, Washington, D.C., 2013.
Reviewer of SODA 2013, New Orleans, Louisiana USA, 2013.
PC member of Social Computing 2012, Amesterdam, The Netherlands, 2012
Reviewer of SocialCom 2011, MIT, Boston, USA, 2011
Reviewer of Pervasive and Mobile Computing Journal, 2011
I'm interested in playing chess and solving puzzles. Recently I have been introduced to an exciting variant of chess called Hostage by John Leslie. You can take a look at this game @
Link to My GitHub