Online Movie Recommender System
This is an independent project. Written in Python, Tensorflow.
  • Preprocessed raw data into Spark Dataset by implementing MapReduce operations on MovieLens dataset.
  • Retrieval: Built candidate generators by implementing content-based filtering and collaborative filtering via Matrix Factorization using SparkMLib, visualized item / query embedding by using PCA, experimented with different similarity measures and regularization methods.
  • Ranking: Ranked candidates by implementing Factorization Machines and Gradient Boosting Decision Tree + Logistic Regression using SparkMLib.
  • Re-ranking: Experimented with factors for re-ranking, e.g. warm-start, fairness etc.
  • Built a web application by defining a RESTful-like API using Flask, built a HTTP/1.1-compliant, WSGI thread-pooled web server on a Spark cluster using CherryPy.