How to Build MultiModal Recommender Systems with Tensorflow
Recommender systems are everywhere! On Tiktok, they suggest the next viral clip; On Youtube, the next interesting video; On Twitter, they compile the list of tweets we wee on our feed. In many ways, these algorithms determine what we get to see or not see! But how do they work and how can we build these sort of systems? While recommender systems are complex and can be implemented in multiple different ways, each approach must do a few things. First, it must derive some representation of the problem context (e.g., who is the user and what are their preference), next it must compute some measure of relevance (i.e., how can I tell if an item is suitable to the given context?) and it must provide results that maximize relevance.
In this post, we will walk through the hypothetical use case of building TweetMaker - a tool that recommends images that you can add to your tweet to make it more engaging! We will discuss the process across the following
- Problem framing. How do we translate the task to a machine learning problem?
- Data Collection. What data pairs and data fields do we need? Walkthrough on collecting sample data from twitter and ingesting it into BigQuery
- Model Training. How do we train the underlying ML models e.g., a multimodal similarity model built with Tensorflow Similarity
- Model Evaluation. How do we evaluate the model to ensure it is fit for purpose?
Full Notebook and code is available on Kaggle - https://www.kaggle.com/code/victordibia/multimodal-metric-learning-tensorflow-similarity