I'm recently thinking a lot about recommendations and about building the book recommendation portal I had in mind since 2013.

However, for recommendation systems it is as hard as with any branch of machine learning to find a good overview over techniques, their respective strengths and drawbacks as well as hard performance measures.

So let's get started.

## The Data

The Movielens 20M contains 20 million movie ratings. They were created by 138,000 users for 27,000 movies.

The data looks like this:

```
userId movieId rating timestamp
0 1 2 3.5 1112486027
1 1 29 3.5 1112484676
2 1 32 3.5 1112484819
3 1 47 3.5 1112484727
4 1 50 3.5 1112484580
5 1 112 3.5 1094785740
6 1 151 4.0 1094785734
7 1 223 4.0 1112485573
8 1 253 4.0 1112484940
9 1 260 4.0 1112484826
10 1 293 4.0 1112484703
```

There is genres and tags as well.

## The Evaluation

The task is to predict the ratings. To do so, the data gets sorted by timestamp. A 50% train data and 50% test data split is done. On the test data, the mean average error (MAE) is calculated. Lower is better. The results have to be given with exactly three decimal places.

## Baselines

All of the following evaluations took roughly 43s on my Thinkpad T460p. The memory consumption of all of them is not relevant.

Name | MAE | MSE | Comment |
---|---|---|---|

Constant 1 | 2.422 | 6.939 | I don't expect this to be awesome, but it should be better than MAE of 5. |

Constant 5 | 1.603 | 3.761 | Together with Constant 5, this gives the range in which all recommenders will be. |

Constant 2.5 | 1.217 | 1.996 | Predicting the middle is the best if you have absolute no prior knowledge and MAE. |

Median User Rating | 0.733 | 1.112 | Every user in the test set was also in the training set! |

Median Movie Rating | 0.723 | 1.061 | For known movies, predict their median value. For unknown ones, predict the median of all medians of movie ratings. |

User-adjusted movie rating | 0.825 | 1.042 | Use the Median movie rating, but add user bias |

## Code

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | ```
#!/usr/bin/env python
"""Analyze the quality of recommendations."""
# 3rd party modules
from sklearn.base import BaseEstimator
from sklearn.model_selection import train_test_split
import click
import pandas as pd
def load_data(rating_filepath='ratings.csv'):
"""Load extracted movie lense data."""
nrows = None
df = pd.read_csv(rating_filepath, nrows=nrows)
df['rating'] = df['rating'].astype('int16')
df = df.sort_values(by='timestamp')
df_x = df[['timestamp', 'userId', 'movieId']]
df_y = df[['rating']]
df_train_x, df_test_x, df_train_y, df_test_y = train_test_split(df_x, df_y)
return {'train': {'x': df_train_x, 'y': df_train_y},
'test': {'x': df_test_x, 'y': df_test_y}}
class BaselineRecommender(BaseEstimator):
"""Create a baseline recommender."""
def __init__(self, strategy='constant', constant=2.5):
self.strategy = strategy
if constant is not None and strategy != 'constant':
raise RuntimeError('constant is only meaningful in the constant '
'strategy.')
self.constant = constant
def fit(self, df_x, df_y):
"""Fit the recommender on movielens data."""
df = df_x.join(df_y)
self.median_by_user = df.groupby(by='userId') \
.aggregate({'rating': 'median'})['rating'] \
.to_dict()
self.median_by_movie = df.groupby(by='movieId') \
.aggregate({'rating': 'median'})['rating'] \
.to_dict()
self.avg_movie = sum(self.median_by_movie.values()) / len(self.median_by_movie)
self.avg_user = sum(self.median_by_user.values()) / len(self.median_by_user)
def predict(self, df_x):
"""Fit ratings for user/movie combinations."""
results = []
for entry in df_x.to_dict('records'):
if self.strategy == 'constant':
prediction = self.constant
elif self.strategy == 'movie_median':
movie = entry['movieId']
prediction = self.median_by_movie.get(movie, self.avg_movie)
elif self.strategy == 'user_median':
user = entry['userId']
prediction = self.median_by_user[user]
elif self.strategy == 'user_ajdust_movie_median':
movie = entry['movieId']
movie_median = self.median_by_movie.get(movie, self.avg_movie)
user = entry['userId']
user_bias = self.median_by_user[user] - self.avg_user
prediction = movie_median + user_bias
else:
raise NotImplemented()
results.append(prediction)
return results
def evaluate(true_ratings, predicted_ratings, func='mae'):
"""Evaluate the results of a rating prediction."""
assert len(true_ratings) == len(predicted_ratings)
if func == 'mae':
absolute_errors = sum(abs(a - b)
for a, b in zip(true_ratings, predicted_ratings))
mae = absolute_errors / len(true_ratings)
val = mae
elif func == 'mse':
sq_errors = sum((a - b)**2
for a, b in zip(true_ratings, predicted_ratings))
val = sq_errors / len(true_ratings)
return val
@click.command()
@click.option('--strategy',
default='constant',
type=click.Choice(['constant', 'movie_median', 'user_median',
'user_ajdust_movie_median']))
@click.option('--constant', default=None,
type=float)
def main(strategy, constant):
"""Analyze recommenders on the Movielens 20M dataset."""
data = load_data()
m = BaselineRecommender(strategy=strategy, constant=constant)
m.fit(data['train']['x'], data['train']['y'])
y_pred = m.predict(data['test']['x'])
mae = evaluate(data['test']['y']['rating'], y_pred, func='mae')
mse = evaluate(data['test']['y']['rating'], y_pred, func='mse')
print('MAE of baseline: {:0.3f}'.format(mae))
print('MSE of baseline: {:0.3f}'.format(mse))
if __name__ == '__main__':
main()
``` |

## Problems

**Ratings instead of Order**: For applications, we are not interested in the right rating but getting the order right. So a constant bias for a user is fine. MAE does not capture that fact.

## Publications

- Prateek Sappadla, Yash Sadhwani, Pranit Arora: Movie Recommender System: They claim to have reached MSE=0.65 with matrix factorization and 0.70 with k-nearest users.
- Shuyu Luo: Introduction to Recommender System, 2018.