Tutorial#

This tutorial will walk you through the comprehensive process of training a model in LibRecommender, i.e. data processing -> feature engineering -> training -> evaluate -> save/load -> retrain. We will use Wide & Deep as the example algorithm.

First make sure the latest LibRecommender has been installed:

$ pip install -U LibRecommender

Serving

For how to deploy a trained model in LibRecommender, see Serving Guide.

TensorFlow1 issue

If you encounter errors like Variables already exist, disallowed... in this tutorial, just call tf.compat.v1.reset_default_graph() first. It’s one of the inconvenience from TensorFlow1.

Load Data#

In this tutorial we will use the MovieLens 1M dataset. The following code will load the data into pandas.DataFrame format. If the data does not exist locally, it will be downloaded at first.

import random
import warnings
import zipfile
from pathlib import Path

import pandas as pd
import tensorflow as tf
import tqdm
warnings.filterwarnings("ignore")

def load_ml_1m():
    # download and extract zip file
    tf.keras.utils.get_file(
        "ml-1m.zip",
        "http://files.grouplens.org/datasets/movielens/ml-1m.zip",
        cache_dir=".",
        cache_subdir=".",
        extract=True,
    )
    # read and merge data into same table
    cur_path = Path(".").absolute()
    ratings = pd.read_csv(
        cur_path / "ml-1m" / "ratings.dat",
        sep="::",
        usecols=[0, 1, 2, 3],
        names=["user", "item", "rating", "time"],
    )
    users = pd.read_csv(
        cur_path / "ml-1m" / "users.dat",
        sep="::",
        usecols=[0, 1, 2, 3],
        names=["user", "sex", "age", "occupation"],
    )
    items = pd.read_csv(
        cur_path / "ml-1m" / "movies.dat",
        sep="::",
        usecols=[0, 2],
        names=["item", "genre"],
        encoding="iso-8859-1",
    )
    items[["genre1", "genre2", "genre3"]] = (
        items["genre"].str.split(r"|", expand=True).fillna("missing").iloc[:, :3]
    )
    items.drop("genre", axis=1, inplace=True)
    data = ratings.merge(users, on="user").merge(items, on="item")
    data.rename(columns={"rating": "label"}, inplace=True)
    # random shuffle data
    data = data.sample(frac=1, random_state=42).reset_index(drop=True)
    return data

>>> data = load_ml_1m()
>>> data.shape

data shape: (1000209, 10)

>>> data.iloc[random.choices(range(len(data)), k=10)]  # randomly select 10 rows

	user	item	label	time	sex	age	occupation	genre1	genre2	genre3
951319	4913	3538	3	962677962	F	25	1	Comedy	missing	missing
969300	3246	2977	5	968309625	F	35	1	Comedy	Drama	missing
914441	1181	3015	2	976142934	M	35	7	Thriller	missing	missing
905593	2063	695	2	974665086	M	25	4	Mystery	Thriller	missing
512570	4867	1200	4	962817971	M	25	16	missing	missing	missing
524227	4684	3174	2	963667810	F	25	0	Comedy	Drama	missing
801408	3792	1224	4	966360592	M	25	6	Drama	War	missing
117662	2270	480	5	974574449	M	18	1	Action	Adventure	Sci-Fi
935170	1088	3825	1	1037975844	F	1	10	Drama	missing	missing
309994	4808	3051	3	962934115	M	35	0	Drama	missing	missing

Now we have about 1 million data. In order to perform evaluation after training, we need to split the data into train, eval and test data first. In this tutorial we will simply use random_split(). For other ways of splitting data, see Data Processing.

Note

For now, We will only use first half data for training. Later we will use the rest data to retrain the model.

Process Data & Features#

>>> from libreco.data import random_split

# split data into three folds for training, evaluating and testing
>>> first_half_data = data[: (len(data) // 2)]
>>> train_data, eval_data, test_data = random_split(first_half_data, multi_ratios=[0.8, 0.1, 0.1], seed=42)

>>> print("first half data shape:", first_half_data.shape)

first half data shape: (500104, 10)

The data contains some categorical features such as “sex” and “genre”, as well as a numerical feature “age”. In LibRecommender we use sparse_col to represent categorical features and dense_col to represent numerical features. So one should specify the column information and then use DatasetFeat.build_* functions to process the data.

>>> from libreco.data import DatasetFeat

>>> sparse_col = ["sex", "occupation", "genre1", "genre2", "genre3"]
>>> dense_col = ["age"]
>>> user_col = ["sex", "age", "occupation"]
>>> item_col = ["genre1", "genre2", "genre3"]

>>> train_data, data_info = DatasetFeat.build_trainset(train_data, user_col, item_col, sparse_col, dense_col)
>>> eval_data = DatasetFeat.build_evalset(eval_data)
>>> test_data = DatasetFeat.build_testset(test_data)

user_col means features belong to user, and item_col means features belong to item. Note that the column numbers should match, i.e. len(sparse_col) + len(dense_col) == len(user_col) + len(item_col).

>>> print(data_info)

n_users: 6040, n_items: 3576, data density: 1.8523 %

Training the Model#

Now with all the data and features prepared, we can start training the model!

Since as its name suggests, the Wide & Deep algorithm has wide and deep parts, and they use different optimizers. So we should specify the learning rate separately by using a dict: {"wide": 0.01, "deep": 3e-4}. For other model hyper-parameters, see API reference of WideDeep.

In this example we treat all the samples in data as positive samples, and perform negative sampling. This is a standard procedure for “implicit data”.

from libreco.algorithms import WideDeep

model = WideDeep(
    task="ranking",
    data_info=data_info,
    embed_size=16,
    n_epochs=2,
    loss_type="cross_entropy",
    lr={"wide": 0.05, "deep": 7e-4},
    batch_size=2048,
    use_bn=True,
    hidden_units=(128, 64, 32),
)

model.fit(
    train_data,
    neg_sampling=True,  # perform negative sampling on training and eval data
    verbose=2,
    shuffle=True,
    eval_data=eval_data,
    metrics=["loss", "roc_auc", "precision", "recall", "ndcg"],
)

Epoch 1 elapsed: 2.905s
    train_loss: 0.959
    eval log_loss: 0.5823
    eval roc_auc: 0.8032
    eval precision@10: 0.0236
    eval recall@10: 0.0339
    eval ndcg@10: 0.1001

Epoch 2 elapsed: 2.508s
    train_loss: 0.499
    eval log_loss: 0.4769
    eval roc_auc: 0.8488
    eval precision@10: 0.0332
    eval recall@10: 0.0523
    eval ndcg@10: 0.1376

We’ve trained the model for 2 epochs and evaluated the performance on the eval data during training. Next we can evaluate on the independent test data.

>>> from libreco.evaluation import evaluate
>>> evaluate(
>>>     model=model,
>>>     data=test_data,
>>>     neg_sampling=True,  # perform negative sampling on test data
>>>     metrics=["loss", "roc_auc", "precision", "recall", "ndcg"],
>>> )

{'loss': 0.4782908669403157,
 'roc_auc': 0.8483713737644527,
 'precision': 0.031268748897123694,
 'recall': 0.04829594849021039,
 'ndcg': 0.12866793895121623}

Make Recommendation#

The recommend part is pretty straightforward. You can make recommendation for one user or a batch of users.

>>> model.recommend_user(user=1, n_rec=3)

{1: array([ 364, 3751, 2858])}

>>> model.recommend_user(user=[1, 2, 3], n_rec=3)

{1: array([ 364, 3751, 2858]),
 2: array([1617,  608,  912]),
 3: array([ 589, 2571, 1200])}

You can also make recommdation based on specific user features.

>>> model.recommend_user(user=1, n_rec=3, user_feats={"sex": "M", "age": 33})

{1: array([2716,  589, 2571])}

>>> model.recommend_user(user=1, n_rec=3, user_feats={"occupation": 17})

{1: array([2858, 1210, 1580])}

Save, Load and Inference#

When saving the model, we should also save the data_info for feature information.

>>> data_info.save("model_path", model_name="wide_deep")
>>> model.save("model_path", model_name="wide_deep")

Then we can load the model and make recommendation again.

>>> tf.compat.v1.reset_default_graph()  # need to reset graph in TensorFlow1

>>> from libreco.data import DataInfo

>>> loaded_data_info = DataInfo.load("model_path", model_name="wide_deep")
>>> loaded_model = WideDeep.load("model_path", model_name="wide_deep", data_info=loaded_data_info)
>>> loaded_model.recommend_user(user=1, n_rec=3)

Retrain the Model with New Data#

Remember that we split the original MovieLens 1M data into two parts in the first place? We will treat the second half of the data as our new data and retrain the saved model with it. In real-world recommender systems, data may be generated every day, so it is inefficient to train the model from scratch every time we get some new data.

>>> second_half_data = data[(len(data) // 2) :]
>>> train_data, eval_data = random_split(second_half_data, multi_ratios=[0.8, 0.2])

>>> print("second half data shape:", second_half_data.shape)

second half data shape: (500105, 10)

The data processing is similar, except that we should use merge_trainset() and merge_evalset() in DatasetFeat.

The purpose of these functions is combining information from old data with that from new data, especially for the possible new users/items from new data. For more details, see Model Retrain.

>>> # pass `loaded_data_info` and get `new_data_info`
>>> train_data, new_data_info = DatasetFeat.merge_trainset(train_data, loaded_data_info, merge_behavior=True)
>>> eval_data = DatasetFeat.merge_evalset(eval_data, new_data_info)  # use new_data_info

Then we construct a new model, and call rebuild_model() method to assign the old trained variables into the new model.

>>> tf.compat.v1.reset_default_graph()  # need to reset graph in TensorFlow1

new_model = WideDeep(
    task="ranking",
    data_info=new_data_info,  # pass new_data_info
    embed_size=16,
    n_epochs=2,
    loss_type="cross_entropy",
    lr={"wide": 0.01, "deep": 1e-4},
    batch_size=2048,
    use_bn=True,
    hidden_units=(128, 64, 32),
)

new_model.rebuild_model(path="model_path", model_name="wide_deep", full_assign=True)

Finally, the training and recommendation parts are the same as before.

new_model.fit(
    train_data,
    neg_sampling=True,
    verbose=2,
    shuffle=True,
    eval_data=eval_data,
    metrics=["loss", "roc_auc", "precision", "recall", "ndcg"],
)

Epoch 1 elapsed: 2.867s
    train_loss: 0.4867
    eval log_loss: 0.4482
    eval roc_auc: 0.8708
    eval precision@10: 0.0985
    eval recall@10: 0.0710
    eval ndcg@10: 0.3062

Epoch 2 elapsed: 2.770s
    train_loss: 0.472
    eval log_loss: 0.4416
    eval roc_auc: 0.8741
    eval precision@10: 0.1031
    eval recall@10: 0.0738
    eval ndcg@10: 0.3168

>>> new_model.recommend_user(user=1, n_rec=3)

{1: array([ 364, 2858, 1210])}

>>> new_model.recommend_user(user=[1, 2, 3], n_rec=3)

{1: array([ 364, 2858, 1210]),
 2: array([ 608, 1617, 1233]),
 3: array([ 589, 2571, 1387])}

This completes our tutorial!

Where to go from here

For more examples, see the examples/ folder on GitHub.

For more usages, please head to User Guide.

For serving a trained model, please head to Python Serving Guide.