Data Info#
The DataInfo
object stores almost all the useful information from the original data.
We admit there may be too much information in this object, but for the ease of use of the library,
we’ve decided not to split it.
So almost every model has a data_info
attribute that is used to make recommendations.
Additionally, when saving and loading a model, the corresponding DataInfo should also be saved and loaded.
When using a feat
model, the DataInfo
object stores the unique features of
all users/items in the training data. However, if a user/item has different categories or values
in the training data (which may be unlikely if the data is clean :)), only the last one will be stored.
For example, if in one sample a user’s age is 20, and in another sample this user’s age becomes 25,
then only 25 will be kept. So here we basically assume the data is always sorted by time,
and you should do so if it doesn’t.
Therefore, when you call model.predict(user=..., item=...)
or model.recommend_user(user=...)
for a feat model, the model will use the stored feature information in DataInfo.
The DataInfo
object also stores users’ consumed items, which can be useful in sequence models
and unconsumed
sampler.
Changing User/Item Features#
It is also possible to change the unique user/item feature values stored in DataInfo, then the new features would be used in prediction and recommendation.
>>> data_info.assign_user_features(user_data=data)
>>> data_info.assign_item_features(item_data=data)
The passed data
argument is a pandas.DataFrame
that contains the user/item information.
Be careful with this assign operation if you are not sure if the features in data
are useful.
See also