poprox_recommender.training.preprocess#

Functions

parse_behaviors(source, target, ...)

parse_behaviors is only for training data Input: behavior.tsv Output: behaviors_parsed.tsv after negative sampling user int clicked_news [N... N...] candidate_news [N... N...] clicked [1 0 0 0 0].

parse_news(source, target, ...)

Parse_news is applied to training, validation, and testing data Input: news.tsv Output: news_parsed.tsv id (N...) title [list of tokens]

poprox_recommender.training.preprocess.parse_behaviors(source, target, user2int_path, negative_sampling_ratio)#

parse_behaviors is only for training data Input: behavior.tsv Output: behaviors_parsed.tsv after negative sampling

user int clicked_news [N… N…] candidate_news [N… N…] clicked [1 0 0 0 0]

poprox_recommender.training.preprocess.parse_news(source, target, pretrained_tokenizer, token_length)#

Parse_news is applied to training, validation, and testing data Input: news.tsv Output: news_parsed.tsv

id (N…) title [list of tokens]