Prediction the Probability of Purchases Recommended Items



This paper discusses various methods for improving recommendation systems. A comparative analysis of two models for solving classification problems is performed: random forest and CatBoostClassifier. The research was performed on the data of the purchase history of Ozon customers. Standard methods that are often used in recommendation systems were used. We implemented collaborative filtering methods, cosine similarity of products from customer views per site visit, and similarity of text data. To evaluate the results, we used special metrics that evaluate the quality of predictions of the first k objects from the recommendations: Mean average precision (map@K) and Recall at K (recall@k). When generating additional features based on various methods that reveal the similarity of objects, an increase in the quality of model forecasts is noted. The CatBoostClassifier model showed the best results.

Keywords: recommendation systems, machine learning, binary classification, collaborative filtering methods, cosine similarity, map@K, recall@k

For citation: Parfenov P.A., Timofeeva A.A., Sologub G.B., Alekseychuk A.S. Prediction the Probability of Purchases Recommended Items. Modelirovanie i analiz dannikh = Modelling and Data Analysis, 2020. Vol. 10, no. 4, pp. 17–30. DOI: 10.17759/mda.2020100402. (In Russ., аbstr. in Engl.)


