Cold Start Problems in "Machine Learning"

The cold start problem in "machine learning" (ML) is a typical issue faced by systems based on "machine learning." It is particularly relevant for recommendation systems. A neural network trained on a dataset needs to make a prediction about what will best meet the user's needs, but it does not have enough data to do so. Our consultant in preparing this article was Olga Kovaleva, a senior analyst-developer at Yandex.

Image source provided by the press service of Yandex

The problem for ML systems is not always that a new user appears about whom there is no information. Sometimes there is no data about a new unit added to the platform: about a movie, if the rs-system works in the interests of a streaming service, or about a product, if it is an e-commerce platform.

Similarly, the problem arises if the service itself or its new functionality is just being launched and information about users has not yet been accumulated. Finally, there is the so-called "contextual" type of cold start problem. An example of how it can manifest itself is an rs-system that recommends movies: a lack of more general information about the user, for example, about their demographic characteristics, mood, preferences, can lead to recommendations that will not work.

The profit and survival of technology companies and startups depend on solving the cold start problem, as user satisfaction with IT services depends on it. There are a number of typical strategies to deal with a "cold start":

● data augmentation or the process of artificially increasing the volume and diversity of a training dataset by creating modified versions of existing examples without changing their key characteristics;

● hybrid approaches in the field of recommendation systems;

● transfer learning and active machine learning.

Let's talk about all this in more detail.

<a></a>Factors that lead to the cold start problem

Several factors in ML systems influence the appearance of the cold start problem. All together and each separately, they can make predictive models and recommendations in recommendation systems inoperable.

<a></a>New users and goods processed on the platform,onlyunits

When a recommendation system encounters new users who have recently signed up and have not accumulated a history of interaction with the platform interface, there is practically no data to understand their preferences. Similarly, if a new product, news, movie, section appears in the system, there is no user reaction to them. Without historical data on them, algorithms cannot assess their relevance to any category of users.

<a></a>Lack of data on user interaction with certain entities on the platform

A similar problem often arises on niche e-commerce platforms. There is so little information about how a user relates to certain types of goods or any other entities with which the platform works that it is impossible to predict the interest of a service visitor. The specifics of some industries are such that the user does not interact with the platform or interacts too rarely.

<a></a>Data sparsity in the feature space

This situation occurs when certain attributes or features have limited coverage in the dataset. This creates difficulties when including new features or contextual information in the model, as the system may have difficulty generalizing effectively without sufficient data.

If the cold start problem is that a new "feature" has been added to the IT service or a new service is being launched for the first time, then there is no information about previous user actions. The effectiveness of recommendations or other predictive mechanisms will be low, because "machine learning" algorithms only have the initial dataset, but there is no feedback from users to retrain the model or adjust the initial training.

This leads to situations similar to the scandal surrounding the "Tay" chatbot on neural networks from Microsoft, which erupted in 2016. The "artificial intelligence," trained to communicate with users and launched on Twitter, showed effectiveness at the very beginning of working with real users. Unfortunately, it was given the ability to learn from dialogues with human users and adopt their manner of speech without sufficient filtering. As a result, the chatbot began to insult users, which is why it had to be turned off.

<a></a>Cold start problem: real cases

Real examples of how ML platforms and neural networks face the cold start problem allow us to better understand how much it affects the effectiveness of ML models and positive or, conversely, negative user experience. Here are some scenarios that "machine learning" specialists face in practice:

● new subscriptions to streaming platforms: some streaming services specialize in personalized music and film recommendations using neural networks. A new subscriber does not have a history of using the service, so the services initially give uncertain and very general recommendations until the user shows what they like with their actions;

● a newly created product card on an e-commerce platform: a new seller on the marketplace puts up a product card, but the ML systems of the electronic platform do not yet have a purchase history or a search history for the new card. A problem arises in forming recommendations to users that will include the new product. What is even worse, the initial uncertain status of the card may affect the system's attitude towards it and make it so that users are only shown old products that it knows what to do with;

● niche markets with limited user data:niche e-commerce platforms that sell goods related to some specific hobby or interest suffer from rare and relatively passive user interaction with interactive services. They are interested in buying and nothing more. Many platforms cannot cope with the problem, their ML tools work poorly. However, there are many approaches to making recommendations much more effective;

● selling goods with a long life cycle over the Internet: neural networks do not find enough history of user searches and purchases for such goods. A cold start problem arises. As with niche goods, there are opportunities to improve the situation, even though the problem is caused by the specifics of the products themselves.

The proposed examples of the appearance of the cold start problem in practice clearly show how serious an effect it can have from the point of view of the profit of IT services. What solutions exist?

<a></a>How to solve the cold start problem in ML algorithms?

It is extremely difficult to improve user experience and increase profits if you do not find a solution to the cold start problem.There are several ways to do this:

● data augmentation:you can synthesize missing data and perform imputation (data imputation), that is, teach the neural network to draw conclusions about missing values based on the available ones. In this way, you can create artificial examples for "machine learning" that the user does not provide;

● hybrid approach to recommendations: combining several ML algorithms,including collaborative filtering, a content-oriented approach and other methods allows the ML system to form a more complete and reliable representation of the user. Where one method will not receive information, the gaps can be filled by another;

● transfer learning and pre-training:when transferring learning, the knowledge gained by the neural network is extrapolated to similar tasks. If you train the model in advance on huge datasets, and also teach it to find analogies between new and familiar situations, then it will be able to work correctly in the conditions of the cold start problem. Use a lot of data for training and mark it well;

● active machine learning: a way to "train" neural networks when they are taught on poorly prepared data, for example, on some source that has not been processed for "machine learning." As a result, the neural network learns to interact with labeled data, learns methods of independent systematization. In many cases, this can help compensate for the cold start problem.

If the problem is related to the operation of recommendation systems as such, then there are a number of approaches that can minimize the consequences of the cold start problem for the platform:

● you can make recommendations based on content: use as many meaningful features as possible in ML models: tags, descriptions, metadata. Train neural networks to work with them efficiently if other information is lacking. Unlike other approaches, content allows you to recommend based on the search that the user is doing here and now;

● you can make recommendations based on popularity:if there is little information about the user, then from external integrations, for example, API with ratings, you can get data about the general fashion for certain goods and use the general background associated with a particular product for better recommendations to users;

● you can use contextual information and user feedback:for ML models that are effective in the conditions of the cold start problem, you can use a wider range of data: the user's social contacts, time spent on pages, clicks, the type of user device, their location, and much more;

● you can use ML based on knowledge and examples: methods based on direct, obtained data can be supplemented with heuristics, an array of data from the relevant field of knowledge that the neural network can use in making decisions. You can also effectively train the system using ML based on examples of how it will act.

By choosing one method or a combination of them, you can radically increase the efficiency of ML and systems on technological platforms that work on the basis of "machine learning."

<a></a>Problems and challenges in solving the cold start problem

In practice, there is no single correct scenario for overcoming the problem. If specialists choose the wrong way to solve these problems, then the platform fails to build effective content personalization. It loses in the competitive struggle. All sorts of ethical collisions occur. Experts have long noted that incorrectly configured recommendation algorithms play the role of amplifiers of prejudice. For example, on e-commerce, they can discriminate against new products, resulting in unfair restriction of competition. Even if unfair ML algorithms received the appropriate settings in order to increase the profitability of the platform here and now, the contradictions embedded in it eventually come out. Interest in the platform falls, it loses its users.

<a></a>Conclusions

The cold start problem is becoming a constant companion of ML platforms. There are a number of effective methods for solving it, which are used by data scientists. Data Scientists are specialists who create tools for solving business problems. A Data Scientist works at the intersection of three areas of knowledge: statistics, machine learning and programming.

The effectiveness of their efforts depends on the correct cocktail of methods and their correct application. The more professional the "machine learning" expert, the greater the likelihood of success. Woe to technological companies and startups where recommendation systems and other mechanisms based on neural networks have been trained incorrectly. As real cases of the cold start problem show, the consequences of the problem not being solved will be devastating for the client base of applications and their profits.

Nikolay Vavilov