Engineers from Yandex, the Skoltech Center for Artificial Intelligence, and the St. Petersburg State University of Aerospace Instrumentation have presented the world's largest open dataset, PackEat, for computer vision systems in retail.
The dataset includes photographs of fruits and vegetables that will help retailers train algorithms for smart cash registers and accounting systems. The dataset contains images of 34 types and 65 varieties of products, taken in real stores from different points of view. In total, more than 100,000 images have been collected, capturing over 370,000 objects. About 9,000 images have markings for each individual object, indicating the weight and number of units of goods.
According to the developers' idea, PackEat will improve the accuracy of product recognition in supermarkets, considering images of objects with packages, intersections, and "noisy" backgrounds. This will help solve key computer vision tasks in retail: distinguishing types and varieties of products, isolating each object separately, and automatically counting the number of units of goods.
The dataset is hosted on the Zenodo platform, and the code and model examples are on Kaggle; researchers and developers can use them in their projects.