ICT-Online.ru conducted a study of the Russian market of speech synthesis solutions based on criteria relevant to contact centers. Using a specially developed methodology, seven products from significant market players were analyzed. This methodology can be used when choosing the optimal solution for implementing voice assistants in various business areas.

Speech synthesis, along with speech recognition, speech analytics, and voice biometrics, belongs to the class of software products that use speech technologies. Their main consumers are traditionally contact centers of telecom operators, financial organizations, online stores, IT companies, medical centers, and any other suppliers of goods and services.

Often, potential customers of such services do not have the experience and competence to objectively compare the available speech synthesis tools, so the choice of a particular product is made at an intuitive level or based on indirect factors – such as the developer's reputation in the market or advertising. The methodology used by ICT-Online.ru in this project demonstrates which scientifically sound metrics can be used for evaluation.

The study involved seven business solutions: Yandex SpeechKit; SaluteSpeech from Sber, Audiogram from MTS AI, "Speech Synthesis" from the Cloud Platform CRT, Tinkoff VoiceKit, Text-to-speech from "Nanosemantika", Aimyvoice. The comparison was conducted separately for male and female voices from the ready-made offers of the product line of these suppliers. Users of the Toloka service in the amount of 500 people were involved as a focus group.

For each stage of the study, text phrases were formulated from the field of activity of contact centers, which the synthesized voices had to voice. Respondent assessments made it possible to identify the services that most successfully coped with a particular task. In addition, the study identified the main factors influencing the quality of the generated voice, and specific features that should be paid attention to when comparing.

The study is located at: «Speech synthesis systems for contact centers».

Now on home