ICT-Online.ru опубликовал исследование систем синтеза речи для контакт-центров

ICT-Online.ru conducted a study of the Russian market of speech synthesis solutions based on criteria relevant to contact centers. Seven products from significant market players were analyzed using a specially developed methodology. A similar methodology can be used when choosing the optimal solution for implementing voice assistants in various business areas.

Speech synthesis, along with speech recognition, speech analytics, and voice biometrics, belongs to the class of software products that use speech technologies. Their main consumers are traditionally contact centers of telecom operators, financial organizations, online stores, IT companies, medical centers, and any other suppliers of goods and services.

Often, potential customers of such services do not have the experience and competence to objectively compare the available speech synthesis tools, so the choice of one product or another is made on an intuitive level or based on indirect factors – such as the developer's reputation in the market or advertising. The methodology used by ICT-Online.ru in this project demonstrates which scientifically sound metrics can be used for evaluation.

The study involved seven business solutions: Yandex SpeechKit; SaluteSpeech from Sber, Audiogram from MTS AI, "Speech Synthesis" from the Cloud Platform of CRT, Tinkoff VoiceKit, Text-to-speech from "Nanosemantika", Aimyvoice. The comparison was carried out separately for male and female voices from the ready-made offers of the product line of these suppliers. Users of the Toloka service in the amount of 500 people were involved as a focus group.

For each stage of the study, text phrases from the field of activity of contact centers were formulated, which the synthesized voices had to voice. Respondent assessments made it possible to identify the services that most successfully coped with a particular task. In addition, the study identified the main factors influencing the quality of the generated voice, and specific features that should be paid attention to when comparing.

The study is available at: «Speech synthesis systems for contact centers».

Now on home