Cantonese speech synthesis engine for offline embedded system
*Accuracy, high precision, accurate Chinese word segmentation based on natural language understanding, accurate pronunciation of the variants of character pronunciation in different words with text contextual analysis.
Naturalness, tone changes with prosody conversion, appropriate breaks for words in long sentences, and emphasis on the correct words. High naturalness of synthesised speech for mixed text input, the mixed content can be any Chinese, English, number, symbols and all other possible characters defined in Unicode 8.0, it was compatible with HKSCS-2004 (Big5, ISO 10646). The default encoding of text is UTF-8.
Intelligent, detect unknown words (unknown names of things, place, person and organisation etc) intelligently, handle all kinds of Chinese punctuations. Artificial intelligence for the pronunciation of "Chinese Number String" to translated "Chinese Readings", such as date time, price, units of measure, phone number, post code, money symbols, license plate, brand name, product model, etc.
Tiny, small memory footprint and storage, extremely small computing resource required. The precompiled binary (library, .dll, .so, .a) is about 200Kb and plus the compressed data, data + program binary <4MB, extremely small space occupied.
Offline, platform independent, PCM audio output don't require a server or network connection. In-memory output or Wav format file will be produced by default.
Self-contained, instant, high speed, handle any length of text, the category of the text can be in any field, any industry, the engine is based on natural language processing, no require of any external dictionary or corpus. The compressed database can process any Chinese text of any industry. Reading news, speaking novels smoothly, deal with text context intelligently, produce and playback voice continuously without time limit.
Automatic translation, true words and pure pronunciation of the language, speech synthesis for the entire file or paragraph of text, and output the corresponding Yale, or Jyutping romanisation text. Mandarin words will be automatically translated into Cantonese by default, see some examples in that page.
Both female and male voice supported, and child voice supported by algorithm generating with female voice.
Portable, cross platform, written in pure ANSI C from scratch without external dependancies.
Anywhere, running on AVR, ARM, PIC, MIPS embedded systems such as toy, watch, robot and iPhone, Android, etc, mobile platforms, of course any normal desktops, and embedded in products, such as news paper or ebook reader, story teller, language learning tool, chat, help desk and game agent, translation assistant and so on.
Embeddable, can be loaded into memory and embedded in other programs.
Why Another TTS Engine?
The features of Yuet is the reason to develop another Chinese TTS engine.
Although there's so many TTS engine that can process Chinese text, but none of them can satisfy the requirements and generate speech based on accurate pronunciation, clear word border, emphasis on the appropriate words, including the engines from Google, Apple (Nuance), Microsoft, and NeoSpeech, iFLYTEK, PiTL etc. And there's some toy library such as eSpeak and Ekho, in fact they're useless because they lack serious correct pronunciation. The engine of Yuet is based on Chinese word segmentation and natural language understanding that differs compared to them.
Minimising the size of Yuet to fit embedded system to an extremely small size is the design goal to pursue, and also the design philosophy of the software. The engine of Cantonese all-in-one standalone is only 4MB, whereas common solutions mentioned above from the famous companies even requires well over 500~600MB of storage. For embedded system, ultra-small footprint can have many advantages, e.g. high flexibility for programming, performance and dramatically reduced the hardware costs, etc.
Offline, embedded TTS engine has the ability to avoid troubles caused by instability of network connection, and the high cost of the development and maintenance of the backend servers.
Independent, complete. Yuet can handle any content of text, detect unknown words intelligently, no require of any extra dictionary. The category of the content can be in any field, any industry, e.g. architecture, agriculture, biology, business, commerce, communications, computer, electronics, environment, education, finance, industry, mechanical, oil, mining, law & polities, government, military, transport, engineering, medicine, physics & chemistry, sports, news, etc.
And other reasons to develop yet another Chinese TTS engine.
Yuet, as a serious, tiny, offline, high speed, embedded speech synthesis engine for Chinese text, because of its extremely small size, it is well suited to small memory footprint, resource constrained embedded systems and microelectronics systems, suitable for both board level and C level integration and porting, such as MCU, FPGA, DSP, SoC and embedded RTOS, Android/Java VM, iPhone, Flash Player and more. Chip level implementation can be developed with new donation or investment.
Yuet can be running as an standalone Chinese text segmenter (morphological and semantic analyser), an standalone translator between Cantonese and Mandarin, an standalone generator of Chinese text romanisation, and other usage of Chinese text processing, e.g. natural language processing, Chinese text information extraction, retrieval, machine learning and data mining, etc.
Tiny embedded Chinese speech synthesis engine, suitable to: intelligent Chinese text processing, natural language understanding, language teaching, language learning, education for children, screen reader, speech translation, watch, toy, book, robot, finance news, book reader, help desk and translation assistant, game, animation, multimedia publishing, mobile phone manufacturers and other products that needs real-time convert Chinese text to speech, including software and computer, consumer device, electronic equipment and so on.
There is a very useful self-study tool demonstrates the features of the engine on App Store. The app was developed to help people to learn the Business Chinese in an efficient way, an speaker assistant with correct native pronunciation. Download the app on iTunes:
Standalone executable binary, the program is a command line tool that can segment Chinese text into words, generate Yale romanisation of the input text, and Cantonese speech synthesis of the input text. Try it, you'll be amazing at how accurate the engine is. It can segment any sentence of the attached material (2000 business sentences) into words with 100% correct output.
Translation between Cantonese and Mandarin, see some samples.
In addition, as a bonus, the dictionary app have included all items in the book - A Dictionary of Cantonese Slang (俗語字典) - the language of Hong Kong movies, street gangs and city life, refer some funny Cantonese idioms and proverbs.