![]() More readings about this, look for Richard Sproat and KyleGorman's work at Google. There are indeed something that needs to be improved:įor TN, NSW normalizers in TN dir are based on regular expression, I've found some unintended matches, those pattern regexps need to be refined for more precise TN coverage.įor ITN, extend those thrax rewriting grammars to cover more scenarios.įurther more, nowadays commercial systems start to introduce RNN-like models into TN, and a mix of (rule-based & model-based) system is state-of-the-art. Since TN is a typical "done is better than perfect" module in context of ASR, and the current state is sufficient for my purpose, I probably won't update this repo frequently. Make sure you have thrax installed, and your PATH should be able to find thrax binaries. Sh run.sh in TN dir, and compare raw text and normalized text. Step: Download and install MP3Gain drag the target MP3 file into the software click Track Analysis type a value in Target 'Normal' Volume, up to 99db click Track. ![]() But it will not degrade the audio quality. Make sure you have python3, python2.X won't work correctly. MP3Gain is a MP3 volume normalizer that can analyze the target MP3 file and adjust each fragment to the similar volume. Note: All input text should be UTF-8 encoded. Pass -format tsv option, normalization will apply to TEXT field only. ![]() Supported NSW (Non-Standard-Word) Normalization NSW typeĪcknowledgement: the NSW normalization codes are based on Zhiyang Zhou's work hereįor Chinese, it removes punctuation list collected in Zhon project, containing Since my background is speech processing, this project should be able to handle most common TN tasks, in Chinese ASR text processing pipelines. This project sets up a ready-to-use TN module for Chinese. The target volume is in accordance with the EBU R128 loudness standardization specification. It automatically normalize the volume of audio files and increase to the uniform volume target without degrading quality. TN is a less important topic for either academic or commercials. MP3 Gain for mac is a simple tool designed specifically to increase and normalize the volume of audio (mp3, aac, wma, etc) files. Since constructing and maintaining TN is hard, it is actually an asset for commercial companies, hence it is unlikely to find a product-level TN in open-source community (correct me if you find any) Subtle and intrinsic complexities hide inside TN task itself, not in tools or frameworks. Some of TN processing methods are shared across languages, but a good TN module always involves language-specific knowledge and treatments, more or less.Įven for the same language, different applications require quite different TN.Ĭonstructing and maintaining a set of TN rewrite-rules is painful, whatever toolkits and frameworks you choose. There is quite some work between "support text normalization" and "do text normalization". Instead, you find a bunch of NLP toolkits or frameworks that supports TN functionality. Search for "Text Normalization"(TN) on Google and Github, you can hardly find open-source projects that are "read-to-use" for text normalization tasks. Chinese Text Normalization for Speech Processing Problem
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |