An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus

Muljono, Muljono and Harjoko, Agus and Winarsih, Nurul Anisa Sri and Supriyanto, Catur (2020) An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus. International Journal of Speech Technology, 23 (1). 141 - 147. ISSN 15728110; 13812416

[thumbnail of s10772-019-09662-1.pdf] Text
s10772-019-09662-1.pdf
Restricted to Registered users only

Download (974kB) | Request a copy

Abstract

Collecting phonetically balanced text corpus is an important step to develop automatic speech recognition and text-to-speech systems. A corpus should have a small number of sentences but contains all phonetic units, such as monophone, triphone, and pentaphone units. There are exist least-to-most greedy algorithm (LTM + Greedy) and its variant to select the minimum sentence set. The variant is on the sentence scoring method, which affect the number of selected sentences. In this paper, we evaluate the sentence scoring methods by Zhang and Suyanto on LTM + Greedy algorithm. The sentence scoring methods are conducted on triphone and pentaphone units on the collection of sentence set. Triphone and pentaphone units have offered higher quality synthesized speech than monophone unit. The dataset of this paper is Indonesian sentences that collected from holy book translation, news, novel, dialog, monologue, and question sentences. Totally 115,489 sentences are used for the experiments. Based on the experiments, LTM + Greedy by Suyanto produces a smaller number of sentences that contain large number of phone units. © 2020 Elsevier B.V., All rights reserved.

Item Type: Article
Additional Information: Cited by: 2
Uncontrolled Keywords: Character recognition; Genetic algorithms; Speech; Telephone sets; Automatic speech recognition; Greedy algorithms; Indonesian minimum sentence set; Phonetically balanced; Sentence selection; Speech corpora; Synthesized speech; Text-to-speech system; Speech recognition
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Depositing User: Sri JUNANDI
Date Deposited: 26 Sep 2025 06:58
Last Modified: 26 Sep 2025 06:58
URI: https://ir.lib.ugm.ac.id/id/eprint/20823

Actions (login required)

View Item
View Item