Abstract | The problem is formulated in terms of obtaining the minimum description length of a text, and the proposed solution finds the segments and their languages through dynamic programming . |
Conclusion | An actual procedure for obtaining an optimal result through dynamic programming was proposed. |
In the experiments reported here, n is set to 5 throughout. | We can straightforwardly implement this recur—sive computation through dynamic programming , by managing a table of size |X | x To fill a cell of this table, formula (4) suggests referring to t x |£| cells and calculating the description length of the rest of the text for O( |X | —t) cells for each language. |
Problem Formulation | Nevertheless, since we use a uniform amount of training data for every language, and since varying 7 would prevent us from improving the efficiency of dynamic programming , as explained in §4, in this article we set 7 to a constant obtained empirically. |
Segmentation by Dynamic Programming | By applying the above methods, we propose a solution to formula (1) through dynamic programming . |
Discussion | In the figure, phone bigram TF—IDF is labeled p2; phonetic alignment with dynamic programming is labeled DP. |
Experiments | The feature selection experiments in Figure 2 shows that the TF—IDF features alone are quite weak, while the dynamic programming alignment features alone are quite good. |
Feature functions | Given (13,10), we use dynamic programming to align the surface form 1‘9 with all of the baseforms of w. Following (Riley et al., 1999), we encode a phoneme/phone with a 4-tuple: consonant manner, consonant place, vowel manner, and vowel place. |