Corpus Tools: Way To Write : Useful Things:

   TOUR
Novelties: Glossers : Community Resources:


Sentences, Words: 14, 163
Words per sentence: 11.64
Average #Complex NP/Sentences: 0.43 max expectable (2.91)
Avg #ProperModifiers/Sentence: 0.07 max expectable (3.88),
Avg #Function/Sentence: 4.00 max expectable (5.82)
Avg #Coordinating Words/Sentence: 0.43 max expectable (34.93)

Readability statistics are weighted sums of the following: words per sentence, percent of complex words, percent of multisyllabic words. In tp, nothing inflects, so nothing is complex in the sense of Gunning Fog complexity. But we do have complex noun phrases, which are clearly marked by having "pi" in them. In tp all words are 3 syllable or less, except proper modifiers. Long words are probably over penalized by Fleich-Kinkaid scoring. kepeken isn't any more complicated than ma and Italija (4 syllables) is just as hard to recognize as Losi (2 syllabes). Proper modifiers should be penalized as being harder though because they are infrequently used and often are a long way from their original form. Also, all base content words are roughtly as frequent as any other, but even the rarest words (like open or maybe the obsolete words like pata) aren't especially difficult.

A toki pona specific measure might be the count of particles (function words) vs other words- la, o, e, li, pi plus the six prepositions. Something that is tp specific that really bumps up reading difficulty is discourse (i.e. long chains of sentences interlocked by "e ni", "ni li", and "ona". mi and sina are less problematic because their referrent is always clear and doesn't refer to something spoken earlier, later or maybe not even specifically mentioned.

Note on maximums-- To calculate a true maximum, you'd have to consider all the possible sentences of length X and see which setence type has the most function words, proper modifiers, etc. For example, in a sentence of 3 words, no proper modifiers are possible. In a sentence of 4 words, 1 proper modifier is possible. In a sentence of 15 words, you could have ni li ijo followed by 12 proper modifiers, but that isn't really something you ever expect to see.









This is a fan site. The creator of toki pona is jan Sonja, which isn't me. All Content create by me is Creative Commons, by Attribution. Feel free to make derivatives to the extent you can or want to. The toki pona corpus texts come from a variety of locations and I believe its usage is acceptable, noncommercial fair use. If you don't think so, email me tokipona@suburbandestiny.com and tell me what document is yours and I will remove it.
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.