toki pona shorthand

Signs and symbols: Writing systems (hieroglyphs, nail writing) and Signed Toki Pona; unofficial scripts too
Signoj kaj simboloj: Skribsistemoj (hieroglifoj, ungoskribado) kaj la Tokipona Signolingvo; ankaŭ por neoficialaj skribsistemoj
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: toki pona shorthand

Post by janMato »

As a matter of fact yes. And if I remember correctly, this is based on corpus text, not on just the word list. (I actually can't remember exactly how I came up with this list, for all I know it is the same as Henry's list. It's the one I use for cuss word generation.)

This chart says 3% of the toki pona corpus text is j, etc.

j=3
k=5
l=10.2 <-- also unsually common due to li
m=4.4
n=11.6 <-- lots of n, due to "ni"
p=3.7
s=4.1
t=4.6
w=2.8

a=17.2
e=7.4
i=14.8 <- extremely common due to "li"
o=7.7
u=3.2

Most of the shape of the chart is driven by the high frequency of particles, pi, ni, li, e, etc.
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

Thanks, that's really great. Do you remember if coda-n is included with n? Adding the numbers up gives me 99.7% so I would wager yes.
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: toki pona shorthand

Post by janMato »

It doesn't add up to 100% because of rounding errors.

I haven't put together the frequency chart by syllable yet. It's also on my to do list to create a transition matrix, but that would require working out 14 x 14 probabilities and I haven't had a chance yet. The English wikipedia article has section on the a sound distribution that I believe is based solely on the occurrence rates in the ~120 base words.

The tricky part about getting good statistics is that many tp texts are mixed English/toki pona. The words toki pona also appear extremely frequently so t,o,k,i,p,n,a are all over represented. Proper modifiers are a pretty small percent of the total text, but they probably behave very differently from the base words, phoneme distribution-wise.
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

Well I think I am pretty much done tinkering. All that's left is to try writing out the corpus and see if all the words come together nicely, which I will try to do before next week, God willing.

The frequency list was important so that I could try and match up the descenders and ascenders with letters in proportions that would more or less balance each other in writing (y'know so the writing would more or less stay on the line); because I wasn't entirely sure of the percentages of words that occurred with certain vowels in front (u- or o- like uta or open) I had to guess a little, and I would say that there are probably more descending than ascending in practice--which is why I suggest lengthening the ascending strokes, to balance out the overall spread of the writing.
http://img826.imageshack.us/i/tokiponashorthand.gif/

EDIT:
On the punctuation I have included colons and periods which I know are necessary. Are semicolons, etc. important? And is capitalization necessary? I have seen people not capitalize proper modifers and I wondered if that were official.

I realize the GIF is horrific, I will write it by hand and scan it when I get the chance. And I forgot to include the solution to ken/kin. It is my goal to write and demonstrate all these things to people in this thread eventually. Make some little lectures on theory and application that could be compiled as lessons.
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: toki pona shorthand

Post by janMato »

jan Misite wrote:EDIT:
On the punctuation I have included colons and periods which I know are necessary. Are semicolons, etc. important? And is capitalization necessary? I have seen people not capitalize proper modifers and I wondered if that were official.

I realize the GIF is horrific, I will write it by hand and scan it when I get the chance. And I forgot to include the solution to ken/kin. It is my goal to write and demonstrate all these things to people in this thread eventually. Make some little lectures on theory and application that could be compiled as lessons.
pona a! This is a much more enjoyable way to learn short hand, Greg and the like always seemed to me like a level of effort on the scale of learning Chinese.

colons are sort of important, they let you know that an upcoming sentence is somehow linked via an 'ni' In my writing, if the the ni refers backwards, I end in a period. (And sometimes I just plain forget to follow any pattern)

kili li jo e wawa. ni li pona tawa mi. kala li kon ike. Fruit have calories. I like that fruit has calories. Fish (on the other hand) stinks.
kili li jo e wawa. ni li pona tawa mi: kala li kon ike. Fruit has calories. I like that fish stinks.

There is nothing official about the above convention, it just happened-- or is my entirely my own fabrication as some might say.

Capitalization really matters mostly when a proper name also happens to be a toki pona word. The rules are at best fuzzy-- many options, not a lot of canon.

meli Utasuli li jo e tomo lon tomo tawa pi mun lili. Miss Trench has a room on the rocket ship.
which could be meli Uta Suli or meli Utasuli, but meli uta suli would be misleading, implying a girl described by being large and a hole.

If proper modifiers are left uncapitalized in the roman script, it is a mistake. If jan Kipo capitalizes a toki pona base word, then he's writing on his iPad.
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

Thanks! Yeah learning Gregg was pretty difficult. The idea behind this project was to include all of the good and none of the bad so that it would be fun to use.

OK, that means I need to include capitalization. Gregg shorthand has diacritics for that, and that might be OK, but there are other ways to show it. I was thinking of marking either the first or last letter of a word somehow, or maybe modifying the space diacritic, or disconnecting and crossing certain letters but I will have to think about it. I don't really like using a larger size letter to indicate capitalization but that would be intuitive for other people. If it were crossing letters I would have to find a way to represent a single syllable ending in u and I don't think I can without distorting the rules I have laid down. I guess I could cross the preceding word with the 2nd (i.e. cross 'jan' with 'mato' to get jan Mato).

The meli uta suli example is funny. I would've thought she would be called jan uta suli but that's still pretty hard to see as a name. I take it these distinctions are not really something you can hear in speech though.
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: toki pona shorthand

Post by janKipo »

Time will tell. We have so little spoken tp we don't know how anything will work out yet.
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

Corpus

I think this is pretty much what I want it to be. I am a little disappointed in how the words come together with spaces written out. That's not really something I feel I can improve upon because I don't use toki pona as much as the rest of you. I don't think fully cursive sentences would be desirable anyways because it would probably take too much time to think it through, remember the sentence as a whole, and the challenge would just slow down your writing. That was mostly a calligraphic thing I was trying out. Written spaces should still be useful for getting down stock phrases and compounds and attaching particles such as mi_wile_e and jan_lili, and those seem OK.

Several words have multiple forms next to them, and several can have additional forms. This is something you can see if you try writing them out. You will notice that sometimes two consonants will come together smoothly and some will come together in a loop-de-loop. Smooth connections have been avoided when there is no intervening a or e vowel in favor of a point because I wanted to make certain that consonants looked distinct.

I am going to run with my previous idea for indicating capitalization -- write the to-be-capitalized letter through the preceding consonant of the preceding word. This will always work because proper modifiers must always follow another word.

I look forward to seeing your handwriting!
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

@jan Mato, do you want me to stick with what I have already done as a baseline? :S I don't know if you've started on the shorthand transliterator or would mind if I tinkered some more.

I am considering shifting most of the lines clockwise 15 degrees (a 1 o'clock-to-9 o'clock stroke and a 1 o'clock-to-7 o'clock stroke); this is because I have found the straight vertical stroke to be difficult to write with what I already know how to, and I have better control in the 3rd quadrant than I anticipated, and I have seen the angle of some of the strokes in third quadrant vary quite a lot in some Gregg writers' handwriting so that I know it would probably work to have several strokes in this quadrant. So it seems like a steep and a somewhat less steep stroke are more manageable than the ideal straight up-and-down vertical stroke in my opinion. In addition the 9 o'clock-to-3 o'clock stroke and the 10 o'clock-to-4 o'clock stroke will be shifted to become the 10 o'clock-to-4 o'clock stroke and the 11 o'clock-to-5 o'clock stroke respectively. The straight ascender will more or less stay the same, by which I mean it will not become the 9 o'clock-to-3 o'clock stroke, it will still ascend; its angle can be underspecified because there is only one straight stroke in the 1st quadrant. The curves in the 1st and 3rd quadrants will stay the same but there will be 2 sets of curves in the 4th quadrant.

I also looked at how it effects the blends and I think it makes them more distinguishable and thus acceptable, which can only help when writing. This is the primary advantage of this proposal.

I have also considered changing the entire orientation of the writing to be primarily up-to-down, and secondarily left-to-right. This would make it more akin to the ancient chinese grass script shorthand than the modern shorthands which are disproportionately influenced by European writing conventions. It also seems reasonable to do something different than the Gregg script, which I have noted has more descenders than ascenders; why not mix it up and make the writing flow in the same direction as the strokes? In addition, when I used the word connectors the writing was still quite strongly downward-shifting. The secondary designation is based on the ascenders being written in a rightward motion so that if writing should drift, they will drift in a rightward direction (in other words, I am saying that it's OK to let connected writing drift into another column; Teeline shorthand allows this for instance; the overall proportions of left and right-flowing symbols in the language will be matched to letters in the proportion that allows the writing to pretty closely follow the line of writing.) However, if the direction of the ascenders were written towards the left, the skew would be to the left, and the Asian order of primarily up-to-down, and secondarily right-to-left would be favored. In addition, I could use a straight vertical in place of the word-space; this would be intuitive because it would be the same as jumping to a new word but while never lifting the pencil and dragging it on the paper. Overall, I think this will have better results because the symbol inventory for mapping letters is skewed towards descenders; it still has to be born out with more practice writing. It could work a lot better than the current layout. I may or may not reassign the symbols to other letters.
jan Misite
Posts: 42
Joined: Tue Dec 28, 2010 6:42 pm

Re: toki pona shorthand

Post by jan Misite »

I have grown aggravated with making the separate connector work; I have decided that it is ok to merge the function of such a space-ligature with the 'ju' syllable symbol. It is underused anyway. The catch is that you cannot connect a word that begins or ends with the syllable to the preceding or following word. I don't know what to do when Jo/'ju' is in the middle of a word. :lol: I'm taking suggestions.

I probably shouldn't worry about it, it's not like tp shorthand could deal with much expansion of the language ie more contrastive pairs a la Ken/kin.

The practical consequence of this is that Jo is never connected to anything. Nothing can connect to the rear of ijo either.
Post Reply