Zipf
Not clear how it can apply to biology, since it is about frequency of use and length. But I suppose there are analogous results everywhere. The interesting question is always about the extent to which the law is prescriptive rather than just descriptive.
A ranged possibility: two-letter codes
Re: A ranged possibility: two-letter codes
thank you and yes, it is a step to monosyllabic dialect of tp, so much disliked by jKiporaspakant wrote: Hi; the two letter code sounds to me like a great idea; should someone wanted to do this, and although not very much in line with general TP philosophy, it would allow creating long words using the 123 stems,
perhaps these cluster words might be made of series two letter codes separated by slashes, or whatever;
may be spaces could work?
1. you may addif I may add, also single capital letters might be used , sometimes, to make even shorter clusters or shorter written phrases;
2. IMO capitals could be better used for proper names. consider the suggestion of jJosan (earlier in this topic) to use non-tp letters - b for pi, f for li, etc
3. this will not make the text much shorter but for some other reasons it could be useful
Re: A ranged possibility: two-letter codes
may be spaces could work?
Perhaps spaces are better;
I think creating a topic of experiments might be useful, more like a window of texts in (two letter) tp ,with explanation besides the discussion topic, so all can see the result of the experiments;
As an example: kp/kn/lo/pi/ma/tm/tw/kn or (better looking) : kp kn lo pi ma tm tw kn
to mean kulupu kon loje pi tomo tawa kon
"fire brigade of the airport" ( supposing kon loje=fire and : ma tomo tawa kon "airport")
It could be even shortened using the one letter option for some stems
Toki
Pa
Re: A ranged possibility: two-letter codes
Missing 'pi's. 'kulupu pi kon loje' 'ma pi tomoa tawa kon'
I've forgotten the point of all this except to save on telegrams (can you still send telegrams?)
I've forgotten the point of all this except to save on telegrams (can you still send telegrams?)
Re: A ranged possibility: two-letter codes
and on toki lili, and for naming files on a computer. What we need is some of the tech savvy tp users to write a program to compress and decompress toki pona in this fashion, and save me from pulling out my decoder ring.janKipo wrote:Missing 'pi's. 'kulupu pi kon loje' 'ma pi tomoa tawa kon'
I've forgotten the point of all this except to save on telegrams (can you still send telegrams?)
Re: A ranged possibility: two-letter codes
po. ts sa*kekk~tmtoni twpasaraspakant wrote: I think creating a topic of experiments might be useful, more like a window of texts in (two letter) tp ,with explanation besides the discussion topic, so all can see the result of the experiments;
po.tssafkekkytmtonitwpasa
it could be shorter, but not much. but you can tryIt could be even shortened using the one letter option for some stems
Re: A ranged possibility: two-letter codes
as well as for different appilcations related to mobile phones (sms, mobile chat)jan Josan wrote: and on toki lili, and for naming files on a computer. What we need is some of the tech savvy tp users to write a program to compress and decompress toki pona in this fashion, and save me from pulling out my decoder ring.
Re: A ranged possibility: two-letter codes
Ask and you'll receive. Oh, you mean executable. Well, I'll get my toki pona site up again soon.jan Josan wrote:What we need is some of the tech savvy tp users to write a program to compress and decompress toki pona in this fashion, and save me from pulling out my decoder ring.
This algorithm is extremely sensitive to errors. For example:
jan mute pi ma ante li ike lukin taso sina pona lukin ==> jnmt@maat*iklutssapolu
(one letter replaced) jnmt@maat*iklutssapolu ==> jan mute pi ma ante li ike lape taso sina pona lukin
(transpose on word break) jmnt@maat*iklutssapolu ==> jm nt pi ma ante li ike lukin taso sina pona lukin
(dropping first letter of each phrase) nmt@aat*klutssapolu==> nimi t| pali ante li kala uta ss anpa olin
Letter drops are the worst because they leave an unbalanced phrase and shift everything in the whole sentence. I already suffer from a toki pona speech impediment, (confusing minimal pairs), this would give me a writing impediment, too. That said, I really like the idea of typeable short hand for toki pona. Leaving space between the roman letters would fix the existing systems sensitivity to letter dropping because a letter drop's effect would be over by the next space. Now a letter drop throws things off until the next punctuation mark.
Code: Select all
public string Shorten(string input)
{
Dictionary<string, string> dictionary = GetDictionary();
StringBuilder sb = new StringBuilder();
string[] plaintext = input.Split(new char[] { ' ', ',', ':', '.' });
for (int i = 0; i < plaintext.Length-1; i++)
{
if(dictionary.ContainsKey(plaintext[i]))
sb.Append(dictionary[plaintext[i]]);
else
sb.Append("<b> " + plaintext[i] + " </b>");
sb.Append(" ");
}
return sb.ToString() + "<br/>" + sb.ToString().Replace(" ","");
}
public string Expand(string input)
{
Dictionary<string, string> dictionary = ReverseDictionary(GetDictionary());
StringBuilder sb = new StringBuilder();
string[] chunks = input.Split(new char[] {'^','*','!','~','@'});
input = input.Replace("^","|l");//la
input = input.Replace("*", "|i");//li
input = input.Replace("!", "|o");//o
input = input.Replace("~", "|e");//e
input = input.Replace("@", "|p");//pi
for (int i = 0; i < input.Length-1; i=i+2)
{
if(dictionary.ContainsKey(input.Substring(i, 2)))
{
sb.Append(dictionary[input.Substring(i, 2)]);
}
else
{
sb.Append(input.Substring(i, 2));
}
sb.Append(" ");
}
return sb.ToString().Replace("|l", "la").Replace("|i", "li").Replace("|i", "lo").Replace("|e", "e").Replace("|p", "pi").Replace("|o", "o");
}
private static Dictionary<string, string> ReverseDictionary(Dictionary<string, string> toReverse)
{
Dictionary<string,string> dictionary = new Dictionary<string, string>();
foreach (KeyValuePair<string, string> pair in toReverse)
{
dictionary.Add(pair.Value,pair.Key);
}
return dictionary;
}
private static Dictionary<string,string> GetDictionary()
{
Dictionary<string,string> dictionary = new Dictionary<string, string>();
//top first (col, row)
dictionary.Add("ala","aa");
dictionary.Add("kalama","ka");
dictionary.Add("lape","la");
dictionary.Add("ma", "ma");
dictionary.Add("poka", "oa");
dictionary.Add("pali", "pa");
dictionary.Add("sina", "sa");
dictionary.Add("tan", "ta");
dictionary.Add("ken", "ke");
dictionary.Add("len", "le");
dictionary.Add("nena", "ne");
dictionary.Add("selo", "se");
dictionary.Add("kin", "ki");
dictionary.Add("lipu", "li");
dictionary.Add("mi", "mi");
dictionary.Add("ni", "ni");
dictionary.Add("poki", "oi");
dictionary.Add("pini", "pi");
dictionary.Add("sin", "si");
dictionary.Add("wile", "wi");
dictionary.Add("ijo", "ij");
dictionary.Add("kili", "kj");
dictionary.Add("linja", "lj");
dictionary.Add("mije", "mj");
dictionary.Add("pimeja", "pj");
dictionary.Add("sijelo", "sj");
dictionary.Add("akesi", "ak");
dictionary.Add("ike", "ik");
dictionary.Add("jaki", "jk");
dictionary.Add("kepeken", "kk");
dictionary.Add("luka", "lk");
dictionary.Add("moku", "mk");
dictionary.Add("namako", "nk");
dictionary.Add("pakala", "pk");
dictionary.Add("sike", "sk");
dictionary.Add("toki", "tk");
dictionary.Add("weka", "wk");
dictionary.Add("ale", "al");
dictionary.Add("jelo", "jl");
dictionary.Add("kala", "kl");
dictionary.Add("lili", "ll");
dictionary.Add("meli", "ml");
dictionary.Add("olin", "ol");
dictionary.Add("pilin", "pl");
dictionary.Add("seli", "sl");
dictionary.Add("telo", "tl");
dictionary.Add("utala", "ul");
dictionary.Add("walo", "wl");
dictionary.Add("seme", "em");
dictionary.Add("kama", "km");
dictionary.Add("mama", "mm");
dictionary.Add("nimi", "nm");
dictionary.Add("pan", "pm");
dictionary.Add("sama", "sm");
dictionary.Add("tomo", "tm");
dictionary.Add("en", "en");
dictionary.Add("jan", "jn");
dictionary.Add("kon", "kn");
dictionary.Add("lon", "ln");
dictionary.Add("mani", "mn");
dictionary.Add("nasin", "nn");
dictionary.Add("ona", "on");
dictionary.Add("pana", "pn");
dictionary.Add("suno", "sn");
dictionary.Add("tempo", "tn");
dictionary.Add("unpa", "un");
dictionary.Add("wan", "wn");
dictionary.Add("ilo", "io");
dictionary.Add("jo", "jo");
dictionary.Add("ko", "ko");
dictionary.Add("loje", "lo");
dictionary.Add("moli", "mo");
dictionary.Add("noka", "no");
dictionary.Add("oko", "oo");
dictionary.Add("pona", "po");
dictionary.Add("sona", "so");
dictionary.Add("anpa", "ap");
dictionary.Add("kipisi", "ip");
dictionary.Add("kulupu", "kp");
dictionary.Add("lupa", "lp");
dictionary.Add("nanpa", "np");
dictionary.Add("open", "op");
dictionary.Add("pipi", "pp");
dictionary.Add("sinpin", "sp");
dictionary.Add("supa", "up");
dictionary.Add("alasa", "as");
dictionary.Add("esun", "es");
dictionary.Add("insa", "is");
dictionary.Add("kasi", "ks");
dictionary.Add("laso", "ls");
dictionary.Add("monsi", "ms");
dictionary.Add("nasa", "ns");
dictionary.Add("palisa", "ps");
dictionary.Add("taso", "ts");
dictionary.Add("musi", "us");
dictionary.Add("waso", "ws");
dictionary.Add("ante", "at");
dictionary.Add("kute", "kt");
dictionary.Add("lete", "lt");
dictionary.Add("mute", "mt");
dictionary.Add("sitelen", "st");
dictionary.Add("uta", "ut");
dictionary.Add("anu", "au");
dictionary.Add("kule", "ku");
dictionary.Add("lukin", "lu");
dictionary.Add("mun", "mu");
dictionary.Add("pu", "pu");
dictionary.Add("suli", "su");
dictionary.Add("tu", "tu");
dictionary.Add("awen", "aw");
dictionary.Add("kiwen", "kw");
dictionary.Add("lawa", "lw");
dictionary.Add("soweli", "ow");
dictionary.Add("sewi", "sw");
dictionary.Add("tawa", "tw");
dictionary.Add("suwi", "uw");
dictionary.Add("wawa", "ww");
dictionary.Add("la", "^");
dictionary.Add("li", "*");
dictionary.Add("o", "!");
dictionary.Add("e", "~");
dictionary.Add("pi", "@");
return dictionary;
}
Re: A ranged possibility: two-letter codes
Interesting. I think I would learn to read it much faster if it had spaces, but it seems that is true for the computer as well. I know I've read through plenty of text in toki pona where a letter or two is transposed, but I knew from the context what was meant. Seems there is always a balance between efficiency and ease of comprehension. What language are you writing your programs in?
Re: A ranged possibility: two-letter codes
Yes, a human can easily tell he's getting a low hit rate. My algorithm only sometimes would realize it's getting a low hit rate because when a letter drops and all the two letters shift across a pair boundary, sometimes you get words, just not ones that make sense.jan Josan wrote:Interesting. I think I would learn to read it much faster if it had spaces, but it seems that is true for the computer as well. I know I've read through plenty of text in toki pona where a letter or two is transposed, but I knew from the context what was meant. Seems there is always a balance between efficiency and ease of comprehension. What language are you writing your programs in?
C#, which ports easily to Java.
Which reminds me, what's the license on your font? I was going to do web page for turning tp latin text into your glyphs-- but only linearly. The dynamic rerouting, resizing and overlapping is beyond my programming talent-except maybe for the boxes (like 0).