Toki pona corpus project needs your help!

Discuss any other topic in here.
Diskutu ĉiujn aliajn temojn ĉi tie.
janKipo
Posts: 2861
Joined: Fri Oct 09, 2009 2:20 pm

Re: Toki pona corpus project needs your help!

Postby janKipo » Mon Apr 16, 2018 7:38 pm

This looks like a valuable aid eventually. right now it has a lot of problems. A quick glance shows several bad definitions right away, enough to worry about the whole. For basic starters, the correct version of “a” is nothing; ‘wan’ would only be used to contrast with some other number or with ‘ante’ (an even then, the bare form is preferred). Listing ‘anpa’ as a preposition is also merely furthering a common problem.Secondly, starting with an independent list encourages howdjasays, complexes built up in isolation from actual situations and thus unlikely to ever be useful (“axotll” springs to mind, even though I know it was used once). But, if you can get a team of serious researchers who will work through the corpus, there is a good foundation here.
Oh yes, the fact thay you have a tp-En dictionary in the same format and and on the same page, more or less, is a definite plus.

Thanks!

linguafrakka
Posts: 2
Joined: Mon Apr 16, 2018 1:53 pm

Re: Toki pona corpus project needs your help!

Postby linguafrakka » Wed Apr 25, 2018 9:37 am

Thanks for taking a look at the project. It's definitely a work in progress. It can be really challenging trying to combine info from somewhat contradictory yet equally credible sources into a single format. Unlike sites such as Google Translate, Glosbe is not the most equipped when it comes to translating full sentences with proper grammar; it makes up for this huge shortcoming by focusing more on shear quantity of languages and words it can translate. Therefore, my goal has been to use the site for its main intent and focus more on cataloging Toki Pona vocabulary on the site. With very few exceptions, I wouldn't add a multi-word Toki Pona phrase to Glosbe unless it corresponds to a single English word (a bit Anglocentric, I know). It's less about helping people speak the language properly and understand the principles behind it, and more about helping people who already know the basic mechanics of the language to find a good Toki Pona translation for that one English word that's giving them trouble.

If you wish to address the issues regarding a lack of instruction on how to use each Toki Pona word behaves in context, I would suggest contributing to the site's "Translation Memory". It's pretty simple to do. You just type a sentence in English, then type your Toki Pona translation (as directly or loosely as you see fit). The site will be able to extract individual words and phrases from the sentences in both languages and display your sentence whenever people look up the one of the individual words I've cataloged. It seems to recognize English grammatical conjugations, so if you translated a sentence like "They are loving it." then people who search for the Toki Pona translation of "love" will see "olin", several English definitions of "love" as a noun or verb which I've added that generally correspond with its Toki Pona usage, as well as your translated sentence with the words "loving" and "olin" highlighted for clarity.

janKipo
Posts: 2861
Joined: Fri Oct 09, 2009 2:20 pm

Re: Toki pona corpus project needs your help!

Postby janKipo » Wed Apr 25, 2018 1:43 pm

Sorry you have suchlow expectations for aa tp Glospe site. The samples from other languages look like it is ideal for catching nuance and context, which play a central role in tp. Admittedly, taking advantage of this possibility calls for more work than just typing in a couple of sentences, but the result is so much more rewarding.
I was somewhat appalled by your sample case of “love”. “They are loving it” would apparently be matched with ‘ona (mute) li olin e ni (or ona)”. But, ‘olin’ is only for relations between persons and persons aree being that can express and receive affection, so rarely ‘it’s. What would be correct is ‘ona li pona (mute) tawa ona (mute)’. I wonder what the algorithm will extract for “love” from that.
So the project needs supervision at every step and clearly trained supervision at that. I hope you will be able to assemble a cadre of people to work on this.


Return to “ijo ante | miscellaneous | diversaj”

Who is online

Users browsing this forum: No registered users and 1 guest