Toki pona corpus project needs your help!

Discuss any other topic in here.
Diskutu ĉiujn aliajn temojn ĉi tie.
janKipo
Posts: 2931
Joined: Fri Oct 09, 2009 2:20 pm

Re: Toki pona corpus project needs your help!

Postby janKipo » Mon Apr 16, 2018 7:38 pm

This looks like a valuable aid eventually. right now it has a lot of problems. A quick glance shows several bad definitions right away, enough to worry about the whole. For basic starters, the correct version of “a” is nothing; ‘wan’ would only be used to contrast with some other number or with ‘ante’ (an even then, the bare form is preferred). Listing ‘anpa’ as a preposition is also merely furthering a common problem.Secondly, starting with an independent list encourages howdjasays, complexes built up in isolation from actual situations and thus unlikely to ever be useful (“axotll” springs to mind, even though I know it was used once). But, if you can get a team of serious researchers who will work through the corpus, there is a good foundation here.
Oh yes, the fact thay you have a tp-En dictionary in the same format and and on the same page, more or less, is a definite plus.

Thanks!

linguafrakka
Posts: 3
Joined: Mon Apr 16, 2018 1:53 pm

Re: Toki pona corpus project needs your help!

Postby linguafrakka » Wed Apr 25, 2018 9:37 am

Thanks for taking a look at the project. It's definitely a work in progress. It can be really challenging trying to combine info from somewhat contradictory yet equally credible sources into a single format. Unlike sites such as Google Translate, Glosbe is not the most equipped when it comes to translating full sentences with proper grammar; it makes up for this huge shortcoming by focusing more on shear quantity of languages and words it can translate. Therefore, my goal has been to use the site for its main intent and focus more on cataloging Toki Pona vocabulary on the site. With very few exceptions, I wouldn't add a multi-word Toki Pona phrase to Glosbe unless it corresponds to a single English word (a bit Anglocentric, I know). It's less about helping people speak the language properly and understand the principles behind it, and more about helping people who already know the basic mechanics of the language to find a good Toki Pona translation for that one English word that's giving them trouble.

If you wish to address the issues regarding a lack of instruction on how to use each Toki Pona word behaves in context, I would suggest contributing to the site's "Translation Memory". It's pretty simple to do. You just type a sentence in English, then type your Toki Pona translation (as directly or loosely as you see fit). The site will be able to extract individual words and phrases from the sentences in both languages and display your sentence whenever people look up the one of the individual words I've cataloged. It seems to recognize English grammatical conjugations, so if you translated a sentence like "They are loving it." then people who search for the Toki Pona translation of "love" will see "olin", several English definitions of "love" as a noun or verb which I've added that generally correspond with its Toki Pona usage, as well as your translated sentence with the words "loving" and "olin" highlighted for clarity.

janKipo
Posts: 2931
Joined: Fri Oct 09, 2009 2:20 pm

Re: Toki pona corpus project needs your help!

Postby janKipo » Wed Apr 25, 2018 1:43 pm

Sorry you have suchlow expectations for aa tp Glospe site. The samples from other languages look like it is ideal for catching nuance and context, which play a central role in tp. Admittedly, taking advantage of this possibility calls for more work than just typing in a couple of sentences, but the result is so much more rewarding.
I was somewhat appalled by your sample case of “love”. “They are loving it” would apparently be matched with ‘ona (mute) li olin e ni (or ona)”. But, ‘olin’ is only for relations between persons and persons aree being that can express and receive affection, so rarely ‘it’s. What would be correct is ‘ona li pona (mute) tawa ona (mute)’. I wonder what the algorithm will extract for “love” from that.
So the project needs supervision at every step and clearly trained supervision at that. I hope you will be able to assemble a cadre of people to work on this.

linguafrakka
Posts: 3
Joined: Mon Apr 16, 2018 1:53 pm

Re: Toki pona corpus project needs your help!

Postby linguafrakka » Mon Apr 30, 2018 1:32 am

Sorry my example-sentence with olin was ike mute. This is why I don't contribute my own sentences to the translation memory; I'm just not good enough yet. I always seek out valid sources and simply copy the sentences over.

Regarding the definitions of the Toki Pona words, it's important to note that the definitions shown below the Toki Pona output-translations aren't meant to apply to the Toki Pona word/phrase, but the English word that it was referencing (that's just how the site wants it done; they demand the English definition of the English word and/or the definition of the Toki Pona word in Toki Pona). So if you look up "dog" and see the output "soweli", you will be shown the English definition of describing "dog" and not "land-mammal" below soweli in that particular case. And yes, other phases will also be presented such as "soweli tomo" when looking up "dog", and the latter definition is likely to have an English definition of dog more geared toward their domestic qualities. To understand the variety of the word soweli's meaning beyond "dog" at this point, you're best off clicking the output-word and being shown all of the back-translations that have been entered for the word. It's this "back-translation" system that I feel really makes Glosbe handy when testing the ambiguity of your Toki Pona utterings.

The system works very well for nouns, verbs, and adjectives. It gets a bit weird for preposition-like words, but that's where we just need to build a rich library of translated sentences to allow learners to draw their own intuitive parallels between the functions of English prepositions and Toki Pona particles and quasi-prepositions.


Return to “ijo ante | miscellaneous | diversaj”

Who is online

Users browsing this forum: No registered users and 1 guest

cron