Small Language Design- Choosing Meanings

Tinkerers Anonymous: Some people can't help making changes to "fix" Toki Pona. This is a playground for their ideas.
Tokiponidistoj: Iuj homoj nepre volas fari ŝanĝojn por "ripari" Tokiponon. Jen ludejo por iliaj ideoj.
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Small Language Design- Choosing Meanings

Post by janMato »

This isn't a toki pona improvement question, just a methodology question for picking words for a small conlang. I'm not interested in the method of learning back in chair and staring thoughtfully at the ceiling and saying "....it should have a word for 'glue'..." although I suppose that is somewhat unavoidable.

I've got a 500 or so word conlang I was working on and I'm kind of stuck on assigning the words suitable meanings. I'm hoping to try to fix most of the defects before I attempt to memorize the words. Here is a link to the word list draft it in case you're interested: http://www.suburbandestiny.com/conlang/?p=128

These are the "meaning lists" I've thought up so far for cross checking my existing list to see if I've left anything out that I think I'll regret.

The 20 odd semantic primes <-- which appear to be peculiarly shaped bricks, imho

Swadesh and glotochronology lists. These are words that are expected to be stable, not necessarily the words that are necessary for a small language. Still, this "worked" for Laadan.

Existing conlang word lists, or small dictionaries of natlangs. In my opinion almost all conlangs are in a sense small, and often closed vocabulary conlangs because the designers are short on time and sometimes don't really provide good guidance on how to expand the vocab. So the vocabulary of Klingon represents a sort of semantic prime set, albeit a good chunk are fantastically specific. Basic English is another of this sort, except Basic English intended to be a list of "most useful" words. And I guess any 1000 word vest pocket dictionary of French or Hawaiian or what ever would be like this too.

http://wold.livingsources.org/meaning <-- I found this one by chance

Is there anything else I should check, or any other technique I should be using?
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: Small Language Design- Choosing Meanings

Post by janKipo »

I always point to aUI, since it has a pretty short list. The NSM (natural semantic metalanguage) list, at 60-70 members, is also a reasonable start (and the notion of definition is much more appropriate). The glottochronology/ missionary starter list is a very different thing and as suspect as the two enterprises it was designed for. Aux large, by and large, are not after basics but enough to get folks into the wide world fast, so not very useful for minimalists. Artlangs, at least of the Pooh-Bah sort, rarely have much more vocab than is needed for the story(/ies) they go into, which is not likely to be basic but very selected environments. And artlangs of the pretty-pretty sort tend to choose their vocabs for what neat features they can display, rather than for meaning, while the diary sort only meet the writer's quotidian life. Basic English marks, as far as I can tell, the maximum size for a semantic basics list; 20 would be a little on the low side (where did you get that, by the way?).
Lojban at one point used the subject list from Roget and Loglan used Eaton's statistics for English, though neither were used in a search for minimality or even basics.
As I said several years ago (and believed long before -- following Aristotle), the quest for primes or even adequate basics is a mug's game (will o' the wisp, whatever); it will always be inadequate and likely to be circular, and certainly redundant. (That paper, by the way, has three or so lists in an appendix.)
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: Small Language Design- Choosing Meanings

Post by janMato »

The more I think about it, the more I'm not even sure I'm posing the question right. "Given a list of 500 words without assigned meaning, what is the best way to assign meanings so that one is unlikely to regret the choices?"

re: primes
Yeah, for my particular project, I'm not at all interested in the idea that there is (or isn't) a list of "ideas" that can additively be combined to create all other ideas as if there was a periodic table for linguistics.

re: aUI and semantic primes
These lists have something in common-- there is so little concrete stuff going on. Couldn't these lists just as easily have been driven by concrete concepts that are generalizable to abstractions? E.g. hard-things-that-exist vs like-a-rock. Snake eyes is more concrete than two, and imho, easier. Tp seems to lean slightly in in the concrete direction, too-- lawa for leader instead of "body part that acts like with the qualities of leadership"

re: 20 words.
I swear I heard you say it first at one of your LCC speeches, but I'm too lazy to track it down. I'm certain you said TP could have gotten by with (many) fewer than 118 words. If I'm completely mis-remembering, then I got the 20 number from the 20 word languages than jan Arpe wrote.

re: lower and upper bounds for # of words in a small language
So I guess the lower bound is somewhere in the 20 to 70 range. I have the suspicion that as the number of words go down, the remaining words begin to behave as phonemes, i.e. meaningless sounds combined to make a meaningful atomic utterance. It's as if there was a hypothetical language with only 3 words, d, o and g, meaning time, space and nature and it happens to be that d-o-g, for lack of a better compound word means "furry domesticate canine"

Ogden wrote several word lists for Basic English (according to wikipedia) of 850, 1000, 1350, 1500 and 2000 words. I assume to you mean the 850 number as the upper bound for what counts as a small language. Do you figure there is a more persuasive criteria for an upper bound for lexemes in small language, other than an arbitrary cut off?

Roget stole from Wilkins, but I recently googled to find Wilkins' word list and couldn't find it.

Eaton was just a word frequency guy or was there more to it than that? I looked at word frequency lists and often they have way too many function words. http://en.wiktionary.org/wiki/Wiktionar ... ency_lists
Also, I wonder if the word frequency lists are being affected by language type, eg. a hypothetical language with lots of cases might have few common words indicating spatial relationships, while a more analytic language might have several spatial relationship lexemes. Many languages put "to have" in the top 1000, I suspect that in language like Russian, "to have" falls low on the list since they use other mechanisms to express "to have" Anyhow, it does make me think a bit more about how I need to consider how the grammar I write will make words necessary or unnecessary.

Smaller languages can be learned in a short period of time than larger languages. Maybe the objective criteria that I'm looking for is the wordlist and grammar that maximizes the part of the learning curve where words and concepts are being rapidly learnt before you hit that sharp rise in difficulty in progressing further. But even if this is a concern, once a language is learnt, it's probably isn't going to continue to be interesting if it's word lists only advantage was that is was quick to learn.

I guess now that I have a variety of lists to work with I should take a look at the words that are in these other languages and made a choice about if I can really live without it if it isn't on the list in my language.
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: Small Language Design- Choosing Meanings

Post by janKipo »

I don't remember 20 primes, but I haven't looked at that stuff in a while.
If your not interested in prime sorts of things, what is the point of your language? Maybe, if you get clearer on that, some of the selection process will be taken care of (probably not all, of course).
The tendency of philosophical languages is to go for abstractions (duh!). Something more based in a conculture would probably go more toward concretes. But the concretes have to be readily generalizable: I don't think "lobster" would be a good place to start for talking about animals, for example. The balance is tricky, and writing a dictionary for tp seems to involve doing a lot of the generalizing -- from "say" to "related to communication", for example, but that is after a lot of usage, so can be passed off as descriptive (in some cases, not others). Still, I think there was some forethought put into the choices.
Yes Eaton and the like are language specific and yield very different results for different languages. LeChevalier once planned to find a Russian Eaton for comparison purposes, but never did.
I suppose the bounds on small languages comes down to purpose again. Generally, the trick is to balance size and expressability (however that might be judged), i.e., ease of learning the vocab against ease of saying the useful things to say. tp does fairly well, but only the saying, not the understanding side of communication. But even Lojban, with all its rules, has an understanding problem -- and not just across various L1s. And, of course, the more complex the things you want to say (which ought to come late in a language but seem always to turn up immediately -- DDJ for example or Alice)the better you need to know how to manipulate the language.
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: Small Language Design- Choosing Meanings

Post by janMato »

Here's my elevator speech.
http://www.suburbandestiny.com/conlang/?p=134
Ok, may an Empire state building elevator speech on the language's goals.

small- because large conlangs won't attract fans willing to invest that much time
an anti-alzheimer's conlang probably doesn't have any structural characteristics, the research supporting bilingualism as an azheimer's preventative was studying people who knew 2 natlangs, probably any 2 would have the same effect.
an language that can be signed as well as as spoken. This would probably call for grammar tweaking, but not vocab tweaking.
home & internet use- These are the plausible domains of a conlang, so any conlang that wants to get off the reference grammar shelf and make a community needs to address these use cases. Vocab would probably be the easiest way to make a conlang work well in these domains.

Here's the series of posts I've made on my experiences writing the conlang so far:

http://www.suburbandestiny.com/conlang/?cat=10

What is DDJ?
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: Small Language Design- Choosing Meanings

Post by janKipo »

Dao de jing the classic of the way and it's virtue by Laozi. Over thirtyeople I know of have made it the first thing they translated into a conlang, in spite of it's remarkable opacity, even in Chinese.

Small. But lojban and loglan are pretty large, well over 2000 words in the base. Esperanto probably doesn't count, since so many know so much already. Both of these have relatively large groups.. A catchy gimmick may be more important than a small vocabulary.

Any language can be signed as well as spoken, so that isn't much of a criterion. The experience of Signed English, however, suggests that sign languages are best if alloed to float free from spoken languages.

Home and Internet, especially Internet, since home use requires the cooperation of spouse, parents, siblings or children -- who rarely, if ever,, are inclined to cooperate. But a home vocal is handy even for the Internet. On the other hand, a Internet centered language is likely to become dated fairly rapidly (Lojban and Loglan experience) so maybe not so worthwhile in details.
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: Small Language Design- Choosing Meanings

Post by janMato »

janKipo wrote:Dao de jing the classic of the way and it's virtue by Laozi. Over thirtyeople I know of have made it the first thing they translated into a conlang, in spite of it's remarkable opacity, even in Chinese.
What would really make a language match a philosophy is some well chosen grammaticalizations. For example, a Buddhist conlang that required different subject/object markers depending on the category of Dukkha it falls into. (ordinary pain, pain due to things being temporary, pain due to desire and the less often mentioned pain of performing or listening to a high school chorus)
janKipo wrote:Small. But lojban and loglan are pretty large, well over 2000 words in the base. Esperanto probably doesn't count, since so many know so much already. Both of these have relatively large groups.. A catchy gimmick may be more important than a small vocabulary.
Esperanto started small, but from the outset allowed the free borrowing of roots, so it isn't small anymore-- the largest EO dictionaries are as big a a natlang. Lojban, imho, with a 2000 word vocab (and it's closed, right?) is small, but it doesn't have a grammar that is small in any sense. Even if it was incredibly catchy, if the vocab is, say 20,0000 or even 5,000 words, the popularity wouldn't translate into a large early adopter community.
janKipo wrote:Any language can be signed as well as spoken, so that isn't much of a criterion. The experience of Signed English, however, suggests that sign languages are best if alloed to float free from spoken languages.
Not all languages can be signed efficiently, ASL has many adaptations specific to sign language. Well, in my particular scenario, I had in mind a sign language that would be used between the hearing and the hard of hearing/deaf. So in the transition period, it would be useful to be able to do both, and something not so inefficient as finger spelling each word. I suspect that sign languages lend themselves more to analytic languages than ones that have any sort of inflections.
janKipo wrote:Home and Internet, especially Internet, since home use requires the cooperation of spouse, parents, siblings or children -- who rarely, if ever,, are inclined to cooperate. But a home vocal is handy even for the Internet. On the other hand, a Internet centered language is likely to become dated fairly rapidly (Lojban and Loglan experience) so maybe not so worthwhile in details.
EO denasko exist, so it doesn't seem impossible. Is the whole point of eo to find girlfriends at eo conventions? I mean, it sure isn't a international diplomacy language. Getting a language spoken at home is hard in the US. It seems like its easier in places like India, so using the US experience as a standard is a bit unfair. The US is a cultural gulag/death camp for 2nd languages. Anyhow, if a language is never used at home, imho, it has poor prospects for ever becoming a living language. Imho, a well designed language is one that *could* come alive. Toki pona lacks the official machinery necessary for day to day communication at home, at a toki pona convention, for wooing and wowing, for getting kids to brush their teeth. One could easily coin the phrases, but so much would be created that such a hypothetical 2nd generation tokiponist wouldn't be able to find a girlfriend at the next toki pona convention because the living home-language would be incompatible with the online one.

Good point, the churn in technology would make basic vocabulary for things like gopher and myspace rather short lived. Maybe if the official rule was to recycle old words to have new meanings, like if the word for gramaphone became the word for mp3 player and the word for myspace became the word for facebook.
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: Small Language Design- Choosing Meanings

Post by janKipo »

If you really wanted a language to match Buddhism, for example, you would need one in which first, every thing is a verb, and second all verbs referred to instantaneous events. As it is, Buddhism has gotten along fine with a highly inflected Indo-European language and a proclivity for making compounds that would astonish even a German (an argument against yet another form of SWH). I can't think why a Buddhist would want to distinguish between various kinds of duhkha, since they are all alike in needing to be escaped and that is all that counts. Categorization smacks of metaphysics, a Bauddha no-no.

Logjam vocab is not closed: beyond the basic 2-3k words there are an indefinite number of compounds based on these words and a further possibility of borrowing words from other language (though at considerable cost). I would have thought that learning 2000+ words would be enough to dissuade most people from learning a language, especially since the Logjam teaching system basically forces you to get them all in memory before you go on to anything else. Happily, most people ignore the teaching devices, get a small slice and build on that as the need arises -- perhaps never learning some of the words at all. I take this as a case of a fairly large vocab that people join a language in spite of and also of a display of how even the largest vocab can be learned, if needed. After all, people learn English a;ll the time and it has a vocab of Lord knows how many words and even a moderately competent speaker probably has 10k at hand. Of course, people are motivated to learn English, which takes world hegemony as a gimmick. Logjam suggests that other gimmicks might also work.

People who know both ASL and English know that, among ordinary deaf users of ASL, the relation to English is hard to discover and that teaching reading, for example, involves falling back to Signed English, a much less efficient system. ASL can even be used to teach reading French, etc., though the slowed signing will, of course be different.

Sure, getting a home language learning going is difficult anywhere. It is easier in a country where there are a lot of languages around (my I-E professor was competent in 13 languages by the time he was 6 or so, because, as an Estonian, he encountered them all everyday on the street). And then the motivation is to be able to deal with the butcher and the policeman, etc. For a conlang -- even Eo, there is no such motivation, except in a very few places where Eo has been nativized in a community. So, getting a home conlang going requires, ideally, that at least two people use it (and finding a female conlanger is really difficult -- I know maybe two dozen) as a secret language which others will then learn to get in on what's going on. And so on outward, to the largwer family and the neighbors. etc. -- in your dreams. Whether tp could be a home language has not been tested, I think (though Sonja suggested that she had used it in pillow talk, which is a step). But I don't see any real reasons why it could not be: all the normal speech acts are covered, and the necessary vocabulary just takes a bit of creative work, mainly recompressing the expanded concepts that go with various words. It would be interesting for some one to actually undertake to translate a bit of domestic dialog into tp (more interesting that the DDJ again).
janKipo
Posts: 3064
Joined: Fri Oct 09, 2009 2:20 pm

Re: Small Language Design- Choosing Meanings

Post by janKipo »

If you really wanted a language to match Buddhism, for example, you would need one in which first, every thing is a verb, and second all verbs referred to instantaneous events. As it is, Buddhism has gotten along fine with a highly inflected Indo-European language and a proclivity for making compounds that would astonish even a German (an argument against yet another form of SWH). I can't think why a Buddhist would want to distinguish between various kinds of duhkha, since they are all alike in needing to be escaped and that is all that counts. Categorization smacks of metaphysics, a Bauddha no-no.

Logjam vocab is not closed: beyond the basic 2-3k words there are an indefinite number of compounds based on these words and a further possibility of borrowing words from other language (though at considerable cost). I would have thought that learning 2000+ words would be enough to dissuade most people from learning a language, especially since the Logjam teaching system basically forces you to get them all in memory before you go on to anything else. Happily, most people ignore the teaching devices, get a small slice and build on that as the need arises -- perhaps never learning some of the words at all. I take this as a case of a fairly large vocab that people join a language in spite of and also of a display of how even the largest vocab can be learned, if needed. After all, people learn English a;ll the time and it has a vocab of Lord knows how many words and even a moderately competent speaker probably has 10k at hand. Of course, people are motivated to learn English, which takes world hegemony as a gimmick. Logjam suggests that other gimmicks might also work.

People who know both ASL and English know that, among ordinary deaf users of ASL, the relation to English is hard to discover and that teaching reading, for example, involves falling back to Signed English, a much less efficient system. ASL can even be used to teach reading French, etc., though the slowed signing will, of course be different.

Sure, getting a home language learning going is difficult anywhere. It is easier in a country where there are a lot of languages around (my I-E professor was competent in 13 languages by the time he was 6 or so, because, as an Estonian, he encountered them all everyday on the street). And then the motivation is to be able to deal with the butcher and the policeman, etc. For a conlang -- even Eo, there is no such motivation, except in a very few places where Eo has been nativized in a community. So, getting a home conlang going requires, ideally, that at least two people use it (and finding a female conlanger is really difficult -- I know maybe two dozen) as a secret language which others will then learn to get in on what's going on. And so on outward, to the largwer family and the neighbors. etc. -- in your dreams. Whether tp could be a home language has not been tested, I think (though Sonja suggested that she had used it in pillow talk, which is a step). But I don't see any real reasons why it could not be: all the normal speech acts are covered, and the necessary vocabulary just takes a bit of creative work, mainly recompressing the expanded concepts that go with various words. It would be interesting for some one to actually undertake to translate a bit of domestic dialog into tp (more interesting that the DDJ again).
janMato
Posts: 1545
Joined: Wed Dec 02, 2009 12:21 pm
Location: Takoma Park, MD
Contact:

Re: Small Language Design- Choosing Meanings

Post by janMato »

janKipo wrote:If you really wanted a language to match Buddhism, for example, you would need one in which first, every thing is a verb, and second all verbs referred to instantaneous events.
I like it! Kind of like a mirror of Kelen There's lots of varieties of Buddhism. I'm a cafeteria-night-stand Buddhist. The later academic approaches to Buddhism are more to my tastes than the hard-core ascetic historical Siddhārtha Gautama's likely opinions.

This reminds me of the earlier discussion on exactly what it means to be a conlang of/for philosophy X. Is it optimized to illustrate the philosophy, make it easier to discuss or think about it, induce a Zen/Taoist/Positivist state of mind etc, maybe "English Shrugged". I tend to take the view that a natlang and conlang will have & need the cultural vocabulary and maybe the grammar to make it easy to discuss philosophy X, but won't necessarily illustrate it in the structure of the language.
janKipo wrote:After all, people learn English a;ll the time and it has a vocab of Lord knows how many words and even a moderately competent speaker probably has 10k at hand. Of course, people are motivated to learn English, which takes world hegemony as a gimmick.
Dunno how productive it is to try to predict the gimmick that will work in advance. English is worth real $ in the job market. Conlangs, and natlangs (like Icelandic or Ute), can't compete on those grounds. For the moment, I'm sticking with my long-term-cerebral-health gimmick. I've met people who've come to my linguistics book club that attempt to teach essentially conlangs to children with profound disabilities who've come to some of the realizations that the conlang community has: learning a conlang without a community is hard, but if it could be done, it would really help these people and kids who are locked up in their own skulls. As an adult w/o any particular disabilities, a medical conlang just needs to be practical enough to learn, i.e. small, useful at home and on the internet.
janKipo wrote:So, getting a home conlang going requires, ideally, that at least two people use it (and finding a female conlanger is really difficult -- I know maybe two dozen) as a secret language which others will then learn to get in on what's going on.
I've read a book or two on how language die, and if you read them in reverse, you can get some hints on how to bring a language to life. The anthropologists (and maybe they were sexists bastards, but it's sounds plausible), noticed that women tend to adopt the language and customs of the husband, so especially in a patrilocal society, it doesn't pay for women to learn a language that they don't think their future husband will speak. But if their future husband speaks another language, then they would be more likely to learn that language, provided the language was practical and had some sort of prestige (or gimmick) to it. I'm too lazy to track down statistics on what language people use post-marriage, but I'd bet that women are more likely than men to switch to the language of their spouse regardless to the language of the greater community. The killer factoid would be the rate of adoption by gender when the wife and the husband speak different L1's and are in a non-L1 community, i.e. a Khmer/Chinese couple in Los Angeles, what would they speak at home?
And so on outward, to the larger family and the neighbors. etc. -- in your dreams.
Yeah, no one can hope to engineer success on that scale on purpose. Eo was an enormous fluke, the right conlang at the right time, but not necessarily a result of being an especially well designed or marketed one. I probably could make a decision to guarantee a conlang's early death, but not decisions to help the odds of it catching on like wild fire.
Whether tp could be a home language has not been tested, but I don't see any real reasons why it could not be
1) compounds in tp become somewhat transparent in the triplets and quadruplets-- n m pi n m and n pi n m -- and that's too long
as a living language, the # of non-transparent compounds would explode.
And that would lead to the domestic home language being incompatible with the online language
2) Maybe if there was a mythical, tireless leader to dictate the conventions, or a rapid-acting, capable-of-reaching-a-conclusion committee tp could address the above, but without, it's kind of slow to evolve
And living languages don't wait for the committees. They pick a convention and move on. The online community holds back, creating a 2nd mechanism for divergence.
Post Reply