Jump to content

Category talk:Turkish terms with homophones

Page contents not supported in other languages.
Add topic
From Wiktionary, the free dictionary
Latest comment: 11 years ago by ElisaVan

From Special:PermanentLink/24941665#A_Bunch_of_Turkish_Inflected-Form_Categories on WT:RFDO:

The following information passed a request for deletion.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


A Bunch of Turkish Inflected-Form Categories

I noticed these in the Special:UncategorizedCategories page, and was going to add the appropriate categories and move the misspelled ones, but then I noticed who created many - if not all- of these, and thought I should check whether this kind of categorization is appropriate. If the consensus is that they're ok, I'll happily withdraw the nominations and fix the problems. Chuck Entz (talk) 02:43, 21 July 2012 (UTC)Reply

We don't do this in any language I'm familiar with, except Latvian and Icelandic, and I'm pretty sure we stopped doing it there. Delete --Μετάknowledgediscuss/deeds 02:49, 21 July 2012 (UTC)Reply
I templatised Category:Turkish terms with homophones. — Ungoliant (Falai) 03:19, 21 July 2012 (UTC)Reply
These categories are very specific, but languages are all unique so they have to be judged as such. I think asking User:Sinek would be a good idea. User:George Animal is also listed as a native Turkish speaker, I think the other users Category:User tr-N aren't currently active. Mglovesfun (talk) 14:22, 21 July 2012 (UTC)Reply
I just realized: these seem to each have their own template that produces them, and the templates aren't language specific- one could just as easily find Category:French possesive singular forms someday. Chuck Entz (talk) 14:41, 21 July 2012 (UTC)Reply
Not all of them do, but the more specific ones seem to. Chuck Entz (talk) 14:54, 21 July 2012 (UTC)Reply
Now that I know what was populating them, I've orphaned and speedied the misspelled ones. Chuck Entz (talk) 16:21, 21 July 2012 (UTC)Reply
I think that a general useful template form for all languages is difficult - if not impossible -, but I propose to follow the way it is done in Hungarian, as this Finno-Ugric language is somehow similar to the Altaic Turkish. In Hungarian, words are formed also agglutinively, and I know that a word can have several cases at the same time. Sae1962 (talk) 07:43, 22 July 2012 (UTC)Reply
I think all that categories (the most of them; like:Turkish third-person singular possesive dative forms) are redundant because that templates exist not even at the Turkish Wiktionary.It is nonsense to create the pages (e.g.:my auto, my page, my father, of the father). It would be enough if the templates exist.They don't have to be created.The creation of the entries like (arabam:my auto) are also redundant because all that things are the same.Arabam (my auto), evim (my house) etc. It is better to create a page for the grammar part for the possessive nouns of the Turkish language.I'm in favour of this idea.GeorgeAnimal. 13:04, 22 July 2012 (UTC)Reply
PS:I am for the deletion of the pages.--GeorgeAnimal. 13:05, 22 July 2012 (UTC).But the templates shall remain in the entries.---GeorgeAnimal. 13:07, 22 July 2012 (UTC)Reply
These Turkish inflection forms can be all generated from simple grammar rules. Including them in the Wiktionary is totally useless. They don't even have exceptions like in English you have dog ->dogs but mouse->mice. Because Turkish is an agglutinative language, if you started including all possible Turkish constructs you would have no rational way to reject something like Çekoslovakyalılaştıramadıklarımızdanmışsınız which means "reportedly you are you one of those whom we could not make Czechoslovakian". Terms with definitions that are Sum-of-Parts are not included in the Wiktionary and with a similar reasoning, I believe grammatical constructs in agglutinative languages that follow simple rules should not be included either. For this reason the categories listed above and the words contained within them should be deleted. Same argument applies for the templates mentioned below. --İnfoCan (talk) 16:58, 22 July 2012 (UTC)Reply
Off-topic comments about whether there should be entries for all possible word-suffix combinations
Actually we do have a way of rejecting something like Çekoslovakyalılaştıramadıklarımızdanmışsınız, namely the requirement that words actually be attested in use (and not merely as mentions). I don't think Turkish is considered one of the limited-documentation languages, so Çekoslovakyalılaştıramadıklarımızdanmışsınız would have to be attested three times in published literature to be included. If not, it isn't included. If it is attested, however, there is actually no reason we shouldn't include it. —Angr 20:38, 22 July 2012 (UTC)Reply
OK, the country of Czechoslovakia does not exist any more, and the above is just an old joke that elementary school kids tell each other. But seriously, if you want attestation, no problem: Avrupalılaşmamıza is a down-to-earth example, it translates as "to our Europeanization" (as in "Islam is shown as an obstacle to our Europeanization", as used in a newspaper editorial). Google search gives me three independent attestations from newspapers [1]. Or, yapamamamızın ("of our inability to do"), 7 attestations just in Google Books [2]. So, is Wiktionary going to include such words now? If so, then probably 90% of the words in Wiktionary can potentially belong to Turkish or some other agglutinative language. This is ridiculous and calls for a proper definition of what a "word" is, to set an inclusion policy. I believe that the attestation part should come in only after you have "peeled off" all the generic inflections. --İnfoCan (talk) 03:07, 23 July 2012 (UTC)Reply
Why shouldn't Wiktionary include such words? We're not paper, we're not going to run out of space. On the contrary, the ability to include such things is one of our major selling points, what sets us apart from conventional dictionaries. I see no harm at all in including such forms, and they may be very helpful for people learning Turkish who haven't yet quite mastered all of the suffixes and so don't know exactly what to shave off to get to the lemma. —Angr 21:53, 23 July 2012 (UTC)Reply
Your argument is equivalent to saying that the Wiktionary should have translations of every four-word English phrase to Turkish. I could say there is no space limitation, and it will certainly be useful to Turkish speakers who don't know English. Yes it is doable but it is not efficient. WikiMedia is great for writing an encyclopedia and perhaps a traditional dictionary, but I don't think it is the right tool for doing what you propose. Rather than entering a translation of everything, you need a rule-based parsing system. I know such Turkish language parsers exist, I have seen them at academic Web sites in Turkey. It would be far less work to write software that parses Turkish than write all these translations! --İnfoCan (talk) 22:23, 23 July 2012 (UTC)Reply
It's not quite equivalent, because Wiktionary's goal is "all words in all languages", and yapamamamızın is a word, while a four-word English phrase isn't a word. Now if we had a parsing system like you're talking about, where someone could type in yapamamamızın and be given the information about its root and affixes, then I'd agree we wouldn't need it as an entry as well. But until we have such a parsing system here, if someone creates an entry for yapamamamızın, and someone else nominates it for deletion, I will vote keep. —Angr 22:36, 23 July 2012 (UTC)Reply
Someone may create an entry for yapamamamızın but will soon tire of creating such entries. This discussion reminds me of the joke about the person with the hammer seeing everything like a nail ;-). You have to admit that current WikiMedia software is not designed to deal with this kind of data. That's why the Foundation has started to move toward new concepts like WikiData [3].
To think that a word is a bunch of characters with a space character on either side is an English- (or rather, a non-agglutinative language) speaker's world view :-). For yapamamamızın and "of our inability to do", the important parts are yap and "do" and the rest is grammatical detail. English uses spaces to separate most morphemes, Turkish relies a lot more on grammar rules. If four words separated by spaces don't deserve to have an entry in the dictionary, nor does a lemma with three suffixes. --İnfoCan (talk) 02:11, 24 July 2012 (UTC)Reply
Let me elaborate. The grammatical rules I mentioned can be summarized in a few paragraphs on a special page in the Appendix: names space and this would be far more efficient than having a template for each inflection form of each Turkish word. On the other hand, it did occur to me that there are a few exceptions to these rules, mainly because of exceptions to the Turkish vowel harmony rules (for certain words of foreign origin (for example the dative case of sol, the musical note, is sole, but the dative case of sol, meaning "left", is sola). For such exceptions, and only for them, a specialized template would be useful to indicate that the usual rules do not apply. This minor point aside, I still stand by my view above. --İnfoCan (talk) 18:26, 22 July 2012 (UTC)Reply
Per İnfoCan--Sabri76'talk 17:45, 22 July 2012 (UTC)Reply
Actually I find it unnecessary to create articles every declined noun form and conjugated verb form, as it'd mean countless forms. Being quite different than English, there are so many possible situations that are expressed by a suffix in Turkish. But I also think of the users that have no knowledge of Turkish. We also have to take into account that, even someone without a basic knowledge should be able to find the definitions they needed. So if someone sees a word like evimizdeyiz ("we are at our house") and tries to look it up, they'll probably find nothing, as it has 3 different suffixes. And searching words like this is useless, as Did you mean ...? part can't always lead you to the right direction, as the forms on the declension templates are not shown on the searches.
Rather than creating each form, I guess we could edit the templates in order to give links to each suffix, something like that: evimizdeyiz. I guess this is more practical, as there are so many (and I mean it.) suffixes and there'd be millions of different combinations with all Turkish nouns. But I still have no idea about the search part, is there way to make it possible to show the info on the declension templates? Sinek (talk) 12:43, 23 July 2012 (UTC)Reply
More off-topic comments about whether there should be entries for all possible word-suffix combinations
Creating millions of pages for all possible combinations of Turkish words with their single, double and triple suffix combinations is very inefficient. It is like creating separate pages for every possible two-, three- and four-word phrases in English. If somebody needs to look up a word like evimizdeyiz ("we are at our house"), you need a parsing-based solution, not a catalog-based solution that WikiMedia provides. It is possible to write scripts that take a word like evimizdeyiz and split it into ev (house) + -imiz (possesive, 1PP) + -de (locative case) + y (vowel-vowel connector) + -iz (copula, 1PP). The output of such a script can then give the necessary links to a such a user. --İnfoCan (talk) 17:53, 23 July 2012 (UTC)Reply
That's exactly what I said. I just don't know how to link the user (who searches evimizdeyiz) to the bare noun, ev. Sinek (talk) 18:32, 23 July 2012 (UTC)Reply
Any chance of moving all the off-topic stuff to the Beer Parlor? The deletion debates has got lost among it. Mglovesfun (talk) 09:16, 24 July 2012 (UTC)Reply
Sorry. I collapsed my comments above. --İnfoCan (talk) 14:15, 24 July 2012 (UTC)Reply

I would favour deleting these in favour of an approach like that used in Hungarian noun form categorisation. To elaborate, categorise possessive noun forms in one possessive noun form subcategory, then perhaps for any further inflected forms based off a form that is possessive (should we include them) categorise them as basic noun forms; for example, evimizde could be included as "locative singular(?) of evimiz". I don't know how accepting of this people would be but I think it would be kind of ok to give some like a free pass to the basic inflections, except perhaps in the case of odd or rare words, so that we would generally have no qualms about the addition of the simple case forms and probably nominative possessive forms too but would maybe be more strict or watchful of additions of not so basic forms like non-nominative possessives and such. User: PalkiaX50 talk to meh 14:43, 24 July 2012 (UTC)Reply