Jump to content

Wiktionary:Beer parlour/2011/July

From Wiktionary, the free dictionary
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


Redirecting single-character digraphs

I suggest redirecting Dz to Dz, and doing the same from all single-character digraphs to their attestable two-character versions. --Daniel 15:31, 1 July 2011 (UTC)[reply]

Support.RuakhTALK 17:12, 1 July 2011 (UTC)[reply]
Why? A reason to redirect would be nice, and none has yet been supplied (here). Moreover, (at least in absence of such reason) I oppose: A reason to have them separate is to show etymology sections stressing different aspects of their development. (The digraph's etymology section can stress how long the digraph's been in use, and the other version's can stress how long it's been in use. Etc.) Another reason is that the digraph (or two-character version) might be a letter in some language that the other version is not a letter in, so we'd need a language section for that language on the one page and not the other. These and similar reasons are why we keep alternative spellings on separate pages.​—msh210 (talk) 19:06, 1 July 2011 (UTC)[reply]
I don't think Dz to Dz are alternative spellings of each other, or that the two-character version can't be considered a letter in some language. They are the same set of characters: "D" followed by "z".
Moreover, the Unicode FAQ says[1] some interesting things:
  • A digraph, for example “xy”, looks just like two ordinary letters in a row (in this example “x” and “y”), and there is already a way to represent it in Unicode: <U+0078, U+0079>.
  • [...] the UTC has taken the position that no new digraphs should be encoded, and that their special support should be handled by having implementations recognize the character sequence and treat it like a digraph.
--Daniel 19:19, 1 July 2011 (UTC)[reply]
I didn't say the two-character version can't be considered a letter. In any event, exposed to the light of your quotes from Unicode, my objections wither and die. I support redirection.​—msh210 (talk) 19:30, 1 July 2011 (UTC)[reply]
I take back my support. (Am wearing flip-flops, too.) See my comment on the vote's talkpage for more.​—msh210 (talk) 17:17, 6 July 2011 (UTC)[reply]
If this is supposed to be handled in the implementation like Unicode says, then shouldn't the Mediawiki software itself perform the redirect? —CodeCat 19:44, 1 July 2011 (UTC)[reply]
MediaWiki doesn't do that, and I don't know why. --Daniel 19:49, 1 July 2011 (UTC)[reply]
I don't understand your question. Unicode says that implementations should recognize that <ng> (two characters) in a Tagalog context is a single digraph; it doesn't say that implementations should canonicalize <ʣ> (one character) to <dz> (two). —RuakhTALK 13:49, 2 July 2011 (UTC)[reply]
Oh, I thought that it was that way. Nevertheless, could something like that be done for Wiktionary? —CodeCat 13:51, 2 July 2011 (UTC)[reply]
Redirecting individual entries manually can be done. I think that automatic redirects can be done, too, with JavaScript. --Daniel 10:55, 4 July 2011 (UTC)[reply]

I created Wiktionary:Votes/2011-07/Redirecting single-character digraphs. --Daniel 10:55, 4 July 2011 (UTC)[reply]

I don't think we're ready for that. I'd like to know more generally if this should always be done, and what the language implications are for using Dz -> D + z and similar rules in any spellings in which they appear. For that it may be useful to have a list of such digraphs. DAVilla 16:57, 4 July 2011 (UTC)[reply]
See Wiktionary talk:Votes/2011-07/Redirecting single-character digraphs for a start . . . —RuakhTALK 00:17, 5 July 2011 (UTC)[reply]
Several of those I would cross out, like ₨. Affected languages seem to be Arabic, Armenian, and Latin. DAVilla 16:29, 6 July 2011 (UTC)[reply]
And Hebrew, and Lao. --Daniel 16:58, 6 July 2011 (UTC)[reply]

2Ps has two letters

Should Category:English two-letter words contain 3Ps? --Daniel 16:56, 1 July 2011 (UTC)[reply]

Apparently not, because the category description (which you wrote!) is "English individual words comprised of exactly two letters". "3Ps" is not composed of exactly two letters: it's composed of two letters and a digit. I don't think the description needs to be changed. (Technically I suppose a word isn't really composed of letters — a word's spelling is composed of letters — but the meaning of the description seems clear. Oh, but if you do change the description, you might want to change "comprised of" to either "composed of" or "comprising"; this use of "comprised of" is common, but is frequently considered incorrect.) —RuakhTALK 17:10, 1 July 2011 (UTC)[reply]
OK, I'm going to change the description of that category to allow "3Ps", unless someone objects in the near future.
If a word's spelling, rather than the word, is comprised of letters, perhaps the name of the category should be changed as well. Since we have Category:Japanese terms written with four Han script characters, we could have Category:English terms written with two letters. --Daniel 18:36, 1 July 2011 (UTC)[reply]
I'm objecting in the near future for some value of near. I don't see the purpose in having a category for two-letter words including also words that include digits. OTOH, two-letter words sans digits form a category useful to people doing crosswords, cryptograms, and other word puzzles. Re the category name, "English two-letter words" is fine: that's what everyone calls them (at least here in Leftpondia). Moreover, "English terms written with two letters" might be read as "...with two different letters" (e.g. (deprecated template usage) sass).​—msh210 (talk) 18:54, 1 July 2011 (UTC)[reply]
3Ps is not a two-letter word. Category:English terms written with numbers (and also letters by presumption) might be more interesting. DAVilla 18:34, 4 July 2011 (UTC)[reply]

Letters and typography

My own idea expressed in a recent discussion made me curious.

Should A, a, B, b, etc. be members of Category:en:Typography? --Daniel 18:20, 1 July 2011 (UTC)[reply]

No, use a subcategory, Category:en:Letter names or the like. DAVilla 18:30, 4 July 2011 (UTC)[reply]

English names of stars, etc.

Today I feel like creating this streak of new and relevant discussions. Feel free to ask me to take it easy in the future if you want less discussions, though I'm finished for today anyway.

Now to the proposal. Renaming categories, this way:

The proposed names are in line with the idea of deprecating language codes piecemeal from category names, which some people seem to approve, and also fits the existence of Category:English surnames, which is codeless, and contains proper nouns, as well. --Daniel 18:46, 1 July 2011 (UTC)[reply]

Would these categories go under Category:English names? If so, then maybe an intermediate category like Category:English topographical names and Category:English astronomical names would be good too, so that they don't all go into the main category. —CodeCat 19:21, 1 July 2011 (UTC)[reply]
Yes, that's a good idea, too. --Daniel 19:38, 1 July 2011 (UTC)[reply]

I created Wiktionary:Votes/2011-07/Categories of names. --Daniel 05:28, 2 July 2011 (UTC)[reply]

I oppose, while I am willing to yield to a significant majority. The proposed renaming is a first step in making the names of hyponymic topical categories needlessly long; "mammals", and "animals" are likely to follow if this naming scheme is to be applied throughout Wiktionary. Furthermore, there was a recent poll from which it was not obvious that a large majority of editors prefers to get rid of language codes: Wiktionary:Beer parlour archive/2011/May#Straw poll: Topical category languages. --Dan Polansky 10:26, 2 July 2011 (UTC)[reply]

That poll ended in a very approximate draw between Category:de:Mountains and Category:de:Physics and Category:German terms relating to mountains and Category:German terms relating to physics, one of these options, the slightly more voted one, being long and written in plain English rather than making use of codes.
There was some disagreement, however, between the voters of long names, about what would be the exact longer names: people proposed "regarding mountains", "involving mountains", "relating to mountains" and even "Mountain terminology".
Concerning this relatively small but cumbersome disagreement of wording: "Category:English names of mammals" would not be an accurate option of name of a hyponymic category; "Category:English names of species of mammals" or "Category:English hyponyms of mammal" would be better.
On the other hand, "Category:English names of stars" is just accurate enough for its contents. This category should contain Sun and Aldebaran, but not red giant or supernova. In fact, the title is so specific I know it can't contain starlight as well. --Daniel 13:05, 2 July 2011 (UTC)[reply]
  • These new category names look very cumbersome and needlessly worsen the usability of categories. As with the category names for etymologies, this should be handled at the presentation level with two lines of Javascript code doing the necessary substitution/prefixation. When Wiktionary finally switches to language-specific presentation (tabbed view or whatever), these "XXX names of" will become redundant. --Ivan Štambuk 16:33, 2 July 2011 (UTC)[reply]
    No, not actually, they would not become redundant. The vote I created (Wiktionary:Votes/2011-07/Categories of names) says why not.
    Actually, the part "names of" would not be redundant, because "Mountains" and "Names of mountains" are different things. The part "German" would be redundant if only German categories are shown, just like how "de:" would be redundant in that case, but perhaps in a more readable manner. --Daniel 16:52, 2 July 2011 (UTC)[reply]
    Users who see category names such as "Mountains" or "Continents" would expect them to contain names of mountains and continents. In fact, AFAICS, the abovelisted categories don't contain anything other than proper nouns. The new category names are more more precise, but non-intuitive and cumbersome IMHO. --Ivan Štambuk 17:23, 2 July 2011 (UTC)[reply]
    Allow me to disagree with you again: No, I don't think so. Some categories could be populated only with proper nouns but contain terms of other parts of speech, too: I only remember Category:Planets and Category:Gods. (and Category:Stars contains star, but that can be ignored) --Daniel 17:43, 2 July 2011 (UTC)[reply]
    I don't really have an opinion about the naming, but I do like the idea of having separate categories for names of certain things as opposed to words about those things. —CodeCat 17:46, 2 July 2011 (UTC)[reply]
  • For information, fr.wiktionary uses names of the kind Countries in English and Lexicon in English of (a domain) (for words used in the domain). The only drawback is that not only proper nouns, but also some common nouns, may be considered as relevant to the first case. This proposal is about this issue. Any other better proposal about it? Lmaltier 17:33, 2 July 2011 (UTC)[reply]
    I learned in this page that "Mountains in Tonga" would be very ambiguous, because there is a language and a place named Tonga. That should be considered when analyzing the idea of coyping the system of the French Wiktionary. --Daniel 17:47, 2 July 2011 (UTC)[reply]
    This is probably the only language for which this ambiguity arises. DAVilla 18:28, 4 July 2011 (UTC)[reply]

For the record, these are the current entries of Category:Planets: carbon planet, double planet, Earth, exoplanet, exosolar planet, extrasolar planet, gas giant, giant planet, Herschel, hot Jupiter, ice giant, inner planet, Jupiter, Le Verrier, major planet, Mars, Mercury, mesoplanet, minor planet, Neptune, outer planet, planet, Planet X, protoplanet, Saturn, silicate planet, sub-brown dwarf, superplanet, Teegeeack, terrestrial planet, Uranus, Venus and Vulcan.

And these are the ones of Category:Gods: Allah, Amaterasu, Discordia, Flying Spaghetti Monster, FSM, Galaxia, goddess, Haumea, Huitzilopochtli, Invisible Pink Unicorn, IPU, Izanagi, Izanami, Jah, Jehovah, Keb, Makemake, momentary god, momentary gods, Nike, Pele, Tezcatlipoca, Tyr, Wenis and Yahweh. --Daniel 05:55, 3 July 2011 (UTC)[reply]

For the sake of brevity in titles, what we might need is the bi- or trifurcation of the Category: namespace. Topic:Mountains in English could allow any terms related to mountains, Names:Mountains in English just the proper nouns, leaving Category:English idioms and the like. DAVilla 18:28, 4 July 2011 (UTC)[reply]

In this case, Everest would be a member of both "Topic:Mountains in English" and "Name:Mountains in English"? In my opinion, "Category:English names of mountains" and possibly "Category:English terms relating to mountains" are better, even simpler. --Daniel 18:04, 7 July 2011 (UTC)[reply]

100 is a number

I propose using the POS header "Number" (and not "Numeral", "Cardinal number", "Cardinal numeral" or "Symbol") for all definitions that meet these requirements:

  • Is Translingual.
  • Is defined as a number.
  • Is written with the digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and/or 9.

Examples of affected entries: 1, 8, 33, 100, 101 and 420. Other standardizations of numbers can come some other day. Today I want to standardize these. --Daniel 01:15, 2 July 2011 (UTC)[reply]

If we standardise this to be called "Number", then the category for words would automatically become "Numeral" unless we agree to put them both together. —CodeCat 10:33, 2 July 2011 (UTC)[reply]
I oppose this piecemal approach to treatment of number words. We had discussions and votes about "number" vs "numeral" that were done in a wrong order. Instead of asking the questions in the right order, you just put forward one question without explaining the implications for the overall treatment of number words. Like, if your proposal is accepted, an implication of the proposal is that "four" will get the part of speech "Number" rather than "Numeral", "Cardinal number", "Cardinal numeral", or "Adjective". If "four" should have the part of speech "Number", that should be decided explicitly rather than by first drawing attention only to entries for sequences of decimal digits.

On a process note, please wait at least a week before you start acting on this Beer parlour discussion. --Dan Polansky 10:44, 2 July 2011 (UTC)[reply]

About your last sentence: Yes, I typically wait a week, sometimes much more, to start acting on a BP discussion; as an exception, I don't see creating votes as acting soon per se, so sometimes I create votes quicker than that.
About the proposal: I did not propose any change to the entries one, two, three, four, etc. I just want to edit 1, 2, 3, 4, etc. I think that is clear. Other languages may have different approaches.
If you want some reasonings, here they are. These are my points of view, of course, so feel free to prove me wrong:
  • "Numeral" very often implies "single numeric symbol" or "single digit". "Number" is better than "Numeral" for being more generic, since we have numbers with two or more digits as well.
  • The POS header "Cardinal number" does not fit all the number senses of 1. The definition "A digit in decimal and every other base numbering system, including binary, octal, and hexadecimal." is not of a cardinal number, and I would like to keep all number senses of that entry together within one comprehensive enough POS header.
--Daniel 12:23, 2 July 2011 (UTC)[reply]
I think there's an argument that if we don't use Transitive verb and Intransitive verb as headers, then simply 'Number' is sufficient for cardinal and ordinal numbers too. Mglovesfun (talk) 12:38, 2 July 2011 (UTC)[reply]
But cardinal and ordinal numbers often differ in their inflection and part of speech. Ordinal numbers are usually adjective-like while cardinal numbers are more often uninflected and behave as a class of their own. They also behave differently in a sentence, because cardinals can generally not be used predicatively while ordinals usually can. Transitive and intransitive verbs are much more alike, they only differ in whether they can have an object. —CodeCat 12:51, 2 July 2011 (UTC)[reply]
"Numbers" are by definition a made-up lexical category, comprised of all kinds of words relating to quantity. We should simply follow the respective language's grammar tradition into what classifies as a number/numeral, and what's just simply an adjective/noun specifying order, distribution, multiplication/partialness etc. because it is impossible to devise a policy that would apply to all. AFAICS, the main issue here is terminological, which of the two terms (number/numeral) should be used and where (separate header or a label). --Ivan Štambuk 16:27, 2 July 2011 (UTC)[reply]
I agree with Ivan: "We should simply follow the respective language's grammar tradition into what classifies as a number/numeral, and what's just simply an adjective/noun specifying order, distribution, multiplication/partialness etc. because it is impossible to devise a policy that would apply to all."
I think we don't need to discuss about all languages together anymore, if we can agree on analyzing "Number/Numeral/Adjective/etc." headers on a case-by-case basis. I opened a discussion about certain Translingual terms, and that's them I want to edit. --Daniel 16:40, 2 July 2011 (UTC)[reply]
I'd put number/numeral alongside abbreviation, acronym, initialism, phrase (etc.) as headers to be avoided wherever possible. For example NATO should have a proper noun header, not an acronym header. Similarly cinq in French is a masculine invariable noun. So noun is the correct header, but Category:French cardinal numbers is a correct category too, just like woof can be a noun a verb and an onomatpoeia. Mglovesfun (talk) 12:09, 4 July 2011 (UTC)[reply]
Abbreviation, Acronym and Initialism should be avoided for English and other languages, because these headers don't accurately display the grammatical, morphological and syntactical characteristics of something that would rather fit a Noun, Verb, Proper noun or other header.
I already explained above why "Number" is the best option for certain Translingual numbers. As of now, the only argument opposing my conclusion would be the implication that words of other languages, such as the English four, would have that header as well. However, this argument subsequently has been disproved: 4 can have a Number header while four can have other headers. --Daniel 00:59, 5 July 2011 (UTC)[reply]

A week passed. The proposal passed, too. --Daniel 14:52, 9 July 2011 (UTC)[reply]

Call for image filter referendum

The Wikimedia Foundation, at the direction of the Board of Trustees, will be holding a vote to determine whether members of the community support the creation and usage of an opt-in personal image filter, which would allow readers to voluntarily screen particular types of images strictly for their own account.

Further details and educational materials will be available shortly. The referendum is scheduled for 12-27 August, 2011, and will be conducted on servers hosted by a neutral third party. Referendum details, officials, voting requirements, and supporting materials will be posted at m:Image filter referendum shortly.

For the coordinating committee,
Philippe (WMF)
Cbrown1023
Risker
Mardetanha
PeterSymonds
Robert Harris

— This unsigned comment was added by EdwardsBot (talkcontribs) at 06:00, 3 July 2011.

Reminder

Hi. As the dates are approaching for this, I just wanted to drop a reminder and let you know that more information is now available at m:Image filter referendum/en and m:Image filter referendum/FAQ/en. Since images are not in heavy local use, I realize this may not seem directly relevant to your project, but it is a cross-Wiki issue intended more for readers than editors, and your input could be very valuable! Thanks. :) --Mdennis (WMF) 13:56, 4 August 2011 (UTC)[reply]

CT: → Citations:

I propose implementing "CT:" as an alias for the namespace "Citations:"

As a result, CT:Egyptic would automatically be a shortcut to Citations:Egyptic, CT:hydrogen to Citations:hydrogen, and so on. --Daniel 06:31, 3 July 2011 (UTC)[reply]

Yes, why not. Mglovesfun (talk) 10:06, 3 July 2011 (UTC)[reply]
I use citations a lot. I might be in favor of this. Still, it doesn't seem entirely essential, and there's some ambiguity with categories. DAVilla 16:50, 4 July 2011 (UTC)[reply]
Would you rather prefer a shortcut "CI:"? (CI:waterCitations:water). --Daniel 23:29, 4 July 2011 (UTC)[reply]
No, and if there isn't a good abbreviation to use then I'd just as well not have one. Speaking of which, why do we even have a shortcut for Wikisaurus? I don't remember a discussion on it. The vote that created Wikisaurus, by the way, had overwhelmingly suggested Thesaurus: as the name. DAVilla 16:17, 5 July 2011 (UTC)[reply]
Re WS shortcut: WT:Votes/2009-12/WT: redirect to Wiktionary:, WS: redirect to Wikisaurus:. --Yair rand 16:43, 5 July 2011 (UTC)[reply]
Cool. 100% approval, no less. --Daniel 16:52, 5 July 2011 (UTC)[reply]
Huh! Missed that one. DAVilla 04:35, 13 July 2011 (UTC)[reply]

I created Wiktionary:Votes/2011-07/CT: → Citations:. --Daniel 02:47, 4 July 2011 (UTC)[reply]

This doesn't seem useful to me. The WT alias is useful, because we have short names for various pages in the project namespace; [[Wiktionary:CFI]] would not be so useful a shortcut. But with the citations namespace, we're not going to shorten the pagename, so what's the point of shortening the namespace-name? Personally, I'd rather we didn't bug developers to do things that we don't actually benefit from. —RuakhTALK 17:08, 4 July 2011 (UTC)[reply]
We have Wikisaurus:woman and WS:woman as a precedent. I like and use the shortcuts, so I guess "the point" is catering at least to my personal taste, unless more people would like them and use them too. Naturally, with the discussion and the vote I created, I expect to know more opinions. --Daniel 23:23, 4 July 2011 (UTC)[reply]
For your own use, you could just use $(function(){$("#searchInput").keyup(function(){var q=/^ct:/i,w=this,e=w.value;if(q.exec(e)){w.value=e.replace(q,'Citations:')}})}); in your personal JS. --Yair rand 00:55, 5 July 2011 (UTC)[reply]

Pinyin entries (do we want them and how should they look?)

In light of the recent edits by the strong-headed Engirst, I think it'd be useful to clarify our policy on pinyin entries.

Do we want them? First of all, I don't think it's possible to attest (more than a few) pinyin entries. Assuming this is true, and going only by WT:CFI, they should all be deleted. So I don't think attestability is a useful criterion for inclusion for pinyin entries.

Instead, I think they can serve:

  1. as an index (the page "lǐcài" works like "Mandarin words pronounced lǐcài")
  2. as a help for learners/users

So, assuming we want to keep them, what should they look like? I think they should include links to the character entries (the "real" entries), and only that. I.e. no long, multiline definitions, citations/examples, etymology, pronunciation etc. Only links to entries, with a short translation. Much like the policy for romaji for Japanese: WT:AJ#Romaji_entries.

(About Chinese written in pinyin. I you'll almost only find: a few bibles, some middle school "China is a glorious country" readers and some "Why Chinese could and should be written with pinyin" articles.) Vaste 09:22, 3 July 2011 (UTC)[reply]

It's the pinyin version of a map. That work would be part of what I called middle school readers. (Or maybe it's for foreigners?)
Can you find citations for common word such as wúliáo or (dài) lǜmàozi? How about politically sensitive words such as liùsìshìjiàn? Vaste 00:23, 4 July 2011 (UTC)[reply]
The Pinyin Atlas usually used in business organization. Engirst 01:22, 4 July 2011 (UTC)[reply]

(moved unrelated post)

If it is possible to attest a few pinyin entries, then we should have those few. As a dictionary, we would not be doing our job if we kicked out attestable terms merely because other terms in the same class were unattestable. Keep what is attestable and throw out what is not. bd2412 T 02:06, 4 July 2011 (UTC)[reply]

I disagree. Any pinyin entry is fully described by its character equivalent(s). After all, they describe the same word. Furthermore, these character versions will (probably) always be more mature, detailed and correct. I say, let's not stop at adding a few hundred or thousand attestable pinyin entries, let's add tens of thousands or hundreds of thousands non-attestable pinyin entries that are simply links to the character versions. This would be much more useful for the typical user, easier to maintain, and also consistent with what's done in e.g. Japanese. It would also better reflect how modern Chinese is actually used.
It would be easier to maintain because there is less duplication of effort, and less entries to attempt to keep in sync. E.g part of what belongs in would also be in jiào and jiāo. If is updated/corrected, then jiào and jiāo would be out of date, and possibly wrong/misleading. Then if a user goes to jiào, and since there is already a definition right there, he/she might not see the updated entry in . Vaste 04:28, 4 July 2011 (UTC)[reply]
In that case, are you willing to either attest or delete the hundreds of pinyin entries User:123abc/Engirst has created over the past year? Note that this user has evaded blocking about a dozen times by changing his IP continuously. ---> Tooironic 04:19, 4 July 2011 (UTC)[reply]
If we created all remaining pinyin entries with a bot, he no longer would have anything to do and he would probably go away on his own. :p —CodeCat 09:31, 4 July 2011 (UTC)[reply]
Actually, though he's being a total ass about it, some of what he does is useful. I just wished he would stop caring so much about pinyin entries and putting all that energy into improving our (often quite lacking) Chinese entries instead. Also, I'm a bit worried that he might just be copying copyrighted definitions (maybe from that Wenlin software he cites all over the place?). Vaste 09:39, 4 July 2011 (UTC)[reply]
"though he's being a total ass about it, some of what he does is useful" I couldn't put it better myself, thank you! Mglovesfun (talk) 12:07, 4 July 2011 (UTC)[reply]
Firstly, in fact, Pinyin entries are useful. Such as Wenlin Pinyin dictionary is a good example of Pinyin entries, its entries sorted by Pinyin are beneficial to users especially for learners of Chinese language.
Secondly, the example sentences from Pinyin Bible are useful and good for references, and Bible is a well-known work. Engirst 15:56, 4 July 2011 (UTC)[reply]
There have been many "Bibles" throughout history; some are well-known works (the Hebrew Tanakh, the Septuagint, the Greek and Latin New Testaments, the Peshitta, the Authorized King James Version, etc.), while others are not. Maybe the specific Pinyin edition you cite is a well-known work, but I doubt it, and I do not trust you to judge. —RuakhTALK 21:44, 4 July 2011 (UTC)[reply]
It is the Authorized King James Version, and The King James Bible is Not Copyrighted. Engirst 22:15, 4 July 2011 (UTC)[reply]
Nonsense. The well-known, public-domain Authorized King James Version is in English. You're talking about some sort of Chinese Bible translation in Pinyin. —RuakhTALK 23:31, 4 July 2011 (UTC)[reply]
The example sentences are in Pinyin and English as well (Please see here). Engirst 00:13, 5 July 2011 (UTC)[reply]
Do you have a point, or are you just trying to waste people's time? —RuakhTALK 00:16, 5 July 2011 (UTC)[reply]
The English example sentences are from King James Version, and the Chinese example sentences are from Chinese Union Version. Both are royalty free and well-known. Engirst 00:28, 5 July 2011 (UTC)[reply]
Your example sentences are adapted from the Chinese Union Version by transposing the Traditional characters into pinyin. The result is not well-known. —RuakhTALK 00:45, 5 July 2011 (UTC)[reply]
Whether in English, Hanzi, or Pinyin, they are the Word of God (Please see here). Engirst 01:11, 5 July 2011 (UTC)[reply]
I'm sorry, but that is irrelevant. Wiktionary is not a soapbox for spreading the Word of G-d. —RuakhTALK 01:14, 5 July 2011 (UTC)[reply]
The example sentences are not for spreading the Word of God, but for learning Mandarin Chinese. Engirst 01:20, 5 July 2011 (UTC)[reply]
Surely the CUV is not a very good source for learning modern Mandarin Chinese? Just like the KJV is not very good for English. Both are quite dated, right?
Wikipedia says:

The vernacular Chinese language has changed a lot since 1919. Indeed, CUV’s language sounds stilted to modern readers. Furthermore, a lot of Chinese characters used in the CUV have fallen into disuse and cannot be found in commonly-available dictionaries today.

It seems that the original CUV is now in PD, but it's dated. There are also slightly modernized versions, but they would be copyrighted. Exactly what version does Engirst use? Vaste 01:43, 5 July 2011 (UTC)[reply]
I'm not saying the example sentences need to be removed, or anything like that; I'm just saying that they don't count toward attestation per WT:CFI. Above, you wrote, "Bible is a well-known work"; I thought you were trying to say that these example sentences count satisfy the "well-known work" requirement in WT:CFI. Did I misunderstand your purpose? —RuakhTALK 02:48, 5 July 2011 (UTC)[reply]

About emoticons

I've added a "Punctuation mark" header to all entries of emoticons, because that's what they are.

I orphaned and deleted Category:Emoticons, in favor of Category:Translingual emoticons, because I could populate separate categories for Japanese emoticons and Korean emoticons. --Daniel 11:27, 3 July 2011 (UTC)[reply]

Emoticons are not punctuation marks; ";)" is not a puntuation mark. --Dan Polansky 08:10, 4 July 2011 (UTC)[reply]
Do you think ";)" is something else? What is that, if not a punctuation mark? --Daniel 08:32, 4 July 2011 (UTC)[reply]
It is not a punctuation mark because it does not serve to punctuate but to convey an emotion. Equinox 09:02, 4 July 2011 (UTC)[reply]
OK; I think that is reasonable, yet false. However, I'm not in the mood to defend the hypothesis of "emoticons as punctuation marks" against its simple negation. I'm just asking what emoticons are, if not punctuation marks. That would help. --Daniel 09:09, 4 July 2011 (UTC)[reply]
Symbols? Mglovesfun (talk) 09:05, 4 July 2011 (UTC)[reply]
Does anyone got a better answer? If they really are symbols rather than punctuation marks, I'd be happy to undo the change, adding a "Symbol" header to every sense of emoticon. --Daniel 09:10, 4 July 2011 (UTC)[reply]
Symbols, yes. —RuakhTALK 17:10, 4 July 2011 (UTC)[reply]

OK... Sometimes, I try to prove my points of view through long streaks of arguments, but this time I'll begin by just trying to disprove what you guys said up until now.

  • Emoticons are punctuation marks; they punctuate.
  • Emoticons are punctuation marks, punctuation marks are symbols, and emoticons are symbols. Both "Symbol" and "Punctuation mark" would be accurate headers. The latter is just more strict, thus better to my taste. (Alternatively, the header "Emoticon" would be very very strict and accurate, even natural, but too strict to my taste.)
  • "I'm happy." is an example of something that is written, not a punctuation mark and conveys an emotion. However, there are punctuation marks that convey emotions, notably "!", "??", "..." and scare quotes.

--Daniel 18:19, 5 July 2011 (UTC)[reply]

IMO, emoticons do not punctuate. The only reason they are typically placed between clauses or sentences is because placing them in mid-clause or mid-sentence would disrupt the reader's flow. They do not, themselves, indicate a specific kind of grammatical break as e.g. a comma does. Yes, you could separate two sentences with an emoticon in lieu of a full stop, but that will work with any visual break (e.g. a vertical line on a poster or greeting card); it does not have a punctuational meaning and you would not know whether it was meant to be a comma, full stop, semicolon, etc. except by working it out from existing knowledge of grammar. Equinox 19:04, 5 July 2011 (UTC)[reply]
I agree with Equinox that emoticons are not punctuation. The primary purpose of punctuation is to indicate a mixture of prosody or grammar (with different languages, writing systems, time-periods, and individual writers tending to put greater emphasis on one or on the other). Even something like the exclamation point, which tends to express surprise, has I think the primary purpose of marking the end of a sentence; the surprise is merely what distinguishes it from certain other punctuation marks that have the same primary purpose. (There's a spectrum of uses, of course: the exclamation point ranges from purely grammatical uses, as in "What a lovely home!", where a period would be incorrect, to purely expressive uses, as in "She ordered (!) him to [] ", where it's really acting exactly like an emoticon. So I don't think there's a bright-line test. But so far, emoticons are really only in the purely-expressive-uses part of the spectrum, so I wouldn't consider them punctuation.) —RuakhTALK 19:52, 5 July 2011 (UTC)[reply]
Maybe we should just use ===Emoticon=== as the header? They don't really seem to resemble anything else in usage, so comparing them seems a bit... fruitless. —CodeCat 19:58, 5 July 2011 (UTC)[reply]
My personal preference is "symbol", because (i) I doubt anyone would dispute that they are symbols (while the idea that they are punctuation is very dubious; see above), and (ii) "emoticon" seems to be getting a bit specific, as with the "mathematical symbol" (or whatever it was) that Daniel proposed. Headers are ultimately supposed to indicate the part of speech, not the category, and we don't want to turn into some kind of character-centric Unicode consortium. Of course we can use the "emoticon" gloss and category, just as we do with "math", "typography", etc. Equinox 20:00, 5 July 2011 (UTC)[reply]
If we are going to treat emoticons as parts of speech, I can't really think of anything beyond 'phrase' or 'interjection' that fits. —CodeCat 20:08, 5 July 2011 (UTC)[reply]

I created Wiktionary:Votes/2011-07/External links, which stems from old discussions. --Daniel 14:03, 3 July 2011 (UTC)[reply]

WT:CFI question

What do we really mean by "Usage in permanently recorded media"? User:Engirst is trying to argue that this site is durably archived because it claims to have CD versions of its Bible content available - even though all that I can find on the webpage is a collection of mp3s. Surely this is not permanently recorded media, nor durably archived. See also our discussion User_talk:Engirst#copyrighted_material. ---> Tooironic 00:31, 4 July 2011 (UTC)[reply]

"Usage in permanently recorded media" essentially is just Google Books and Usenet. --Daniel 02:50, 4 July 2011 (UTC)[reply]
Note: Last year, I asked "What are the durably archived sources?" and got some replies. --Daniel 09:02, 4 July 2011 (UTC)[reply]
What about the Internet archive? Aren't internet pages, that have been archived there also permanent? Matthias Buchmeier 12:08, 4 July 2011 (UTC)[reply]
No, they are not permanent. The archive is subject to the whim of the copyright holders. Per the Terms of Use, "if the author or publisher of some part of the Archive does not want his or her work in our Collections, then we may remove that portion of the Collections without notice." DAVilla 17:24, 4 July 2011 (UTC)[reply]
The King James Bible is Not Copyrighted Engirst 19:40, 4 July 2011 (UTC)[reply]
"Wordproject is an open, royalty free web page, online and on CD, which aims to make the Word of God - the Bible - available to as many people as possible, through a means that is simple, up-to-date and cheap to reproduce and use." CD is durably archived as well. Please see here. Engirst 12:17, 4 July 2011 (UTC)[reply]
Show me where the CDs are. All I can see is a series of mp3s. In any case it's not a legitimate publication. If it were that would mean that any one could just create a site with some mp3s and call it a durably archived source! ---> Tooironic 22:16, 4 July 2011 (UTC)[reply]
Note that we consider movies "durably archived" as well. There have been cases where movie quotes were enough to verify an entry in terms of CFI. -- Prince Kassad 13:30, 4 July 2011 (UTC)[reply]
How do spoken quotes work if there is no known written representation? —CodeCat 14:54, 4 July 2011 (UTC)[reply]
How do written quotes work if there is no known oral pronunciation? DAVilla 17:20, 4 July 2011 (UTC)[reply]
I agree that spoken word only doesn't work, we are a written dictionary only (apart from audio files). Mglovesfun (talk) 17:25, 4 July 2011 (UTC)[reply]
I understand if there's some inherent ambiguity in trying to spell something oral, but who ever said we're a written dictionary only? DAVilla 16:29, 5 July 2011 (UTC)[reply]
I would rule it as durably archived and therefore citable, definitely quotable even if not. DAVilla 17:20, 4 July 2011 (UTC)[reply]
Not all CDs are durably archived, since you can burn something to CD without archiving it durably. (Similarly, not all books are durably archived, since you can write something in your personal diary without archiving it durably. Regardless of the medium, common sense is required.) —RuakhTALK 17:26, 4 July 2011 (UTC)[reply]
I don't remember anyone calling something from Google Books not durably archived. In my experience, people seem to think that everything that comes from that website is durably archived by definition, because, when an entry is attested through it, nobody questions it. I'm ready to be proven wrong, if I am. --Daniel 22:53, 4 July 2011 (UTC)[reply]
Well, I meant physical books — but no, not everything from Google Books is durably archived, either. Some of the "books" on there are print-on-demand, and there's not even any guarantee that any hard-copies exist, let alone archived anywhere durable. (I also mentioned this in the discussion you linked to above.) That said, Google Books always indicates where the book came from (e.g., if they got it from a certain library), and in cases where it was supplied directly to them by a publisher in digital form, you can usually tell from the editing quality whether it was really published or not. As long as we're aware that presence on Google Books is not necessarily sufficient, we can generally apply common sense in individual cases. —RuakhTALK 23:29, 4 July 2011 (UTC)[reply]
So physical form is sufficient if it's a mass produced book, but not clear if it's a mass produced CD? I interpret the scenario as the latter, but now I'm wondering if that assumption is incorrect. DAVilla 16:34, 5 July 2011 (UTC)[reply]
One more Pinyin Bible with CD for your reference. Engirst 21:59, 5 July 2011 (UTC)[reply]

Control characters

I created this as a simple entry for a control character. Feel free to improve the idea of defining control characters somehow. Perhaps they should be in appendices instead; I don't know. --Daniel 17:50, 5 July 2011 (UTC)[reply]

First bad result: The new entry appears among the recent changes, but can't be clicked on from there. --Daniel 17:59, 5 July 2011 (UTC)[reply]
I don't think adding control characters is a good idea. But we could add their names instead, like NUL or STX. —CodeCat 21:24, 5 July 2011 (UTC)[reply]
Names of control characters would include ^G and COMBINING GRAPHEME JOINER. An appendix might list them all, and their Unicode codepoints. --Daniel 21:52, 5 July 2011 (UTC)[reply]

Please help clean up the topical categories!

Since the vote that created subcategories for English topical categories, a lot of entries have been left behind, still in the 'main' category. The main categories should now be empty, but there are many that aren't yet. So I'd like to ask everyone who can and is willing to help fix this, by adding the prefix en: to those categories in each entry. There is now a list of topics, which shows how many entries each category still has. Once they all show no entries, we can be satisfied. :) —CodeCat 19:41, 5 July 2011 (UTC)[reply]

Is this what we want? If yes, I can contribute with my bot. --flyax 21:09, 5 July 2011 (UTC)[reply]
Yes, but you have to be very careful not to add it to categories where it shouldn't be added. Category:British English and Category:German verbs should stay as they are, for example. —CodeCat 21:23, 5 July 2011 (UTC)[reply]
OK. With this regex (User:Flubot/en to topical categories) there won't be any problem I guess. --flyax 08:47, 6 July 2011 (UTC)[reply]

User:123abc's sockpuppets User:Ddpy and User:Engirst

I propose that we delete all pinyin entries created by both these user accounts since the vast majority of these hundreds of entries cannot be attested. ---> Tooironic 00:32, 6 July 2011 (UTC)[reply]

Does "cannot be attested" mean "not a real word"? I would think we'd have trouble attesting almost all pinyin entries, since the large majority of Chinese literature prefers to use characters instead. Tempodivalse [talk] 01:05, 6 July 2011 (UTC)[reply]
"Cannot be attested" means "not a real word yet as far as Wiktionary is concerned". - [The]DaveRoss 01:11, 6 July 2011 (UTC)[reply]
Even if anyone with a solid grasp of the language would be able to tell you that the pinyin is an accurate transliteration of the word? I see some trouble down the road if we decide to be extremely stringent about "attestation". For instance, many obscurely inflected Russian, Esperanto, and (especially) Latin words we have might not be fully "attestable", even though a fluent speaker of the language will tell you it is a perfectly valid word. (That's probably something to be discussed in another topic, however, and I don't want to distract from the initial purpose of this thread.) Tempodivalse [talk] 01:20, 6 July 2011 (UTC)[reply]
Tempodivalse, all the valid Russian words can be attested, including slang, it's not a purely spoken language but a language well-described and used if you know, I have yet to find a Russian word that doesn't exist on the internet or in dictionaries, perhaps standard transliterations of foreign concepts that are rarely discussed by Russian. (end of distraction)
As for pinyin entries, although there are rules about spacing, capitalisation, spelling of erhua, even the tone marks and tone numbers and absence of them, pinyin is only used in learning materials, dictionaries, books for students or when hanzi can't be entered for technical reasons. Same can be said about bopomofo - it's a tool, not the proper script. Almost invariably, pinyin follows the proper hanzi (simplified or traditional) text to help with pronunciation, the primary script for Mandarin Chinese, pinyin on its own is otherwise useless. Pinyin alone is used by people who have the agenda to convince people that Mandarin can be written in pinyin, like Pinyininfo web-site and our ill-famed User:123abc and his various incarnations. I second deletion of his entries, even if they don't break rules. Keeping pinyin entries in sync with hanzi entries is made impossible due to his utter lack of cooperation with other Wiktionarians. --Anatoli 04:08, 6 July 2011 (UTC)[reply]
I agree. As it is now, he does little/nothing to help us improve Wiktionary in a way that *I* (we?) care about. Vaste 04:25, 6 July 2011 (UTC)[reply]
I doubt all valid Russian inflections can be attested, but that's no reason to exclude them. If words written using pinyin are not counted as "real words", then all pinyin should be excluded regardless of attestation. If that's not the case, then I don't see why they should be deleted. Considering words of different writing systems to require attestation separately doesn't make sense to me. --Yair rand 04:27, 6 July 2011 (UTC)[reply]

Pinyin entries

Could we keep only the pinyin section of the entries? This way we handle it like romaji in Japanese instead. I.e. a list with short definitions and links to entries (in characters). See example for shū:

Pinyin

shū (with tone numbers: shu1)

  1. , : book, letter, document; writings
  2. : father's younger brother
  3. : comb; brush
  4. : open up, unfold, stretch out; comfortable, easy
  5. : neglect; careless, lax
  6. , : transport, carry, haul

Useful, to the point, and zero problems with attestation. Vaste 04:25, 6 July 2011 (UTC)[reply]

It would be perfect if the entries were maintained the way you suggested but we have a case where a wayward editor does with pinyin entries what he thinks is right, not what is being discussed and agreed on. Do you see the difference? --Anatoli 05:42, 6 July 2011 (UTC)[reply]
I think we could clarify the guidelines we have. Right now, about pinyin entries it says:

Pinyin entries: The entire simplified phrase and the entire traditional phrase should be hyperlinked to allow for easy navigation to the simplified and traditional entries (which often contain additional information that is lacking in the Pinyin entry).

This to me implies that pinyin entries such as the ones Engirst are creating are perfectly okay (if they fulfill WT:CFI). (I.e. "additional information" doesn't have to be "lacking" in the Pinyin entry.) I would like this changed to limit the scope of pinyin entries to something like the example above. Do you agree? Vaste 07:07, 6 July 2011 (UTC)[reply]
Yes, I agree. The scope of pinyin entries should be limited. All examples, etymology, usage notes, etc. should go into the main entry but pinyin entries should list possible hanzi (simp. and trad.) that could have the given reading. Engirst (being so keen to write pinyin) could fix some hanzi entries where pinyin is missing. --Anatoli 07:46, 6 July 2011 (UTC)[reply]
I just noticed that WT:About Sinitic languages also says this:

Headwords that are romanizations point to both the traditional and simplified forms, but do not duplicate all entries with that pronunciation, instead having a “Pinyin” L3 heading (likewise for other romanizations), linking to characters with that reading; see ài.

(Note: when added the article looked like this: ài)
I've always felt that this ultimately should depend on a technical solution. It was attempted several years back, it was called WiktionaryZ, but ended it up going nowhere. In a nutshell, I should be able to fill out a single form for any word or phrase, including all orthographies and meanings on that form, and be able to search for that term using any of the orthographies that were entered. We are so far from that at Wiktionary, it's not even funny. It is absolutely insane for me to be creating duplicate entries for each and every word (simplified and traditional). But, that's currently the only way available to me if I want to ensure a consistant user experience. Now, we're debating the wisdom of creating a third version (the pinyin). There should be a button on the screen that I can press to toggle between traditional, simplified and pinyin. If you have an iphone, ipod touch or ipad, check out Pleco. Now that's what we should be working toward here. -- A-cai 00:05, 7 July 2011 (UTC)[reply]
P.S. I stand corrected. wiktionaryZ eventually became OmegaWiki. -- A-cai 00:09, 7 July 2011 (UTC)[reply]
So what would we need in order to simplify the process? Some kind of automatic synchronization system tied to whenever anyone clicks the save button? That wouldn't be all that hard to produce with javascript, but with pinyin it appears that it's not one-to-one correspondence between entries. --Yair rand 07:58, 7 July 2011 (UTC)[reply]
One solution to keep the information in one place only could be: create the "real" entry as a subpage somewhere (e.g. of the traditional entry), and then include that as a template in both the traditional and the simplified page. A special link to edit the subpage could be included. Quotations etc (given in trad and/or simp) could then be conditionally included, depending on if it's the simplified or traditional page importing it. Perhaps a bit complicated though.
It isn't one-to-one with trad/simp either. trad -> simp is almost many-to-one (the only exception I know of is trad 著 matching simp 着 and 著). Vaste 02:55, 8 July 2011 (UTC)[reply]

The one-to-one problem is a legitimate hurdle. There are a number of ways one could tackle this. Ideally, the user should only have to enter the term in one orthography. The computer should do the rest. However, it is not a straightforward process. In a nutshell, the computer needs access to two key/value lists (or one big list):

  1. simplified/traditional
  2. Pinyin/character

Because it is not a one-to-one correlation, a partial list of key/value pairs would look like:

  • 书 書
  • 字 字
  • 云 云雲
  • 发 發髮
  • ken3 肯啃恳垦懇墾
  • ken4 掯

In cases where there is no one-to-one correlation, the computer would need to prompt the user for the correct choice. A similar technique is used by most modern Pinyin input method editors.

Anyway, the first step is to create the key/value lists from some source on the Internet (CEDICT might be an option as a source). Next, a process would need to be put in place whereby the lists are made available to some kind of JavaScript or python call etc. The rest of the process would involve a bot auto creating the missing components from the seed entry. Please let me know if you're having trouble seeing where I'm going with this. Thanks. -- A-cai 23:19, 7 July 2011 (UTC)[reply]

Modern Pinyin input method editors work on the assumption that their users know characters; but I don't think we can assume that an editor entering a word in Simplified characters knows which of the corresponding Traditional characters is correct. Can we? —RuakhTALK 00:42, 8 July 2011 (UTC)[reply]
Correct, you would have to know which one to choose. I don't see how else it could be done. Automation only gets you so far. Language expertise would have to carry you to the finish line, I think. BTW, we have the same problem with static entries. At the end of the day, we need language experts to do the heavy lifting. -- A-cai 01:33, 8 July 2011 (UTC)[reply]
I'd say we cannot assume that. Though perhaps not the best example, a simplified editor might know that "里头" -> "裡頭" (or "裏頭"), and assume that "千里" -> "千裡" (which is wrong, should remain "千里") or "乡里" -> "鄉裡" (this is typically "鄉里").
The most serious attempt to tackle this problem that I know of can be seen at Chinese Wikipedia. Vaste 02:54, 8 July 2011 (UTC)[reply]
I think you're right about Chinese Wikipedia. The only one that doesn't seem to require any human intervention whatsoever is when you are going from traditional to simplified. However, even there, you still have the problem of multiple pinyin readings (ex. 好 hao3,hao4). On the other hand, if an interface could be developed whereby the user is prompted to select the correct reading or character in such cases, it would at least reduce the amount of work that would need to be done by a human language expert (although it probably wouldn't eliminate it). In any case, the process that I described above is rather basic compared to what they do at Chinese Wikipedia, so I'm pretty sure that the process of conversion can be optimized over time, thereby minimizing the number of choices that would need to be made by humans. Furthmore, you do actually get one-to-one correlations most of the time, so for the vast majority of entries, the computer could do the whole thing without breaking a sweat. -- A-cai 10:01, 8 July 2011 (UTC)[reply]

"yuan is a nonstandard spelling of yuán"?

Previous vote: Wiktionary:Votes/pl-2009-12/Treatment of toneless pinyin syllables

It is not a fact. yuan is written in Renminbi, it is a standard spelling. It is a toneless spelling of yuán, but not a nonstandard spelling of yuán. Please see the banknote of Renminbi for your reference. Engirst 09:54, 6 July 2011 (UTC)[reply]

You mean how it says "20 yuan" on the banknote, right? Isn't that in English? One side has English, one side has Chinese, right? Vaste 10:04, 6 July 2011 (UTC)[reply]
There is no English in Renminbi banknote. Please see the banknote of Renminbi for your reference. Engirst 10:16, 6 July 2011 (UTC)[reply]
But the pinyin appears in the same place as the other languages (zhuang etc). Isn't it just a way to write something meaningful for non-Chinese to read?
Then again, toneless pinyin is much more commonly seen in Chinese than toned pinyin in general. I wouldn't call it a "standard" way to write Mandarin though. Maybe "toneless variant of yuán" would be more appropriate? Vaste 10:24, 6 July 2011 (UTC)[reply]
So, the wording of "toneless Pinyin is a nonstandard spelling of toned Pinyin" is inappropriate. Engirst 10:56, 6 July 2011 (UTC)[reply]
I disagree, and so far as I can recall we had a vote to settle this question. Toneless pinyin (like accentless Hebrew) is sometimes used because the target audience is expected to know the meanings intended even without the tones, or because the authors do not appreciate the significance of tones. It is not a misspelling, per se, but is not the idea presentation either. bd2412 T 17:02, 11 July 2011 (UTC)[reply]

Linking words in definitions

Hi, I'd like to suggest that instead of piecemeal linking of "significant" words in definitions, Wiktionary silently links every word automatically. OK, there would be no blue cue, but people would get the idea fairly quickly I think if the cursor changed on mouseover? OTOH, I suppose this must have been suggested before....

There is no reason to link a, an, the, and other words that will not assist in understanding the definition. The blue links are more than clues to linking; they are also emphasis on important aspects of the definition. --EncycloPetey 18:07, 6 July 2011 (UTC)[reply]
I see no downside at all (unless it's a performance one, which seems unlikely) to linking words that people are unlikely to click on. The point is that they can click on any word they want more information about, irrespective of whether someone's pre-decided that they might want to. 86.181.204.160 19:08, 6 July 2011 (UTC)[reply]

Wikilook might help you. Lmaltier 18:22, 6 July 2011 (UTC)[reply]

This is an interesting idea with merit. I've thought about it in passing a few times. This might be useful for Simple English Wiktionary, which is intended primarily for learners. (One Esperanto site I know employs a similar feature: each word in an article or message can be clicked on to reveal a pop-up. It really helped me boost my vocabulary and is more efficient than searching through a dictionary all on your own.) Tempodivalse [talk] 19:41, 6 July 2011 (UTC)[reply]
I dislike this idea. Every word being a link makes it much more difficult to copy and paste things. Number of times I've wanted to copy and paste something from a definition: +∞. Number of times I've come across the word "the" in a definition and decided to look it up: 0. —RuakhTALK 20:13, 6 July 2011 (UTC)[reply]
Wikilook has not this drawback. The main advantage of Wikilook is that it works on all sites, not only here (but you need Firefox). Here is a link for more information: https://addons.mozilla.org/en-US/firefox/addon/wikilook/ Lmaltier 20:15, 6 July 2011 (UTC)[reply]
That looks pretty cool — and much more sophisticated than what the anon is suggesting. —RuakhTALK 20:33, 6 July 2011 (UTC)[reply]
Maybe we could do the same thing with a preference setting and Javascript? —CodeCat 20:36, 6 July 2011 (UTC)[reply]
Wikilook is available in WT:PREFS, but afaict it only works on Firefox and Opera. --Yair rand 01:12, 7 July 2011 (UTC)[reply]

five quondam Jōyō kanji, 196 recently added Jōyō kanji

In 2010 five kanji lost their Jōyō kanji status and 196 Jinmeiyō kanji and Hyōgaiji were recognised as Jōyō kanji. I have updated the information concerning the five former Jōyō kanji (, , , , ) in their respective articles (however, the Jōyō tag remained, because I am not aware of their current status - are they now Jinmeiyō kanji or Hyōgaiji ?), but how are the 196 newly added Jōyō kanji to be dealt with? Some of them are not tagged at all, others are tagged as Hyōgaiji, which is now obsolete. I suggest adding a tag (Common Jōyō kanji since 2010), but they should probably not be updated manually given the considerable number of entries concerned. The uſer hight Bogorm converſation 07:32, 7 July 2011 (UTC)[reply]

While browsing wikipedia I discovered that all 5 quondam Jōyō kanji were added to the Jinmeiyō kanji list. How about referencing the Jōyō tag with the usage notes like that in order to raise the reader's awareness of the alteration of their status? There is a source added by a user which facilitates and justifies referencing. The uſer hight Bogorm converſation 07:45, 7 July 2011 (UTC)[reply]

Altaic languages

I noticed that several etymologies refer to the Altaic languages or to Proto-Altaic. As far as I know, the existence of that family is disputed, and so is the question of which languages belong to it, so should we really allow it on Wiktionary? —CodeCat 17:25, 7 July 2011 (UTC)[reply]

No. Category:Altaic languages failed RFD. -- Liliana 17:28, 7 July 2011 (UTC)[reply]
It looks like it failed for the same reasons. Does that mean any Proto-Altaic etymologies should be removed as well? —CodeCat 17:31, 7 July 2011 (UTC)[reply]
I suppose yes, since the Altaic proposal is, as you said, not universally accepted, therefore any etymologies involving Altaic are nothing but spurious theory. -- Liliana 17:34, 7 July 2011 (UTC)[reply]

I'm sorry, but that's ridiculous! Have you ever heard of the comparative method? Don't you know that all linguistic reconstruction cannot be proven? Even Indo-European cannot be. It is not spurious theory, Altaic is based on the comparative method like Indo-European, Uralic, Dravidian, Sino-Tibetan, et al. There is merely some kind of bizarre prejudice against Altaic. Have you ever read any literature on Altaic or the comparative method in general? Why should Altaic etymologies be ignored? How are they any less valid than any others? Again, no linguistic proposal with regard to a proto-language is ever universally accepted because none of them can ever be proven, as all proto-languages are, by definition, from times when there are no written attestations -- And we drown

Yes, all proto-languages are theoretical, but some are more widely accepted than others. Proto-Indo-European is very widely accepted; no serious linguist disputes it. Proto-Altaic, on the other hand, is rather controversial; it's almost as controversial as "Proto-Nostratic". Most terms should probably not be taken back further than Proto-Turkic, Proto-Mongolian, etc., though I suppose we can mention Proto-Altaic when the term is specifically mentioned in the literature as a term used to argue in favor of the Altaic hypothesis. Even then we should couch it correctly: don't just say "Proto-Turkic *XYZ < Proto-Altaic *UVW", but rather "Proto-Turkic *XYZ. So-and-so (1957) compares this form to Proto-Mongolian *RST and derives both from Proto-Altaic *UVW." We can do that for Nostratic too: if a form is "notable" for its use in the reconstruction of Nostratic, we can mention that, but we shouldn't just slap Nostratic etymologies down as if they were as noncontroversial as PIE etymologies. —Angr 17:58, 15 July 2011 (UTC)[reply]
Should we have categories such as Category:Turkish terms derived from Proto-Altaic? Category:Altaic languages failed RFD some time ago. —CodeCat 18:28, 15 July 2011 (UTC)[reply]
No, we shouldn't. Mentioning the fact that a term has been used in the reconstruction of Proto-Altaic doesn't mean we need to categorize the terms. —Angr 06:55, 16 July 2011 (UTC)[reply]
And that creates a problem because people will use {{proto}}, like And we drown did, which automatically categorizes. In effect, while we are able to control language templates and their uses to some extent, proto-language etymologies have free reign and can be called anything at all, because the template copies the name provided. —CodeCat 13:24, 16 July 2011 (UTC)[reply]

"A long rode to ho" or is it "A long road to hoe"????????

A long rode to ho. It is not mispronounced it is misspelled. Rode is a length of chain and rope that is put out from the ship to the anchor. A long rode is required when it is windy or stormy. To pull a rope on a ship is to ho from the term “Heave Ho”. The group will advance on the rope on the command Heave and sailors expressed in unison "Ho" as they pulled. If it is stormy and windy or the current is strong the long rode to ho is hard work that takes a long time.

I would be ever so pleased if people would stop misspelling the phrase or replacing the word, rode with row, as if that would makes more sense. The term has been around before we where colonies. If you look this idiom up on Google you will find 200,000 web pages CAN BE WRONG! Sports writers are the worst. Journalist will put it in quotations but it is not a correct quote. Imagine journalist getting a quote wrong, it gives me shivers.

My 3rd grade teacher, Mrs. Samuels, told me in 1969. There are 2 versions. “The long rows are not hoed they are plowed. Now road and rode sound alike but they are not the same as in ‘The long rode to ho means its hard work and takes a long time. Like on a boat anchor. See why spelling is so important many adults spell that word wrong”. We looked the words up in the dictionary and the phrase made perfect sense to an 8 year old. Mrs. Samuels was a great teacher , and she is still teaching from my heart, this very moment.

This phrase has lost its way from its nautical roots.

— This unsigned comment was added by Rekamlias (talkcontribs) at 00:47, 8 July 2011 (UTC).[reply]

This is a policy discussion page. You probably meant to post this somewhere else. —RuakhTALK 00:52, 8 July 2011 (UTC)[reply]
By the way, for the record — the versions in “row to hoe” are original (attested since 1835), and still wildly more popular than the versions in "road to hoe" that started to arise after a few decades. With all due respect to Mrs. Samuels, no "rode to ho" version exists at all. —RuakhTALK 01:05, 8 July 2011 (UTC)[reply]

«Derivations» in topical categories

Do we really want categories like Category:Biblical derivations to follow the en:categoryname scheme? Wouldn't a name like "English terms derived from the Bible" be more appropriate? Does the use of "derived from" apply to languages only? --flyax 06:19, 8 July 2011 (UTC)[reply]

The vote that changed the derivations categories only affected derivations from languages. So there are still some categories in Category:Etymology that haven't been changed. —CodeCat 10:35, 8 July 2011 (UTC)[reply]

Quotation index

I've been thinking about writing an indexer for {{quote-book}} that would generate an index arranged by author. This would make it easier to find inconsistencies in our entries, as well as hopefully pushing people to use a more standardized format for quotations. Some potential issues / questions:

  • Need to sort by first name, or by the name given in the author field. I could try to extract the last name but it would be inconsistent at best.
  • I'm only planning on doing English sections to start, due to the small number of non English entries that use the template.
  • Is sorting by author the most convenient format to use?

So, would this be useful to anyone? Nadando 06:23, 8 July 2011 (UTC)[reply]

Personally, I don't use {{quote-book}}, and don't plan to start; and my general impression is that the majority of well-formatted quotations are not using it, either. So the index would be rather permanently incomplete. —RuakhTALK 10:04, 8 July 2011 (UTC)[reply]
I disagree, I think quote-book does a good job. --Mglovesfun (talk) 12:25, 8 July 2011 (UTC)[reply]
What do you disagree with? —RuakhTALK 13:19, 8 July 2011 (UTC)[reply]
I'ne never used {{quote-book}}. I tend to look for Latin quotes or 18th-century English quotes through Wikisource, and use templates set up for particular oft-used sources only. These typically lack ISBNs and have peculiar formatting or linking that can't be tied in through the {{quote-book}} template. --EncycloPetey 14:56, 8 July 2011 (UTC)[reply]

Bosnian, Croatian and Serbian translations

I've been thinking about this issue for a while. The merge debate on WT:RFM was about Serbo-Croatian categories, while translation templates don't categorize anything. Also there are Bosnian, Croatian and Serbian Wiktionaries, so occasionally converting a translation to Serbo-Croatian will remove a valid link. Thoughts? Mglovesfun (talk) 11:24, 8 July 2011 (UTC)[reply]

Maybe the translation template could have three links instead of one for Serbo-Croatian? —CodeCat 11:26, 8 July 2011 (UTC)[reply]
There is a Serbo-Croatian Wiktionary, so {{t|sh}} is also correct. Mglovesfun (talk) 11:36, 8 July 2011 (UTC)[reply]
I meant that when you type {{t|sh}} the result shows three (or four) links to languages: (bs) (hr) (sh) (sr) —CodeCat 12:23, 8 July 2011 (UTC)[reply]

Slang senses that are not in widespread use

We have an appendix for protologisms, but those only really consider new words that are created in the hopes that they will be used. But there are also quite a few cases where an existing word is used in a sense that isn't widely known outside a certain group of people. In other words, it's not the word that's new, it's the meaning. In many cases these terms are in widespread use, but only within that community or context, which makes them hard to attest elsewhere or even at all. So I am wondering if there is a way to define such senses at all on Wiktionary? Is there an appendix for such slang senses? —CodeCat 12:52, 8 July 2011 (UTC)[reply]

Why not use LOP?​—msh210 (talk) 15:23, 8 July 2011 (UTC)[reply]
If a word already exists, can it still be a protologism? —CodeCat 15:28, 8 July 2011 (UTC)[reply]
I dunno, but a bunch are already there. Starting at the beginning, you'll soon find "aardvark" and "abdicate". (Also "a" and "aa", though those are basically separate words that happen to be spelled the same as existing ones, so maybe they don't count.) —RuakhTALK 15:32, 8 July 2011 (UTC)[reply]
But there is still a difference. Does a term really belong there if it couldn't be meaningful outside of the context of a certain community of speakers? For example, if someone coined a word that was simply not useful outside of Facebook, could it still be listed on that page? —CodeCat 17:18, 8 July 2011 (UTC)[reply]

These two categories, and also their subcategories, are the only categories that are left from the 'old' set of derivations categories. They were not affected by the recent vote because they are not languages, so they would not belong under Category:English terms derived from other languages. Now that things have settled a little I think we can try to move these categories as well. I would like to propose the following names:

CodeCat 20:07, 8 July 2011 (UTC)[reply]

Category:English terms coined by J. R. R. Tolkien would be more informative for the last one. "Tolkien's legendarium" is too fanspeak-ish. --Daniel 20:22, 8 July 2011 (UTC)[reply]
There are also:
--flyax 20:56, 8 July 2011 (UTC)[reply]
Category:en:Australian Aboriginal derivations should really be emptied and deleted, and there is only one entry in it. Does anyone know from what Australian language family it derives? —CodeCat 21:04, 8 July 2011 (UTC)[reply]
Anyhow, it should use {{etyl|aus}}, and that's what it does now. -- Liliana 21:17, 8 July 2011 (UTC)[reply]
Category:eo:Fictional locations seems to be ok, the word 'derivation' does not appear in the title. Mglovesfun (talk) 12:38, 9 July 2011 (UTC)[reply]
But it is inside Category:eo:Fictional derivations, so that can't be deleted until Category:eo:Fictional locations is deleted or removed from it. —CodeCat 12:43, 9 July 2011 (UTC)[reply]

This is just a small question but... what exactly is the usual practice on the 'see also' links at the top of the page? I noticed it is used to redirect between spellings that look the same. But can it also be used between words that may sound the same? For example, would it be useful to link between dança and dansa, given that these two words are pronounced identical in several languages and one could easily be mistaken for the other? I'm thinking of cases where a learner of Catalan hears dansa but assumes it is spelled dança based on English spelling. —CodeCat 22:23, 8 July 2011 (UTC)[reply]

No, it's not used for homophones or near homophones. Pronunciation varies so much between languages that we'd have a serious headache if we attempted that. Someone searching a particular language by spelling can use that language's Index of entries to search alphabetically. --EncycloPetey 22:26, 8 July 2011 (UTC)[reply]
It would be interesting to have a lookup based on the pronunciation of a word instead of the spelling, e.g. IPA:dansa. One major problem is that inflections important in one language may not matter in another. At least for non-tonal languages, it might make sense to omit these in the title, only indicating stress in the language section where it applies. I don't know if there is a minimal set for tones. Aside from that, the same sound can have different interpretations in different languages. The transcriptions would have to be pretty narrow, distinguishing for instance the aspiration on b and p in English. But the narrower the transcriptions, the less likely they will overlap. There would have to be see alsos for similar pronunciations. Because of the way phonemes can group many adjacent phones for arbitrary languages, this would make for some long lists. DAVilla 04:57, 13 July 2011 (UTC)[reply]

Inferring structure of entries

Currently, it is almost always the case that left-aligned emboldened text are headers (the main exception being inflection lines). I find this to be a helpful clue to inferring the structure of entries, especially when the ToC isn't visible (e.g. when explicitly hidden or when too far above). Do others find this useful? If so, are their easy layout tweaks we can do to make this more standard? Inflection lines could be adorned somehow, which has the ancillary benefit of drawing attention closer to the definitions. I think most other left-aligned emboldened text is used to segment sections such as "Derived terms", where we could just as easily use italics. Is there some way we can increase usability here? --Bequw τ 05:08, 10 July 2011 (UTC)[reply]

Maybe indentation would be useful as well? If everything within ==English== were indented a bit (including the L3 headers), everything within ===Verb=== were indented a bit further (including the L4 headers), and so on, that might help clarify the structure as well. Unfortunately, I don't think that would be achievable with just CSS; I think we'd have to use some DOM-inspecting JavaScript. So maybe it's not worth it. —RuakhTALK 12:24, 11 July 2011 (UTC)[reply]
I like indentation for this purpose. It would enable us to consider smaller fonts for headings which would be more economical of vertical screen space.
Also, there are a number of entries that have bold headings not sanctioned by WT:ELE that are created by starting the line with ";". They would seem to interfere with human users' ability to make structure inferences reliably. DCDuring TALK 13:03, 11 July 2011 (UTC)[reply]
It'd be doable with just CSS if we had consistent header levels. That is, if the POS were always L3 (not sometimes L4 because "Etymology 1" is L3) and "See also" were always L3 (not sometimes L4 under a POS and sometimes L5 under an L4 POS and sometimes L3 under the language), etc., then we should be able to use CSS and templates to indent all definitions a certain amount, all derived/related/'nymous/translation terms a certain amount more, and so on. But the way we do things now, I agree with Ruakh that doing it with CSS is far more trouble than it's worth (though it's still not impossible).​—msh210 (talk) 16:14, 11 July 2011 (UTC)[reply]
I despair of our ability to make headers other than L2 consistent. In English it is not too far-fetched to imagine having some kind of etymology, possibly with trivial content, for every single-word lemma entry. But, even in English, misspellings, multiword entries, and inflected forms and other form-of entries would be a challenge. Foregoing the cognitive advantages of grouping etymologically related PoSes under a shared Etymology heading seems a bad exchange even for the layout improvements under discussion. DCDuring TALK 20:10, 11 July 2011 (UTC)[reply]
Maybe we could try putting the part of speech on the headword line and remove the header altogether. In most cases, the headword line is the most important defining element of an entry, not the headers. So it makes sense to let it stand out as much as possible. —CodeCat 20:16, 11 July 2011 (UTC)[reply]
I don't think we can just remove the POS header, since then it would no longer be evident that we're breaking the word up by part of speech; and in the MediaWiki and HTML structure of the page, we'd then have all the verb definitions inside the noun's ====Translations==== section. But it could work if we merged the POS sections — putting all definitions in a single list regardless of part of speech — with the headword lines just kind of interspersed. That's what Dictionary.com does. —RuakhTALK 21:34, 11 July 2011 (UTC)[reply]
The problem seems to be that we want the table of contents to behave as if the POS header exists, but we really want the headword line to take its place and show no header. Is there a way to do that? Aside from that, I don't think the current structure is very useful. We organise terms by etymology, but that is counterintuitive for most people who are just looking up a word. —CodeCat 21:39, 11 July 2011 (UTC)[reply]
Maybe that's what you want, but it's not what I want. :-/   But yes, it could be more-or-less achieved by using CSS, and 100% achieved by using JavaScript. —RuakhTALK 22:02, 11 July 2011 (UTC)[reply]
Grouping all the parts of speech together has additional structural problems besides just the translations. In inflected languages, the inflection/conjugation then is not separated by headers that will indicate which definitions go with which inflection pattern. It could/should be possible to set a preferennce that does something like that, but it would mean that the POS headers would not show up, however all the subheaders under that POS would still show up. --EncycloPetey 18:16, 12 July 2011 (UTC)[reply]
Also possible would be to keep the header and remove the following line break (e.g. the 4th option at WT:Beer parlour archive/2011/May#Part of speech headers and headword lines). That would I think require JS. --Bequw τ 22:05, 29 July 2011 (UTC)[reply]

{{l}}

I've noticed that BigDom uses the template {{l}} in his Luxembourgish entries to link to the English translations. While this is not officially mandated by ELE, I do think it is a nice idea, especially for pages starting with a Translingual entry, or terms which are the same in two languages (like water). I've been wondering if we should introduce that practice. -- Liliana 23:14, 11 July 2011 (UTC)[reply]

So, for example, we might define important as # Having {{l|en|relevant}} or {{l|en|crucial}} value.? I'm not a big fan of that idea. I like that {{l}} is reserved for mentions of a term, as distinct from mere linkified uses. It's an important distinction for us, and I think it makes sense to promote it even in our wikitext. (And the generated HTML is different, in such a way that readers can customize the display of {{l}} if they like.) Also, I'm guessing that such an approach would add needless headache for downstream entities that use our definitions. —RuakhTALK 01:45, 12 July 2011 (UTC)[reply]
I only started yesterday and it just so happened that when I looked at an entry to find out how to create an entry, it used the {{l}} template, so I just assumed that all the entries used it. If it's not common practice, I can stop using it if people want me to. BigDom 13:03, 12 July 2011 (UTC)[reply]
I've only ever used the {{l}} template in lists of terms, which is where I understand the name "L" came from. It's most useful for lists of non-English terms in a non-English section where there are to be linked Derived terms, Related terms, Synonyms, and so forth, and you want to link to the correct language section. --EncycloPetey 18:19, 12 July 2011 (UTC)[reply]
I always understood it to mean 'link'. So I've used it in any situations that call for linking to a specific language. —CodeCat 18:51, 12 July 2011 (UTC)[reply]
Ditto, I always thought it was for 'link'. --Mglovesfun (talk) 18:01, 15 July 2011 (UTC)[reply]
Huh, I always thought it was just for "language"/"language section". --Yair rand 17:08, 25 August 2011 (UTC)[reply]

When the template was created, it did not mean "language", and did not mean "list"; it meant "link". I am certain of this and my word is absolute, because the original name of the template was Template:link. Alternative interpretations eventually appeared, though. --Daniel 17:13, 25 August 2011 (UTC)[reply]

FWIW, I've used {{l}} as BigDom and Liliana describe -- but then, I'm mostly dealing with Japanese, where there can be a lot of overlap with Chinese entries on the one hand, making the language specification in the link useful, and where the {{l}} template increases the font size slightly, which is very useful for making sure that big ugly kanji are legible. I find it a very helpful tool for link disambiguation. -- Eiríkr Útlendi | Tala við mig 04:35, 27 August 2011 (UTC)[reply]

Prepositional phrase as a POS

Some pages have Prepositional phrase as their POS (e.g. like a lamb to the slaughter). But I think that this is not a POS at all. Fortunately, in most cases, prepositional phrase pages use Adverb, etc. (and they are categorized as prepositional phrases). I propose to change all Prepositional phrase POS to the appropriate POS (most often, it should be Adverb). Lmaltier 07:27, 16 July 2011 (UTC)[reply]

sorry, but it was voted on: Wiktionary:Votes/pl-2010-01/Allow "Prepositional phrase" as a POS header. -- Liliana 12:03, 16 July 2011 (UTC)[reply]
Thanks, I was not aware of this vote. I don't understand this decision: to me, it's like adopting Phrase instead of Noun or Verb for both red fox an fill the bill (Noun phrase and Verb phrase would be acceptable too, but not Phrase alone). I understand it better after reading the discussion page, but I still fully agree with DAVilla. Note that, to take the example I gave, I find uses of very like ..., such as But then, in his circle, an innocent would have been very like a lamb to the slaughter. (samanthalucas.com/books.php?title=bodyheartandsoul). Lmaltier 13:43, 16 July 2011 (UTC)[reply]
"Like" is only sometimes a preposition. In the case you cite, it functions as an adjective. Modification by "very" is an indication, as is its use as a predicate. Just like#Adjective "worth" (adjective), it takes an NP complement. It is specifically analyzed as such in CGEL. "Like" is (almost ?) always a preposition when it has an NP complement and is an adjunct: "Like his brother, John writes left-handed." DCDuring TALK 14:41, 16 July 2011 (UTC)[reply]
I think in that example, 'very' modifies the entire clause that follows it. 'like a lamb to the slaughter' seems to me like an adjectival phrase that 'very' modifies in its entirety. —CodeCat 14:53, 16 July 2011 (UTC)[reply]
You may well be right — I don't know whether the modifier attaches above the complement or below it — but I don't think that changes anything. The term "adjectival phrase" gets applied by traditional grammar both to phrases that are actually headed by adjectives (which very can modify) and to other phrases that modify nouns (which very cannot easily modify). For example, you can say "the restaurant is slightly outside the town", but not normally *"the restaurant is very outside the town". True adjective phrases, however, don't have this restriction: "the restaurant is very far outside the town" is fine. Whether that's because very is just modifying the adjective that heads the phrase (like or far), or because the phrase as a whole retains its head's ability to be modified by very, I don't know, but the diagnostic works either way. —RuakhTALK 18:05, 18 July 2011 (UTC)[reply]
You could say that "slightly outside" is a prepositional phrase as well, so that instead of saying that "slightly" modifies "outside the town" you could also read it so that "slightly outside" has "the town" as its antecedent. The second example is a little more awkward, but I think most people would read it as "very much outside the town". But what happens when you leave the antecedent out? "slightly outside" as an adverb is more usual than "very outside", so this may hold for its use as a preposition as well, in which case it's not really a good example. —CodeCat 18:19, 18 July 2011 (UTC)[reply]
When you leave the antecedent out, it's an intransitive preposition; hence *"very outside" is ungrammatical even then. It's actually a perfect example. (Traditional grammar calls it an "adverb" in that case — and that's how nearly all dictionaries handle it — but as you've observed, it behaves similarly whether or not it has an antecedent, so modern linguists recognize it as a preposition.) Re: how most people would understand it: that's exactly the point. Most people would read *"he stupid" as "he is stupid", because only the latter is grammatical Standard English. —RuakhTALK 18:47, 18 July 2011 (UTC)[reply]
@Lmaltier: You advocate ===Adverb===, but then you give an example where it's modifying a noun! Syntactically, prepositional phrases don't behave quite like adjectives or adverbs, and I think ===Prepositional phrase=== is exactly the POS. (I, too, agree with DAVilla's comment that we should use real POSes rather than stuff ===Abbreviation===; but ===Prepositional phrase=== is the real POS.) —RuakhTALK 18:05, 18 July 2011 (UTC)[reply]
I think this specific case can't be an adverb because 'like' can actually mean two things. It can mean either 'resembling' or 'similar to', or it can mean 'in the manner of'. But such phrases behave very differently, compare 'he remained like a deer in headlights' and 'he stared like a deer in headlights'. The first case describes a property of something and so it's more adjective-like, while the second case describes a manner so it's more like an adverb. You can say 'he remained green' but not 'he stared green', while you can't say 'he remained quickly' while 'he stared quickly' is ok. —CodeCat 18:27, 18 July 2011 (UTC)[reply]
Nearly all prepositional phrases have both adjective-like uses and adverb-like uses; that's not particular to phrases headed by like. (Like is unusual, though, in actually being an adjective sometimes, rather than merely heading an adjective-like phrase. On this, traditional grammar and modern linguists agree; for example, the OED gives the quotation “The fixed stars are like our sun in every point in which it is possible to compare them” under one of its adjective senses, not one of its preposition senses.) —RuakhTALK 18:47, 18 July 2011 (UTC)[reply]
I'll stop commenting what I don't understand at all. Lmaltier 05:40, 19 July 2011 (UTC)[reply]
(I understand that like is an adjective in some cases, but I feel that it's a preposition in the sentence mentioned above (a preposition with a possible comparative form, maybe, but still a preposition). Here is a paper studying this subject: http://people.brandeis.edu/~maling/Maling1983_adjectives.pdf)
I now understand the rationale for using prepositional phrase as a POS. But I'm still convinced that it would be much more useful to readers to include 2 sections, one for the adjectival phrase and one for the adverbial phrase. Are there references mentioning prepositional phrase as a part of speech similar to adjective or adverb? Here is a reference I agree with: http://www.infoplease.com/cig/grammar-style/prepositional-phrases-big-daddy-phrases.html. Lmaltier 06:00, 22 July 2011 (UTC)[reply]

Rollback

Hi, I was wondering whether someone could give me the rollback tool. I've been fighting the vandalism for a little while now and wanted a review into it and maybe some advice on how to improve please. Thanks a lot.
Here are some useful links:

  • My contributions: 1
  • My reports of vandalism: 2

I really like this place, it is very community based and I hope to help out and contribute here for a long time to come.
-- PoliMaster talk/spy 13:19, 16 July 2011 (UTC)[reply]

First tell us, are you a clone of User:Razorflame? --Vahag 13:23, 16 July 2011 (UTC)[reply]
No I am not a clone user of Razorflame. Now can you give me some advice please? -- PoliMaster talk/spy 13:29, 16 July 2011 (UTC)[reply]
I find it a bit odd, a user who fights vandalism, but doesn't do anything else. It seems your second ever edit was on User talk:SemperBlotto to ask how to fight vandalism. Not 'bad' or 'negative', just a bit 'odd'. --Mglovesfun (talk) 13:36, 16 July 2011 (UTC)[reply]
Well he's been talking to me in the IRC chat too, and it seems he really wants to do something. -- Liliana 13:37, 16 July 2011 (UTC)[reply]
@Mglovesfun Different users help in different ways to the project, I'm not so fantastic with my language but still satisfactory so my way to help and contribute back to Wiktionary was to help with vandalism fighting. Hope this is OK? - PoliMaster talk/spy 13:41, 16 July 2011 (UTC)[reply]

Getting a rollback right does not change anything to what you can do, and almost anything to how you can do it. You are welcome to fight vandalism, and you don't have to do anything else if you don't want to. However, if somebody participates only to get special rights (as this sometimes happens), he should not get them, in my opinion. Lmaltier 17:15, 16 July 2011 (UTC)[reply]

Hi Lmaltier, no that wasn't the case, the tool will make it easier for me to revert the vandalism meaning that I can revert quicker and get it reported quicker. Here are an updated list of the vandals I have had blocked. Also they've not re-vandalised after their block. I will help either way the decision goes. :-D Merci beaucoup. -- PoliMaster talk/spy 17:04, 18 July 2011 (UTC)[reply]
What I wrote is the result of experience. Yes, some people participate only to get and use special rights. This is not a good thing. Lmaltier 20:18, 18 July 2011 (UTC)[reply]
What you seem to be saying is that the tool won't give him the ability to do anything he hasn't been able to do before, although it will make what he's been doing (without the objection of anyone here) easier for him to do. So am I correct in saying that this is not a question of new privileges or trust, that it's simply a matter of turning on a feature that could only be abused by people who don't know how to use it? If so, then who really cares if he gets it or not? We would still have the ability to patrol the edits, right? DAVilla 20:26, 18 July 2011 (UTC)[reply]
Privileges have a psychological dimension as well as a practical one. (Which is mostly unfortunate, but it's not entirely a bad thing; for example, new Wiktionarians who have experience with other WMF projects are likely to know that they can seek out an administrator if they need help from a trusted editor who's knowledgeable about the project.) A user whose only goal is to obtain a privilege, however minor, may well be a user who will abuse the psychological dimension of that privilege. Re: "We would still have the ability to patrol the edits, right?": Right! I didn't think that was true, but I just tested, and yeah, if you're in the "rollbacker" group but not the "autopatroller" group, then clicking "rollback" will create an unpatrolled edit. —RuakhTALK 21:25, 18 July 2011 (UTC)[reply]
So this would have some sort of status associated, rather than merely the practical utility? Then let's make it meaningless. Let's just automatically give rollback to anyone who has made at least one edit in the published namespaces (including the thesaurus and appendices, but not wiktionary or talk) on any 7 days. I guess PoliMaster isn't quite there yet though. DAVilla 06:12, 21 July 2011 (UTC)[reply]
@Lmaltier I understand what you're saying. This is a tool to be used in order to make tasks easier and quicker to perform and to increase performance quality when going about “vandalism fighting”. It isn't a right, it is a privilege and a tool to be used properly.
@DAVilla The tool can be so easily switched on and off. If the tool is misused which I intend not to, then surely it should be removed. The tool gives the ability to make my tasks quicker and much,much easier though. My intention never has been to patrol my edits, I can't do that now and didn't put in a request for that. My intention is to fight vandalism and unconstrutive, disruptive edits which is why rollback is being requested.
@Ruakh As I said above I never had an intention for patrolling my edits and that can be left to other users or admins. My request was to get rid of the mess and disruption that gets left by vandalisers.
Thanks. :-) - PoliMaster talk/spy 22:02, 18 July 2011 (UTC)[reply]

The Wiktionary community may find it useful to note that User:Thepoliticalmaster was granted rollback on Wikipedia, and had it promptly removed after misusing the tool and being unable to distinguish vandalism from non-vandalism. See this old version of his user page. He was blocked shortly afterwards. He caused many experienced Wikipedians to have strong concerns about his maturity and ability to learn how to operate at Wikipedia. I urge the Wiktionary community to exercise caution and take the user's previous editing history on Wikipedia into account before granting Thepoliticalmaster rollback or other rights. Tom Morris 13:22, 21 July 2011 (UTC)[reply]

Yes, I had already seen that. Basically, he's a troll and might yet be blocked here as well. SemperBlotto 14:05, 21 July 2011 (UTC)[reply]
That of course is interesting information to note. I'd like to state that at this point, I am unwilling to grant him any sorts of rights. The fact he just bugged me on the IRC chat pretty much every 30 minutes in the early morning to write my opinion in here does not contribute to my view on him, either. -- Liliana 14:16, 21 July 2011 (UTC)[reply]

Just as followup: there is now a community ban proposal being made at English Wikipedia over Thepoliticalmaster - see here. He's admitted to being the latest in a long line of socks. Tom Morris 13:46, 28 July 2011 (UTC)[reply]

Definition editing options trial

There wasn't much response to the suggestion in the earlier discussion on enabling the definition editing tool for a trial period, so I'm going to assume that there aren't any objections to temporarily enabling the script for one month. If anyone objects to the trial, the trial can be ended right away. To disable it for personal use, click the button below.

...

--Yair rand 18:27, 17 July 2011 (UTC)[reply]

I object to it being available by default without more explanation. How many folks have used it and not disabled it? That option, inviting people to use it voluntarily, seems like the first deployment step. If noone keeps using it or accepts that invitation, that seems to be a reliable indication that it is not a good solution to a meaningful problem. DCDuring TALK 19:47, 17 July 2011 (UTC)[reply]
It's been available opt-in for months, and I've lost count of how many times people have been invited to try it out. I've stopped the trial. Since it's targeted at users who never see the beer parlour, knowing how many Wiktionary regulars use it is irrelevant. --Yair rand 20:09, 17 July 2011 (UTC)[reply]
Does it address any need stated by actual users? Does it bring us up to some industry standard? Do our veteran users want it and keep it? DCDuring TALK 22:23, 17 July 2011 (UTC)[reply]
That depends on what you mean by "actual users", no, and I have no idea. The point of the tool is to make it easy for people to edit. A simple way to figure out whether it will be successful at doing that would be to trial it. --Yair rand 19:09, 18 July 2011 (UTC)[reply]
I think that such trials should first be made opt-out for all administrators, and only later (if ever) rolled out to everyone. In the specific case of definition editing, the lack of objections in the above-linked section may not be meaningful, given that (1) most of that discussion was about a different feature entirely and (2) I'm betting that most active editors never actually bothered to try it out. People are lazy. Also, it might help if you explained what exactly this feature is supposed to do. I find that even when I turn it off at Special:Preferences, I still get the little pencils to the left of definitions that let me edit them; the main differences I actually see when I turn the gadget on are that I get a little "Add language" box in the language tabs, and I get many more "(−)(±)" things in the list of categories (rather than just a "(+)" at the end). Either this is the least-accurately-named feature ever, or I'm seeing some sort of bug, possibly relating to different preferences I have set in different places (?), but since it's not clear what the intended behaviors are, it's impossible to report deviations from them. —RuakhTALK 19:53, 18 July 2011 (UTC)[reply]
The "(−)(±)" category editing buttons are built into tabbed languages, not at all connected to definition editing options. I don't know why they might sometimes not appear. The "Add language" box is also part of tabbed languages, but since it's dependent on the definition editing tool to allow the user to edit the definitions in the new section, the button doesn't appear unless the tool loads before tabbed languages, which will always happen if the gadget is on (gadgets load in order before anything else), but only sometimes happen if they're loaded through prefs or the button. If turning it off through Special:Preferences still leaves the edit definition buttons there, you probably also have it enabled through WT:PREFS or the button. Re an opt-out trial for admins, that wouldn't really give information on whether newbies will be able to edit more easily. (And I really don't see how a ten-pixel icon next to definitions could be all that harmful...) --Yair rand 20:22, 18 July 2011 (UTC)[reply]
There are too many ways to turn these things on and off! I looked at WT:PREFS and saw that it wasn't checked; I didn't look at the button, because I didn't remember that a single button controlled both features (and I had tabbed-languages turned on in other ways, so it wasn't obvious that I had that button pressed as well). In the past I thought that giving people more ways to turn something on was a good thing, but my experience with these two features of yours has taught me that it's not. (By the way, isn't your proposed way of turning it on for all users equivalent to the button, rather than to the Gadget? Doesn't that imply that the behavior of tabbed-languages+definition-editing will be nond-eterministic if and when they're both turned on for all users?) Re: opt-out trial for admins: It would tell you whether other admins think it will make newbies able to edit more easily, which is a start. And I doubt you'll get admins to support the tool until they've tried it out and decided for themselves that it seems likely to be useful for newbies. —RuakhTALK 21:38, 18 July 2011 (UTC)[reply]
If tabbed languages and the definition editing tool are enabled by default, then the check for the availability of the definition editing function could be removed, as there won't be situations where it doesn't get loaded. --Yair rand 22:09, 18 July 2011 (UTC)[reply]
I would hope that in the final version it doesn't apply one style sheet and then overwrite it seconds later with another. In other words, if this is to be changed for everyone, make it a planned core change at that time, and thus only apply it with the certainty that it works. You have my support for broad testing, meaning it's forced but only on those of us who would know or bother finding how to turn it off if we needed to. Maybe include the link to opt out in the news heading. DAVilla 20:11, 18 July 2011 (UTC)[reply]
In other words, testing is more than fine, just do it in grades. If anything, we should be giving you more support for a project like this. DAVilla 17:16, 21 July 2011 (UTC)[reply]

Mandarin pinyin entries — problems and proposal

Background: Recently — as in, over the past six months or so — one person has been creating huge numbers of Mandarin entries under pinyin spellings, regardless of attestation. He or she (henceforth "he") has several accounts, including at least 123abc (talkcontribs), Ddpy (talkcontribs), and Engirst (talkcontribs), and he also has ready access to many and diverse IP addresses; experience has therefore shown that blocks are almost useless, because he has no difficulty evading them. They do not even slow him down. In addition, he does not seem to be interested in following community consensus; he does insist that the pinyin spellings he adds are attested, but this is generally backed more by repeated assertion than by evidence. (For example, he frequently links to a web-site that has a pinyin edition of a certain Bible translation; the Web-site says that its text is available on CD, which is good enough for him. He's even, on several occasions, described the Bible translation as a "well-known work", as though it satisfied the well-known work rule; perhaps that would be true of the translation itself, but certainly does not seem to be true of the pinyin edition.) And because of the sheer volume of his edits, individual RFVs are out of the question.

A related problem is that we don't have clear policies on pinyin entries — I suppose we haven't had much need for them until now — which I think makes it hard to deal with him. I for one don't feel comfortable deleting hundreds and hundreds of possibly-valid entries just because the guy who created them "does not play well with others".

So I think we need to create a clear policy about Mandarin pinyin entries; and unfortunately, it needs to be one that even non-Mandarin-speaking editors can help enforce, since our Mandarin-speaking editors' ability and willingness to patrol bad edits is dwarfed by this one person's ability and willingness to create them.

Proposal: My proposal has two components, one relating to criteria for inclusion, one relating to entry layout.

Proposed CFI: That a pinyin entry, using the tone-marking diacritics, be allowed whenever we have an entry for a traditional-characters or simplified-characters spelling. I realize this may be a bit controversial, so here's my rationale for this:

  1. Despite what some people seem to think when we discuss these things in the abstract, in practice we always allow some flexibility and common sense. If our three cites for the (hypothetical) English noun tellaximination were one that used it capitalized at the start of a sentence ("Tellaximination"), one that used it hyphenated across a line break ("tellax-/imination"), and one that used it in the plural ("tellaximinations"), we wouldn't say, "oh, well, the lowercase unhyphenated singular form tellaximination doesn't have any cites, so I guess we have to delete that entry!"
  2. Although pinyin is not the normal writing system for Mandarin, there are actual books written in it (readers for children and whatnot), dictionaries for it, and so on. So it's not like, say, including romanized Greek.
  3. Pinyin is widely used as a text-input method for characters, so it makes sense for us to support that use.
  4. It doesn't make sense to distinguish between pinyin spellings that have three durably archived cites and pinyin spellings that do not, because that doesn't reflect any distinction between "real" spellings and non-real spellings: in some sense every word that is written in characters is also written in pinyin, and in some sense no word is. Granted, attestation of the pinyin spelling is something of an approximation for overall word frequency, but it doesn't have any real lexicographic value.

Proposed ELE: That a pinyin entry have only the modicum of information needed to allow readers to get to a traditional-characters or simplified-characters entry; see [[yánlì]] for an example. In particular, note:

  1. No POS information, and no L3 or L4 headers other than ===Pinyin=== and potentially ===Anagrams===. (Should ===Pronunciation=== be allowed as well? It seems bottable.)
  2. No sense-line information other than {{pinyin reading of}}.
  3. No Wikipedia boxes or example sentences or quotations or references or external links or whatnot.
  4. No categories other than Category:Mandarin pinyin.

In a way, these are two separate proposals, but they're intimately linked, in that together they form a compromise between pro-pinyinists (who presumably would not want such a restrictive entry layout) and anti-pinyinists (who presumably would not want such permissive criteria for inclusion); so I plan to put them forth as a single vote.

Does this seem like a good idea?

RuakhTALK 15:29, 19 July 2011 (UTC)[reply]

I like this idea, it's a very good compromise. The only change I would like to make is instead of using 'Pinyin' as the header, use 'Romanization'. This doesn't change the fact that only Pinyin romanizations are allowed for Mandarin, but it would allow the same header to be used in broader terms. For example for Gothic or other languages that are regularly published in transliterated form. We should make clear rules on the languages that can have such a header and the romanization scheme that it uses. —CodeCat 17:45, 19 July 2011 (UTC)[reply]
That makes sense. The definition-line already makes very clear what type of romanization it is; and if, hypothetically, we allowed two different romanization schemes for a given language, there's no reason to put them in separate POS sections. But we'd still use Category:Mandarin pinyin, right? Not Category:Mandarin romanizations. —RuakhTALK 18:28, 19 July 2011 (UTC)[reply]
==Transliteration== is even broader. I agree with using "pinyin" in the category name because there are other romanizations of Mandarin. In fact there are other forms of pinyin, but that shouldn't be a sticking point.
I'm fine with this as long as quotations and references are still allowed on the Citations: page and with the explicit understanding that citations are still a valid form of attestation for (at least other, if not all) transliterated terms, in which case these restrictions do not apply. (Although I am more than happy to remove the well-known work exception.) DAVilla 17:10, 21 July 2011 (UTC)[reply]
About other pinyins, the official and most correct name is actually "Hanyu Pinyin". I think we could either use this name, or if it's too long maybe just link to the Wikipedia article (with this name). Vaste 02:52, 24 July 2011 (UTC)[reply]
I like "Hanyu Pinyin", but should the name indicate that it's specific to Mandarin? DAVilla 03:22, 24 July 2011 (UTC)[reply]
We would need to make a list of the languages that these rules apply to and the transliteration schemes they use. I don't think making a policy about 'pinyin' is necessary when we can also make a policy about 'everything on this list'. —CodeCat 17:15, 21 July 2011 (UTC)[reply]
Hear, hear. Even for Chinese, we have to consider the Palladius system and Xiao'erjing, transcribing Chinese to Cyrillic and Arabic scripts. DAVilla 03:29, 24 July 2011 (UTC)[reply]
Brief (concise) English definitions are necessary and very beneficial to users especially for beginners, such as this example. Please see here for your reference. Engirst 10:41, 21 July 2011 (UTC)[reply]
It may sound non-constructive but here's my opinion: All proposals to accommodate 123abc's (and all his other aka's) edits will not work because he is a vandal by nature. He is back to working on toneless pinyin. He creates new headers and categories. He just doesn't care about the rest of editors. His whole activity seems like a revenge for blocking him. Rather than trying to please him, we should find a way to permanently block him. Unfortunately, I don't know what to do with the myriad anons he is creating. We should not sympathise with a person creating frustration and additional work. No matter what decision is made here how can you make sure he will stick to this decision? He has his own plans and he will do what he wants to do. Patrolling may not cope with his prolific edits. --Anatoli 01:15, 22 July 2011 (UTC)[reply]
On a more constructive note, I support the decision for toned pinyin entries to only serve as a link to hanzi entries, both traditional and simplified if they exist. The definitions, samples and the rest should go into hanzi entries. --Anatoli 01:20, 22 July 2011 (UTC)[reply]
Thanks for your feedback! FWIW, I'm actually assuming that 123abc will not abide by the community decision; but the above proposal is one that I believe can be enforced pretty effectively with technical assistance (bots, patrolling enhancements, etc.). If the proposal is accepted, we can set about arranging things such that it really doesn't matter whether or not he abides by it. Now, if he engages in real vandalism — not merely rejecting community consensus about what Wiktionary should be like, but actually inserting wrong information, removing valid information, and so on — then we're pretty much screwed. But so far he doesn't seem to have done anything like that (please correct me if I'm wrong); he seems to genuinely want Wiktionary to be a good resource for pinyin, and the above proposal still allows it to be. —RuakhTALK 01:35, 22 July 2011 (UTC)[reply]
Thanks, Ruakh. I'm very interested to see what the vote will look like. I'm currently job-hunting, so not able to add much thought. --Anatoli 03:04, 22 July 2011 (UTC)[reply]
Excellent proposal, Ruakh. It seems Engirst himself agrees to some extent too. Vaste 07:33, 22 July 2011 (UTC)[reply]
It should like the example ài with brief English definition because beginners don't read Hanzi, otherwise it is useless for beginners. If users understand Hanzi, Pinyin entries are unnecessary for them. Engirst 15:43, 22 July 2011 (UTC)[reply]
That whole entry is a mess IMHO. Vaste 14:51, 23 July 2011 (UTC)[reply]
Regarding the list of links to hanzi-entries, I think it'd be useful separate the common meanings from the obscure ones.
Showing these as equal is pretty ridiculous: 伌, 僾, 叆, 唆, 嗋, 噯/嗳, 堥, 塧, 壒, 嬡/嫒, 愛
Maybe it'd be enough to sort them by frequency (approximately)? Vaste 14:51, 23 July 2011 (UTC)[reply]
It has been improved like this Engirst 15:57, 23 July 2011 (UTC)[reply]

Wiktionary:Votes/2011-07/Pinyin entries.RuakhTALK 17:17, 27 July 2011 (UTC)[reply]

Derived terms

Our use of this term as a header feels misleading to me as applied in English. We use it to refer to morphological or synchronic derivation, whether or not that corresponds to historical or diachronic derivation. This may be wholly satisfactory for languages without much written record and adequate for languages in which this fits traditional ways of addressing etymology, even with an ample historical record. In the case of English it puts Wiktionary needlessly in a realm where we have either, 1., a proliferation of compound affixes any one or all of which can claim to be a morphological ancestor of a given term ending in each of the terminal affix groupings or, 2., a single resolution that is based on subjective feel, usually of a single person's expertise. The result is that the nature of what appears under the Derived terms header is not consistent. It is one thing for inconsistency to result from uneven progress toward a well-defined goal. It is another when it results from a lack of a well-defined goal.

I have always thought, yea, assumed, that the Related terms header was a wonderful home for terms with significant sharing of ancestry with the headword. This would have freed up the Derived terms section for historical derivations. As we now have a large and increasing number of English entries with acceptable or better etymologies, it seems that we have the potential to make good presentation of historical derivations (under Derived terms) while preserving other kinds of derivation information (under Related terms). DCDuring TALK 18:53, 19 July 2011 (UTC)[reply]

For my part I have almost entirely stopped using Derived terms and now almost always use Related terms, because I don't have to worry so much about which order the words were coined in. Just an observation. Equinox 18:58, 19 July 2011 (UTC)[reply]
I agree. My own understanding of derivation was historical derivation (if this phrase really means what I understand, i.e. that the derived term has been coined from the other term). The Related terms section should be used for other terms of the same family we want to mention in the page, and when the precise historical derivation is unclear. In pages describing affixes, a better header could be Terms using this suffix, etc. (only to be used when their number is reasonable: 10000 would be too many). Lmaltier 20:03, 19 July 2011 (UTC)[reply]
Oh yeah, a second observation (sorry to hijack your "thread", DCD!): we have a huge number of entries where people have added thematically related terms under the Related terms header, even though they have no etymological relation (e.g. "river" under "water"). This has made me wonder more than once whether we could pick a better subtitle. Equinox 20:06, 19 July 2011 (UTC)[reply]
The problem with DCDuring's suggestion is that, while it may sound uncontroversial, it conceals his very different idea of "historical or diachronic derivation" from most other people's. For example, as I understand it, he would not have us list happily at [[happy#Derived terms]], since that derivation happened in the 1300s, and therefore (in his mind) happily is actually a loanword from Middle English. (If I'm wrong, DCDuring, please correct me. I'm going by what-all you wrote here.) I actually think, for various reasons, that "morphological or synchronic derivation" is a fine use of the ====Derived terms==== section; but even if we do decide to restrict it to "historical or diachronic derivation", I would not want DCDuring to wield the scepter. —RuakhTALK 04:20, 20 July 2011 (UTC)[reply]
I am open to reason on that point, especially in view of my belated realization that WT:ELE specifies that Derived terms means morphologically, not historically derived. That issue strikes me a much more fundamental than my quibbles about whether 1500/1470 or 1100/1066 is a meaningful dividing line for how English language derivations are classified. They certainly seem completely separable issues. Further, there isn't much chance that I will be doing any scepter-wielding on matters not completely beneath the notice of others, both by my preference and by the way this wiki works. DCDuring TALK 04:38, 20 July 2011 (UTC)[reply]
Isn't happily derived from happy (whatever the date of this derivation)? Lmaltier 06:02, 20 July 2011 (UTC)[reply]
Not if you believe that "English" inherited both these words as it were separately. (This is one of the many problems that follow from taking the stance that Middle English is a separate language from modern English. Ƿidsiþ 12:43, 25 July 2011 (UTC))[reply]
We could revisit the names of the headings, but I doubt there's much that can be done. ==Compound derivatives== and ==Morphologically related terms== don't feel right. DAVilla 16:59, 21 July 2011 (UTC)[reply]

"Pinyin word" part of speech header

Is "Pinyin word" a correct part of speech header? If so, why has blocked editor User:123abc used one of his/her sockpuppet accounts to add this header to dozens if not hundreds of entries (as in this edit), and no one has said or done anything about it? Are there no admins left at this project? 71.66.97.228 08:11, 22 July 2011 (UTC)[reply]

It is just an experimental example for the above proposal. Engirst 09:24, 22 July 2011 (UTC)[reply]
In my view, no. I wouldn't expect Serbo-Croatian entries to have ===Roman spelling=== and ===Cyrillic spelling=== headers, so I wouldn't accept this either. Just should be phrase, noun, verb, etc. --Mglovesfun (talk) 09:42, 22 July 2011 (UTC)[reply]
Yes, you are right, Pinyin entries should be phrase, noun, verb, etc as this example created by A-cai. The "Category:Mandarin Pinyin words" is just an experimental example for response the above proposal. Engirst 10:09, 22 July 2011 (UTC)[reply]

Whatever kind of experiment it is, it is wrong, so please promptly undo every instance of adding "Pinyin word" as a part of speech at this project. 71.66.97.228 07:31, 23 July 2011 (UTC)[reply]

But "part of speech" is not allowed by the proposal. Engirst 10:25, 23 July 2011 (UTC)[reply]

Listing recent discussions in alphabetical order

I don't know about you, but I'm having trouble searching for discussions in the Tea Room. The particular one I'm looking for is on the word **something**, but I can't seem to find it, as nothing is listed in alphabetical order. Would it be too much of a trouble if all active discussions were listed in alphabetical order?

Thanks!

I think that this would not be an improvement: the chronological order is the best way to detect new discussions. In most cases, most active discussions are at the end of tha page. Anyway, an alphabetical order requires that you know the title and, if you know the title, it's easy to find it (use the search command of your browser). Lmaltier 06:13, 23 July 2011 (UTC)[reply]

I would like to do the following: take a dictionary (Arabic --> English) and manually (as opposed to an automatic scan etc.) copy many or most of its entries into wiktionary. The dictionary in question is not in public domain, AFAIK. The process would add more content than the mere dictionary entries, so I think doing this is somehow still an intellectual achievement which is more than a stupid copy and paste. Of course, I would also cite the dictionary.

More precisely, it seems commonplace that copying single items from a dictionary is OK. Is there a point at which the sheer number of entries copied becomes a violation of copyright laws? Jakob.scholbach 09:19, 25 July 2011 (UTC)[reply]

We need more Arabic content. Just do it, we'll sort out the problems later :). A word is a word. So, if you translate washing machine as غسالة (ghassaala) into Arabic, which dictionary can claim they said it first? The exact phrasing of examples could be suspicious, of course but I doubt there is a problem there. Correct me if I'm wrong here. --Anatoli 11:57, 25 July 2011 (UTC)[reply]
Citing dictionaries is commonplace in entries too. I've seen it many times. Learners also use their textbooks, including me. I'm 99% sure it's OK to do what you're planning, get a 2nd opinion though. --Anatoli 11:59, 25 July 2011 (UTC)[reply]
As far as I'm aware, you can't copyright the definition of a word or phrase. I think it's the precise wording and layout of a dictionary that's subject to copyright rather than the individual words and/or translations. BigDom 12:04, 25 July 2011 (UTC)[reply]
IANAL, but I think that in U.S. copyright law (Which is what Wiktionary is bound by, since its servers are in Florida), copyright violation begins almost immediately. Why do you translate غسالة (ghassaala) as “washing machine” rather than as “laundry machine” or “washer”? If it's because another dictionary translated it that way, then you're violating their copyright. As BigDom (talkcontribs) almost says, a dictionary doesn't own the facts it contains, such as the meaning of an Arabic word, only its expression of those facts, such as the English translations it chooses to express those meanings; so in cases where there's really only one reasonable English translation, they don't have much claim to ownership of that translation, but very, very often — more often than you'd think — it happens that there are actually multiple reasonable English translations. (Note: you write that "it seems commonplace that copying single items from a dictionary is OK", but I don't think that's true. You're probably thinking of "fair use", which is an aspect of U.S. copyright law that allows small amounts of content to be taken for purposes of commentary, parody, or the like; but fair use does not allow one dictionary to take a definition from another.)
That said, you can certainly yoink gender, plural, part of speech, exact spelling, vowel signs, and other sorts of information where the dictionary can't claim to have exercised creative expression (and where we tend to express the same fact in a slightly different way, anyway).
RuakhTALK 12:29, 25 July 2011 (UTC)[reply]
Maybe it's best to invite User:BD2412 here, he seems to be some sort of expert in legal matters. But I doubt that it's a copyright violation to translate a word as washing machine rather than washer. -- Liliana 12:34, 25 July 2011 (UTC)[reply]
If entries/translations are made manually, then the contributor applies his/her knowledge of the vocabulary, at least for one language, in case of Wiktionary, we need to follow transliteration rules and language specific policies. I'd say - words yes, phrases, clauses, multipart words - no or only with great care. --Anatoli 12:48, 25 July 2011 (UTC)[reply]
I'd bet that any systematic pattern of copying definitions from one or more copyrighted sources could bring trouble. Even copying other information might lead to trouble if all of one's facts are identical to those of a copyrighted source, including their idiosyncratic views and mistakes. Checking multiple sources is a good idea anyway. DCDuring TALK 13:26, 25 July 2011 (UTC)[reply]
(Intellectual property lawyer hat on) A dictionary, like most other works, is subject to copyright protection, but this protection is very thin for reference works that purport to present collections of facts. The standard is set forth in Feist v. Rural, a 1991 U.S. Supreme Court case holding that there could be no copyright infringement from copying the list of names and numbers in the phones book. There is no strict line delineating infringement, but for reference works the right generally adheres only to selection and presentation. For a dictionary that includes "all words in all languages", selection should not be an issue, so long as we make sure to include words above and beyond what is included in the source dictionary. Presentation is an issue, but there is also a merger doctrine that allows copying if the expression being copied is so basic that there is no less complex way to convey factual information. For example, one could not infringe by copying the statement that "roses are red", because there is no simpler way to convey that roses are, in fact, red. With respect to choices between meanings like washing machine, laundry machine, and washer, we should use whichever translation is the most accurate, unambiguous, and common (washer is clearly ambiguous, as it could mean a person who washes, or the ring that goes between a nut and a bolt). If this choice happens to coincide with what is in the dictionary being referenced, this is probably because that dictionary has successfully determined the uncopyrightable simplest expression. Cheers! bd2412 T 22:42, 25 July 2011 (UTC)[reply]

(unindent) OK, let me make this question more precise, maybe we get a clearer picture then. A typical entry in an Arabic-English dictionary looks like this:

لبس (a) labisa to wear [...]

I want to create three (!) pages out of this, namely one for the Category:Arabic_roots root ل ب س, one for لبس (which is the past tense) and a redirect page for يلبس (which is the present tense, which is spelled out in the dictionary only in an abbreviated form [the "a"]). Doing that does require some knowledge of the grammar, which makes it more than merely copying things. (As a parenthesis let me point out a key advantage of wiktionary over any printed dictionary: we don't need to decide whether we organize words by root or by alphabet, we can do both at a time).

As you know, this is time consuming when done with bare hands. On the other hand, Arabic is highly structured, so much can be done quite algorithmically, i.e., by a computer. For example, deriving يلبس from لبس follows a certain rule. Therefore, I'm thinking of writing a computer program that facilitates the (otherwise manual) copying of the data, i.e., creating a database first and then create the three pages (in the above example) out of it and finally upload it to wiktionary (in case no previous pages exists). Writing a program is obviously time-consuming and only makes sense if it used later on a larger scale; so if it becomes clear that this will result in a blatant copy vio it doesn't make sense to start off. Jakob.scholbach 22:31, 25 July 2011 (UTC)[reply]

Should one do it?

Aside from the copyright question, what makes you think the dictionary you have is reliable? I have seen (and even own) recently-published translating dictionaries (by a major dictionary publisher) that are so full of errors that I would sooner trust a personal guess than rely on that dictionary. This is one of the big problems we've seen over the years from editors who relied heavily on published dictionaries without an adequate knowledge of the language; they didn't know when the dictionary was wrong. --EncycloPetey 17:55, 25 July 2011 (UTC)[reply]
That's a very good point actually. I've recently been adding a lot of Luxembourgish vocabulary and I have used a couple of dictionaries to help confirm my translations, and noticed that about 5-10% of the entries are inaccurate. Luckily I can tell when a translation isn't quite right but you could easily go wrong. BigDom 20:45, 25 July 2011 (UTC)[reply]
I'm thinking of Hans Wehr's dictionary Arabic-English, which has had 6 editions and is, according to a number of sources (and personal experience) the most widely used dictionary for non (Arabic) native speakers. Of course, every book has its shortcomings, but I'd rather stick with the mistakes of some book than adding my own ones. My wikisocialization is mostly due to wikipedia: there, it is good standard to finally rely on published sources (as opposed to one's own opinion/knowledge). Is this different in wiktionary? Of course, consulting multiple sources is always good, but there are not so many up to date Arabic--English dictionaries at all. Jakob.scholbach 22:31, 25 July 2011 (UTC)[reply]
Wiktionary is not Wikipedia; our goals, methods, and norms are VERY different from Wikipedia. The prefered way to document a definition is through published quotations that use the word, rather than dictionaries that define/translate the word. See biceps (Latin section) for the preferred method of supporting the translation of a non-English word. Even resprected dictionaries have been known to contain whopping errors, often through copying and preserving out-of-date English translations from Victorian English (or earlier). It's a bit like a complaint that Stephen J. Gould once made about how paleontologists always described a particular fossil horse as being "the seize of a [obscure dog breed]". They used the description because they copied from someone who had used that comparison, even though the vast majority of people would never have heard of that breed or know how big it was.
Every time in the past that someone has copied en masse translations from a bilingual dictionary, we've found large numbers of errors. My comments above treat only one of the problematic issues. There is also the very large problem of matching senses of translations with senses of English words. So, a word translated as "leaf", but does that mean the organ from a plant, the thin layer of material in a book, or something else, or both? Most translating dictionaries do not match their translations with English senses, just with English words. --EncycloPetey 22:37, 25 July 2011 (UTC)[reply]
Still, better to have the definition with a comment that we are relying on foo dictionary, which should be taken with a grain of salt, then not to have it at all. bd2412 T 22:44, 25 July 2011 (UTC)[reply]
Based on my experiences with some dictionaries, I'd say that isn't necessarily true. As I say above, some dictionaries contain too high an error rate, and we'd still be missing the sense-connection information. --EncycloPetey 22:49, 25 July 2011 (UTC)[reply]

Welcome template

Wouldn't it be nice if our {{welcome}} template took a lang= parameter. Ideally it would produce the relevant translation as well as the English text. SemperBlotto 11:13, 26 July 2011 (UTC)[reply]

I think it's a nice idea, but surely on the English Wiktionary users should be welcomed in English? BigDom 22:11, 26 July 2011 (UTC)[reply]
Most new users haven't set up their babel either, so we can't even know for sure what language to use. —CodeCat 22:39, 26 July 2011 (UTC)[reply]
· I agree. I mean, we don't generally welcome new users until and unless they've made some edits, so we can sometimes make a good guess, but we really can't know for sure. (I can think of one editor who had no interest in expanding our coverage of his native Albanian, preferring instead to add horribly mistaken information about Hebrew and Ancient Greek. Though on the other hand, it would have been really funny to leave him a welcome message in Hebrew and see if he would admit that he couldn't understand a word of it, so maybe that's actually a perk!)
· Also, including a translation, even alongside the English as SemperBlotto suggests, risks giving offense: "what, do you not think my English is good enough?" So, maybe it would work better for {{welcome}} to just include links to translations; that way readers can decide for themselves if they want a translation, and if so which one. At most, {{welcome}} might take parameters indicating the admin's best guess as to what translation(s) is/are likely to be helpful, so the links to those translations can be given top billing.
RuakhTALK 23:28, 26 July 2011 (UTC)[reply]

"English non-idiomatic translation targets"

The entry paternal uncle is an English sum of parts kept to be translated, so it fits Category:English non-idiomatic translation targets.

Naturally, some of its translations are SOP too, like the Portuguese tio paterno. Should it be a member of Category:Portuguese non-idiomatic translation sources? --Daniel 11:39, 26 July 2011 (UTC)[reply]

That is of decidedly secondary importance. The English category might have the ability to discourage contributors from adding NISoP translations, which should be a lesser priority for contributor effort. As the "source" language sections don't have translation tables, there is not much effort-directing function to the category. Though there is the all-important symmetry. DCDuring TALK 12:39, 26 July 2011 (UTC)[reply]
I think [[tio paterno]], and similar entries, should simply be deleted. —RuakhTALK 13:40, 26 July 2011 (UTC)[reply]
How similar? Just translations like the Spanish tío paterno and the Italian zio paterno, or the English paternal uncle, too? --Daniel 16:25, 26 July 2011 (UTC)[reply]
The rationale for having paternal uncle is partly that it might be opaque to some English language learners and partly that some languages have single words for the concept. These seem to me, respectively, lame and irrelevant for a comprehensive monolingual dictionary (whose complexity would tend to overwhelm a learner). From a phrasebook perspective, a better entry might be for "on (some)one's father's side", which might be considered idiomatic if one ignores the usual context.
I wouldn't mind seeing all of them go, but the discussion advanced rationales for paternal uncle that don't apply to its translations. DCDuring TALK 16:45, 26 July 2011 (UTC)[reply]
Just to note, in the relevant discussion (Talk:maternal uncle) many thought it wasn't straight-forward SOP. Then again, I don't follow those discussions much. --Bequw τ 17:05, 26 July 2011 (UTC)[reply]
It increasingly looks to me like a wise choice.
Many clearly place little importance on even immediate context to distinguish the "having to do with one's genetic father" and "behaving like a father, fatherly" senses of (deprecated template usage) paternal. No real difference for maternal. DCDuring TALK 17:15, 26 July 2011 (UTC)[reply]
I would delete the Category, as the remaining entries can be presumed, by their existence if not relevant debate, to be idiomatic. We don't have a translations target exception. DAVilla 05:25, 31 July 2011 (UTC)[reply]

Numbered translations eg flap

I know there have been discussions in the past about relating translations to definition by "numbering" - I notice this entry was numbered back in Jan 2010 - should the numbers be removed? (see also WT:TR re defn No 1) —Saltmarshtalk-συζήτηση 19:28, 26 July 2011 (UTC)[reply]

sorted - thanks —Saltmarshtalk-συζήτηση 16:13, 27 July 2011 (UTC)[reply]
Numbered senses are discouraged, but haven't been rooted out. At [[flap#Noun]] at least there is also a gloss sufficient to match with the definitions. DCDuring TALK 16:29, 27 July 2011 (UTC)[reply]

Picture dictionary - see mammal, dinosaur

A picture of a tiger might have a place in the tiger entry. If we must have "picture dictionary" images (and I don't think we do) shouldn't they be elsewhere in the page? their current placement makes a mess (on my screen anyway) of the translation boxes. Do others have an opinion? —Saltmarshtalk-συζήτηση 16:13, 27 July 2011 (UTC)[reply]

As the Wiktionary:Picture dictionary project no longer has any participants, rearranging the content on the few pages that have it should not be controversial. As it is the germ of an idea that might belong at Wiktionary, I personally would prefer it not be deleted, just moved out of the way on the page, after translations, for example, but not after, say, External links, References, Anagrams, and See also. DCDuring TALK 16:41, 27 July 2011 (UTC)[reply]

Redirected combining characters

Wiktionary:Votes/2011-06/Redirecting combining characters passed.

Here's how I implemented it in two entries.

--Daniel 06:58, 28 July 2011 (UTC)[reply]

Wiktionary's rules

Wiktionary should have rules to follow but not by somebody's thinking. Please see here. 2.25.213.130 09:45, 28 July 2011 (UTC)[reply]

You don't know anything about following rules. Stop trolling. —CodeCat 09:54, 28 July 2011 (UTC)[reply]
Please show me the rules for following. 2.25.213.130 10:06, 28 July 2011 (UTC)[reply]
The last line of the discussion linked at Template talk:derv#Deletion debate says:
"No consensus. For this very controversial subject, especially, this probably means don't delete the template and don't use it either. (or don't use it very much)"
I wrote it myself. --Daniel 10:10, 28 July 2011 (UTC)[reply]
This is only your "probably" opinion, and what is "very much" means? 2.25.213.130 10:24, 28 July 2011 (UTC)[reply]
You already know what "very much" means. When people started reverting your edits en masse, you reached that threshold long ago. That sentence says the template is "very controversial", therefore even 1 use is likely to be very much.
That's the conclusion of a big discussion. You should read it. --Daniel 10:36, 28 July 2011 (UTC)[reply]
You said "even 1 use is likely to be very much", did you mean {{derv}} is banned? 2.25.213.130 10:42, 28 July 2011 (UTC)[reply]
No. I meant what I said. --Daniel 10:43, 28 July 2011 (UTC)[reply]
So, it is your opinion but not a rule. 2.25.213.130 10:47, 28 July 2011 (UTC)[reply]
Obvious troll is obvious. Just don't use the template. BigDom 10:53, 28 July 2011 (UTC)[reply]
It is your opinion. Anyway, is {{derv}} banned by rule? 2.25.213.130
It is now. Stop using it. —CodeCat 11:23, 28 July 2011 (UTC)[reply]
Do you mean it is by rule? If so, you too. But you cannot close my mouth, then do anything as you like. 2.27.73.38 13:17, 28 July 2011 (UTC)[reply]
If there is no rule about something do you just keep doing it indiscriminately? There are more possibilities than 'required', 'allowed' and 'forbidden'. If something is disputed you should realise that it might be against agreed practice. If something disputes a certain action it means 'stop what you're doing and discuss it'. Going against that means going against community consensus. Rules have nothing to do with that. Community consensus and common practice is above rules; in fact, consensus is what creates the rules in the first place. And yes, it's possible to go against consensus even if there is no consensus yet. If no agreement has been reached, the normal practice is to keep things as they are until a decision is made. That means you don't keep doing what you're doing because then you're not keeping things as they are, you're making your own decision outside of the community and acting on your own. That's why people disagree with your work and I really hope you can understand this. —CodeCat 13:39, 28 July 2011 (UTC)[reply]
Excellent summary. We should have something like this recorded somewhere to refer to when these situations arise. What's a good place for it so we could find it again? DCDuring TALK 13:46, 28 July 2011 (UTC)[reply]
Help:Interacting with humans? —RuakhTALK 18:52, 28 July 2011 (UTC)[reply]

"Character" row

!

Basic Latin ! (U+0021) EXCLAMATION MARK

!

Basic Latin

! (U+0021)

EXCLAMATION MARK

I added a row "Character" to the character box, so I won't have to go all the way to the left to the H1 title if I want to copy a character. It also makes clearer that all the information of the box, such as the codepoint and the code block, concern that specific character, even if there is not any image at the moment. Feel free to revise my decision.

  • Examples of affected entries: Ç, and ´. The last one contains two boxes, each with a different character to be seen and copied. The middle one contains an image.

--Daniel 15:14, 28 July 2011 (UTC)[reply]

As I mentioned on Daniel's talk page, I think the {{character info}} boxes should be minimal. They should be enough to help a person see the character (thus an image instead of sometimes an unrecognized square) and distinguish it from similar ones (thus the codepoint & name) especially for users with varying font installations. Since they are in the header, they often push down into the top sections, or worse, with TabbedLanguages pushing all content down. I therefore think they should be shorter (some ideas already discussed) not longer. I don't think the extra row adds much, as the saying both LATIN CAPITAL LETTER A and Character A seems overly redundant. And the H1 title isn't that far away for your copying needs:) We should focus on driving people to the actual language sections. --Bequw τ 21:30, 28 July 2011 (UTC)[reply]
The extreme redundance between LATIN CAPITAL LETTER A and Character A decidedly is an exception, not a rule. Names of characters in Unicode use capital Latin letters, so naturally the capital Latin letters themselves are described with that kind of conspicuous repetitions.
As different examples, ! is EXCLAMATION MARK and ゝ is HIRAGANA ITERATION MARK. The former may sound redundant, too, to English speakers who know that punctuation mark very well, while the latter is likely to be more obscure to them and may need the image of the character as an additional clue. Even the exclamation mark benefits from being shown somewhere easy to be seen. Relatedly, we have the tradition of displaying titles of entries at headword lines, which I believe is for the same reason. (in addition to other reasons, such as displaying macrons, inflections, etc.)
Anyway, since you mentioned you want shorter boxes, I added two basic ideas to the right side of this conversation, based on the one of !. --Daniel 22:07, 28 July 2011 (UTC)[reply]
I like the new designs. I've take off UCS too as most people don't know what that means (at least some recognize Unicode). Some of the character images might be wider than ! so we'd just have to make sure it wouldn't hit the ToC horizontally for other characters with the side-by-side layout. I agree that showing the character is important even when we don't have images. So how about we have character info show an enlarged text character when no image is given. Otherwise, with both, we can have incongruity. When I go to , I don't have Braille fonts so I see the unrecognized box character along with an appropriate image, and that could be confusing to some. --Bequw τ 23:23, 28 July 2011 (UTC)[reply]
I think character width is not a problem with these two designs in particular, because they still have plenty of unused space. Even a big "W" could fit any of them very well; however, notably, the entry W has an unnecessarily small image of a W instead, which is very shorter but a little wider than the "!" of the examples above.
If a reader does not have the right fonts, then they will see many boxes of unrecognized pieces of text. I suppose many characters written in the abovelinked entry are currently unable to be properly displayed on your screen. That problem is not specific to the Unicode box; it is not even specific to individual characters. Someone without Khmer fonts is unable to read ចំណងឈ្នាប់, for example.
One possible, natural way to try to deal with that is placing a (hopefully small and relatively inconspicuous) warning on every entry spelled with characters unexpected to be found in English texts. The wording would be like "This entry is written with the Gothic script. Lack of appropriate fonts results in seeing boxes, interrogation points or mojibake in place of the actual characters." --Daniel 00:04, 29 July 2011 (UTC)[reply]
Yes, something along the lines of (but smaller than) w:Category:Foreign character warning boxes. --Bequw τ 02:46, 29 July 2011 (UTC)[reply]

Non-official "official" designations

In the page Appendix:Unicode/Combining Diacritical Marks, in the column of official designations, there is this:

  • "COMBINING GRAVE ACCENT (VARIA)"

Which in this page, is simply called:

  • "COMBINING GRAVE ACCENT"

I propose simply deleting that parenthesized word from that appendix, and do the same for other, similar, examples such as "COMBINING ACUTE ACCENT (OXIA, TONOS)". If a character has multiple names, they can be mentioned in the entry. --Daniel 17:00, 28 July 2011 (UTC)[reply]

ftp://unicode.org/Public/3.0-Update/NamesList-3.0.0.txt has them; the parenthetical information has apparently been taken out somewhere between there and the most recent names list.--Prosfilaes 19:43, 28 July 2011 (UTC)[reply]

"official" country names

I just put Republic of Armenia and Commonwealth of Australia in en:Countries, but now I'm not so sure. Perhaps we should have a sub-category, something like "Official country names"? ---> Tooironic 01:42, 29 July 2011 (UTC)[reply]

But the official names of countries change, not naturally over time but by fiat, and even what is "official" depends on which groups are recognized by whom. I don't think that's something a language dictionary should concern itself with. DAVilla 05:16, 31 July 2011 (UTC)[reply]
Should we even have those entries? —RuakhTALK 02:24, 29 July 2011 (UTC)[reply]
Well I don't think anyone would think about deleting United States of America, would they? ---> Tooironic 02:34, 29 July 2011 (UTC)[reply]
I would. I'm not saying, right here and now, that it should be deleted; but it's certainly not obvious to me that it should be kept. —RuakhTALK 12:26, 29 July 2011 (UTC)[reply]
I would consider country names to be common collocations, and they are also idiomatic. The Republic of Ireland isn't necessarily the same as Ireland. And the Republic of China? You get the idea. :) —CodeCat 14:43, 29 July 2011 (UTC)[reply]
I agree with Ruakh that the existence of an entry, even one that has strong emotional support (ie, not based on policy or mission or reason) from some or intuitive support, should not per se prevent us from considering policies that would lead to its deletion. DCDuring TALK 14:50, 29 July 2011 (UTC)[reply]
We already had votes on place names, we keep them. Keeping the official names separately seems like a bit of waste - some will overlap with colloquial of idiomatic or may have variants. Russia and the Russian Federation are both official names for Russia. Ukraine has no other official form. --Anatoli 04:18, 1 August 2011 (UTC)[reply]
Official (as well as colloquial) country names sometimes change, but in most cases they will remain in the history books. In another decade or two, Americans will probably no longer be familiar with the word Soviet and won’t know what Union of Soviet Socialist Republics refers to. We never used that name much, preferring the shorter versions, but it will always be found in history books and old documents. —Stephen (Talk) 08:06, 1 August 2011 (UTC)[reply]

Partially italicized category name

I italicized the last part of "Category:English words suffixed with -ness". See its last revision.

Rationale: That's how text is formatted by {{term}} and {{suffix}}. Italic.

What do you think? Should this be done for all categories of English morphemes? --Daniel 03:50, 26 July 2011 (UTC)[reply]

To me, it seems like a better look, more consistent with our best practice elsewhere. DCDuring TALK 14:35, 29 July 2011 (UTC)[reply]
What has to happen so the text matches the pagename? DCDuring TALK 14:37, 29 July 2011 (UTC)[reply]
I updated {{suffixcat}} to try to make all titles of categories of suffixes in Latin script be italicized automatically.
This includes Category:English words suffixed with -hood and Category:Spanish words suffixed with -ez, and does not include non-Latin suffixes, such as Category:Japanese words suffixed with -員.
One possible and annoying side effect would be not recognizing the right script, and trying to italizice (or not italicize) some languages incorrectly. Please let me know if that happens. --Daniel 15:48, 29 July 2011 (UTC)[reply]
Can the template not just use the script template itself with face=ital, like {{term}} does? —CodeCat 16:21, 29 July 2011 (UTC)[reply]
But on the page it says "Pages in category "English words suffixed with -hood", without "-hood" being in italics. Why do this if it isn't even consistent on the page? DCDuring TALK 16:24, 29 July 2011 (UTC)[reply]
That seems more like an oversight in the MediaWiki software itself. I agree though it is a bit inconsistent. —CodeCat 16:28, 29 July 2011 (UTC)[reply]
Yes, apparently the software itself is being inconsistent. --Daniel 16:38, 29 July 2011 (UTC)[reply]
I implemented the idea of face=ital, and it works. --Daniel 16:38, 29 July 2011 (UTC)[reply]
I've removed the if and allowed it to call the script template directly, I hope it works this way too. —CodeCat 16:40, 29 July 2011 (UTC)[reply]
Doesn't the current implementation break for the language-codes with the ugly magic prefixes? —RuakhTALK 19:28, 29 July 2011 (UTC)[reply]

Variations: namespace

I created Wiktionary:Votes/2011-07/Variations: namespace, as a follow-up to Wiktionary:Votes/2011-06/Disambiguation: namespace.

Feel free to discuss this idea, amend it, support it, etc. --Daniel 12:39, 29 July 2011 (UTC)[reply]

ISO 639-3 updates

A new batch of changes for ISO 639-3 (3-letter language codes) were released by SIL. Most of the changes didn't affect languages in which we had any entries. I've implemented all the renamings except for those with new names with complicated characters ({{nee}}, {{gel}}, and {{bzx}}), though others should feel free to. I've deleted all the retired codes except for {{noo}} (Nootka or Nuu-chah-nulth). ISO split it into two and we should decide if we should follow that or not. There's only a few entries, translations, and categories. Anyone have enough expertise to weigh-in and potentially make the linguistic changes? --Bequw τ 17:02, 29 July 2011 (UTC)[reply]

stupid thing

Tooironic said this is a "stupid thing" (Please see here). If as Tooironic said that is true, should we get rid of these "stupid things"? 2.25.191.61 20:12, 29 July 2011 (UTC)[reply]

I think there are too many of those categories, but I don't think there is anything wrong with them in itself. I can imagine someone might find it useful to find other words written with the same character. On the other hand they could probably find those terms by looking in the index. So I would say a very weak delete... —CodeCat 20:51, 29 July 2011 (UTC)[reply]
Index:Japanese has some useful contents, such as an ordered list of words, much like Index:English. However, as of today, the Japanese index does not make the work of Category:Japanese terms by their individual characters. --Daniel 21:15, 29 July 2011 (UTC)[reply]
One of the main function is for learning purpose as well as for Chinese language (please see an example of a dictionary here. 2.25.191.61 22:12, 29 July 2011 (UTC)[reply]
Just a note — this is a perfectly valid discussion to have, but I think your tone is needlessly combative. Also, I'm inclined to say that this decision should be made by the editors who are actually working on Japanese. If they consider it useful to have separate derivations category for each kanji that appears in (say) more than twenty words, then I don't see a need for the rest of the community to interfere with that. (Some decisions are worth making on a project-wide basis — or even a WMF-wide basis — but this doesn't appear to be one of them.) —RuakhTALK 21:04, 29 July 2011 (UTC)[reply]
We should consider the matter itself (ignoring personalities). Is "more than twenty words" base on rules? How about {{derv}}? 2.25.191.61 21:34, 29 July 2011 (UTC)[reply]
Moreover, category of Derived terms are useful indeed, please see an example of a dictionary here. 2.25.191.61 22:32, 29 July 2011 (UTC)[reply]
You haven't actually given an argument to support your views. All you've done is saying 'this is how it is' and then link to a site that does the same. Imagine you were arguing for or against the death penalty or something like it. If you were in favour, you could say 'look at the USA, they have the death penalty so it's right'. If you were against, you could say 'look at Germany they don't have the death penalty so it's wrong'. It's not an argument, it just doesn't work that way. You need to explain why we, at Wiktionary, should do it. Why do you think it's a good thing? Why do you think it improves things and makes Wiktionary better? —CodeCat 22:40, 29 July 2011 (UTC)[reply]
Please go to the point. The point is somebody said that category of derived terms is a stupid thing. I have linked to an example of a dictionary to show that it is useful. 2.25.191.61 23:00, 29 July 2011 (UTC)[reply]
But why is it useful? And above all, why is it useful to Wiktionary? You haven't answered that. —CodeCat 23:04, 29 July 2011 (UTC)[reply]
Can you please stop bringing 'rules' into every discussion? It's getting a bit annoying. I've already explained to you how it works. Wiktionary discussions are not court rooms, we're not here to see whether our practice conforms to the law. Think of discussions as parliament if that makes it easier to understand. You wouldn't say to a parliament 'you can't make that legal, it's against the law!' We're discussing the law itself here, which makes appeal to rules completely pointless. —CodeCat 21:42, 29 July 2011 (UTC)[reply]
Of course we can make a rule and change a rule as well. Anyhow, we should have rules for following, otherwise will out of order. 2.25.191.61 22:33, 29 July 2011 (UTC)[reply]
I still don't see what your point is. We are discussing the rules right now, right here. Why are you arguing about rules instead of making suggestions on how to change the rules? —CodeCat 22:40, 29 July 2011 (UTC)[reply]
This is just respond to your words. So, please go to the point. 2.25.191.61 22:49, 29 July 2011 (UTC)[reply]
I'm not even sure what the point is right now. I've already made mine, above. Right now I'm waiting for others to join the discussion, because I don't think two people can make a consensus. —CodeCat 22:57, 29 July 2011 (UTC)[reply]
If the point is just that somebody said that some project is a stupid thing, then this discussion should be over already. The answer to "If as Tooironic said that is true, should we get rid of these "stupid things"?" is no, because Tooironic's comment quoted by the anon is a poor argument when devoid of context. Either you elaborate it, or there is no point. If further arguments are provided, (or old ones get revisited) then a number of categories may or may not be deleted, as a result.
Granted, many subprojects of the big project of having categories for individual derivations are controversial, and may benefit from some constructive discussions, but I barely see one here. --Daniel 23:33, 29 July 2011 (UTC)[reply]
I don't think they're stupid, but they do seem fairly useless and so trivial as to be better performed in more mechanical ways. DAVilla 05:08, 31 July 2011 (UTC)[reply]

Why is zhìyuànzhě deleted? 2.25.193.78 12:07, 30 July 2011 (UTC)[reply]

This is not the appropriate forum for that question — and I'm quite convinced that you already know that. —RuakhTALK 13:39, 30 July 2011 (UTC)[reply]
I find out the link now, please see here. 2.25.193.78 17:44, 30 July 2011 (UTC)[reply]

revision of context label to better support Mandarin entries

The {{context}} label has been modified so that one can now include script information. Everything seems to be working fine with the exception of a special case involving Mandarin words that don't have a simplified form. For this situation, the behavior should be such that the word gets placed in both simplified and traditional categories by the template. This would be done by designating script=traditional and script2=simplified. Furthermore, the traditional form should be sorted according to radical/stroke order, while the simplified should be sorted according to pinyin order. For example, the template in the entry for 哀求 would like like:

# {{Advanced Mandarin|lang=cmn|script=traditional|script2=simplified|skey=口06|skey2=ai1qiu2}} to [[implore]]; to [[entreat]]

Everything seems to be working just fine with the script/skey portion, but nothing seems to be happening with the script2/skey2 portion. I should be seeing something at the bottom of the entry that looks like: Category:cmn:Advanced Mandarin in simplified script that is sorted according to pinyin/tone order (ai1qiu2), but nothing is there. Help :) -- A-cai 12:20, 30 July 2011 (UTC)[reply]

Fixed (there was a problem in {{context}}). Let me know if there's any other related ones. --Bequw τ 20:34, 29 August 2011 (UTC)[reply]


How to treat participles on Wiktionary

I'd like to continue the discussion started above at Inflected German participles, but this time not only for German, but cross-linguistically since it turned out to be a problem that concerns many languages. Ok, so the basic question is how participles are to be treated best on Wiktionary. An example for participles in English would be (deprecated template usage) playing as the present participle and (deprecated template usage) played as the past participle of (deprecated template usage) play. Let me sum up the previous discussion. Traditionally, participles are treated as verb forms, so they normally appear in inflection tables of verbal infinitives (see here for German spielen), in Wiktionary too (German, Dutch, French...). The tricky point is: Often participles are used as adjectives in sentences (and can then be declined like normal adjectives). This goes as far as that, for example, the German present participle cannot be used as a verb, only as an adjective (or adverb). This might apply to other languages and really questions whether such participles should be put under "Verb" headers (as is currently done in German and probably most other languages), and even whether they should appear in verb inflection tables.

All in all it seems that the current "German" way of treating participles is rather bad. I know of two other possible solutions. One was proposed by Dan Polansky above. When participles only appear as adjectives (such as German present participles) they don't get a Verb but an Adjective header. When participles are used both as verbs and as adjectives (such as most German past participles), they get both headers. Personally, I think this solution makes sense except for two problems: First, for almost any verb we'd have a verb as well as an adjective section for its past participle. To me this seems redundant, but I also understand the contrary attitude that it's more clear-cut. A more serious problem would be that there appear to be cases where participles are used ambiguously so one cannot tell for sure whether they are verbs or adjectives -- e.g. German das Haus ist gebaut, Dutch het huis is gebouwd (thanks to CodeCat), French Il a sucré son café, puis a bu le café sucré (thanks to Lmaltier). If it's true that participles are something in between verbs and adjectives here, another solution might be appropriate, and that solution is already being used in Latin. For this language, there's a separate Participle header which subsumes the different Latin participles. See āctus for an example. What's the downside of such an approach? As I said, participles can be inflected, and such inflected participle forms (such as āctī) are also under a Participle header. This misses the fact that those forms are completely unambiguously used as adjectives (ambigious cases can still be inflected, as in Spanish la casa está construida, thanks CodeCat), and "participle" is probably not a proper part of speech either.

That's quite complicated, and if anything's unclear or if I put something wrongly, I'm looking forward to your comments. So, how do participles behave in other languages? How are they treated on Wiktionary, and do you think it makes sense? What do you think about the Latin way? Is there possibly a uniform way to represent participles on Wiktionary independent of language, or should we continue to have language-dependent ways of treating them? But, as I said, all the current ways I know of have flaws. At the end of the discussion, of course I'd like to have a good solution for German, but if other languages benefit, so much the better. Longtrend 10:37, 9 July 2011 (UTC)[reply]

The inflected forms are not always unambiguously adjectives either. In French, for example, when a past participle is used to form the perfect tense, it still inflects based on the gender and number of its direct object. So they could arguably be considered 'declined verb forms'. —CodeCat 11:40, 9 July 2011 (UTC)[reply]
Thanks for the notice, I missed that. In languages that inflect for case, it would be more appropriate to say "non-nominative past participle forms are used unambiguously as adjectives". Longtrend 12:11, 9 July 2011 (UTC)[reply]
That may not be accurate either. In early Old Norse, the agreement in the perfect tense was actually the accusative, which later became specifically neuter accusative, but still agreed in gender in earlier texts. This example is found in Völuspá (with the agreement in bold): hverir hafði lopt alt lævi blandit , eða ætt iotuns Óðs mey gefna , with the first agreement being neuter nominative/accusative, but the second is feminine accusative. This is because the combination of participle and object was still considered an object of 'to have' in that language, and was therefore placed in the accusative case. That is, a sentence like 'I have painted a door' was not distinguished grammatically from 'I have a painted door' or 'I have a door painted'. —CodeCat 12:33, 9 July 2011 (UTC)[reply]
That's interesting. I think I better don't try another generalization :) But your example seems to be a strong argument in favor of the thesis that participles are (or can be) something in between verbs and adjectives -- that is, if we are going to treat participles uniformly across languages; otherwise it's at best an argument for Old Norse. Longtrend 12:57, 9 July 2011 (UTC)[reply]
Hungarian has present, past, future, and adverbial participles. The Etymology section contains the information that this entry is the participle of a verb. There can be adjective and noun sections to illustrate the appropriate usage and declension. See for example nevelő, the present participle of nevel (to educate). --Panda10 13:11, 9 July 2011 (UTC)[reply]
Latin participles also come in past, present, and future, and have mood (active or passive as well). There are some Latin participles that were used as adjectives, but since Classical Latin did not always clearly distinguish between adjectives and nouns (they had the same inflectional endings), this means that some participles were used as substantive nouns. In fact, the future passive participle eventually came to replace gerunds and infinitives to funtion as a noun. However, it still had a verb funtion in the passive periphrastic conjugation, and was never used in the nominative (you had to use a verbal infinitive for that). In other words, the situation was rather complicated as to what part of speech these things were. For Latin, we've chosen simply to recognize "Participle" is a separate part of speech because it simplifies everything. Other languages are free to make similar choice in how they handle their parts of speech, but I don't think there's a single way to handle everything that will work across all languages. --EncycloPetey 14:12, 9 July 2011 (UTC)[reply]
We should not invent anything: words should be addressed according to traditions of each language. In French, it's clear that participles are verb forms, not adjectives, and that adjectives are not participles, are not verb forms. I provided an example of a sentence with an ambiguous meaning. This sentence shows that this is not always an easy distinction, and this is a good reason to make it as clear as possible here, this is not a reason to blur the difference. Lmaltier 15:22, 9 July 2011 (UTC)[reply]
"words should be addressed according to traditions of each language" -- so in your opinion, we should treat German present participles as verb (form)s, even though they are never used as such, just because they are traditionally regarded as verb forms? "In French, it's clear that participles are verb forms, not adjectives" -- how come past participles inflect for gender in predicative use then, a behavior you can only find in adjectives otherwise? Longtrend 12:03, 10 July 2011 (UTC)[reply]
French past participles are inflected in some cases, yes, this does not make adjectives. Actually, I think that the distinction between participles and adjectives is exactly the same in English and in French. I also think that all German verbs have compound tenses, and, therefore, that all German participles are actually used as verb forms. Am I wrong? Lmaltier 13:56, 10 July 2011 (UTC)[reply]
Something that's still not clear to me is just when something is a verb and when it's an adjective. I can understand that finite verb forms are verb forms... but what about non-finite forms? Why are they verb forms? Etymologically they are often not verb forms at all (like in the Old Norse example; Romance participles have a similar history), so why do we call them verb forms now? —CodeCat 14:11, 10 July 2011 (UTC)[reply]
Because (1) we speak English, (2) English has become less inflected and so its grammar has changed, (3) the original categories for parts of speech were set up by the Romans and Greeks, and (4) we have a better understanding of rammar in the 21st century. --EncycloPetey 14:29, 10 July 2011 (UTC)[reply]
That still doesn't answer my question though. Why are they verb forms now when they were not originally? What about them makes us consider them verb forms? —CodeCat 14:35, 10 July 2011 (UTC)[reply]
In English, our classification of -ing forms and -ed forms and specific senses thereof depends on such things as whether there is a corresponding base form, and whether the forms behave like adjectives or nouns. The verb form is assumed to exist because it is hardly ever possible to find such forms never modified by any adverb. If derived from transitive verbs, they usually take complements just like other forms of the verb.
The conversion process of denominal verbs seems to sometimes begin with -ing and -ed forms. For example, one can be coffeed out or coffeed up, but instances like "He coffees himself up every morning" are more rare.
The answer to the question seems to be simple: when you think to the verb when using the word (when you think to the action expressed by the verb), it's a participle, a verb form (even when an ellipsis blurs this fact); when you don't think to any action, only to a characteristic of the thing (not to how the thing got this characteristic), then it's an adjective. See adjective and verb for definitions. Lmaltier 16:09, 10 July 2011 (UTC)[reply]
@Lmaltier: I still don't quite understand your analysis of French participles. You probably know what I was talking about, but to be sure here's an example: Le café est sucré_ vs. La sauce est sucrée (excuse me if those sentences are wrong -- I just have some very basic knowledge of French, but you get what I mean). Correct me if I'm wrong, but sucré(e) behaves just like an adjective and nothing like a verb here -- you can perfectly replace it by a "proper" adjective but not by a "proper" verb. So what makes you think it's a verb other than 1) tradition and 2) the fact that it's obviously derived from a verb (which is not sufficient, as Dan Polansky convincingly demonstrated above -- the fact that in English almost each verb can be "agentivized" (for lack of a better word) by -er in English doesn't make the new forms verbs)? And as of German: Yes, you are wrong in your assumption that all participles are used in compound tenses. Present participles are never used in such constructions. In English, I perfectly agree with the analysis that present participles are (or can be) verbs, since there's such cases as I am playing -- however, there is no equivalent form *Ich bin spielend in German or any other complex verbal constructions with a present participle. Longtrend 17:00, 10 July 2011 (UTC)[reply]
You misunderstand me. In your examples, used alone, they are not verbs: very clearly, both sucré(e) are adjectives. They refer to a characteristic of the thing. In Il a sucré le café or (passive form) La sauce a été sucrée avant d'être servie, it's also very clear that they are not adjectives, they are verb forms. The same applies to present participles (this is an easy case, as present participles are never inflected in French: when they can be inflected, then the words are not present participles, they are adjectives). For German, I was thinking to past participles. But, for German too, I think that the criterion should be: do you think to the action expressed by the verb or not? The difference between an adjective and a verb is not related to a suffix or anything of the kind, it's related to how it is used and what is meant by people using it; do people want to use the verb (to refer to an action), or do they want to use an adjective (to refer to a characteristic)? Lmaltier 18:43, 10 July 2011 (UTC)[reply]
Your criterion is semantics, which is not valid. Expressing an action is neither a necessary nor a sufficient condition for being a verb. Verbs can also express characteristics ("shine") and nouns can express actions (just take the word "action") -- whether in some language there are adjectives that express actions I can't tell, but probably there are. We define parts of speech not semantically, but syntactically. Back to Le café est sucré, couldn't that also be a passive sentence (perhaps continued by "par...")? In this case the participle could be analysed as a verb, couldn't it? Longtrend 19:14, 10 July 2011 (UTC)[reply]
But the definition of verbs and adjectives includes important semantic considerations! If you forget them, you won't be able to make the distinction in difficult cases. Of course, some verbs are not action verbs, but they probably don't cause problems. You are right: in Le café est sucré, sucré is an adjective, but in Le café est sucré par mes soins., it's a verb. It's exactly like sugared in English. Lmaltier 19:30, 10 July 2011 (UTC)[reply]
Many east Asian languages have verbs that express states or properties rather than actions, as does Esperanto ("mi estas blua" and "mi bluas" both mean 'I am blue', "mi estas bluinta" means 'I have been blue'). Several old Indo-European languages also have stative verbs, which are semantically very much like a copula and a participle in English. —CodeCat 19:40, 10 July 2011 (UTC)[reply]
English and French too have verbs that express states. But are there examples of an unclear status (verb(participle) or adjective?) for these verbs? Lmaltier 19:53, 10 July 2011 (UTC)[reply]
There is when dealing with Latin. There are a whole set of Latin deponent verbs whose meaning can only be conveyed in English using adjectves. A Latin scholar would identify the Latin translation as a verb, but only because it has verb endings and not because of any functional or semantic distinction. Latin participles are likewise not always verbs but primarily for the reason that they take the endings of an adjective, inflecting for gender which Latin verbs don't do. And yet the "participial form" is listed as a verb form in most texts and conjugation tables, and forms part of certain compound conjugations. So, in Latin the "verbness" of a participle comes from its tense and context, but its "adjectiveness" comes from its gender and inflectional endings. --EncycloPetey 20:21, 10 July 2011 (UTC)[reply]
How to treat participles on Wiktionary — AEL
· [de-indenting] I agree with Lmaltier about sucré. Let me give a similar example, but in English. Take the sentence “At 3:00 PM, the window was closed”: it can mean either “At 3:00 PM, someone closed the window”, or else “At 3:00 PM, the window was not open”. When it has the former sense, it's a use of the participle: “was closed” is just “closed” cast into the passive voice. When it has the latter sense, it's a use of the adjective: “was closed” means “was a closed window”. The important point is that this ambiguity is specific to the word closed. English has a lot of participial adjectives, but it also has a lot of participles that do not double as adjectives. “At 3:00 PM, the window was opened” has only one meaning. (The analogous alternative meaning would be expressed as “At 3:00 PM, the window was open.”) So it's hard to imagine a solution that uses just a single POS header for words like closed: even though participles are often called "verbal adjectives", we still must distinguish between those that double as real adjectives and those that do not. The former clearly need an ===Adjective=== POS header in addition to whatever POS header the latter have; and I think it's clearly a bad idea to use ===Adjective=== for words like "opened".
· I agree also with Lmaltier that we should generally follow language-specific traditions. That doesn't necessarily mean following two-hundred-year-old theories of grammar; there are current active linguistic traditions for all of these languages. If all of the linguists working on German describe the present participle as a verb form, then we should at least figure out why that is, before just deciding that we know better!
RuakhTALK 20:46, 10 July 2011 (UTC)[reply]
Thanks for your input. Actually, I agree with you on almost all points. sucré was probably a very bad example to argue for my position, since it has developed a new adjective meaning and usage independent of the participle. Just like closed, it falls under the category of what I dubbed "lexicalized participles" in the initial discussion above (which, surprisingly for me, seemed to be rather unintuitive to many). I absolutely agree with you that such lexicalized participles indeed need two sections -- one Adjective section for the lexicalized usage, and one for the participle, and I think we only need to discuss the latter, since (from my point of view) in many cases it's really unclear whether we are dealing with verbs or with adjectives here (or perhaps even with ===Participle===s?). As an example, imagine English had gender, and in the sentence At 3:00 PM, the window was opened the word opened agreed in gender with the subject window. Would we still be so sure that opened was a verb, it would be declined for gender, after all? It's more than a hypothetical situation, this is exactly what we find in Spanish and French and probably many other languages: La respuesta está obviada "The reply is avoided" -- obviada has feminine gender here which comes from the feminine respuesta, and as far as I know it is not the case that "obviado" has developed adjectival meaning and usage. So what about cases like that?
As for German current linguistic tradition, it's certainly not the case that all linguists describe the present participle as a verb form. It's what you learn at school, and in many cases present participles are listed in verb conjugation tables. For example, the Institut für Deutsche Sprache describes past participles as inflected forms of elements of the word class verb and present participles as adjectives formed from verbs by word formation. canoo.net, on the other hand, lists present participles in its grammar as infinite verb forms, but then says that "all present participles have the form and the function of adjectives" and also lists them as adjectives in its dictionary (e.g. spielend). I'll see if I can consult some printed grammars. Longtrend 09:16, 11 July 2011 (UTC)[reply]
@Longtrend, re: verb forms and gender: A word form's having a gender that matches the subject of the sentence does not speak against the form's being a verb form. Czech simple past tenses of verbs show the gender of the subject of the sentence, as in the verb dělat (to do) with its masculine simple past tense dělal, its feminine simple past tense dělala, and its neuter simple past tense dělalo. The same thing is seen in Russian, in its делать, де́лал, де́лала, and де́лало. Unlike these languages, German simple past tense machte does not show gender. --Dan Polansky 10:06, 11 July 2011 (UTC)[reply]
Sorry, I was unclear here. Of course claiming that verbs cannot inflect for gender would be wrong. My point is that in the languages under consideration, inflection for gender does not happen (I hope this is correct), except for the dubious cases of participles, so we'd have to assume that for some reason verbs inflect for gender in that kind of construction and only there. But maybe that's not too good an argument, since in Czech gender agreement on verbs only seems to happen in simple past forms, too. Still: even if there is no strong evidence that we are dealing with adjectives here, is there any evidence that they are verbs? Or is it possibly adequate to say that participles in such positions are something "in between"? Longtrend 10:24, 11 July 2011 (UTC)[reply]
Czech forms that have a similar function as English past participles (called Czech "passive participles" per W:Czech conjugation, whyever) show gender equally well as Czech simple past tense forms: dělán m, dělána f, děláno n, of dělat. They resemble their corresponding adjectival forms: dělaný m, dělaná f, dělané n. For example, "je dělán" corresponds to German "wird gemacht" and English "is made" or "is being made". --Dan Polansky 11:33, 11 July 2011 (UTC)[reply]
@Longtrend, re gender: I see no contradiction whatsoever in saying that a participle (a "verbal adjective", as they're often called) is a non-finite verb form that (often) has various adjective-like properties, including (often) agreeing in gender/number/case/definiteness/c. with a modified noun. And there's no need to imagine a hypothetical English-With-Gender; in actual English, verbs do not agree with their subject at all ("I/we/you/he/she/it/they went") — except for present-tense verbs, which display a bit of agreement, and be, which displays a bit more agreement. Do we therefore say that be is a different part of speech — say, ===Copula=== rather than ===Verb=== — and that present-tense verbs are a weird in-between form that has properties both of a ===Verb=== and of a ===Copula===? —RuakhTALK 12:20, 11 July 2011 (UTC)[reply]
As you already said, participles are sometimes called "verbal adjectives", and some experts don't even give a POS for them but say simply that they are "lexical items" that have "characteristics and functions of both verbs and adjectives" (see here). So discussing a ===Participle=== header is not as absurd as your analogy with English present tense verbs suggests (nobody doubts their verbal status). Of course inflection is only one criterion, there are other criteria that solidly confirm that English present tense verbs are verbs, such as position in the sentence. But I still miss any such criteria for past participles, let alone for present participles. I could agree very well with the approach to treat participles as verbs if they are used to form complex tense or voice constructions. This is the case in English with both present and past participles, so personally I would not change anything about the "English way" (unless we are going to find one solution for all languages). But this doesn't help for German present participles, since they are neither used as stand-alone verbs nor to form complex constructions. So what are they? Longtrend 16:57, 11 July 2011 (UTC)[reply]
If, when you use them, you think to the verb, to the meaning of the verb, you feel you use the verb, then, they are verb forms. In French too, the phrase adjectif verbal is used by some authors, but it's misleading, because they are not verbs at all, their only relationship with verbs is etymological. And these authors don't use this phrase for participles... Lmaltier 18:15, 11 July 2011 (UTC)[reply]
That doesn't always work either. When I think of verwarring in Dutch, I definitely think of verwarren. The form with -ing is very predictable like this in Dutch. But it's not a present participle like in English, it's a verbal noun. I've never heard of this form being considered a verb form any time, but I still think of the verb when the word is mentioned. —CodeCat 18:23, 11 July 2011 (UTC)[reply]
Well, in some languages (Bulgarian...), such forms, even nouns, are traditionally mentioned in conjugations. This is why traditions of the language are important. Your reference is right when stating that participles share characteristics of verbs and adjectives. Actually, they are verb forms with some characteristics of adjectives. But it's wrong when stating "In English, participles may be used as adjectives" (cf. opened, see above). Lmaltier 18:29, 11 July 2011 (UTC)[reply]
Are there participles that can not be used as adjectives? Or can all participles behave as an adjective in all languages that have them? It seems more economic to me to say 'participles are adjectives that may sometimes be used as verb forms' than 'participles are verb forms that can always be used as adjectives'. —CodeCat 18:52, 11 July 2011 (UTC)[reply]
I just answered: opened is a participle, and is not an adjective. And, in French too, corresponding adjectives don't exist for all participles: they're rather common, but not systematic at all, for past participles, and much less common for present participles (note that, for present participles, derived adjectives often have the same pronunciation as the participle, but not the same spelling, e.g. intriguant is a participle, intrigant is the adjective derived from the participle). Lmaltier 19:07, 11 July 2011 (UTC)[reply]
I'm sorry, that's not what I meant. By 'adjective' I meant 'showing adjective-like behaviour', not necessarily having 'adjective' as its part of speech. Opened can be used like an adjective: the opened door. So my question is, are all participles able to be used as adjectives? Are they all able to be used in non-adjectival ways (which apparently implies 'as a verb form')? —CodeCat 19:10, 11 July 2011 (UTC)[reply]
By that approach, we might as well list all words as ===Adjective===, since all words show adjective-like behavior. —RuakhTALK 19:20, 11 July 2011 (UTC)[reply]
@Lmaltier: You honestly think that English participles cannot be used as adjectives? So all these are wrong? Do you have any reason for asserting that apart from your "emotional" analysis? If not, what is that analysis you're proposing based on? Even if we accept a semantic analysis, it's really fuzzy. When I say watcher as a nominalization of watch, I certainly think of the action expressed by the verb. So, is watcher a verb in your opinion? All the syntactic evidence suggests it's a noun, and we treat it as a noun. Longtrend 19:24, 11 July 2011 (UTC)[reply]
In my opinion, in the opened door, opened is used as a participle, not as an adjective. I think that it can be considered as an ellipsis for the door which has been opened. But you probably know better than me.
@ CodeCat: I already answered your first question just above. I add that, in French, past participles of 100 % intransitive verbs are never inflected, it would be quite absurd to consider that they behave as adjectives. Second question: yes, in French, participles can always be used as verb forms (as they are verb forms). In English too. Most typical uses in French (not the only ones) are in compound tenses for past participles, and in the "en + participle" form for present participles. These forms are clearly verb forms.
About watcher: of course, you don't feel that you use a verb when you use watcher, you feel that you use a noun derived from the verb. Of course, it's not a verb form. Lmaltier 19:34, 11 July 2011 (UTC)[reply]
I just fixed intrigant: I removed the verb form section for French (it was a Tbot mistake). As you can see, considering that participles = verbal adjectives leads to serious mistakes. Lmaltier 19:34, 11 July 2011 (UTC)[reply]
Do you care to explain why "of course" watcher is not a verb form but participles undoubtedly are? I'm sorry, but your criterion just seems to be circular and fuzzy. Why does a word belong to a certain POS? Because you feel it. Why do you feel that it belongs to the POS? Because it does. Longtrend 19:59, 11 July 2011 (UTC)[reply]
I never explained than all words directly derived from verbs are verb forms. I even explain that adjectives derived from participles are not verb forms, and that verb forms are not adjectives, even if they share some characteristics. Lmaltier 21:19, 11 July 2011 (UTC)[reply]
@ Longtrend: I don't speak German, so it's impossible for me to judge; but there are other things you can look for. For example, in English, a transitive verb's present participle can take a direct object even outside of explicit progressive/continuous constructions: “while heating the milk, continue checking the temperature and consistency”. (There are a few adjectives that take directly construed complements, as in “it was worth every penny”, but that's very unusual among adjectives, but absolutely universal among transitive verbs' present participles.) —RuakhTALK 19:20, 11 July 2011 (UTC)[reply]
Yes, this is possible in German as well. The Institut für Deutsche Sprache already quoted above states, in my translation: "The present participle -- unlike the past participle -- is never used as a part of analytical verb forms but only in contexts where adjectives occur otherwise. However, present participles show a verbal 'heritage' through their valency". So on the one hand, valency is an argument for verb status of present participles, but on the other hand, both inflection and distribution are arguments for adjective status. (Besides, I'm not sure why you accept valency as an argument for verb status of present participles [there are only few other adjectives taking direct objects] but at the same time reject gender agreement as an argument for adjective status of past participles [there are no other verbs inflecting for gender in French or Spanish].) Longtrend 19:46, 11 July 2011 (UTC)[reply]
There are also many languages in which past participles can inflect as adjectives even if they are from an intransitive verb. I think Latin is an example, and so is modern Icelandic: hann er kominn (he has come) but hún er komin (she has come), the endings of 'come' differ based on the gender of the subject. This is apparently unlike French (it would literally translate as il est venu and elle est venue), but it just shows how much variation there is in each language. —CodeCat 20:00, 11 July 2011 (UTC)[reply]
Sorry if I'm misunderstanding you, but « il est venu » and « elle est venue » are exactly how you say it in French. I guess you're thinking that in French it would be *« il/elle a venu »? Most French verbs form the perfect by using avoir (to have) and an uninflected past participle, and that's the case we were talking about above, but a bunch of common ones, including venir, form it using être (to be) and an inflected one. (Lmaltier erred when he wrote that "past participles of 100 % intransitive verbs are never inflected", unless he was rounding to the nearest percent. :-) ) Some verbs, by the way, can go either way, depending on syntax or semantics or speaker preference. And some use être and an uninflected past participle, for reasons that make sense if you know French but aren't worth going into if you don't. —RuakhTALK 20:46, 11 July 2011 (UTC)[reply]
If that's the case, then it seems to me that such a sentence is just a subject, copula and an adjective, much like 'elle est verte'. Venu is simply an adjective that means 'in a state of having come' (also etymologically), parallel to 'in a state of being green'. —CodeCat 20:49, 11 July 2011 (UTC)[reply]
Yes, I was wrong, venir is an intransitive verb with an inflected past participle (but I was meaning always intransitive verbs, not 100% of intransitive verbs). What I was having in mind was only verbs using avoir, the common case. And, no, in this sentence, venue is not an adjective, no Francophone would consider it as an adjective, it's part of the "passé composé" of the verb. Lmaltier 21:10, 11 July 2011 (UTC)[reply]
[after e/c] @CodeCat: No, sorry. I see why you would say that, and that may well be the origin of the construction; but in everyday Modern French « elle est venue » can simply mean "she came", without any implication about present circumstances. (And even in literary French, which retains a separate preterite construction « elle vint » for that sense, one can write something like « elle est venue trois fois », meaning "she has come three times", where I think it's a bit farfetched to posit a state of "having come three times". Certainly in English you can't say "the window is open three times".) —RuakhTALK 21:20, 11 July 2011 (UTC)[reply]
@Longtrend: I'm not rejecting gender agreement as an argument for adjective status, I just don't see it as conclusive. In French and Spanish, it is not only adjectives and sometimes past participles that show gender agreement, but also determiners (la femme, la mujer) and many pronouns (elle, ella; la tienne, la tuya); and many animate nouns come in masculine–feminine pairs that resemble gender agreement (japonais(e)ADJun(e) Japonais(e)N, japonés/esaADJun(a) japonés/esaN). And of course, many Slavic, Afro-Asiatic, and other languages have gender agreement even in finite verb forms, so it's not like it's unheard-of. —RuakhTALK 21:57, 11 July 2011 (UTC)[reply]
How to treat participles on Wiktionary — AEL 2

I'm not sure about languages other than English, but in English, there are some simple syntactical clues to tell whether a participle form has split off and become a full adjective. If it can be modified by very, it certainly exists as an adjective (and continues to exist as a participle). You can't say, for example that the sandwich was *very eaten that the letter was *very typed or that the world was *very created. I suspect a similar test would work in French. Would tres créé, tres dactylographié, or tres mangé be acceptable? Of course, this doesn't work all the time because not all adjectives are gradable. Another test is to see whether it can be the complement of certain linking verb other than be (particularly become), for example he became closed, the movie became interesting, and the muscles became bruised, but not *the letter became typed, *the sandwich became eaten, or *the world became created.--Brett 01:45, 12 July 2011 (UTC)[reply]

Yes, the sense of adjective is exactly the same in English and in French. Lmaltier
The test with 'became' only works for English, because in Dutch de boterham werd gegeten (the sandwich became/was eaten) is not just valid, it's very common. The test with 'very' doesn't always work either, because there are certain verbs that indicate a progressive action. These are especially common in Dutch, where they begin with ver- (although not all verbs in ver- have this progressive aspect). In these verbs, very would simply indicate that the progress had continued to an exceptional degree. decomposed is a good example: it was very decomposed. This does not necessarily indicate an adjective, since you could easily imagine that the decomposition process had progressed to a significant degree. There are probably a lot of other verbs like this. I'm not arguing that this means decomposed is a verb form in such cases, I'm just saying that the test is ambiguous. —CodeCat 10:26, 12 July 2011 (UTC)[reply]
As I said, I was making the specific point for English, but it seems likely that, in Dutch or other languages, there would be certain modifiers that will modify verbs and not adjectives or adjectives and not verbs. It might not be the equivalent of very, but there may be something. Similarly, while the Dutch word for become may take both verbs and adjectives as complements, there is likely some verb that will take only adjectives (or AdjPs) as complements.--Brett 11:09, 12 July 2011 (UTC)[reply]
I know nothing about Dutch, but would de boterham schijnt/lijkt gegeten be grammatical?--Brett 12:33, 12 July 2011 (UTC)[reply]
It would be grammatical even though it sounds a little strange, mostly because people would not say it that way. Dutch has a separate verb opeten which is used when something is eaten completely. It's also more usual to add te zijn after 'schijnen' or 'lijken' and an adjective: de boterham schijnt/lijkt opgegeten te zijn (the sandwich seems to be eaten up), just like de boterham schijnt/lijkt rood te zijn (the sandwich seems to be red). But de boterham schijnt/lijkt gegeten is not really wrong, because people will understand 'gegeten' as an adjective. —CodeCat 12:52, 12 July 2011 (UTC)[reply]
That's true in English as well; participles can productively be turned into adjectives. (Just as you can reply to "Are you inside yet?" with "Very inside", even though "inside" is a preposition rather than an adjective and "very inside" doesn't have a single specific meaning, you can reply to "Is it eaten yet?" with "Very eaten", even though "eaten" is a participle rather than an adjective and "very eaten" doesn't have a single specific meaning. For example, it could mean that even the crumbs got eaten; or it could just mean that it was eaten a long time ago: "Am I too late? Is the cake eaten yet?" "Very eaten. You're about a week too late." That doesn't mean that eaten is normally an adjective, only that participles can be stretched into use as adjectives.) —RuakhTALK 13:37, 12 July 2011 (UTC)[reply]
The discussion is going a bit in circles right now. If they can be used as adjectives in all cases (not including cases that some 'known' adjectives lack, such as comparison), then why are they not adjectives after all? It doesn't really matter if they have extra properties that most other adjectives don't. Do they meet all the minimum requirements to qualify as adjectives? —CodeCat 14:45, 12 July 2011 (UTC)[reply]
All words can be used as adjectives. The point of parts of speech is not "is it remotely possible to use this word in this way?", but rather, "is this how this word is normally used?" It is possible to press participles into service as adjectives, and this is a fairly productive process: plenty of normal adjectives (tired, interesting, closed) began life as participles. But most participles are not normally used this way. —RuakhTALK 14:58, 12 July 2011 (UTC)[reply]
Historically it's actually the opposite. The oldest participles in English actually began life as adjectives and only later became used as verb forms. Proto-Indo-European had no periphrastic tenses (or even tenses at all!), and even in Proto-Germanic participles were still mostly adjectival (compare the Old Norse and Icelandic examples above, which closely reflect the PG situation). I realise this doesn't really change the situation for English as it is currently spoken, but it does point out that the question of 'which was first' is definitely 'adjective'. The productive process eventually came to be reversed, but it was not always so. I think if you go back far enough in history, you'll find that many old English participles were originally adjectives, then became participles, and (maybe?) had adjectives formed from them again. —CodeCat 15:05, 12 July 2011 (UTC)[reply]
You'll forgive me for not just taking your word for that, given that you also think that participles today are definitely adjectives. Just because they're not used in any periphrastic verb constructions, doesn't mean they're not verb forms. (I'm certainly not saying you're wrong. I'm just not confident that, if I knew more about those languages, that I would agree with you.) —RuakhTALK 15:55, 12 July 2011 (UTC)[reply]
In PIE, the distinguishing feature between verb forms and verbal adjectives is that the former are based on aspect stems (stative, perfective and imperfective) while the latter are based directly on roots. Strictly, only aspect stems form verbs in PIE, since they are conjugated while roots are not (unless it's an athematic root verb such as *h₁es-, but those are rare). The English weak past participle and the Latin perfect passive participle both derive from a verbal adjective in *-tos which was attached directly to the root and had no aspect-forming infix originally. Irregular weak participles like brought are still remnants of that. —CodeCat 16:11, 12 July 2011 (UTC)[reply]
@Ruakh: Isn't there a third group of "original" participles between those that you just mentioned (participles that cannot be used as adjectives or just in such a way that all words can, and lexicalized participles -- tired etc. -- that are now true adjectives independent of the original verb): participles that are regularly used as adjectives and are not in any way peculiar in such constructions. I'm thinking of such cases as the opened window (I'm not even sure whether this is grammatical -- please correct me if it's not!). It is not lexicalized as an adjective here (compare open), but it's not just a weird way to use an adjective either (compare *the cried child). Longtrend 15:11, 12 July 2011 (UTC)[reply]
I wonder why 'cried child' is strange but 'fallen child' is fine, especially since both cry and fall are intransitive. There must be something inherent in the meanings of these participles that makes them different somehow. Maybe some participles like fall are active by nature while cried is passive? —CodeCat 15:14, 12 July 2011 (UTC)[reply]
Is fallen child really acceptable in the sense "child that fell"? Or is it rather only acceptable under a lexicalized interpretation of fallen? Longtrend 15:24, 12 July 2011 (UTC)[reply]
@Longtrend: I believe "the opened window" is a reduced passive; you can also say "the just-opened window", for example, meaning "the window that had just been opened", or "the next-opened window", meaning "the window that had been opened next". It's not really an adjective; you can't say *"the very opened window", even though semantically that would make sense. —RuakhTALK 15:55, 12 July 2011 (UTC)[reply]
Okay, I think this makes sense for English. In German there is the exact same kind of construction (das geöffnete Fenster) and you can also say das gerade (just) geöffnete Fenster but not *das sehr (very) geöffnete Fenster. Here, however, the participle inflects just like an adjective. That is, unlike in the discussion we led above, it doesn't just take one category typical of adjectives (gender), but inflects according to a whole adjective paradigm. Would you still say the participle is a verb there, given that info? Longtrend 16:35, 12 July 2011 (UTC)[reply]
Yes, that's what I'd say: lexically speaking, it's a non-finite verb form, and grammatically speaking, it differs in consistent ways from true lexical adjectives, so it's best thought of as a ===Verb=== rather than as an ===Adjective===. But I'd say it very cautiously, doing my best to make very clear that (1) this is my tentative opinion based on almost no knowledge of the language at all and (2) I mean, I'm not a linguist or anything. I'm just doing my best to understand what linguists have figured out. —RuakhTALK 17:28, 12 July 2011 (UTC)[reply]
Okay, I appreciate your assessment anyway. What I don't like about that solution is that we'd weirdly have an adjective declension table under a Verb header. I wouldn't even know how to handle this. Longtrend 14:04, 14 July 2011 (UTC)[reply]
When the word is not an adjective, it's not an adjective declension table, it's a verb form declension table... This may be included in the conjugation table. Lmaltier 19:46, 15 July 2011 (UTC)[reply]

In Greek, μετοχή (metokhḗ, participle) is one of the ten parts of speech, at least according to school grammars. Its special character of being something that shares (μετέχει (metékhei, participates in)) qualities of both verb and adjective makes it worth distinguishing it from other POS. On el.wiktionary we follow this distinction and use μετοχή as an L3 header for Greek words. I see that there is in use a Participle L3 header for "some Russian, Lithuanian, and many Latin entries" (Wiktionary:Entry_layout_explained/POS_headers). So I think that we could also discuss the possibility of a more extended use of this header. --flyax 15:29, 12 July 2011 (UTC)[reply]

That's what I originally considered the best possibility (or rather after Prince Kassad's comment in the initital discussion) since there appear to be cross-linguistic problems of assigning participle forms to parts of speech, but at the moment I tend to a language-specific approach (I'll give my arguments later). That doesn't mean, though, that it's impossible that more languages use a Participle header, let alone that the header is wrong for the languages that already use it. Longtrend 15:39, 12 July 2011 (UTC)[reply]

Since this discussion is currently inactive (thank you all for your contributions!), I'll try to sum it up and draw my personal conclusions from it. If there is one thing that we all agree on, I think it's the fact that the matter is very complicated and not easy to handle. Put more concretely, it is not desirable to simply have a linguistically universal Participle header for everything that is traditionally called a participle. Even if there seem to be cross-linguistic problems of assigning participles to a POS, each language should be considered separately and carefully.
For German, after this discussion and checking out some grammars, my personal impression is that the introduction of a Participle POS header should be taken into consideration. I'll give my arguments for that impression, which might also be relevant for other languages.

  • First of all, it should be questioned whether different kinds of participle in one language even form a more or less homogeneous class, or if they should be treated separately: e.g. for German, should present (pr.p.) and past participles (pa.p.) be treated the same or differently? Opinions differ slightly here, Peter Eisenberg's grammar Grundriss der deutschen Grammatik only treats pa.p. as infinite verb forms, but pr.p. as adjectives. But most grammars agree in putting pr.p. as well as pa.p. into the same class (mostly infinite verb forms). There is an interesting article by Heinrich Weber (unfortunately in German) discussing the classification of German participles on the basis of twelve criteria that help distinguish verbs from adjectives (such as including a verbal lexeme, governing accusative and/or oblique cases, usability as an adverbial, gradability). He comes to the conclusion that of those, pr.p. and pa.p. have eight charasterics in common, pr.p. and the infinitive six characteristics, pa.p. and infinitive also six, but only five common characteristics for pr.p. + adjectives / pr.p. + finite verbs and four common characteristics for pa.p. + adj. / pa.p. + finite v. So present and past participles have more characteristics in common both with each other and with the infinitive than with either finite verbs or adjectives. This is an argument in favour of treating German pr.p. and pa.p. basically the same, whatever that solution may look like.
  • So what header should we use for German participles: Verb, Adjective or what? All grammars I checked out agree in that pa.p. are to be treated as a verb form, but that most can also be used as an adjective. For pr.p., there is less of a consensus: For Peter Eisenberg and the Institut für deutsche Sprache (IDS), pr.p. are not (infinite) verb forms but adjectives that are merely formed from verbs. All other grammars I know of classify them roughly as verb forms, but some then weirdly say that they are used only as adjectives (such as canoo.net or the Duden-Grammatik which states that pr.p. aren't conjugational forms of verbs). Since pa.p. are also used to form complex tenses, I think we can agree that putting both pr.p. and pa.p. solely under an Adjective header makes no sense.
    What solid arguments are there against using a Participle header for German (for both pr.p. and pa.p.)? Traditionally, "participle" was considered a separate part of speech. This has changed, now they are often regarded either as verbs or as adjectives, so this might be an argument against the Participle header. But I believe this to be simply due to basic differences between grammars and such dictionaries as the Wiktionary. We here at Wiktionary are forced to assign each word form to a POS. This is not the case for grammars. If we can't decide for a POS after considering all relevant aspects, why not recognize that what we need may be a separate POS? It might seem that pr.p. in German can be perfectly treated as adjectives, according to syntactic distribution and morphological inflection. But then they govern arguments like verbs, are generally not prefixable by un- or gradable, etc. They simply don't fit either category. And the same is true for pa.p., which might seem to be clearly verbs. But then they can be used attributively, decline like adjectives, are sometimes governed by other verbs (unlike finite verbs, but like adjectives), etc. Let's assume we use a Verb header for German participles despite the adjectival characteristics. How would we solve the dilemma of needing to have an adjective declension table under a Verb header?

For those reasons, it seems to me that introducing a Participle header would be the best option for German. We could put declension tables there without a contradiction (as there would occur for declined "verbs") and at the same time link to the verbal origin. Just for clarification, lexicalized participles such as wütend or verrückt that are now true adjectives would of course be unaffected. All those who disagree with me: in which points exactly do you think I'm wrong or I drew wrong conclusions? I'd be very glad to hear your comments, especially since I really want to reach a consensus. I'm well aware that introducing a new POS to a language needs more justification than keeping the status quo -- but the status quo in this case is not an option, since currently we have no way at all to treat declined participles (AFAIK there is not a single such entry on Wiktionary yet). Longtrend 18:54, 15 July 2011 (UTC)[reply]

I imagine Dutch will be treated the same, because its participles are more or less identical to German ones. Is the situation for the Romance languages much like German as well (apart from the fact that they show gender agreement in predicates, which German doesn't)? —CodeCat 19:04, 15 July 2011 (UTC)[reply]
French and Spanish (the only Romance languages I speak) differ from German in important ways: (1) French distinguishes blatantly and obviously between present participles, which are very restricted in their uses and which do not inflect for gender or number, and adjectives derived therefrom, which are normal adjectives and often spelled differently from their participles; (2) Spanish has two different constructions that could be called "present participles", of which one (the gerundio; here we call it the "adverbial present participle") is considered to be a verbal adverb and does not inflect for gender or number, and the other (the participio presente; dunno if we have a name for it here) is no longer productive, but rather survives only as various nouns and adjectives; (3) neither French nor Spanish requires the declension tables that have Longtrend so bothered, since their adjectives and past participles inflect only for gender (masc/fem) and number (sing/pl), not for definiteness or position or case. The closest thing to that is Spanish forms like dándo-, which we're currently not worrying about SFAICT, and which anyway are further evidence for ===Verb===ness. Personally I still suspect that ===Verb=== is the way to go for German as well, but many of Longtrend's reasons for using ===Participle=== for German don't apply to French and Spanish anyway. —RuakhTALK 20:17, 15 July 2011 (UTC)[reply]
Thanks for the analysis and information you provide. It clarifies things much for German. My conclusion is that a Participle POS is no more justified in German than in English or in French. Why? Because specialists call them either verb forms or adjectives. It's possible to treat the declension of adjectives in an adjective declension table, and the declension of verb forms in the conjugation table. Lmaltier 19:57, 15 July 2011 (UTC)[reply]
But that would mean that some verb forms can have an adjective declension section. Do we want that? —CodeCat 20:05, 15 July 2011 (UTC)[reply]
@Lmaltier: Since when do you listen to specialists' analyses rather than the speakers' emotions? Or is it just because it's a convenient way to prove my point wrong? I already said why I think it is that participles are often treated either as verb forms or as adjectives. Until you respond to my arguments, I see no reason to take over your point of view instead. While responding, keep in mind that experts by no way agree in the decision whether participles are verbs or adjectives. Longtrend 20:12, 15 July 2011 (UTC)[reply]
I always said that we should not invent anything (and this is one of the basic principles of the Foundation),and that we should follow specialists, traditions of the language. And verb forms cannot have an adjective declension section, as they are not adjectives. The best place for the declension of these forms is the conjugation table. Also note that I don't propose anything on how to deal with the question in German (this is not easy if opinions differ among specialists, and it's true that a decision should be taken). I only think that, in German, we can do with the verb and adjective POS, according to what you explain. Lmaltier 06:30, 16 July 2011 (UTC)[reply]
You still haven't responded to my argument about the difference between grammars and Wiktionary. What linguists do agree in is that German participles have characteristics of both verbs and adjectives, so I have a bad feeling about just squeezing them in one of these groups. (And putting them in both groups would suggest that in one usage they are clearly verbs, while in the other they are clearly adjectives, which does not seem to be the case either.) I don't really see a problem about a Participle header, which to the contrary would solve those problems. This is my impression specifically for German, my proposal would not affect any other languages, since I know too little about them -- we should refer to linguists' analyses there as well. If you worry that Participle is not a proper POS, well, "proper noun", "prefix" and "symbol" aren't either, as is even in the ELE. Do you think the Participle POS is inappropriate in Latin, too?
You probably know what I meant by "verbs would have adjective declension sections": this was short for "verbs would have a declension section that would include exactly the same forms as adjective declension templates". And this would be a problem in my opinion. You propose to include those forms in conjugation tables. Just to make sure I understand you correctly: you want to change the verb conjugation template so it includes all the declined forms? But that's declension, not conjugation as the header would suggest. The contradiction remains. Participles inflect for completely different categories than normal verbs. Longtrend 09:56, 16 July 2011 (UTC)[reply]
For Latin, I don't know: when I was learning Latin, participles were ccnsidered as verb forms, but this tradition may be different in different countries, and may change with time. The tradition to be adopted is the one currently used for Latin in English-speaking countries. In French, nobody considers that it's a problem to consider aimée, aimées, aimés as conjugated forms of aimer. I don't see why it is a problem to decline a verb form. Lmaltier 11:19, 16 July 2011 (UTC).[reply]
Discussing this topic would be a lot easier if you replied to my arguments and all my questions... Longtrend 12:21, 16 July 2011 (UTC)[reply]
Maybe someone else wants to answer my questions and concerns then. To be honest, it's not that important for me to have a Participle header for German. I don't think a Verb header would be really wrong or anything. I just want a solution that works for German (and of course I want that solution to be as good as possible, so I still like the Participle solution best), and I can't imagine how a Verb header could work for declined participles. Any concrete suggestions? Even if so, why bother when "Participle" could do the job effortlessly and is obviously not really wrong (to say the least)? Longtrend 17:12, 19 July 2011 (UTC)[reply]
It seems strange to have certain verb forms listed under ===Participle=== rather than ===Verb===, given that we don't generally use different POSes for different inflected forms. I mean, assuming you're still planning on definitions like “Past participle of spielen.”? And I really don't see why the verb's ====Conjugation==== section, at the lemma (infinitive) entry, can't provide all forms of the participle. It just doesn't seem like ===Participle=== buys us anything. —RuakhTALK 17:35, 19 July 2011 (UTC)[reply]
Thanks for your reply. Well, IMO the advantage of ===Participle=== over ===Verb=== is that with the latter, we would say "this word is a verb" despite all its adjectival characteristics, while with the former we would admit that the issue is more complicated than that. AFAICT, it simply reflects the linguistic facts better.
My problem with listing all participle forms under the infinitive entry is the following: a form like spielendem (dative, masculine or neuter, singular) is clearly declined, not conjugated, it can clearly be traced back to a base form spielend. I'm not aware of any other case where there is a word form which on the one hand is an inflected form of some lemma, but simultaneously serves as the base form for a group of differently-inflected items. The latter in this case is declension, the former (allegedly) conjugation. Or is it just a terminological issue?
I also don't think it's so clear we're talking about "verb forms" here as you seem to assume tacitly. Saying that they are not verb forms but, well, participles, seems to work just well for Latin, see amāns: the "definition" is a translation, while the "present participle" part is in the Etymology section. The only problem I see is the following: German past participles, unlike present participles, all appear as part of complex tenses (which arguably makes them verbs, at least in this usage). And some intransitive ones cannot even appear in non-verbal positions or be declined, i.e. they show no adjectival characteristics. Of course, we might also use ===Participle=== in such cases and just omit the declension part, but this might fail to capture the fact that what we find here are quite unambiguously verbs.
Oh well. I certainly learned a lot about participles during this discussion, but regarding my initial question "How to list declined participles on Wiktionary?" I'm as perplexed as before. Longtrend 20:17, 19 July 2011 (UTC)[reply]
"Tacitly", my foot; I explicitly said I was assuming it, and added a question mark for good measure! Regardless, from everything you've said, it seems clear that German past participles, at least, are certainly verb forms, even if they are also adjective-like.   Re: "I'm not aware of any other case where there is a word form which on the one hand is an inflected form of some lemma, but simultaneously serves as the base form for a group of differently-inflected items": It happens. For example, in Hebrew, especially Classical Hebrew, if a verb-form has a personal pronoun as a direct or indirect object, then that pronoun can be incorporated into the verb-form as an additional nominal inflection; tishkakhénu, for example (in Lamentations 5:20; KJV "dost thou forget us"), is tishkákh ("thou dost forget", verb) + -énu ("us", object pronoun), where tishkákh is the second-person masculine singular imperfect/future/prefix-conjugation of shakhákh ("forget", verb). —RuakhTALK 04:07, 20 July 2011 (UTC)[reply]
Okay, but if Wiktionary's basic policy indeed is "Don't invent anything" (as Lmaltier claims, and as you might assume tacitly? SCNR...), then that's no option for us. Saying that spielendem is an inflected form of spielen (verbal infinitive) rather than spielend (present participle) is something I've never heard before. Compared to that, putting ===Participle=== as the POS is, if it all, just a ridiculously tiny "invention", and definitely not wrong (since they're definitely participles, we just don't agree if it's a proper POS). Longtrend 17:46, 20 July 2011 (UTC)[reply]
How to treat participles on Wiktionary — AEL 3

Sorry for not going through all this tl;dr discussion - has any kind of agreement been reached by now, or are people still arguing about tiny details? -- Liliana 15:29, 29 July 2011 (UTC)[reply]

There's no agreement, but at the moment there's no discussion either. I also don't think we ever argued about "tiny details". There is no established way to treat German inflected participles on Wiktionary, and no imaginable way seems perfect. If you have any input, please feel free to contribute. Longtrend 22:37, 30 July 2011 (UTC)[reply]

Just throwing in that on German Wiktionary, they do use POS headers called "Partizip I" (which is German for "present participle") and "Partizip II" (which is "past participle"). However, I couldn't find any discussion that led to the introduction of those headers (didn't search too long, though), and they don't seem to have any entry for an inflected participle, either. Note that inflected participle forms don't appear in verb conjugation tables. Longtrend 16:16, 1 August 2011 (UTC)[reply]

I'm not a linguist, but I added some 10,000 Swedish entries to the English Wiktionary, including many of the most commonly used words. In order to get things done in a limited time, I systematically treated past participles as adjectives, giving their role as a verb form in the Etymology section. See for example arresterad, bekräftad, debatterad. This works fine with the existing templates used for Swedish adjectives. So far, this has not been controversial at all. Some future linguist may perhaps argue that these are not actually adjectives, but if they have the time to change my edits, they will find the job easy to automate by the fact that I followed a single pattern. --LA2 10:26, 11 August 2011 (UTC)[reply]
Thanks for your input. Sounds like a workable solution, but the problem is that in many languages, including German, (past) participles are often clearly used as verbs, i.e. in compound tenses. Maybe that's not true for Swedish. Longtrend 22:31, 11 August 2011 (UTC)[reply]
For Swedish verbs, the form that is used with verbs is the supine, which was originally the neuter form of the past participle. But they are not always identical anymore, since verbs with participles in -en have a neuter form -et but the supine has -it (this distinction is not original, though). —CodeCat 22:47, 11 August 2011 (UTC)[reply]
I've just seen this discussion so thought I'd chip in with a note about Luxembourgish. It's pretty much the same as German; the Luxembourgish participle (only one, rather than two in German) can be used as an adjective (either attributive or predicative), but it is also used in many compound tenses. Most verbs in Luxembourgish only have conjugations for the present tense, so for those every other tense (past, future, conditional, etc.) is formed using the participle. Therefore just having the entry as either an adjective or a verb form would be inaccurate. BigDom (tc) 08:42, 5 September 2011 (UTC)[reply]