Wiktionary:Beer parlour/2008/April
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live. |
Beer parlour archives edit | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Wholesale conversion to "Determiner"
A user Contribution of BrettR_aka_Mr._Determiner has changed the entries for a dozen frequently used words to eliminate all PoS headings except "Determiner". Does such a drastic step have community approval? It seems simply undiscussed. It seems to me to be going in the direction against intelligibility for ordinary users and favoring some current fashion in lingustics and language education. DCDuring TALK 20:11, 1 April 2008 (UTC)
Wiktionary:Beer_parlour/2007/April#Determiner_vs_Determinative is the last major discussion and seems hardly conclusive. Has there been a policy vote? Or is this an April Fools thing? DCDuring TALK 20:36, 1 April 2008 (UTC)
- I don't think that's O.K. It might be O.K. to unify ===Adjective=== and ===Pronoun===, and maybe even ===Noun===, under ===Determiner===; but eliminating ===Adverb=== sections? —RuakhTALK 22:28, 1 April 2008 (UTC)
- Is "determiner" really widely accepted in books for those outside of the language and linguistics community. If it is leading (bleeding?) edge, then perhaps we could find a way of transitioning to its use that allowed for folks like me to get behind it. I really don't see why this header ought to be in use at all until it has been well discussed. Because we can't do any real user research, we have to pay close attention to the practice of other publishers. Their practice does not support this. If we think that we can achieve a competitive advantage by being cutting edge, free of some unhelpful traditional categories, then we should go ahead and implement the change. I haven't heard the rationale for the superiority of the unknown-to-users category "Determiner" to the widely known old-fashioned categories. Maybe it doesn't really matter much because hardly anyone will use a dictionary for these words anyway and we don't have all that big a user base so we can just do what pleases us. DCDuring TALK 23:39, 1 April 2008 (UTC)
- I'm not certain whether we should use "Determiner" or not, but I am certain we should not use it until we've discussed it and come to a conclusion to use it. The previous discussions are neither recent enough nor conclusive enough imho. Thryduulf 00:01, 2 April 2008 (UTC)
- That wish is not stopping the process which continues as we wring our hands. Judging from the lack of action I would say that this is something most don't care about or support. DCDuring TALK 01:26, 2 April 2008 (UTC)
- We just don't particularly have time or whatever. What "BrettR" doesn't realize it that he is likely to cause a vote on using "determiner" fo English (see below), and if it is barred, we will calmly revert all of his edits and remove the Determiner heading from the entries it was already used in. (I recall reading something on the 'pedia about someone (ahem) determined to show the wikt people who don't even know what a determiner is; I suppose I should have seen this coming? ;-). It is not a classic English POS, and there is little reason to allow it. "BrettR" is likely just wasting his time. But note, not much of ours, is easy to rip out. OTOH, maybe it is just an April Fool's joke, albeit not a very amusing one. Robert Ullmann 01:39, 2 April 2008 (UTC)
No, not April fools. There are a small number of English determiners and half of them were already listed as determiner. The others have been categorized that way for a long time. I just though I'd be bold. I'm certainly willing to discuss. I've stopped the process for now.--BrettR 01:45, 2 April 2008 (UTC)
By the way, the vast majority of my edits last night were simply adding or fixing up the {{en-det}} template to sections with existing L3/POS Determiner headers.--BrettR 13:51, 2 April 2008 (UTC)
- I really don't understand all of the hullabaloo here. "Determiner" (or "demonstrative determiner") has been in use as an English POS header here since at least May 2004, without causing much of a fuss, and it seems to me that all BrettR has done is introduce a bit of much-needed consistency. -- Visviva 06:42, 3 April 2008 (UTC)
- I'm asking what it's for. If it's good for users, then it deserves to be documented as policy. If it's not, then it deserves to be eliminated. The other dictionary operations seem divided on its goodness. Should we stay with the set of categories most widely known to users and contributors or should we push them in a progressive direction (if it really is progressive)?
- As wiktionary matures some of the issues that have been swept under the rug ought to be addressed to allow progress toward consistent practice to help our users get more out of us than they would get out of a list of definitions. Our entries are far from uniform in quality and extremely inconsistent in the use of fundamental categories. The appearance of uniformity forced by the software and the heading structure belies the depth of the inconsistency. The level three headers are of fundamental importance in structuring entries. Questionable areas include the treatment of abbreviations (written vs. non-written, actual PoS for abbrevs. other than nouns); phrases, idioms, and proverbs; interjections; numbers; and other symbols. Do we need adjectve headings for attributive use of nouns? When do participles become adjectives or nouns? When are related etymologies worth splitting? Codifying much of this now might be premature, but it would help Wiktionary improve to convert our experience and beliefs concerning what works from a user perspective to policies and guidelines. DCDuring TALK 10:14, 3 April 2008 (UTC)
I believe that all the English determiners are now identified with an L3 Determiner heading and {{en-det}}.--Brett 12:09, 26 May 2008 (UTC)
Proposal to use "Determiner" as an L3/POS header.
Per the discussion immediately preceding this one, I'd like to propose that we begin using "Determiner" as an L3/POS header for the unambiguous English determiners (CGEL determinatives). It's a small enough group of words — larger than you might expect, but still pretty small — that it shouldn't be too bad to undo if we later decide it was a bad idea for whatever reason. Conversely, if we decide that it was a good move, we can open the door to determiners in other languages, and loosen the "unambiguous" criterion.
Advantages to doing this:
- It's more accurate than trying to force determiners into the traditional lexical categories (parts of speech).
- It's more concise than giving determiners several different POS sections with the same definitions over and over again.
- It's in better keeping with our sister project Wikipedia's (accurate, or trying-to-be-accurate) descriptions of the various parts of speech.
Disadvantages:
- The term "Determiner" will be less familiar to many readers than more traditional terms. (Of course, most of our readers probably have a fairly unclear notion of the traditional terms' meanings, anyway.)
- From what I understand, there's not total consensus in the linguistic community about exactly which words are determiners. (Of course, "we can't do this perfectly" is a far cry from "we shouldn't bother trying" — and it's not like there's total consensus in the traditional-grammarian community about exactly which words belong to which traditional parts of speech, either.)
- It might be a slippery slope from this to less obviously positive adoptions of modern linguistic theory — will we next adopt the notion of intransitive prepositions? Will we create an entry at Ø with screenful after screenful of definitions of null determiners and whatnot in every language known to man? (Of course, knowingly accepting inaccuracies is also a slippery slope. As a wiki, we really have no choice but to trust our future selves and future members of the community.)
Thoughts? (I'd like to bring this to a vote within a week or two, if there seems to be agreement.)
—RuakhTALK 01:23, 2 April 2008 (UTC)
- Why should we buy what the linguists are selling? Why should we buy it before OED, MW, Collins, AHD, Longmans, Random House, et al.? What are the actual advantages of this to our anonymous users? What evidence supports our beliefs about these possible advantages? What are the words that would be affected? It would be handy if we had one or more categories for them. If this vote is to be meaningful, would it not be useful to rein in Mr. Determiner and get him involved in the discussion? What else has to be done for this change to actually have a good effect on the experience of our supposed user base? DCDuring TALK 01:41, 2 April 2008 (UTC)
- I'm in for a vote. Obviously I think determiner is a very useful category. Why should we buy what the linguists are selling? Can you imagine editors over at wikipedia (I know this isn't wikipedia) asking "Why should we buy what the physicists are selling? Newton was good enough for me." We should buy it because they're the experts. Why before OED? Because OED only comes out every few decades. Why before Longman? Longman has used determiner for years. For a list, see here Category:English_determiners (that doesn't include cardinal numbers which are all determiners as well as being nouns). I hope that helps.--BrettR 01:57, 2 April 2008 (UTC)
- Longman's competitive strategy seems to be to try take market share among those who are dissatisfied with existing dictionaries. They seem to try novel approaches in many ways: typesetting, layout, font selection. So it is hardly a shock that they would be an early adopter. I wonder what percentage of the things that they have been early adopters of have been taken up by the others, that is, been successful. And what percentage they have abandoned, that is, failed. OED and MW also have online dictionaries which could readily adopt these innovations if they thought there were an advantage. — This unsigned comment was added by DCDuring (talk • contribs) at 02:52, 2 April 2008 (UTC).
- Here on Earth there is very little reason for an engineer to care about theories of everything, general relativity, gauge theory, string theory, etc. Truth is truth for a purpose. Newtonian physics is very practical for many purposes. How will it help an anonymous user to use an unfamiliar category such as "Determiner"? How will it help me? It is not compelling and more than a little troubling that the rationale is: "Trust us, we're the experts." I certainly have no objection to having entries for words like "determiner" or using it for categorizing. It is for more fundmental purposes that I am concerned. You are imposing a change in the structure of what we are doing. It reminds me more of a Websterian or Shavian spelling reform proposal than something productive. DCDuring TALK 02:21, 2 April 2008 (UTC)
- How will it help an anonymous user to use an inaccurate category such as "Adjective" for a determiner like (deprecated template usage) both? Determiners are some of the most common words in the language; if someone's looking one up, I don't think we can assume that they'll be terribly familiar with "Adjective", either — and it certainly won't help them understand how to use the word. Incidentally, I really don't understand your comparison of this to Webster's and Shaw's proposals: we most certainly are not trying to change how people use these words. We're not trying to turn these words into determiners, only to accurately identify the words that already are determiners. —RuakhTALK 02:39, 2 April 2008 (UTC)
- Accuracy-shmaccuracy. They are no more Determiners than they are Adjectives. These are just categories, tools of thought. Planets don't have elliptical orbits any more than they have circular ones. Elliptical orbits are more useful tools for thinking about their orbits, not ultimate truth. What are the constructive benefits of using this category? Will users be happier? Will they leave the site knowing more? Will we gain users? Will we gain funding? Will linguists cheer us? DCDuring TALK 02:52, 2 April 2008 (UTC)
- I don't think it would be constructive to get rid of the category -- there are English determiners (or words which act like determiners), just as there are determiners (or words which act like them) in many other languages, and it makes sense to have a category for such words. The real question is whether we should use this as a header for English, as we have already been doing for many years. If we reject it for English entries, we have the odd phenomenon of a category, accepted for English entries, which is (and must be) a POS header for other languages such as Korean, but which is not accepted as a POS header for English entries. That would be inelegant at least, and elegance does have a certain value of its own. -- Visviva 04:41, 2 April 2008 (UTC)
- I notice that above "determiner" is called "leading (bleeding?) edge". That's an odd way to refer to a concept that is at least 75 years old. Leonard Bloomfield wrote in 1933 (likely echoing earlier writings) that "The determiners are defined by the fact that certain types of noun expressions (such as house or big house) are always accompanied by a determiner (as, this house, a big house)." Since the 1960s, the concept of English determiners has been positively mainstream within linguistics, admittedly with some disagreement around the edges (do words like my, his, etc. belong?), but with broad agreement that there exists this class of words that are not properly described as adjectives or pronouns.
- "How will it help an anonymous user to use an unfamiliar category such as 'Determiner'?" How will it help them to find things called "demonstrative adjectives"? And maybe they'll click on the word and learn something.
- Nobody is asking anyone to take this one faith. The topic is clearly laid out in many books and articles with data and argument to back it up. Anyone who hasn't read much about the topic could do worse than look at the wikipedia entry. From there I would suggest moving on the A Student's Introduction to English Grammar or, for the more ambitious, The Cambridge Grammar of the English Language. You could also check out John Payne's 1993 chapter in Heads in Grammatical Theory by Corbett et al (eds.). Hope that's useful.--BrettR 13:17, 2 April 2008 (UTC)
- If it has been around then it has had ample opportunity to have proven its practical utility to OED, MW, et al. The article of faith is whether it will prove useful to the practical understanding of users. This appears to be a fashion on interest to linguists with no positive practical consequences that anyone seems to be aware of. I was hoping that there would turn out to be some positive consequences that could emerge from the discussion. I remain hopeful that someone will be able to articulate an advantage to the population at large from making this change. If it has value in making linguists want to participate in Wikitionary, that might count for something. DCDuring TALK 14:11, 2 April 2008 (UTC)
- If we can agree that there is a practical utility to having POS at all (and there seems to be consensus that there is), then presumably there is utility in making them perform consistently. The OED defines adjective as "a word standing for the name of an attribute, which being added to the name of a thing describes the thing more fully or definitely, as a black coat, a body politic." It further defines an attribute as "A quality or character ascribed to any person or thing." The determiners don't look at the qualities or characters of things. Instead, they have a pointing function that tell us which things are being discussed. Unlike adjectives, they cannot typically be graded or modified by too, so, very, or other adverbs that typically modify adjectives. They cannot typically appear predictively as most adjectives can (which also relates back to the fact that they are not qualities). That is, you can't say "The people are some." They are usually mandatory where adjectives are (always?) optional. You can't typically use them together (where you can string adjectives together to your heart's content). Adjectives are independent of number where determiners typically much match the number of the noun. In fact, about the only similarity is that they are both appear before nouns. So one practical utility of calling these determiners is that you convey all this information in a single word.
- What, may I ask, is the benefit of calling these adjectives apart from tradition?--BrettR 14:37, 2 April 2008 (UTC)
- The principal advantages are that:
- users believe they know the implications of something being labelled an adjective and
- there is some validity to that belief.
- I don't see any reason to cause them to investigate the meaning of determiner if doing so does not offer some real benefits to those not on a career path in language. That the concept has appeal and value to those in the field I do not doubt. I am a little disappointed that there seems to be so little of practical value in the concept. When writers about language a writing a book to sell they don't seem to find the word "determiner" of great value in explaining things. It doesn't appear in the index or glossary of many books on language {Chomsky's "Aspects", Pinker's "The Language Instinct", 3 Safire works, 1 Crystal, 1 Fischer), though it does appear in the index to Pinker's "Words and Rules" and Crytal's "Encyclopedia of Language".
- In short, "determiner" has been around, but has not swept the field among the producers of on-line and print dictionaries or language authors. It's merits do not speak for themselves. The stated benefits seem limited to "elegance". DCDuring TALK 15:38, 2 April 2008 (UTC)
- The principal advantages are that:
You seem to be implicitly assuming
- said implications apply to determiners, such that this belief is a good thing here.
which is not an assumption I'm ready to make with you. That said, if we had some sort of (determiner) context tag, I'd feel more comfortable than I currently do with labeling these "adjectives". (That might also help streamline the entry for זֶה (ze, “this”), which can serve either as an adjective or as a determiner, the latter being more formal/poetic/archaic, but the meaning being the same either way.)
- I await an itemization of the erroneous conclusions that users draw from the old-style PoS headers that will be corrected by the use of Determiner for the words that really are Determiners as opposed to the pretenders that have been nominated Determiners by some linguists for their own nefarious ends. I don't see why the Hebrew need for the determiner category has any particular implication for how we treat English. Or is there a procrustean imperative in Wiktionary's constitution that I missed. I don't see how we can have "proof" of benefits of one kind of catagorization over another, but I would like to see some presentation of the possible benefits that offset the "cost" of introducing a term that Longman's identifies as "technical" in the two of its dictionaries I have looked at. This seems of a piece with "plurale tantum". If Wiktionary is "by linguists, for linguists", then we ought to reconsider the logo design to make that clear. DCDuring TALK 19:47, 2 April 2008 (UTC)
- I thought I had done that. I guess we're thinking about different kinds of conclusions. Could you give me an example of the type of thing you mean, say with 'verb' in the role of determiner (nefarious interloper/deprecated POS header)?--BrettR 20:47, 2 April 2008 (UTC)
- I have no idea how to put 'verb' in the role of determiner or how that connects to what I think I've said or was trying to say. I'll look at this again tomorrow with some sleep. My underlying goal is to make sure that we have our poor occasional users in mind with any change because they are the source of the growth of Wikitionary's impact on the world. It seems moderately useful to me personally, but I don't believe myself to be representative, nor do I delieve the participants in this forum to be representative, of our target users. DCDuring TALK 01:01, 3 April 2008 (UTC)
- I thought I had done that. I guess we're thinking about different kinds of conclusions. Could you give me an example of the type of thing you mean, say with 'verb' in the role of determiner (nefarious interloper/deprecated POS header)?--BrettR 20:47, 2 April 2008 (UTC)
- Erm, sorry, but I think you misread my comment? In no way was I suggesting that "the Hebrew need for the determiner category ha[d] any particular implication for how we treat English." What I was suggesting is almost the opposite, actually: some of our Hebrew entries currently make use of ===Determiner=== even though that category doesn't apply as perfectly in Hebrew as it does in English, and I was suggesting that if we figure out a decent way to handle English determiners without using ===Determiner===, we can apply that same method to Hebrew. (Obviously ignoring the distinction altogether is not a decent way in either language.) —RuakhTALK 23:16, 2 April 2008 (UTC)
- "I have no idea how to put 'verb' in the role of determiner" Sorry, what I meant was, if the 'verb' heading didn't exist, and all these verb things were being called 'nouns', what arguments would you use to argue that we should split them out into a new category? Anyhow, have a good sleep.--BrettR 01:40, 3 April 2008 (UTC)
- As a general comment: the additional heading is a bad idea. Adding the categories seems fine. Don't those categories already exist anyhow? The heading might simplify typing an entry in once, instead of listing the POS headings that apply. But a heading like 'determiner' does nothing to clarify the word for typical readers. --Connel MacKenzie 06:28, 3 April 2008 (UTC)
- Yes, the categories already exist. The discussion is about the heading.
- I have trouble imagining what kind of typical readers are being imagined here. What typical reader looks up words like many, each, this, etc. in a dictionary? Presumably we have someone curious who has noticed something about the word and wants to understand it better. Such a person is much more likely to meet their goal (i.e., learn something) if they find the heading Determiner. If they don't know what it means, the answer is a click away. And if they've already gone to the trouble of looking up one of these words, then its seems a good bet that they'd go that extra step.--BrettR 11:41, 3 April 2008 (UTC)
- I agree with this. If a person is looking up a word like this, then they should be told about what it is. To call this and a word like nice by the same label will probably lead to an erroneous conclusion that they have the same syntactic behavior. The reason for using two labels is because they occur in different positions in grammatical constructions and because they have different functions. And that's why they are used in detailed descriptions of English grammar. It is probably just resistance to change (prescriptivists dont usually like change) that other dictionaries have not followed Longman. I suggest you add the new label. Ishwar 19:54, 15 April 2008 (UTC)
Demonstrative adjective
I seem to have started this. Demonstrative adjective is an old established term for this/that, but certainly not for both. I again remind you all that only this/that, these/those inflect for number, unlike any other English adjectives. A fun bit is contracting "is" while putting these adjective into the possessive: "this's this's", "that's that's". — This unsigned comment was added by Allamakee Democrat (talk • contribs) at 16:27, 2 April 2008 (UTC).
- Note: it is an error in English to contract this' with is. Regarding "demonstrative adjective", that term is possibly established for linguists, but not used in any recognized general use dictionary, to describe them. Making up headings certainly is not useful. A category with two members is a fine bit of overkill. A simple usage note on those two entries, would be better. --Connel MacKenzie 06:24, 3 April 2008 (UTC)
- Your pronouncement is a bit misleading. Webster's New 20th Century Dictinary includes (deprecated template usage) demonstrative with the following as a definition: "in grammar, pointing out; as that is a demonstrative pronoun." It does not use demonstrative to identify parts of speech of entries, but the concept is in the dictionary as it is in most dictionaries.
- The problem with this approach can be seen (for example) in that same dictionary's entries for (deprecated template usage) this. They give both a listing marked as pronoun followed by half a dozen definitions, then a listing marked as adjective followed by the same half a dozen definitions with the wording alterted only slightly for the use as an adjective.
- You see, demonstrative adjective/pronoun words are one class of Determiners. One of the great advantages of the Determiner header in English is that we can combine and simplfy many repeated senses. This has much potential for reducing confusion on the part of the casual user, since they won't have to look through two sets of definitions that look almost identical. The translations will also be greatly simplified. In most languages I've studied, the translation for adjectival use of this and pronoun use of this are the same or nearly so. Thus, separate Adjective and Pronoun sections (in addition to other uses) doubles the necessary length of all words that are Determiners. Consolidating under a single header will simplify and cleanup these entries enormously. --EncycloPetey 18:06, 3 April 2008 (UTC)
- Demonstratives this, that, etc. can function as determiners and as pronouns. But, not all determiners have the same syntactic behavior as this, that. For example, no, every, the cannot function as pronouns. So, you should use both labels for this reason. It, of course, doesnt necessarily follow that you need to repeat the definition multiple times Ishwar 19:35, 15 April 2008 (UTC)
- I'm not quite sure I understand why the matter needs to be debated at all. Determiner is not an accepted part of speech among legitimate grammarians; if you look in one of good old Webster's dictionaries, you'll never find a word classified as a "determiner." If it is not accepted by the experts, why should it be on Wiktionary? Are we striving for incorrectness? Elfred 03:51, 20 April 2008 (UTC)
- That's a funny thing to say. Picking up the first three relevant books that come to hand on my bookshelf -- Longman's Dictionary of Contemporary English, Leech & Svartvik's Communicative Grammar of English, and van Ek & Robat's Student's Grammar of English -- I find that all three use "determiner" quite routinely. The last two have a fairly canonical status in the field of English language teaching.
- There are at least two things that distinguish these works from many others: they are written for an international audience, and they are based on methods and theories that are reasonably up to date. I would hope that we would try to emulate them in both regards. Dictionaries written for monoglot audiences, based on obsolete theories and methods, are not a particularly good role model. -- Visviva 04:56, 20 April 2008 (UTC)
- It would be much easier to make a decision about this if we actually knew something about our anon users are and who they should be. Longman's DCE (definitely a learner's dictionary) is a good model for a dictionary, but it seems to be the only dictionary that uses "determiner" as a PoS. If someone could articulate the benefits of the "determiner" concept .... DCDuring TALK 09:47, 20 April 2008 (UTC)
As of the March, 2008 revisions, the OED uses Determiner. See, for example, the entry for many.--Brett 12:11, 26 May 2008 (UTC)
Internet slang
We currently have {{Internet}}
, {{slang}}
, and {{Internet slang}}
, all of which are context tags categorizing eponymously. This means that some Internet slang is categorized under Internet and slang, whereas other Internet slang is categorized under category:Internet slang. This is far from ideal. I can think of two (mutually exclusive) solutions:
- Get rid of category:Internet slang and force template:Internet slang to categorize into Internet and Slang. This seems reasonable to me.
- Have AutoFormat change every context tag that includes both Internet and slang to include Internet slang instead (and remove spaces, or whatever, of course). This seems like a bad idea to me, as there could be an entry that is both Internet (not necessarily slang) and slang (outside of the Internet), so would need both tags.
Thoughts?—msh210℠ 21:18, 2 April 2008 (UTC)
- Seems to me that the vast majority of these should be in Category:Internet slang. You are probably right that the conversion cannot be automated, but perhaps we could get a cleanup list? -- Visviva 06:33, 3 April 2008 (UTC)
- It would be nice to redesign the context labels to carry information such as formality and region. Yes,
{{internet slang}}
could expand, as you say, but why shouldn't{{context|internet|slang}}
classify it under all three categories? DAVilla 06:53, 7 April 2008 (UTC)
- Is that doable with template:context?—msh210℠ 19:00, 9 April 2008 (UTC)
- See my response here. DAVilla 17:05, 1 June 2008 (UTC)
full-width characters
I think we should reprogram wiktionary so that when people look up something with full-width characters, it treats it as a lookup using half-width characters. Full-width characters are non-ASCII variations of half-width characters, made wider to match the width of other languages like Japanese. So words in them would show up now and then in foreign text for example CD in Japanese. If someone looks up CD on Wiktionary, we will either...
1) Tell them the word doesn't exist,
2) Give them a definition like "full-width version of CD",
or 3) Just automatically redirect them to CD (hardcoded, not a manually-added redirect, because it'd be too much work to manually add redirects for all English words).
I suggest option 3. Right now we do option 1. Language Lover 22:23, 2 April 2008 (UTC)
- How does one come to look up something with full-width characters? By copying text from another website, and pasting it in Google or Wiktionary's search field?
- If full-width characters are non-ASCII, do they carry other meaning? Or are they just a presentation form which always represents the same as their Latin equivalents?
- Sounds like it might be a problem to be solved by text encoding standards bodies, and the makers of system software and web browsers, rather than by the maintainers of each website. —Michael Z. 23:13, 2 April 2008 (UTC)
- Copy-and-paste, yes. Google is smart enough to fix them; MediaWiki (the software the Wiktionary runs on) is not — which is just as well, because we probably do want entries for C and D, if only to answer the questions in your second paragraph. However, CD should probably JavaScript-redirect to CD. Shall we discuss this at Wiktionary:Grease pit? —RuakhTALK 01:09, 3 April 2008 (UTC)
- Okay, perhaps it is a feature we could use, but it should be pretty well thought out before implementing.
- But this brings up another question, about which symbols are appropriate Wiktionary entries. WT:CFI doesn't really cover this adequately. The technical implementation is different, but whether in the full-width or ASCII code range, "CD" means "CD". As far as I know, Wiktionary entries are about words, and maybe some symbols or the concepts they represent, but they are not about code points. The full-width code point C is a symbol representing the exact same concept as the ASCII code point C, and I don't think it should have a separate dictionary entry. —Michael Z. 21:30, 3 April 2008 (UTC)
- Yes, hallelujah! And combine A with Cyrillic А and Greek Α etc. Differentiating script is a great feature of Unicode if you're interested in automated text manipulation, but when it comes to defining symbols, these are indistinguishable glyphs. In fact, I would say that if anything deserves another page it's the italic text, the script text, etc. even if it may be the same code point. DAVilla 06:48, 7 April 2008 (UTC)
- I believe this is yet another issue that Hippietrail's Extension:DidYouMean would deal with for us, lets hope we can get it tested and implemented soon. Conrad.Irwin 11:36, 3 April 2008 (UTC)
- If you have the right keyboard setup, you would come across it just by typing. And don't think for a second that just because some Asian language Wiktionary exists that a single translation of ko:CD is going to keep a non-native English speaker from coming here. Even with as little knowledge of other languages as I have, I know that to get the best explanation of a word you have to look on the foreign language dictionary. A survey of foreign language terms on this English Wiktionary and the all-too-often need for a
{{gloss}}
drive in the point. So yes, there is every reason to have this incorporated into DidYouMean or some other solution. DAVilla 06:48, 7 April 2008 (UTC)
- If you have the right keyboard setup, you would come across it just by typing. And don't think for a second that just because some Asian language Wiktionary exists that a single translation of ko:CD is going to keep a non-native English speaker from coming here. Even with as little knowledge of other languages as I have, I know that to get the best explanation of a word you have to look on the foreign language dictionary. A survey of foreign language terms on this English Wiktionary and the all-too-often need for a
Colloquial and slang: a sensible combination?
I'm in a disagreement with User:Amgine over at doggie as to whether having both {{colloquial}}
and {{slang}}
simultaneously is appropriate. I'm firmly in the camp that only "slang" is needed and if the fact nobody commented on that above is any indication, other people seem to agree. Circeus 01:05, 3 April 2008 (UTC)
- I agree with you. As Amgine says, "[t]he terms are not synonymous", but as you say, "slang is by its very definition 'colloquial'". "Colloquial" does not imply "slang", but "slang" does imply "colloquial", so there's never any need to list both. (That's going by what I currently know. If someone can give a decent rationale for ever including both, I'm open to the possibility.) —RuakhTALK 01:25, 3 April 2008 (UTC)
- Colloquial, by our glossary, indicates a term in common, often informal, parlance, as opposed to jargon usage. Slang, in contrast, is characterized by its limited use, unconventionally or as informal jargon. ie. "anon" on en.wiktionary is slang, but "bike" is colloquial. (Incidentally, I thought informal was defined in the glossary at one point as indicating a term for which there is a more-formal synonym term used in formal circumstances, such as the bike/bicycle dichotomy. Currently there is no definition in the glossary for informal, so should we be using it at all?) - Amgine/talk 14:06, 3 April 2008 (UTC)
- Given that the distinction you refer to was added a few days ago (after your revert, I think—would need to check;I've started a discussion about it below), I definitely don't consider it to have any bearing on the discussion. Circeus 17:23, 3 April 2008 (UTC)
- Even given your definitions of informal (the last of which I dispute, btw) the primary definitions are to contrast with formal uses. And, by implication, must clearly not be considered to have any bearing on the discussion.</irony> I was unaware of the edits to the glossary, assuming your good faith in the matter. - Amgine/talk 20:13, 3 April 2008 (UTC)
- Re: the edits, I was just pointing it out before someone looked there and noticed it bore on the discussion.
- As for "my definitions" (which, despite pointed discussions on the subject, have not been altered since), they are directly in line with those of, e.g. Merriam-Webster. In fact, one could argue that the replacement definition is entirely superfluous since meaning 1 arguably cover it (and the entire entry could use a better swipe: it's considered bad form to define a word in by saying what it doesn't mean).
- I've made the point in the past, with several references to scholarly sources, that no modern dictionary makes a meaning distinction between the "colloquial" and "informal" labels (which is why they always use only one). Furthermore, "informal" is clearly never used in the very restricted meaning of "spoken" (which is what "colloquial" was originally explicitly coined to cover). Circeus 21:24, 3 April 2008 (UTC)
- Now that you mention it, I don't recall any dictionary using the informal label except the US ones. However, I have not often examined dictionaries specifically for this use. I should examining those near to hand to me before I comment further:
- Oxford Dictionary of Current English: informal
- M-W Pocket, Online: neither
- Various nautical, rare: colloq and formal (presumably because all else is informal?) But these are not valid as they are, by definition, non-standard.
- I believe this does not indicate a consensus amongst modern publishers, but it may raise question as to the value of *either* label. - Amgine/talk 03:55, 4 April 2008 (UTC)
- There are disagreement over the philosophical application of tags between lexicographers, and although some general language dictionaries do use colloquial, they use it instead of informal, not as a separate category. When M-W published (in th 50s) a dictionary that dropped the colloquial and (IIRC) slang tags, amongst others, they got an incredible amount of flak, but that dictionary is considered one of the most progressive of its time.
- To give aother example of peculiar tag use, my harraps-chambers billingual is mildly idiosyncratic itself: it uses neither formal or informal, but has "ironic" and "humorous" (arguably billingual dictionaries have different needs and must carry more connotative information than the average monolingual). It uses "familiar" and adds "colloquial" next to the explanation of the abbreviation (Fam). It does use "formal" though. This is likely because "informal" is not used of language in French, and they strived to use a single tag for both languages whenever possible. Circeus 06:02, 4 April 2008 (UTC)
- Now that you mention it, I don't recall any dictionary using the informal label except the US ones. However, I have not often examined dictionaries specifically for this use. I should examining those near to hand to me before I comment further:
- Even given your definitions of informal (the last of which I dispute, btw) the primary definitions are to contrast with formal uses. And, by implication, must clearly not be considered to have any bearing on the discussion.</irony> I was unaware of the edits to the glossary, assuming your good faith in the matter. - Amgine/talk 20:13, 3 April 2008 (UTC)
- Given that the distinction you refer to was added a few days ago (after your revert, I think—would need to check;I've started a discussion about it below), I definitely don't consider it to have any bearing on the discussion. Circeus 17:23, 3 April 2008 (UTC)
- Colloquial, by our glossary, indicates a term in common, often informal, parlance, as opposed to jargon usage. Slang, in contrast, is characterized by its limited use, unconventionally or as informal jargon. ie. "anon" on en.wiktionary is slang, but "bike" is colloquial. (Incidentally, I thought informal was defined in the glossary at one point as indicating a term for which there is a more-formal synonym term used in formal circumstances, such as the bike/bicycle dichotomy. Currently there is no definition in the glossary for informal, so should we be using it at all?) - Amgine/talk 14:06, 3 April 2008 (UTC)
- Longman's Dictionary of Contemporary English 3rd ed (1987) uses: formal, informal, literary, pompous, poetical, slang, dialect, technical, old use, old-fashioned, appreciative, derogatory, taboo, trademark, slang, humorous, and appreciative, but not "colloquial". I included more than have a direct bearing to illustrate that they seem to go out of their way to select straight-forward terms whose ordinary meaning is very close to their intended meaning instead of relying on jargon. Their definition of colloquial is that it indicates usage "suitable for ordinary, informal, or familiar conversation, not formal or special to literature." They are rather progressive in their handling of such matters. (These are the fellows who use "determiner" as a PoS.)
- MW3 (principal copyright 1961) must be the M-W dictionary referred to above. The 1993 ed. retains "slang", but not "colloquial", "formal", or "informal". It does have "standard", "substandard", and "nonstandard". DCDuring TALK 12:03, 4 April 2008 (UTC)
- From this I conclude that "colloquial" is less precise. I wonder if it wouldn't be desirable to allow all of these tags and indicate which ones are more and which ones less precise. That way, even less precise knowledge that users might have would be included. Categories would make the less precise items reviewable if someone actually believed that had knowledge or belief tha they should be more precise. We can take advantage of our non-print wiki nature to be more dynamic and evolutionary than print dictionaries. DCDuring TALK 12:03, 4 April 2008 (UTC)
I find humorous the claim that all slang is colloquial just below another topic called "Internet slang". Certainly Internet slang or other types of esoteric jargon did not necessarily originate in speech. The categories slang, colloquial, and informal all overlap greatly, but there are clearly some terms that will be only one or the other. This is another big reason to maintain the distinctions. -- Thisis0 23:50, 5 April 2008 (UTC)
- Well, obviously the reason I do not normally accept the combination is properly that I consider that "colloquial" and "informal" are one and the same for all practical purposes, so obviously "informal" and "slang" make no sense to use together. Circeus 00:26, 6 April 2008 (UTC)
- Likewise, I assume all informal falls within the purview of colloquial, but not always the reverse; that is, I feel colloquial is more inclusive as some colloquial speech is also formal. However, these discussions suggest the labels are ill-defined and, possibly, driven by editors' opinions more than by objective measure. As such it seems likely they are inappropriate labels to be used at this time. - Amgine/talk 01:58, 6 April 2008 (UTC)
I see a misspelled entry
Just passing by, but I noticed that paranomasia and paranomasias are misspelled: the correct spelling is paronomasia, with an o as the second vowel. --124.178.50.148 01:05, 3 April 2008 (UTC)
- Actually, both spellings seem to be fairly common; the a spellings should probably be marked as alternative spellings of the o spellings. —RuakhTALK 01:13, 3 April 2008 (UTC)
Recent edits to Appendix:Glossary needs community discussion
I think these (especially those to "Colloquial" and "Informal") really ought to be discussed. While I appreciate the addition of a {{familiar}}
tag and entry, I'm really dubious of the need to add entries for informal and formal, unless we want to throw meanings on them they don't have or (not to mention the information given is sometimes contradictory with that found in Wiktionary: Glossary). The links in {{informal}}
and {{slang}}
are at best superfluous, at worst an insult to our readers. Also, the attempt to specifically distinguish "slang" and "colloquial" (although it does formally establish that the tags are inappropriate in combination, see related discussion above), is fairly ridiculous, as slang is colloquial (if we take the "originating in speech" definition as a basis) by its very definition.
I'll again request a thorough discussion about the basis of distinguishing informal from a purportedly distinct and easy to attribute (because if we have to argue over the tagging every single word, the tag is pointless) "colloquial" category , as it is indirectly related to the other discussion above.Circeus 01:21, 3 April 2008 (UTC)
- I think of "informal" as being used mostly to create the contrast with "formal", rather than being used in the same context as "colloquial" and "slang". There are certainly words that are mastly used in "formal" contexts, Madame Chairman. "Formal" is not a synonym for standard. It refers to words used on ceremonial and official occasions and contexts, but not limited to any one of them (like courts of law). DCDuring TALK 11:24, 3 April 2008 (UTC)
- I don't really dispute the usefulness of the template, but I do strongly doubt the necessity of defining it into the user glossary. Circeus 21:27, 3 April 2008 (UTC)
- I don't wish to cause trouble, but if we have a term we use in these tags, I would argue that we need to have a link from the tag to either an entry or, if we are using the word in an even slightly idiosyncratic way, to the glossary. DCDuring TALK 21:49, 3 April 2008 (UTC)
- Given the number of various terms we use like this, I agree. If we're going to make distinctions beween "informal, colloquial, slang, vulgar, etc" then it seems reasonable to explain the distinctions to our users by means of a link to some kind of explanation. --EncycloPetey 22:21, 3 April 2008 (UTC)
- Of course, it would be a good idea if we could actually agree about the meanings first I think the previous discussion firmly established there is at best disagreement as to what should be done of colloquial. Circeus 23:58, 3 April 2008 (UTC)
- Given the number of various terms we use like this, I agree. If we're going to make distinctions beween "informal, colloquial, slang, vulgar, etc" then it seems reasonable to explain the distinctions to our users by means of a link to some kind of explanation. --EncycloPetey 22:21, 3 April 2008 (UTC)
- The tracks are parallel. The only technical decision is whether to use the normal entry definition, a category page, or the glossary. I would argue that the best technical solution would be to use the Wiktionary Glossary definition unless there wasn't one, in which case the normal entry would suffice. Then we can hash this out without further tech involvement. Or perhaps we could simply insist that the Glossary be the sole source for such tag definitions. The category pages would be inconvenient for maintaining the sets of related terms, I suppose. DCDuring TALK 00:40, 4 April 2008 (UTC)
Citations_talk: is deprecated
There has been scattered discussion in various places about the [[Citations_talk:]] namespace with the general feeling being that, given the phenomenal amount of use that our Talk: pages get, we may as well have one talk page to discuss the word - rather than two talk pages to discuss two intimately related pages. The Citations_talk: namespace should now be empty, and the site javascript conspires heavily against users finding themselves there. Conrad.Irwin 11:33, 3 April 2008 (UTC)
- That's nice. Now, given that one of the primary purposes of the Citations namespace is to collect citations for words that we don't yet have in NS:0, and may or may not have in the future, where do you suggest putting any resulting discussion? An NS:1 Talk page will look orphaned, and such are routinely deleted. Eh? Robert Ullmann 11:58, 3 April 2008 (UTC)
- They should still be in the Talk: page, as - should the entry ever be created - previous discussion about the citations is likely to have an impact on it. It should be trivial to ask whatever (or whoever) deletes orphaned talk pages to check for Citations pages of the same name. (off topic) I feel that the obsession with deleting orphaned talk pages is unnecessary and indeed harmful - it is possible for the community to talk about entries that will never exist or have been deleted. It makes sense to store all this talk in a central place where it can be found instantly should the need arise - the talk page is the ideal page for this. Take for example the
{{rfvfailed}}
template - which archives deleted information easily accessible on the talk page, yet if a whole entry gets deleted the archive is somewhere in a page history accessible from some index that I can't remember the name of at the moment. Conrad.Irwin 12:36, 3 April 2008 (UTC)
- They should still be in the Talk: page, as - should the entry ever be created - previous discussion about the citations is likely to have an impact on it. It should be trivial to ask whatever (or whoever) deletes orphaned talk pages to check for Citations pages of the same name. (off topic) I feel that the obsession with deleting orphaned talk pages is unnecessary and indeed harmful - it is possible for the community to talk about entries that will never exist or have been deleted. It makes sense to store all this talk in a central place where it can be found instantly should the need arise - the talk page is the ideal page for this. Take for example the
- Perhaps a standard link (tab?) from Citations to the associated talk page would facilitate finding it. And if it were red, it would convey non-existence. I don't see any reason to delete our previous work, either citations or discussion. Page history is not readily searchable, AFAIK. DCDuring TALK 16:53, 3 April 2008 (UTC)
- Re: Ullmann's comment: why are any discussion pages with valuable content being deleted at all? Orphaned talk pages should only be deleted if they contain vandalism, discussions about terms which may not presently exist are valid if only to show that someone has discussed them and a decision was made that the word didn't merit inclusion. Orphaned talk pages should not be deleted by rote.
- Re: the general discussion: I am opposed to splitting discussions between the cites and talk page, since the cites page is simply another type of "discussion" about the NS:0 term. It isn't as if the talk pages are swamped, the average defined term has no extant talk page and the average talk page is tiny, even compared to the entry. Best to keep all meta information in one place. - [The]DaveRoss 20:27, 3 April 2008 (UTC)
- So apparently there is this fancy thing in MW now called "namespace alias" where we can make all "Citations_talk" pages point directly at "Talk" if we wish (also "WT" at "Wiktionary" if we want that. - [The]DaveRoss 20:54, 3 April 2008 (UTC)
- Oooh. That sounds a lot like what I would want. Any drawbacks? Presumably it would need a vote to be implemented. DCDuring TALK 17:36, 4 April 2008 (UTC)
- Yeah, on wiki consensus would need to be displayed, and the namespace which was made into an alias would never-ever be editable, at all. - [The]DaveRoss 20:30, 4 April 2008 (UTC)
- I think it would be more useful to wait and see how Citations talk might be used than to declare this early that it could not be used. Primarily I am concerned that there is not the 1-to-1 correspondence between entries and Citations pages that people wish there to be. I could be proven wrong, but I would want time itself to do that.
- However, I am certainly interested by this ability to create namespaces that are not the usual duals. I never did understand, for instance, how WT talk: could ever be of use. WT: isn't really a "full" namespace anyway. At the same time, it would be incredibly useful to have the Template: space for documentation and another space that actually holds the code, sharing a single talk page. But you've heard me say that before. DAVilla 06:12, 7 April 2008 (UTC)
- Well <noinclude> allows documentation on the Template page, it is just more common to see it on the talk page, no reason really to do it that way. I agree that WT_talk is useless, I personally think we ought to make it an NS alias, but I wasn't greeted with full support when I voiced that opinion... - [The]DaveRoss 20:55, 9 April 2008 (UTC)
The definition is embedded in {1} code and references contain {2} but they don't seem to do anything. Is this something I have to watch for or can I delete it when I see it? --Panda10 20:34, 3 April 2008 (UTC)
- You can delete that on sight, it is meant to be used when a definition is subst:ed in from a template, but it ends up as a relic when people who don't know how to use the template do. - [The]DaveRoss 20:56, 3 April 2008 (UTC)
Formatting of Idioms, Proverbs in non-English entries
I haven't found any precise information on how to format Idioms or Proverbs in non-English entries. The main problem we have is where to put the literal translation. Here is a summary of the different elements to put in a standard entry.
- Wiki links on the words or group of word of the Czech proverb: yes, in lemma form
- Translation with the equivalent proverb in English: with a # after the entry, if not just omit
- Literal translation of the Czech proverb: on the same line as the entry, at the end, in bracket using the tr= parameter in the template
{{infl}}
but should only be used for transliteration or at the end of the definition in bracket or in the Etymology section ? - Explanation of the idiomatic meaning of the Czech proverb: after the translation with #:, in italic ?
See About Czech for more information and also Translation of idioms. --Thomas was here ☺ 12:13, 4 April 2008 (UTC)
- I usually put the literal translation as part of the etymology section. There may be times where explaining the literal translation further can help, and sometimes that can go under "Usage notes". I usually wouldn't put the literal translation as a "definition" unless there was no English expression that came close to the meaning. --EncycloPetey 18:00, 4 April 2008 (UTC)
- I also use the etymology section for this. I can't think of a specific case where a usage note would be helpful, but I can abstractly imagine that sometimes an idiom's literal meaning could have implications a user should know about. I agree that the literal translation shouldn't be given a sense line unless the idiom is sometimes used literally, and even then, caution is warranted. —RuakhTALK 22:39, 4 April 2008 (UTC)
- Me too. -- Visviva 02:51, 5 April 2008 (UTC)
- Fine, so there seems to be a consensus on putting the literal translation in the Etymology section. Thanks for your answers. Another and I hope last question is: should we copy the idiomatic meaning from the English entry in the non-English entries ? I can also add the final version of this formatting to a Wiktionary:Proverbs page and put this page into the Category:Proverbs. It also is maybe time to add a section Non-English entries in the Entry layout explained. --Thomas was here ☺ 16:32, 6 April 2008 (UTC)
- We already have Wiktionary:Language considerations for that; the page is just severely underused. --EncycloPetey 21:35, 6 April 2008 (UTC)
Illustrations
I suggest a new guideline for illustrations: they must be helpful when trying to understand the meaning of the word (if not, they belong to Wikipedia). As an example, I think that the picture in maritime pine does not belong here, but a picture of the tree would help. Lmaltier 16:47, 4 April 2008 (UTC)
- Not a bad idea. But a little visual interest and color is better than an absence of any illustrations. At least the picture is (purportedly) actually of a maritime pine, albeit a rather young one. DCDuring TALK 17:30, 4 April 2008 (UTC)
- Perhaps whoever took that picture is planning on returning in 30 years to get us a followup...assume good faith :) - [The]DaveRoss 20:29, 4 April 2008 (UTC)
- Changed that pic, hope it's more satisfactory (others at Commons:Pinus pinaster). In general, I agree; we should generally have no more than one image per sense, and the image should be chosen to illustrate that sense as clearly as possible -- which in the case of an organism, means preferably a picture of the whole, mature organism. We should link to Commons or Wikipedia for additional visuals. However, if the previous picture had been the only one available on Commons, I think it would have been better than nothing. -- Visviva 02:50, 5 April 2008 (UTC)
- It's a fine pic and probably better than the other, but from a distance many pines look alike. If I wanted to know how a maritime was different from the pines I know, probably neither picture would help. I wonder whether we should make all the wikicommons pictures on the subject just one click away. DCDuring TALK 04:00, 5 April 2008 (UTC)
- All 3 sister project links in lite form seem useful in this case. Lite form makes it all appear on a single screen. I suppose there might be other cases where the bigger links would be OK. And yes that probably is the best single picture, whatever its shortcomings might be for specific purposes. And with tabbed browsing it is so easy to compare images of two or more kinds of pines. DCDuring TALK 04:10, 5 April 2008 (UTC)
- My point was that, sometimes, no picture is better: everything that blurs the limit with Wikipedia should be avoided, including pictures when their only interest is encyclopedic. Pictures should illustrate definitions, they should be visual definitions. A good example, for a country name, is a map showing where the country is. This is not original, I think that this principle is applied for picture selection in most dictionaries (except encyclopedic dictionaries). Lmaltier 21:01, 5 April 2008 (UTC)
I'd like to find a category to put these words in. I think it should be a subcategory of People, but what would be a good name? Thanks. --Panda10 22:04, 4 April 2008 (UTC)
- Category:Titles is related, but not quite the same. I had been thinking a while ago that it would be good to have a category for ranks of nobility like duke and baronet. I think whatever is used for "rulers" should probably encompass "nobility" as well. Mike Dillon 02:37, 5 April 2008 (UTC)
- hereditary heads of state seems reasonable. --Allamakee Democrat 03:32, 5 April 2008 (UTC)
- Wouldn't that exclude many founders of dynasties? DCDuring TALK 18:34, 5 April 2008 (UTC)
- I'm not sure we need or want categories that are that granular. We already have too many categories that will never have more than a dozen entries; it's not like we're going to be adding individual heads of state a la Wikipedia. Mike Dillon 03:59, 5 April 2008 (UTC)
- hereditary heads of state seems reasonable. --Allamakee Democrat 03:32, 5 April 2008 (UTC)
- What is the minimum number of entries to create a category? How about Leaders for category name with this list as a starter: autocrat, crown prince, czar, czarina, despot, dictator, dynast, emperor, empress, generalissimo, governor, head of state, imperator, kaiser, khan, king, leader, magistrate, magnate, maharajah, maharani, mikado, mogul, monarch, Negus, pharaoh, potentate, prince, princess, queen, rajah, rani, regent, ruler, shah, shogun, sovereign, sultan, tsar, tsarina, tycoon, tyrant, viceroy
- Or I could add these words as a See also to king without creating a category. Panda10 13:04, 6 April 2008 (UTC)
- I'd go with Category:Rulers personally; "leaders" is awfully vague and doesn't describe a very clearly-delineated semantic field; it could just as easily include things like pastor and principal. But a category should (IMO) definitely exist which includes most if not all of the terms you have listed above. -- Visviva 14:08, 6 April 2008 (UTC)
- I'd avoid Category:Rulers personally. The category name is ambiguous. Does it refer to people who rule, or to instruments used to measure distance? Such ambiguous names should be avoided. --EncycloPetey 16:24, 6 April 2008 (UTC)
- How about Category:Monarchs? See Wikipedia w:Monarch listing a lot more words. This categorization would exclude words for modern leaders, though. Or better: Category:Positions of authority? This is a wider category and could include a lot more. --Panda10 17:53, 6 April 2008 (UTC)
- "Monarchs" is too narrow a description. I fear that "positions of authority" may be too broad, unless we want to include abbot, schoolmaster, manager, etc. I haven't thought of a good name, but I have come up with many bad ones. --EncycloPetey 21:33, 6 April 2008 (UTC)
- "Heads of state" would probably work. "Nobility" should be a separate category, although obviously some entries would be in both. Thryduulf 21:57, 6 April 2008 (UTC)
- "Heads of state" doesn't really work for princess, queen, or prince (at least not in general). There seem to be a few of cross-cutting things going on here, mainly heads of state and royalty. Mike Dillon 22:13, 6 April 2008 (UTC)
- I am not certain that there is a need for one category to contain all these entries, categories "Heads of state", "Royalty" and "Nobility" linked by see alsos would seem to be the best solution to me. "Marquess", "King" and "Prime Minister" don't seem to belong to a single class of positions to me. Thryduulf 22:31, 6 April 2008 (UTC)
- I concur. That looks like a pretty good breakdown. Now the question is where to put them. "Heads of state" probably makes sense under Category:Government and/or Category:Titles. Royalty and Nobility could make sense under Category:Society and/or Category:Titles. Mike Dillon 01:31, 7 April 2008 (UTC)
I created Category:Heads of state and set up the parents and a customized description for using {{topic cat}}
. We can adjust if necessary. Mike Dillon 01:44, 7 April 2008 (UTC)
- P.S. We already have Category:Monarchy too under Category:Forms of government. Mike Dillon 01:46, 7 April 2008 (UTC)
Dating
Would anyone object to categorizing words by the year/decade/century of their introduction to the language/earliest attestation, where such information is known? --Ptcamn 08:02, 5 April 2008 (UTC)
- Not really, but it seems problematic. What really counts as "knowing" that information? To prove that a word was introduced in the year/decade/century X involves proving that was not in use at any time before X -- and proving a negative is a difficult task under the best of circumstances. Third-party sources aren't necessarily a reliable fallback, either -- I think there have been a few recent cases where we've found citations for terms which substantially predate the dates of introduction given in reputable sources. -- Visviva 16:32, 5 April 2008 (UTC)
- You want to introduce this concept for all languages? or limit it to English? or to Modern English? Even limiting to modern English, you'd be talking about adding hundreads of categories. Personally, I'm not sure they would be very useful. Why would anyone want to look up "English words first attested in 1712"? --EncycloPetey 18:12, 5 April 2008 (UTC)
- Folks might be more interested in words first attested after a certain date so categories as we now know them might not be the right approach. We already have, in principle, the concept of definitions that are "dated", which implies that we know when they died or, at least, retired from active service. A bit of unstructured user Feedback suggested the notion. Perhaps that will be something we will further develop in the second decade of Wiktionary. DCDuring TALK 18:29, 5 April 2008 (UTC)
- I can't imagine a category more useless than Category:English nouns, so if you want to create these, I'd have no problem with them. But, as Visviva notes, caution is warranted. —RuakhTALK 19:40, 5 April 2008 (UTC)
- Not the topic, but I find the Wiktionnaire category for French words useful, despite its 350 000 entries. I already used it several times. Lmaltier 21:14, 5 April 2008 (UTC)
- Categorize after earliest and newest added quotation? Nah, I don't know... Still, it would be a hassle, if not outright confusing, for words of several meanings. \Mike 12:26, 7 April 2008 (UTC)
Keenebot2 to branch out?
This thread is to inquire about User:Keenebot2's generation of conjugated forms of verbs, nouns, adjectives and other parts of speech in foreign languages other than French. Hopefully I've proved myself capable of running a bot - around 50000 entries created in the last couple of months for French stuff. I'd like to branch out into other languages now, only ones which have logical conjugations. Examples of Keenebot2's non-French FL entries are for inflections of dificultar (Catalan), precisar (Catalan), soma (Romanian) and lyste (Norwegian). Essentially, I'm asking to be allowed to use the bot for whatever I feel might be useful, as long as it's within the "get data from Wiktionary, process offline and spout out tonnes of declension/conjugation forms in WT format" form. I'd hate to have to go the an uber-bureacratic 2-week vote for every language I use, and also am not keen on abusing the bot status to pick things willy-nilly and add unfamliliar languages so nobody notices, so can I get "free reign" for my bot? Keene2 13:50, 5 April 2008 (UTC)
- The bot authorization is only for French, so you'll need a vote. A very important consideration is the involvement of a native speaker in the process; otherwise serious errors will occur. A native speaker need only to look at the entries being made and say wait a sec, that's wrong!. The vote process is very simple if not controversial, and if you are too impatient for the time you have to wait, you probably should wait a bit anyway ;-). For now: got a native speaker of Romanian? Robert Ullmann 14:41, 5 April 2008 (UTC)
- "Native speaker" is pushing it, but yes, there's needs to be someone who knows the language well enough to vouch for the resultant entries. —RuakhTALK 16:09, 5 April 2008 (UTC)
- 2 week vote started. Any question are best posed there. Keene2 14:48, 5 April 2008 (UTC)
- I will point out here that prior to running TheDaveBot for Spanish verbs I looked over it, two native speakers looked over it, the pages sat for several months and THEN two errors were discovered (two per verb, not two total), I think Keene will act in the best interests of Wiktionary and I don't expect perfection, even though I am certain Keene will aim for it. - [The]DaveRoss 16:25, 5 April 2008 (UTC)
Unicode 5.1
Is finally officially out:
There is a lot of cool new stuff inside, so check it out. Unfortunately still no Avestan and Egyptian hieroglyphs :/ --Ivan Štambuk 21:15, 5 April 2008 (UTC)
Early Cyrillic
Unicode 5.1 includes some revisions and very significant additions for the range of Cyrillic characters used for Old Church Slavonic (cu, chu), Old East Slavic (orv) – used in etymologies, e.g. горілка#Etymology – and modern Church Slavonic languages (and probably many others).[1] A small selection is already available in the Dilyana font,[2] and undoubtedly there is more font support to come.
It's likely that only obscure Slavistics fonts will support this range, at least at first. Will we have to fork the current Cyrillic style (.RU) into a second version with its own list of fonts for this purpose? When the fonts do become available, the preference be to imitate traditional typography, which uses old-fashioned manuscript-style typefaces for these languages? —Michael Z. 01:30, 6 April 2008 (UTC)
- Such significant additions merit their own brand new ISO 15924 code, so they created Cyrs - Cyrillic (Old Church Slavonic variant). So far OCS entries (2158 of them according to WT:STATS) are using exclusively
{{Cyrl}}
, which should be changed to{{Cyrs}}
once the particular font issues are inspected and settled. Dilyana is so far used for Glagolitic ({{Glag}}
). --Ivan Štambuk 02:06, 6 April 2008 (UTC)
- I've created a draft
{{Cyrs}}
, based on the existing pattern, specifying Dilyana font followed by other Cyrillic ones, and applying class="Cyrs".
- I've created a draft
- Is there any reason not to add a class="Glag" to
{{Glag}}
, for future use and user customization? —Michael Z. 04:46, 6 April 2008 (UTC)- We should have Glagolitic. It was in use in the Balkans long after it disappeared elsewhere, so there are documented forms for relatively modern words in Glagolitic spellings. --EncycloPetey 04:57, 6 April 2008 (UTC)
- Is there any reason not to add a class="Glag" to
- The template is already there, and seems to be in use on 226 pages. I'll just go ahead and add the class, so it will be there if anyone needs it. It seems clear that there will be no conflict. —Michael Z. 05:06, 6 April 2008 (UTC)
Hm, it seems that applying "font-family:Dilyana;" breaks glagolitic text in my browser (Safari/Mac, but it doesn't affect the display in Firefox 2/Mac). Evidence for "don't mess with it if it's not broken". —Michael Z. 05:19, 6 April 2008 (UTC) NM; restarting the browser fixed it. —Michael Z. 05:26, 6 April 2008 (UTC)
Still lacking credibility as a decent dictionary
I wrote the following in the Requests for Cleanup, but though it worth copying here.--Richardb 00:24, 6 April 2008 (UTC)
It is still too easy to find basic words, such as head, which have far fewer meanings listed in Wiktionary than in many a concise dictionary. I pointed this out about head a couple of years ago. Yet it is still missing some simple definitions:-
- head of steam, head of pressure.
- head of a door frame
- it cost him his head (it cost him his ilfe, but his head may still be in place!)
- $10 per head
- side of a coin
- part of a tape or disc player, printer etc
- promontory
- events come to a head; a climax
- the top of a pimple;spot;boil
- out of one's head; off one's head
etc etc.
some parts are confused:-
- (countable) The topmost, foremost, leading or principal operative part of anything.
What does it say on the head of the page?
Principal operative part of a machine has nothing in common with head of the page
I previously tried to get some sort of Quality Control Project going on the top 1000 words, but was defeated by apathy (mine and everyone else's). It has to be a team effort, but team efforts never seem to succeed here. Everyone seems to want to do their own thing. So Wiktionary still seriously lacks credibility in it's most basic function - as an English Dictionary.
I'm no longer interested in trying to take this on. But unless quite a decent group takes it on, the dictionary is still going to be lacking credibility, despite all the other wonderous stuff which people spend time adding.--Richardb 00:24, 6 April 2008 (UTC)
- So what you are saying is - you can't be bothered to fix it yourself, but are complaining that others aren't fixing it for you? Hardly the spirit of a Wiki. SemperBlotto 07:12, 6 April 2008 (UTC)
- Nonetheless I think the basic point is valid -- we are unlikely ever to be taken seriously as a dictionary unless we have exemplary coverage of core English vocabulary. Part of the problem here is that writing a reasonably comprehensive entry for a common word like head is easily a day's work; personally, on the rare occasions when I have that kind of time available, I find it difficult to justify spending the day improving one entry rather than creating 50 or 100 entries for words that we don't have yet. But I do agree that this is our single most significant failing at present. The next time I actually have an 8-hour block available, I'm bringin' it to the GSL. -- Visviva 13:01, 6 April 2008 (UTC)
moo goo gai pan: What if you don't speak the language?
Copied from: Talk:蘑菇鸡片
start ...omitted... The Cantonese definition seems to have been removed, although this is fairly clearly a dish of Cantonese origin. 24.29.228.33 02:59, 6 April 2008 (UTC)
- I agree. The problem that I'm running into is that I can't find any corroborating material. I don't think citing the Wikipedia article is appropriate at this point, since it also lacks proper citations. I don't speak Cantonese myself, so I don't know if I can trust the accuracy of the scant materials that I've found online. Already, I've found one descrepancy. Wikipedia says that 鸡片 is "gai1 pin3" but Cantodict says that its gai1 pin3*2.[3] I think I'll post this to WT:BP, and find out what others want to do. I'd honestly rather having nothing, than to include potentially inaccurate information. My reason is that I've seen how errors and inaccuracies are quite easy to perpetuate online, once they are out there. -- A-cai 05:44, 6 April 2008 (UTC)
end
First of all, are there any Cantonese speakers that could help out with these two entries (蘑菇雞片 and 蘑菇鸡片)? If not, what do we want to do about such entries (words in languages which are difficult to verify, especially because we lack the appropriate native speakers at Wiktionary)? Opinions? -- A-cai 05:44, 6 April 2008 (UTC)
Wiktionary:Transliteration
The guidelines at Wiktionary:Transliteration and the contents of Category:Wiktionary:Transliteration need some attention. There are a few independent issues I'd like to address, so I'll place them under separate subheadings here. —Michael Z. 16:45, 6 April 2008 (UTC)
Forked guideline
I'd like to merge Wiktionary:Transliteration with Wiktionary:Transliteration and romanization. There doesn't seem to be any reason for two essentially redundant guidelines. Any objections? —Michael Z. 16:45, 6 April 2008 (UTC)
- Sounds good to me. :-) —RuakhTALK 19:36, 6 April 2008 (UTC)
- Since there's no objection, I'll go ahead and merge these shortly. I'll add merge notices to the pages immediately, in case someone missed this discussion. —Michael Z. 18:33, 10 April 2008 (UTC)
- Done merging, and then made some additions and reorganization on the page. Please look over Wiktionary:Transliteration and romanization. —Michael Z. 21:51, 10 April 2008 (UTC)
Nomenclature
Romanization is the more general category, with transliteration being more limited in scope. In one case (Wiktionary:About Japanese/Transliteration, dealing with w:Hepburn romanization) a guideline seems to be incorrectly named. Since we're dealing with less than a dozen guidelines so far, I've proposed moving the guideline to Wiktionary:Romanization and re-categorizing it under Category:Wiktionary:Romanization. Comments or objections? —Michael Z. 16:45, 6 April 2008 (UTC)
- That makes sense. It's not like we'll ever be transliterating into any non-roman script. —RuakhTALK 19:36, 6 April 2008 (UTC)
- Incorrect. Transliteration has the broader scope, since any script may theoretically be translated into any other script. Romanization is a specific subset of transliteration in which the result is written with Roman letters. There are some languages included here (like Serbian or Crimean Tatar) which in fact are written in multiple scripts. --EncycloPetey 21:29, 6 April 2008 (UTC)
- Eh, that's iffy. I'd argue that neither has broader scope, since "transliteration" typically implies a character-by-character mapping scheme (so, you can transliterate Greek writing, but not Hanzi writing, into Latin script), which "romanization" does not; but "romanization" necessarily implies mapping into Latin script, which "transliteration" does not. But for Wiktionary purposes, where we only ever map text into Latin script, "romanization" has the broader scope. (Your comment about languages in multiple scripts strikes me as a red herring, since in that case we're not transliterating Serbian+Latin into Serbian+Cyrillic, but rather including the already-existent Serbian+Cyrillic alongside the already-existent Serbian+Latin. If there's a Serbian+Cyrillic word that's spelled funkily for whatever reason, we'd use that funky spelling, rather than providing a straightforward Cyrillic transliteration of its Serbian+Latin counterpart.) —RuakhTALK 21:55, 6 April 2008 (UTC)
- You may be right about Serbian, but I maintain that Transliteration has the broader scope. Strictly speaking, Romanization implies the use of only Roman letters, but Pinyin "Romanization" includes Arabic numerals to transcribe tone. On reflection, there are aspects to Romanization that are not covered under the term Transliteration, just as there are aspects of Transliteration not covered under the term Romanization. So, I retract what I said about one being the subset of the other; they are two items which have significant overlap on Wiktionary, but neither is wholly included within the other. --EncycloPetey 22:05, 6 April 2008 (UTC)
- It's true that I only considered transliteration into Latin when I suggested that romanization is the more encompassing concept, but that is what we're concerned with. Romanization from another alphabet is also transliteration, while romanization from a logographic system is not (the addition of numbers to pinyin is just a detail of a romanization system). In en.Wiktionary, "romanization" covers the whole topic more precisely than "transliteration" does.
- The broad standards bodies have run into these constraints too, so while Slavicists usually say "transliteration", the BGN/PCGN refers to all of their standards as "romanization" systems. —Michael Z. 22:30, 6 April 2008 (UTC)
I'll leave this alone for now, since there doesn't seem to be active support for changing the names. —18:37, 10 April 2008 (UTC)
Standards
For romanization and transliteration we are using a mix of established standards, slightly modified standards, and systems created specifically for Wiktionary. Some of our romanization guides emphatically state that romanization is distinct from pronunciation, while at least one novel phonetic system is under development with the explicit assumption that the need for transliteration "is now suddenly past."
I think we need to develop some basic guidance for the use of romanization and transliteration in Wikipedia.
- Briefly, what is the purpose of romanization in Wiktionary? Is it distinct from pronunciation, and if not then should it be merged with the latter or deleted altogether?
- What circumstances justify developing our own novel system instead of adopting an established standard, created and used by professionals?
—Michael Z. 16:45, 6 April 2008 (UTC)
- Briefly, the purpose is to enable the casual reader to look at a string of characters they don't know, ignore that string, and look at the string right next to it of characters that they do know, so they'll have some idea of the word in question, will easily be able to tell if the same word is mentioned in more than once place in an entry (assuming they can distinguish between the various scripts it might be written in that could all produce the same romanization), and so on. It is definitely distinct from pronunciation, because many languages are like English in that a single word can have vastly different pronunciations over space and time, but we don't want to have to provide all those pronunciations every single time we mention a word in any entry. Also, because we typically aim to provide pronunciations in a fairly technical form (IPA, SAMPA, etc.) that are hard to guess at if you're not familiar with them; romanizations, by contrast, should be easily (if sometimes ambiguously) intelligible to the casual reader.
- I think for most languages, there exist various co-existing de facto standards, and I'd almost say that in most cases it's better to form our own balance than to try to impose some de jure standard that's not representative of our needs and those of our readership. (This is also complicated by the fact that the de jure standards are often tied to specific organizations and specific kinds of goals, and therefore are potentially POV; and even when this isn't the case, they're frequently way too technical for our purposes.)
- —RuakhTALK 19:36, 6 April 2008 (UTC)
- I don't think it's a good idea to just dump standard transliterations schemes used by thousands of publications (including most of the real-world dictionaries) in favour of some ad-hoc designed ones that are Wiktionary-specific and which should somehow approximate phonetic value of a word to a clueless reader who just happens to randomly open some FL entry. If someone is supposed to actually learn a FL word using Wiktionary (assuming that that is the primary purpose of FL entries), he is expected to be familiar with some basic properites of it, like phonology and transliteration system. --Ivan Štambuk 20:28, 6 April 2008 (UTC)
- In the case of “standard transliteration schemes used by […] most […] real-world dictionaries”, I agree with you; but I think that for most languages, the so-called "standard" transliteration schemes are really not the most widely used. (Perhaps I'm simply mistaken; perhaps Hebrew, the only non-Latin-alphabet-using language I know well enough to really form my own opinion about, is simply an exception in this regard.) Also, transliterations are not just for foreign-language entries, but also for English-language etymology sections and so on. And even foreign-language entries are not just for people actually learning the foreign language, but also for people who encounter a foreign-language word in some context that makes them want to know more about it. (And I don't think that for most languages there's any transliteration scheme that could be considered a "basic property" that a language learner is expected to know.) —RuakhTALK 21:09, 6 April 2008 (UTC)
- Transliteration also helps non-readers of foreign scripts discern and compare the structure and possibly the phonemics (not the phonetics) of words of various languages.
- I would suggest that for some of these reasons it would be best to use a system in use in dictionaries or in linguistics. I think it is generally better to use an established system than to invent our own—even if it is a rare one, then it would be used in at least two places, not just one. Some systems may have variations or not be well defined, in which case we may choose to nail down the fuzzy details.
- I also believe that the the wiki principal of reliance on documented knowledge strongly discourages us from presuming to have the expertise to develop or modify a better romanization method than those that have been developed or used by professionals or academics.
- But the cases of some languages, the choice of a best system may be debatable, or there may be no good candidate (e.g., Wiktionary:About Thai#Transliteration). —Michael Z. 22:55, 6 April 2008 (UTC)
- The primary purpose of FL entries on Wiktionary is to help English-speaking users (not necessarily native English speakers; English being de facto world's only lingua franca, and the defining vocabulary much more easier to acquire than any specific terminology, cross-language learning opportunities are much more bigger here than in e.g. Wikipedia) learn what do FL lexemes mean, with as much additional data that could enhance learning experience. Experiences of others (those who happen to randomly open a FL entry or navigate to it via ===Etymology===) that have absolutely no interest in the FL entry itself, nor wish to spend a reasonably small amount of time acquainting themselves with transcription/transliteration system usually used for it, should be of little or no concern. Transliterations can sometimes convey much important data - stress/pitch/tone via diacritics that could sometimes be phonemic but not marked in usual orthography, or hyphenation for separating clitics and compounds (which are due to various peculiarities sometimes very difficult for beginners to distinguish).
- Most "important" scripts have some sort of standard transliteration system (in lots of cases an ISO standard), or usually a half a dozen of them (The great thing about standards is that there are so many to choose from) that are widely used, and Wiktionary should follow the most common practice employed by real-world FL-English and English-FL dictionaries. Significant deviations should be thoroughly discussed an voted on (like when community decided to dump /r/ and use /ɹ/ despite the fact most (>90%) English-English, FL-English and English-FL dictionaries uses /r/ and that almost no one except trained linguists and knowledgeable enthusiasts knows wtf "alveolar trill" means ^_^).
- It might make some sense to account for those who want to see "ch" instead of "č", "sh" instead of "š", "ś", /ts/ instead of /c/ etc. - but not at the expense of all the others who could use the Wiktionary to learn the language, and would expect it to follow the scheme used by most of the others FL-English dictionaries. Maybe in simple.wiktionary.org, or some "dumb-mode" WT:PREFS option ^_^ --Ivan Štambuk 19:51, 7 April 2008 (UTC)
In a while, I will try to incorporate some of these thoughts into the transliteration guidelines. When/if I formulate some concrete wording, I'll introduce it here before changing the guideline. —Michael Z. 18:43, 10 April 2008 (UTC)
Organization
Transliteration guides are spread out under different namespaces and categories, and inconsistently titled. Some merely refer to standards outlined in Wikipedia articles. [please add any omissions]
- Wiktionary:About Greek#Romanisation
- Wiktionary:About Greek/Transliteration
- Wiktionary:About Hebrew
- Wiktionary:About Korean#Romanization
- Wiktionary:About Russian
- Wiktionary:About Thai#Transliteration
- Wiktionary:Ancient Greek Romanization and Pronunciation
- Appendix:Kurdish transliteration
- Appendix:Mongolian transliteration
- Appendix:Persian transliteration
- Appendix:Russian transliteration
- Appendix:Ukrainian transliteration
- Wiktionary:About Japanese/Transliteration
- Wiktionary:About Sanskrit#Transliterated entries
Where does all of this belong?
- It seems to me that any "wiki-romanization" originated by this project belongs in the Wiktionary: namespace, and not in an Appendix:.
- Is it better to point to Wikipedia, or to duplicate that material here, in cases where only standardized systems are used?
- Should we present alternative standards, or only include Wiktionary's selected or created romanization systems?
—Michael Z. 16:45, 6 April 2008 (UTC)
- I think for some languages, such as Han-using languages, it does make sense for the romanizations to be described in appendices, since a reader might find it useful to learn the details of our system. But for other languages, such as Greek or Hebrew, an interested reader would find it much more useful to simply learn the script for his or herself, and the romanizations are probably only needed in the Wiktionary namespace. (Even when we do have an appendix, it might be best to have both an appendix and a project page, aimed at different audiences. Keeping them in sync would be a bit annoying, but when you consider that we also have to keep all main-namespace romanizations in sync, it's really nothing by comparison. :-P) —RuakhTALK 19:36, 6 April 2008 (UTC)
- These pages do NOT necessarily all belong in the Wiktionary namespace. If a page is about standards of transliteration used specifically for Wiktionary, then it belongs in the Wiktionary namespace, either within an "About Language" page, or as a page or subpage of its own linked from that "About" page. On the other hand, if the page is about a variety of transliteration schemes, for the benefit of users who may have a work with an unusual transliteration scheme, then it should be an Appendix. The Wiktionary namespace is set aside for information about practice on Wiktionary, and should include only the standard selected for Wikationary. The Appendix namespace covers supplementary material not specific to Wiktionary, and should include any major system likely to be encountered. --EncycloPetey 21:22, 6 April 2008 (UTC)
- Sensible, but it results in the guides for Wiktionary's romanization/transliteration being split between two different namespaces, or having some Appendix information repeated in the Wiktionary: namespace. I guess this could be ameliorated using categories, and by adding a definitive list to the main romanization/transliteration guide. Which is the tidiest solution? —Michael Z. 21:04, 7 April 2008 (UTC)
- That will depend on what information currently exists on Wiktionary for a given language. I would think that having an "About:LAnguage" page would be an important first step, since there is the possibility of listing and linking such key pages and sections from the bottom of the page. --EncycloPetey 22:52, 7 April 2008 (UTC)
- There shouldn't be any significant duplication. The project page should describe what is required/recommended for entries, and the appendix should explain how a given system works. So if it is the consensus that for language A, romanization X should be used, the "Wiktionary:About A" page should say "Entries in language A should use romanization system X," and link to "Appendix:X Romanization." Beyond this, considerations that affect how a romanization system is used in entries (layout, templates, etc.) go in project space; but the description of the system (insofar as it is not unique to Wiktionary) goes in appendix space. -- Visviva 09:17, 9 April 2008 (UTC)
- That's a good summary, Visviva. I will review the relevant guidelines and appendices, and perhaps shuffle things around a bit to fit this picture. —Michael Z. 18:45, 10 April 2008 (UTC)
Have a look at the pages in Category:Transliteration appendices. Most of them are simply labelled "Wiktionary standard translation", with no explanation or citation. I'll move these from the Appendix namespace into Wiktionary, and post a note on each requesting a reference. —Michael Z. 16:59, 11 April 2008 (UTC)
Does anyone still object to AutoFormat just fixing these? I don't think I've ever seen a case where its explanation of what it would do was wrong; granted, in plenty of cases it was incomplete, but I think that's because it only adds one {{rfc-*}} tag at a time, so if it actually just fixed things, I think it would have done a complete job. It's annoying that we have to do these manually, and frankly, I'm not convinced that manual intervention is any less error-prone than AutoFormat would be. —RuakhTALK 19:20, 6 April 2008 (UTC)
- Agreed. I can't think of an instance where AF would have done something I disagreed with. However, it might be nice to have an official proposal of new things we're giving AF license to do, so we can specifically agree to them (if Robert's willing to throw together such a list). -Atelaes λάλει ἐμοί 21:27, 6 April 2008 (UTC)
- Likewise. I can imagine that some of the more complicated pages could present a problem (mulitple POS sections with a single Translation section at the bottom), but these are very rare and are problematic anyway. --EncycloPetey 22:07, 6 April 2008 (UTC)
- I've asked the same question here myself a while back, and since then I think I've seen an error, but just one, which I can't believe I didn't report. Suggestions aren't something people jump on, but if the bot does something wrong then we can yell at Robert to tweek it. DAVilla 05:02, 7 April 2008 (UTC)
- Yes. Go for it. SemperBlotto 07:26, 7 April 2008 (UTC)
Appendix:Old Cyrillic alphabet
I've created a new Appendix:Old Cyrillic alphabet, including transliteration. Please review and correct any mistakes. —Michael Z. 02:58, 7 April 2008 (UTC)
Placement of terms consisting of multiple words
I am by convention placing terms consisting of multiple words such as complex analysis under the Derived terms header of the article, in this case analysis, as it is my understanding of WT:ELE that they belong there. Is my understanding shared by the community?
My placement of these under the Derived terms header in the article analysis has been reversed. Before launching an edit war, I'd like to be sure I am on the right side. --Daniel Polansky 09:38, 7 April 2008 (UTC)
- I always put such terms in the "Derived terms" section (see sulfate as an example). I don't see the problem. SemperBlotto 09:45, 7 April 2008 (UTC)
- Me too. I always figured this was the reason for using "terms" rather than "words" in the headings of these sections. -- Visviva 09:52, 7 April 2008 (UTC)
- This has always been my understanding as well. Thryduulf 11:35, 7 April 2008 (UTC)
- The edit summary that moved these to "See also" claimed that compounds should get different treatment. Was there a discussion to that effect before the creation of
{{rel-top}}
? Without the template, any justification to push some of this perhaps lower-value material lower on the page is understandable. But with collapsible tables I can't see any justificiation at all for separating them. Sometimes I wonder about the point of having big tables of such derived and related terms at all, collapsed or not. DCDuring TALK 11:47, 7 April 2008 (UTC)
- The edit summary that moved these to "See also" claimed that compounds should get different treatment. Was there a discussion to that effect before the creation of
- Not to my knowledge; I think the editor was just confused. -- Visviva 12:12, 7 April 2008 (UTC)
- Okay, thank you all. As an aside, I am very fond of these big tables, although not yet sure why. --Daniel Polansky 12:26, 7 April 2008 (UTC)
- Even if your fondness were very neurotic (;-}), it would almost certainly be shared by some meaningful fraction of our users. I'd be interested in why you like them or how you might use them. DCDuring TALK 13:07, 7 April 2008 (UTC)
- So (a) one reason I have discovered is that when looking for a compound term, I like only typing one word of the several and then navigate myself to the term with mouse. That is on days on which I type a lot and am glad to get a relief from typing. Another one (b) is that some substantives get extended by adjectives (e.g. philosophy, analytical philosophy, continental philosophy, pain, physical pain, emotional pain), and when these multi-term extensions are listed, the page of the substantive kind of documents its subclasses, or attibutes. I admit that the latter could be partially served by the Hyponyms header.--Daniel Polansky 17:52, 7 April 2008 (UTC)
- And (c) is the reason (or use case?) given by Mike below: I know or assume that the phrase contains a specific word, but am uncertain about the exact reading of the phrase. --Daniel Polansky 18:05, 7 April 2008 (UTC)
- Adding after the discussion: (d) specifically for adjectives, derived multi-word terms tell me on what classes the adjective is defined as a value of an attribute, so to speak. Phrased differently and modeled differently, it tells me what types the predicate of the adjective is ready to accept as its parameter. --Daniel Polansky 13:45, 9 April 2008 (UTC)
- Likewise, I like the tables of related and derived terms. I always try to add them to Latin entries because I find it helps enormously with learning the vocabulary. Being able to see a host of related terms, and click on each to get the specifics, really is enlightening in terms of understanding Latin word relationships. The commonalities among the various words allow insight into the scope of the root word, and provide a survey of what ending created words in other parts of speech from that root. They're also really handy in the case of verbs for finding (and learning) all the compounded verbs that come from a particular root, and which differ in the addition of a prepositional prefix. --EncycloPetey 01:15, 8 April 2008 (UTC)
- Thanks for the explanations. Though this seems like something only a veteran would use, Daniel has articulated how the tables might help an ordinary user who had come to Wiktionary to look up a complex concept. It's similar to having a lot of usage examples and citations in principal name-space, enabling certain (correctly spelled) searches to find useful entries. That kind of use would not put any limits on how phrases or compound words appeared, so that esthetics and the interests of etymlogic/morphologic/ally oriented user needs could legitimately govern. Would subject matter grouping help in the case of long lists. I would have thought that time-zone names would have been a helpful categorization in the useful extreme case of time. DCDuring TALK 02:46, 8 April 2008 (UTC)
Wow! So everyone here is happy with the fact that (deprecated template usage) timely is buried deep within (deprecated template usage) time? DAVilla 16:04, 7 April 2008 (UTC)
- I kinda would prefer to split it into several tables, say one for "Derived terms" (which would include e.g. (deprecated template usage) timely), one for "compounds" and one for "phrases", though I understand that such division is not popular here, and a distinction between "compound" and "phrase" is perhaps more difficult to keep up in English than in other languages (like Swedish). But for the information those lists contain: yes, I like them as they allow me to scan the list to find an expression I know contain a given word, but am uncertain how it would be written in the "lemma form"; or I may see which options there are to add a particle or a preposition to get to the appropriate expression, even if I don't remember which one should be used. (Trying to keep track of English prepositions in general, and prepositions used in various fixed expressions in particular, is nothing but Sisyphus work... ;) \Mike 16:36, 7 April 2008 (UTC)
- Perhaps other grammatical forms, or transformations of a word deserve a special status. The plural times appears right next to time, so maybe the adjective timed, adverb (and adj.) timely, etc, belong closer to the top than, say time-honoured or Australian Eastern Daylight Time.
- Is it possible to describe a logical, but fairly limited list of such forms? —Michael Z. 16:46, 7 April 2008 (UTC)
- I would then say that anything which is not a compound/phrase, that is, anything which is not possible to split into more than one independent 'proper' word, would qualify. Thence (deprecated template usage) timely would qualify, but not (deprecated template usage) time-honored (= (deprecated template usage) time + (deprecated template usage) honored). Would there be any ambiguity in such a split? \Mike 17:42, 7 April 2008 (UTC)
- I'm happy with (deprecated template usage) timely's being s.v. time's "Derived terms" section if it's a derived term. If it's in fact descended from an older version of (deprecated template usage) timely then I'm not sure. In the case of (deprecated template usage) complex analysis, I suspect strongly that it is derived from (deprecated template usage) analysis and so belongs in its "Derived terms" list.—msh210℠ 16:55, 7 April 2008 (UTC)
- If it is descented from an older version of (deprecated template usage) timely then I believe it should be in a related terms section. Thryduulf 17:04, 7 April 2008 (UTC)
- Ah, yes, agreed.—msh210℠ 18:39, 7 April 2008 (UTC)
- If it is descented from an older version of (deprecated template usage) timely then I believe it should be in a related terms section. Thryduulf 17:04, 7 April 2008 (UTC)
- Interesting ... the discussion at #Ambiguous etymologies (above) seemed to reach the opposite conclusion. It remains my opinion that both forms of etymology need to be presented on Wiktionary; "timely" is formed from time + -ly in contemporary English, but it is also a linear descendant of OE tīmlīce. We would be doing an unforgivable disservice to our readers if we discounted either of these facts... and in this case it seems like "Derived terms" is the more transparent choice. -- Visviva 09:25, 9 April 2008 (UTC)
- Hm, I'm thinking in terms of grammatical morphology rather than etymology. Timely is an adv/adj sense of the lemma time, regardless of whether it sprang from time or has always been used alongside it. Maybe I'm being too ambitious, as this may require a separate section, or something like a declension or conjugation block, rather than being sorted at the top of "derived terms". —Michael Z. 17:08, 7 April 2008 (UTC)
- Do ordinary passive users actually use derived terms and related terms? How do we use it? I use it as a kind of memory exercise when working on an entry sometimes, but rarely follow the links just to get information.
- Are "Derived terms" and "Related terms" ever split by sense? I assume that we wouldn't want them to be.
- It doesn't seem silly to divide the contents of these into single words, compounds, and phrases/idioms/proverbs if the single block is "too big" as time's certainly is. DCDuring TALK 17:21, 7 April 2008 (UTC)
- Separating (in lion) the lioness from the lion cub would be cruel. It is painful to have to look in different sections depending on the precise spelling of words (space inside or not?) when there is no other reason to separate them. But I agree that proverbs should be put in a different section. Lmaltier 17:48, 7 April 2008 (UTC)
- It doesn't seem silly to divide the contents of these into single words, compounds, and phrases/idioms/proverbs if the single block is "too big" as time's certainly is. DCDuring TALK 17:21, 7 April 2008 (UTC)
- 2. Splitting by sense seems like a good idea in many cases. For example, why would I want pressure head or Korboggen head to be in the same table as head lettuce at head#Derived terms? This goes several times over for Chinese characters in the East Asian languages. But I suspect there are many other cases where splitting by sense would cause all hell to break loose.-- Visviva 15:15, 8 April 2008 (UTC)
- The time page is indeed an extreme example, showing the downsides of what I so often like. As regards my preference, the main point is that the compounds are listed somewhere, not that they are listed under Derived terms heading. For me, it would be perfectly okay to have Compound terms heading, or whatever is considered appropriate. --Daniel Polansky 17:58, 7 April 2008 (UTC)
Inflections
We conventionally list certain inflections by the headword: plurals of nouns and pronouns, comparatives and superlatives of adjectives and adverbs, other cases of pronouns (he > him, himself, his), key inflections of verbs.
My paper dictionary (Canadian Oxford) also lists such inflections when they are irregular or "may cause difficulty". But it goes beyond Wiktionary by adding the simple past tense, present and past participles, adjectives in -able formed from transitive verbs, e.g., achieve (achievable), exchange (exchangeable). It may include versions restricted to U.S., British, or Canadian English, etc, e.g. "car·olled, car·ol·ling; US car·oled, car·ol·ing".
Regardless of their shared or separate etymologies, timed, timely, and timeful are inflections of time. It makes sense that we would make this intimate relationship clear somehow. Perhaps we should consider expanding the inflection templates like {{en-noun}}
, or adding an "Inflections" section before "Derived terms" —Michael Z. 19:48, 7 April 2008 (UTC)
- We already have an "Inflections" section before "Derived terms".—msh210℠ 20:29, 7 April 2008 (UTC)
- By my reading of WT:ELE, in English words only some inflections belong next to the headwords, but an inflection heading is only to be included in non-English words.
- Perhaps the latter restriction should be relaxed. —Michael Z. 21:11, 7 April 2008 (UTC)
- You have read ELE correctly; we do not use the Inflections section in English entries, and we do not need to. Adjectives formed in "-able" are separate words with separate entries and etymologies. They are listed in the "Derived terms" section. English does not treat timely as an inflection of time, nor do I know of any European language where the adverbs are considered inflections of nouns or vervs. Adverbs are typically regarded as a separate part of speech, though they are Derived from nouns, verbs, or adjectives. --EncycloPetey 22:47, 7 April 2008 (UTC)
- Agreed 100%. —RuakhTALK 22:59, 7 April 2008 (UTC)
- Okay, I see that the verb forms can be handled as in carol#Verb.
- So inflection isn't the correct term, but this still means that some of a word's closely-related cognates can get lost in a sea of compound words, idiomatic phrases, and relative neologisms. In this way, Wiktionary's presentation suffers in a few cases, compared to a paper dictionary. —Michael Z. 00:45, 8 April 2008 (UTC)
- I agree that it would be nice to have separate sections for (on the one hand) words derived by the addition of affixes and (on the other) phrases “derived” by the addition of words. —RuakhTALK 01:00, 8 April 2008 (UTC)
- Yes, they can get lost, but this only happens on a very small number of pages. I suspect there are fewer than 50 such pages on all of Wiktionary. --EncycloPetey 01:11, 8 April 2008 (UTC)
- But our hope is for them to get lost on many words. We might as well formulate some thoughts about how to separate them now, when it's still quite rare. Would anyone object to my splitting the section at time#Noun into two tables, one glossed as "words derived from the noun time", one as "idioms and set phrases using the noun time", just to see if we like the result? —RuakhTALK 01:38, 8 April 2008 (UTC)
- I'd like to see how it looks. I can't see any serious objections to trying it, as long as none of the information is removed. Idioms, especially, seems to be a different thing from derived terms
- My paper dictionary groups these as "idioms and phrasal verbs", and groups them after the main definitions. It also differentiates "derivatives" (formed with suffixes and are appended to an entry unless further definition is required) from compound words (which are always main entries, whether they are formed as one word or not, e.g., bathroom, serial number, and mega-musical). —Michael Z. 02:49, 8 April 2008 (UTC)
- With collapsible tables, we don't even need an additional section. We could have multiple tables just as we do for Translations, as long as each table is appropriately labelled. --EncycloPetey 02:58, 8 April 2008 (UTC)
{{lookfrom}}
Something useful I've recently found is {{lookfrom}}
, which directs the user to a Special:Prefix index page. It isn't perfect, but could be used to make the expansion of the ====Derived terms==== section redundant. Keene 14:22, 8 April 2008 (UTC)
- I don't think that would do much to help things, as it only can find pages starting with the selected word. Things like "in time" or "on time" would still need a manually made list in (deprecated template usage) time. On a related term: ould it be possible to tweak the search function so that it only looks for pages which includes the search string in the title of the page, and doesn't care if it occurs in the body? That would IMO make more sense as a replacement for ====Derived terms==== (except of course such derived terms which are based on some kind of mutation or stem change...) \Mike 14:42, 8 April 2008 (UTC)
- Additionally, that template function doesn't distinguish languages. Nor does it restrict the listing to words etymologically derived from the start term; it simply lists entries that start with the specified set of characters. For example, boggy is derived from bog, but boggle is not (even though it shares the same start letters). It also is case-sensitive, so it wouldn't list New York if you were looking at "n". We don't have anything that would make Derived terms redundant. --EncycloPetey 21:43, 8 April 2008 (UTC)
Treatment of certain types of compound terms
I asked a question about formatting Spanish entries that applies to many other languages, so I brought it here. Compound words can often be formed by affixing pronouns (or participles particles) to verbs. In Spanish, for instance, practically any combination of one or two reflexive/direct/indirect pronouns can be attached to infinitive verbs, present participles, or affirmative commands. So I have a couple questions.
1) How should these compound words be treated? Should they be listed as "Conjugations" or under "Derived Terms"? Should one place (for instance the infinitive entry) list all the permutations or should they be scattered on the pages of each stem? See redactar for an example of putting them all under "Derived Terms".
2) The list was made by blindly following the rules of grammar, so many of the permutations are rare, possibly unattestable. Should we include links for as yet unattested terms (does the CFI apply to links)? I'd love ideas. --Bequw → ¢ • τ 19:34, 8 April 2008 (UTC)
- Note: that should say "pronouns (or particles)". Some languages have particles (small words) that can be attached to the verb. I know Hungarian does this, and IIRC German and Dutch do as well. Indonesian will have problems with this that I don't fully understand.
- To elaborate for those who aren't familiar with Spanish, many verbs can have a pronoun (indirect object or direct object) affixed to the end of a verb. So, "kiss me" would be bésame (besa + me). "Give it to me" would be dámelo (using da + me + lo). How should these affixed forms be treated, and where/how should they be linked or listed? The issue is compunded in Spanish by the fact that the meaning of some verbs changes depending on the presence or absence of these pronouns. --EncycloPetey 21:34, 8 April 2008 (UTC)
- Hebrew does this also, although it is far more common in older texts than it is in current speech. I say have all these attested forms as entries, and list all of them that are definitely the correct form, even if unattested. I'm not sure where to list them, though: probably under Conjugation.—msh210℠ 22:07, 8 April 2008 (UTC)
- I think Hebrew's a bit different from Spanish in this regard: in Hebrew I think it's actually part of the verb's morphology — for example, in a form like (deprecated template usage) עשני (asáni), I couldn't say where the verb ended and the direct object began. So in Hebrew, I think these forms definitely warrant their own entries, and in fact some of them (such as (deprecated template usage) קדשנו (kidshánu)) already have them. By contrast, in Spanish I think there's a separate verb and enclitic pronoun, and the spacelessness and accent are strictly written phenomena. (The -monos thing also affects pronunciation, but still I think falls into the same general category.) They may or may not warrant their own entries — my sense is not, but I see it both ways — but I don't think that bears on handling of Hebrew. —RuakhTALK 04:32, 9 April 2008 (UTC)
- Even if Ruakh's correct that Hebrew is different from the Romance languages in this regard, I still maintain that both should have such entries and lists per my comment just above.—msh210℠ 16:46, 9 April 2008 (UTC)
- As I've always seen "Sum of Parts" reasoning used to remove phrases, and not applied at the sub-word level, I'd suppose that all Romance language compound term that are attestable would merit their own entry. This is especially the case in Spanish because some of the affixed pronouns could be ambiguous (whether direct or indirect objects) and meanings can change slightly. --Bequw → ¢ • τ 15:50, 9 April 2008 (UTC)
- I think Hebrew's a bit different from Spanish in this regard: in Hebrew I think it's actually part of the verb's morphology — for example, in a form like (deprecated template usage) עשני (asáni), I couldn't say where the verb ended and the direct object began. So in Hebrew, I think these forms definitely warrant their own entries, and in fact some of them (such as (deprecated template usage) קדשנו (kidshánu)) already have them. By contrast, in Spanish I think there's a separate verb and enclitic pronoun, and the spacelessness and accent are strictly written phenomena. (The -monos thing also affects pronunciation, but still I think falls into the same general category.) They may or may not warrant their own entries — my sense is not, but I see it both ways — but I don't think that bears on handling of Hebrew. —RuakhTALK 04:32, 9 April 2008 (UTC)
- I'm not suggesting we apply it at the sub-word level, only that we adopt a more useful definition of (deprecated template usage) word. If (deprecated template usage) me hablaste is two words, then I think so is (deprecated template usage) háblame. That said, it would be nice to have an entry for (deprecated template usage) hábla- defined as “Form of habla used with following clitic pronouns; see hablar.”, and perhaps (deprecated template usage) háblame should have an entry that says simply “habla (see hablar) + me.” I don't know what POS to use, though; in many cases it's a verb phrase (which we'd call a “verb”), but in other cases it's an odd verb-phrase fragment, like in either (deprecated template usage) dalo al profesor or (deprecated template usage) dame el libro. (Perhaps either V+DO or V+IO can be considered a constituent, I don't know, but certainly it doesn't flip back and forth whenever we change which object is a clitic.) And IMHO it is in no case a good idea for dar to link to dame, etc., though it should perhaps give the relevant imperative as “da/da-/dá-” (and likewise for other affected forms). —RuakhTALK 17:10, 10 April 2008 (UTC)
Treatment of other types of compound terms
Hebrew has a number of terms that translate into English as prepositions (from, to, others) and conjunctions (and, that), but which are attached to the fronts of words in the Hebrew. These are (deprecated template usage) ב-, (deprecated template usage) ו-, (deprecated template usage) כ-, (deprecated template usage) ל-, (deprecated template usage) מ-, and (deprecated template usage) ש-. Words formed of these, like (deprecated template usage) בארץ (b'eretz) (equals (deprecated template usage) ב- (b) plus (deprecated template usage) ארץ (eretz)), are written without space in the middle, and are, I think, recognized as one word by, for example, schoolchildren. Linguists consider them two words each, with the prefix counting separately from the rest. (Ruakh informs me it's actually a clitic rather than a prefix.) Certainly anyone who knows Hebrew can figure out the meaning if he can figure out where the prefix ends: if it's two words, then it's a sum of its parts. On the other hand, someone who doesn't know where the prefix ends will likely look up the whole thing. Ruakh says these are not entry-worthy; I say they are. I decided to take this issue here to the BP because it may well be relevant to other languages (Finnish, Hungarian, others) as well. What do you all think?—msh210℠ 16:41, 9 April 2008 (UTC)
- If they're written without a space in the middle, and words in Hebrew are otherwise spaced, then I say we might as well have them. What harm does it do after all? (A bigger problem is with languages like Thai or (what I've been learning recently) Lao, which are not written with spaces between words at all. It is often impossible to judge what is a compound noun and what is just sum-of-parts.) Widsith 16:50, 9 April 2008 (UTC)
- The problem is that there's no limit; you can (and frequently do) put more than one of them together, as in [v'][she][k'][she][mi][b'][tokh], "and that when from within" — which contains the clitic [she] twice, four other clitics once each, and the preposition [tokh]. (Note: the brackets here are just for ease of reading; this isn't IPA or anything.) In this example, I think [b'][tokh] "inside, within" warrants inclusion as a fixed expression, as does [k'][she] "when"; but certainly the whole thing together doesn't. —RuakhTALK 17:06, 9 April 2008 (UTC)
- I'm not sure what the problem is with adding many such entries. Not all will be attested, and those probably are the only ones we should add, but among those that are attested, I'll grant that there will still be quite a few. But this is a wiki: we've got time.—msh210℠ 17:28, 9 April 2008 (UTC)
- I think the same rule as in English should be followed, if possible: if the compound term refers to a specific term, then it should be included (like "motorcar", "railroad car", "carwash"), but not if it is a simple sum of parts ("red car", "Japanese car", "dad's car"). Of course, it is often not clear if the term is specific or not. In that case I don't see any problem including the word.--Jyril 17:41, 9 April 2008 (UTC)
- I agree. —RuakhTALK 17:55, 9 April 2008 (UTC)
- Sorry, but I don't think that makes sense. The CFI are designed to use the words found in permanently accessible works (books, journal articles, etc.) as a proxy for the words that a typical reader might encounter and want to look up. That works fine — not perfectly, but fine — for individual words; but I don't think it'll work at all for a stray series of prepositions and conjunctions that all happen to appear together at the start of a phrase, plus whatever word happens to follow them. There are virtually unlimited real combinations, and the CFI's standards of attestation won't reflect which ones are worth including and which worth aren't. (Partly because in some sense none of them are worth including, partly because in some sense they all would be if that were even remotely possible, just as it would be great to include every possible Lao sentence.) —RuakhTALK 17:55, 9 April 2008 (UTC)
I think a dictionary is intended to be (among other things) an aid for learning a language, not a substitute, or the only source for understanding. Therefore it is not necessary to include every possible form of every word in every language. It would probably be impossible as well. Just as an example, every Finnish verb has five infinitives and six particips, some of which can be inflected in fourteen cases in singular and plural, and combined with six possessive suffixes and a host of clitics. This adds up to dozens, possibly hundreds of forms derived from each verb. The numbers are smaller with nouns, but they are still large. Check one composition at: järjestelmällistyttämättömyydellänsäkään, which I added for fun, and because I have learned that it is the longest "word" in Finnish that is not a compound term. Nykysuomen Sanakirja (the Dictionary of Modern Finnish) has some 200.000 entries. In order to list all forms that may exist in Finnish alone, one would probably need tens of millions of entries. As a matter of fact I think we have too many forms already. As a simple example, most of the English plurals are plain old SoP's and completely unnecessary for anyone who knows even the basics of the language. Finnish plurals are a bit more complicated because the stem often changes, but for the most part they are as useless (or to be exact, the stem does not change, but the nominative form and stem are very often not the same thing). Only irregular plurals and those which have an independent meaning, or are "pluralia tantum" would suffice, IMHO. Having said this I do not know where to draw the line. Hekaheka 18:54, 9 April 2008 (UTC)
- Why did you added järjestelmällistyttämättömyydellänsäkään because the word usually used is epäjärjestelmällistyttämättömyydellänsäkään? ;) In the case of words with enclitic suffixes or other "obvious" cases, I would accept those which are very common in that language or otherwise special. That is not the case of "proper" inflections, which should be included. --Jyril 19:18, 9 April 2008 (UTC)
- This is out of the point of the discussion, but epäjärjestelmällistyttämättömyys includes a double negation and is therefore not a meaningful word - rather a collection of clitics. Hekaheka 21:50, 9 April 2008 (UTC)
A dictionary is not used only by people learning a language. You might want to try to understand a message you received by e-mail in a language you don't know, and cut and paste every word. Another use of Wiktionary might be by a browser allowing a simple search by double-clicking on any word of any website. Such uses require that as many forms as possible are included. Lmaltier 21:05, 9 April 2008 (UTC)
- Translations robots do that job better. Besides, how much does it help to know that presupuestares is "the second-person singular of presupuestar in the future subjunctive", if you don't know what a subjunctive is?. It takes quite a bit of language-specific knowledge to understand the glosses. Hekaheka 21:50, 9 April 2008 (UTC)
- There are no translation robots for some languages. Of course, you cannot translate a text well just by searching each word. But you can know what the lemma form is, and what it means, and this is important, even if you don't know what subjunctive means. Paper dictionaries don't allow that (I already searched for a word in a paper dictionary, and concluded it was absent; actually, it was present, but the lemma form was not obvious). Lmaltier 06:08, 10 April 2008 (UTC)
- Translations robots do that job better. Besides, how much does it help to know that presupuestares is "the second-person singular of presupuestar in the future subjunctive", if you don't know what a subjunctive is?. It takes quite a bit of language-specific knowledge to understand the glosses. Hekaheka 21:50, 9 April 2008 (UTC)
- Well I guess there is some precedent for excluding single words that are Some of Parts as in English we now exclude the possessive case. Per language policies could be written to exclude Sum of Parts words according to the rules of that language, but I don't think it should happen until our wiki technical abilities mature. For example, finally when dealing with the case of the first letter of a word, our technical ability (auto-redirecting Omphaloskeptic -> omphaloskeptic) matches our policy (not allowing multiple entries for different cases of the same word/lexeme). (That was such a great enhancement, by the way.) I'd be great if a user's search came up empty, we could ask them for the language and we could check to see if the word is decomposable by that language's wiki-programmed rules. But until then, let's leave it open. --Bequw → ¢ • τ 21:36, 9 April 2008 (UTC)
Wandering [edit] links
These should now appear on the correct line; also trans and rel tables play nicely with images. See WT:GP for more info. If you see anything odd, tell me or us there. Some float boxes still need some extra code removed for IE. Robert Ullmann 18:10, 9 April 2008 (UTC)
Demonyms
A demonym is the name for a person from a place: European, Basotho, Iowan, Winnipegger, Haligonian, Smithereen. It's a specific, and sometimes interesting, kind of word.
I'd like to create a context template {{demonym}}
applying a new category:Demonyms, which would in turn fall into category:People and either category:Geography or category:Place names.
(See also category:Exonyms, category:Endonyms, category:Xenonyms.)
Is this a sensible idea? —Michael Z. 07:01, 10 April 2008 (UTC)
- It sounds like a sensible idea to me. Thryduulf 09:09, 10 April 2008 (UTC)
- The category sounds good, but why would we need the context label? The sense would not be used in the context of demonyms: it'd be used in a general context.—msh210℠ 16:38, 10 April 2008 (UTC)
- I see, demonym doesn't describe where the word is used, rather it's a sub-category of nouns (and come to think of it, they also belong in category:English nouns). I guess it's just as easy to add
[[category:demonyms]]
as it is to use a template. —Michael Z. 17:37, 10 April 2008 (UTC)
- I see, demonym doesn't describe where the word is used, rather it's a sub-category of nouns (and come to think of it, they also belong in category:English nouns). I guess it's just as easy to add
- OTOH, demonym is specific to a sense, not an entry, not a Language, not an Etymology, not a PoS. (Not that that problem is in anyway limited to demonyms.) DCDuring TALK 17:48, 10 April 2008 (UTC)
- So a Berliner is both a native and a pastry.
- General question: is it a good idea to place the category tag in its context in an entry (e.g., in the same line as definition no. 1), or should they all remain at the bottom of the page? —Michael Z. 18:03, 10 April 2008 (UTC)
- End of the language section; see Wiktionary:Votes/2007-05/Categories at end of language section. —RuakhTALK 20:07, 10 April 2008 (UTC)
- This would be a topical category, and should be a subcategory of Category:People, and might also be listed as a subcategory of Category:Etymology for each language. I wouldn't use a context template, because "Demonym" is not a context; it is a class of words. That is (astronomy) and (sports) say something about the context in which a sense is used, but demonym describe the kind of word. --EncycloPetey 23:25, 10 April 2008 (UTC)
- Er. Is this really what demonym means? The only place I've seen it used that way is on Wikipedia, and I always assumed someone there just invented it cos it sounds important. Obviously I get the formation, but I am just a bit cautious about our adopting something if it is really only a protologism. A look at books.google shows few hits, and a lot of those seem to be using it with the sense of "name used by the people", i.e. "colloquial pseudonym". Widsith 18:49, 11 April 2008 (UTC)
- 2007 dictionary where it is used as defined, 1870 dictionary where it is defined differently, 2003 book where the word is used and mentioned as defined, an 1895 dictionary where it apparently means 'name based one what one does' (I think, not too great with Greek), 2005 textbook used as defined. I think is used some if not widely, but it's present definition is very recent, late 1990s or early 2000s. The old definition seems to have fallen off at the onset of the 20th century. - [The]DaveRoss 20:35, 11 April 2008 (UTC)
Thanks for the input, and thanks for checking the attributions, TRD. I've created Category:Demonyms, and a couple of language subcategories, under People, Geography, Etymology, and Names. Please review. —Michael Z. 2008-06-26 21:36 z
Guidelines to correct structures with multiple ety and pron
I've been going through Category:Entries with level or structure problems and found several entries where I did not know how to correct the structure. These are mostly the ones with multiple etymology and pronunciation in different variations. One example is Sofia. Is there a guideline I could look at? Thanks. --Panda10 21:52, 10 April 2008 (UTC)
==Italian== ===Pronunciation 1=== {{IPA|/soˈfia/}} ====Proper noun==== {{rfc-level|Proper noun at L4+ not in L3 Ety section}} '''Sofia''' ''f'' # {{given name|female||it:|eq=Sophia}}.
- WT:ELE is your best bet, normally it is broken down by etymologies, then parts of speech. I am sure that somewhere there are a pair of lexemes derived from the same etymology with different pronunciations...I think that the best idea in this case would be to list both pronunciations in the pronunciation section and then note the pronunciation differences, rather than break the page into yet more sections. - [The]DaveRoss 22:21, 10 April 2008 (UTC)
- I've also got a "Model Pages" project. For a word with a single pronunciation and multiple etymologies, refer to round. For a word with 2 etymologies, each with its own pronunciation, see hinder. For a word with a single etymology and multiple pronunciations, see predicate (though you'll have to look at my last edit, since Widsith disagrees and thinks this is two separate etymologies). --EncycloPetey 23:15, 10 April 2008 (UTC)
- Thanks for the model entries. What are the headers that can be numbered? Only etymology? It seems that AutoFormat will add rfc-level to entries with numbered pronunciations as above. So even if I remove rfc-level because I think the structure is correct, it will be added back next time. --Panda10 23:56, 10 April 2008 (UTC)
- There is debate about which headers may be numbered. Personally, I believe that etymology and pronunciation headers should be numbered iff they are parallel headers under the same over-header. So, I would only number pronunciation sections if (1) there were more than one pronunciation under a single etymology and the pronunciations were tied to the particular POS sections underneath them, or (2) there were multiple pronunciations tied to particular POS sections and the etymologies had not yet been put in. But, in the latter case, the addition of etymologies might eliminate the need for numbering the pronunciations, if they were located under different etymology headers.
- There has been discussion off and on about numbering Verb or Noun sections in certain languages. However, there are several approaches to how this gets handled and there has never been a focussed discussion or conclusive decision. Some regulars are strictly opposed to the idea, while others think it is useful. But, it is almost never needed in English, so it isn't usually a concern to the community. It's more a problem in languages where the gender, inflection, or other aspects of a word are tied to specific senses, so that there must be separate inflection lines and separate inflection sections for the different definitions. --EncycloPetey 00:11, 11 April 2008 (UTC)
Sorry, I am getting a little confused. I've just got a message from Hikui87 that I edited some of the Japanese entries incorrectly. They were marked with rfc-level and I moved the Alternate forms section from below the POS above it and probably renamed them Alternative spellings because I thought all languages followed the same basic layout. It seems that Japanese entries follow different layout rules. So are we discussing only English entries here? --Panda10 11:47, 11 April 2008 (UTC)
- All languages do follow the same basic layout, but Alternative forms can appear under the POS when the forms are specific to a particular POS. This sometimes happens in English entries, but not so often as in some other languages. In the case of Japanese, they've chosen to make that placement of the section all the time because it's a more common problem than in other languages. The kanji used to write Japanese have more than one reading, and a particular romaji may come from more than one set of characters too. So, there are so many cases where the alternative form depends on POS, or even on sense, that ALternative forms is placed under the POS every time in Japanese entries. This is one reason I don't try to clean up Japanese (or Chinese, Korean) entries myself. There are a number of special considerations. --EncycloPetey 12:29, 11 April 2008 (UTC)
Glossaries on Wiktionary
Are there planned any glossaries on Wiktionary other than Wiktionary:Glossary? What is planned to happen with Transwiki:Glossary of library and information science? Does anyone know whether Wikipedia plans to keep its glossaries? I am asking because I find the glossaries useful, simplyfying the work of extracting all the definitions of a given domain from Wiktionary, which is something that can in principle be tediously done using categories. Thanks for any hints. --Daniel Polansky 17:22, 11 April 2008 (UTC)
- Yes, there are others. They are mostly in the Appendix: or Transwiki namespace because they are not Wiktionary-specific. You can find them by sifting through Category:Appendices. --EncycloPetey 17:45, 11 April 2008 (UTC)
- So is it that the glossaries in Transwiki: namespace are planned to be moved to Appendix: namespace? I mean, I thought that Transwiki means that these things are yet to be processed. Is there any policy, even if in the making, on how to deal with glossaries, like how to format the entries? --Daniel Polansky 18:20, 11 April 2008 (UTC)
- As far as I know, there's not really any policy or guideline on transwikis. In my eyes, Generally the transwiki namespace is full of the unformatted crap that Wikipedia didn't want, in essence in limbo between the 2 projects. Keene 21:44, 11 April 2008 (UTC)
- Items in the Transwiki: namespace have been moved here, and may be cleaned up and moved to the appropriate location. However, some of these items are duplicates of what we have, or are non-Wiktionary items, and will be deleted. --EncycloPetey 21:52, 11 April 2008 (UTC)
- So is it that the glossaries in Transwiki: namespace are planned to be moved to Appendix: namespace? I mean, I thought that Transwiki means that these things are yet to be processed. Is there any policy, even if in the making, on how to deal with glossaries, like how to format the entries? --Daniel Polansky 18:20, 11 April 2008 (UTC)
Inflection line for nouns used only in the plural.
Encyclopetey and I have come to a disagreement at template talk:en-noun#For a plural regarding how we should note pluralia tantum on the inflection line.
My position is that we should use the format:
- '''noun''' {{pluralonly}}
- (or '''noun''' {{plurale tantum}} depending on the outcome of the discussion further up this page).
This categorises words into category:English pluralia tantum, a sub-category of category:English plurals, which is a sub-category of category:English nouns
Enyclopetey is advocating the alternative:
- {{en-noun|''[[plurale tantum]]''}}
This categorises words into category:English nouns
Before this discussion descends (further?) into acrimony I feel that more opinions are needed. I suggest that the discussion take place here rather than there. Thryduulf 18:43, 11 April 2008 (UTC)
- It seems like a good idea that it be regularized. I would expect that it would need a vote if it is actually to become mandatory. I certainly hope it isn't going to be backdoored. Taking the point of view of an ordinary user would suggest that it should be intelligible. Learning from what other dictionaries (with vastly more resources and a pecuniary interest in trying to make their product useful) seems useful even if we reject their choices. MW Collegiate shows:
- no plural and no notation when the singular and plural are the same, but also for regular plurals;
- "pl" when the noun in plural in form and is usually used in a plural sense;
- "pl but sing in constr when the noun looks like a plural but takes a verb in a singular inflection
- "pl but sing or pl in constr when the noun lools like a plural but may take a verb either singular or plural.
- MW3 (unabridged) does the same, but also always shows a plural form if there is one, reducing some ambiguity in "singular only" cases, and adds the qualifier "usu" (usually) for more common plural forms and sing vs pl 'construction'.
- MW Online seems to be the same as MW Collegiate.
- Longmans DCE, an ESL/learner's dictionary, dispenses with the idea of "construction" and has just "P" for plural and "U" for uncountable, slightly restricting the acceptable choices to simplify matters for their users.
- I'd be interested in what OED, AHD, Collins, Random House, and Chambers do.
- I think we've already established that nobody with normal users uses Latin. DCDuring TALK 19:52, 11 April 2008 (UTC)
- Both the OED and MW3 use pl. for plurale tantum nouns. The AHD is inconsistent, sometimes using the text "Often used in the plural" (cf. pant) and other times putting the plural form in bold at the head of the numbered sense (cf. color). Random House doesn't bother to mark these at all. --EncycloPetey 21:58, 11 April 2008 (UTC)
- Hm, that may not be inconsistent. "Pant leg" sometimes appears in the singular, but as far as I know, the "regimental colours" never does. —Michael Z. 08:22, 12 April 2008 (UTC)
Sorry, I missed the point of the discussion which relates to the structure of categories and the use of template. Does this affect non-editing users? Not initially, as I understand it.
- When using the category intersection tools, what would happen? I hope that such tools will become available to regular users. If a p.t. noun is treated no differently than a normal noun most of the time, then everything should be fine. If not then, then we will have a problem.
- Making something templated means its appearance can be changed by changing the template only. The EP approach would seem to give more scope to change things because there would be a template for the noun itself instead of the noun appearing in a "hard-coded" way. My 2 cents. DCDuring TALK 20:09, 11 April 2008 (UTC)
- This came up because I couldn't figure out how to do it with
{{en-noun}}
, then went searching for the right template, and finally asked for help.
- This came up because I couldn't figure out how to do it with
- Doing this with a template would be advantageous, because it's already expected by semi-newbs like myself. All the better if it's an option on en-noun rather than a separate template. It would also guarantee consistent formatting and categorization, and allow us to decide on the exact categories and wording independently (the template can always be updated).
- I think the wording is a separate discussion, but FYI, the Canadian Oxford saves space by saying plural noun or (pl. same or -es) in many cases, and letting you figure out the details—it's obviously aimed primarily at native English speakers. —Michael Z. 20:33, 11 April 2008 (UTC)
- We could adjust the
{{en-noun}}
template to accept something along the lines of "plural=only" / "plural=tantum" that would add the expected formatting and category. However, I'm not sure how best this could be done. --EncycloPetey 21:55, 11 April 2008 (UTC)
- We could adjust the
- If we are adjusting the
{{en-noun}}
template (which I assume would be no more difficult than the existing way we get{{en-noun|-}}
to categorise into category:English uncountable nouns), then I'd hope we'd stick with the "pl=" format already used. I don't think "pl=tantum" would be possible as the plural of "plurale tantum" is apparently "pluralia tantum", so logically the/a plural of "tantum" is "tantum". "pl=pluralonly" or "pl=plurale tantum" would be possible, both displaying whichever form of words we agree on (which I agree with Michael is a separate issue). I think however the best form might be {{en-noun|sg=-}}, which to me implies there is no singular form in the same way{{en-noun|-}}
signifies there is no plural form, and has the benefit (I presume its a benefit anyway) of using the existing "sg" parameter. This might fall down though for two-word pluralia tantum, e.g. glad rags and checks and balances where we use the sg parameter to link to the individual words. Perhaps then use either the "sg" or "pl" parameters with some other symbol to denote this status, ! perhaps? - If we do this, then I think the
{{pluralonly}}
and{{plurale tantum}}
templates should be depricated in the inflection line, but remain for use in the sense line. - Whichever solution we have, I think it is useful to retain categorisation of the pluralia tantum, either there solely or there and in category:English nouns or category:English plurals. The existing templates could of course very easily be modified to do this as well. Thryduulf 23:34, 11 April 2008 (UTC)
- If we are adjusting the
- [I adjusted some text above which didn't display correctly —Michael Z. 00:28, 12 April 2008 (UTC)]
- My preference would be for dual categorization in those cases. --EncycloPetey 00:17, 12 April 2008 (UTC)
- Dual categorisation in which two categories? Thryduulf 00:59, 12 April 2008 (UTC)
- My preference would be for dual categorization in those cases. --EncycloPetey 00:17, 12 April 2008 (UTC)
Glossary - formatting
Is there any consensus on how to format glossaries? I have put up for myself a provisional policy at User:Daniel Polansky#Glossary, still wondering whether I should use (a) bullets, boldface, and "-" separator or (b) definition lists with ";" and ":". Today, I have formatted two glossaries, using the option (a). The option (a) is used in Wiktionary:Glossary and is more compact than definition lists. Still, definition lists are a standard HTML means of entering terms and their definitions. --Daniel Polansky 10:32, 12 April 2008 (UTC)
- I am definitely in favour of (b), see the utterly hated Appendix:List of Harry Potter terms, it adds proper structure and (as a result) looks neater. Conrad.Irwin 18:18, 12 April 2008 (UTC)
- That is a woefully incomplete list...if you are going to do it at least do it right :p - [The]DaveRoss 20:07, 12 April 2008 (UTC)
- Having a look through the various glossaries, I agree that a guideline would be helpful.
{{compactTOS}}
doesn't need rules to be added, because it already stands out on the page. If they are desirable, then we can add them as CSS border-top and border-bottom in the template, instead of tossing in more wikitext.
- I would suggest that bulleted glossaries don't need bold formatting, especially if the terms are linked. A colon may be a less obtrusive and more natural separator. The terms don't get lost in Appendix:Bagpipe terms, and I think it is more readable than many of the others.
- The guideline mention consistent copywriting, too. Does each term begin a sentence, or is it followed by one? Does the definition begin with a capital letter? I think the answers are different for each glossary, but the definitions in a glossary should be consistently written.
- I am very much in favour of using structural HTML, but unfortunately Wikipedia's semicolon-colon wiki lists are styled for discussion. Definition lists are well structured, but not particularly attractive. The lists also have the advantage that one term can be associated with several definitions (but unfortunately there's no way to put more than one paragraph or another list into a single definition). —Michael Z. 03:08, 13 April 2008 (UTC)
- IMHO the defined term should better appear in boldface. It does so in Wiktionary entries of terms, it is so formatted when the HTML definition list is used, and it is a Wikipedia convention to have newly defined terms in boldface. --Daniel Polansky 06:17, 13 April 2008 (UTC)
- It's good typographic practice to use the formatting appropriate for the context.
- Headwords in entries are the most important thing on the page, and have to visually compete with the headings. Each entry has one or only a few of them. Boldface is appropriate here, and it is a convention inherited from many paper dictionaries. But dictionary terms also appear in etymologies, where they are italicized, and in lists of related terms, etc, where they are in roman font and linked. We wouldn't boldface any of these instances.
- In contrast to Wiktionary entries, glossaries have dozens or even hundreds of terms, and are made of many blocks of running text, rather than a collection of headings and bulleted lists. They are more similar to the lists of terms appearing in entries than to the headwords. Glossary entries don't have to compete with other bold elements, they just have to be found by the reader, and then not distract her from reading the definitions. A term here is already flagged by coming at the beginning of the line, by being marked with a bullet, by being linked, and is set off with prominent punctuation after. Boldfacing every one is just adding icing to the gravy. —Michael Z. 17:40, 13 April 2008 (UTC)
- Okay. I will format poorly formatted glossaries using the option (a), as I prefer it and I can see no clear consensus against it, but I will refrain from turning well-formatted glossaries formatted using the option (b) into the formatting (a). Please, let me know if you think it a poor personal policy. --Daniel Polansky 06:47, 13 April 2008 (UTC)
- A and B are both much better than any of the other formats which appear in some of those glossaries. —Michael Z. 17:25, 13 April 2008 (UTC)
a bot to capture Wikipedia on Wiktionary
I was just now editing 狐獴 (the Mandarin entry for meerkat), and thought of something (maybe someone else has already thought of it, but anyway). Wikipedia now has thousands of articles with versions in multiple languages. The titles of most of these articles are either nouns or proper nouns. Perhaps a bot could be written to create Wiktionary formatted entries (similar to Tbot) for these words. For example, if the bot noticed that the English Wikipedia article for meerkat had a Mandarin equivalent article called 狐獴, the bot would create a formatted entry on Wiktionary that would look something like what you now see at 狐獴. A category could be slapped onto such entries (similar to the way Tbot tags things), so that a human editor could verify the contents, and add extra things (ex. Pinyin romanization for Mandarin entries etc.). Thoughts? -- A-cai 13:31, 12 April 2008 (UTC)
- This would be fine if the interwiki links between the various Wikipedias always used the same conventions. Take your example of w:meerkat - the Italian interwiki link points to "Suricata suricatta", which is the translingual (or modern Latin) name, not the Italian word suricato. So your bit would generate incorrect entries. SemperBlotto 14:28, 12 April 2008 (UTC)
- That's a bad idea. Many of the organism articles on most Wikipedias are based on the scientific name, and most of the plant articles (over 15,000 and growing) use the Latin binomial for the name. some other Wikipedias do the same. When an organism belongs to a higher taxon, and is the only organism in that taxon, the two articles may be the same, but not all Wikipedias divide them the same way. As a result, the article on the Ginkgo genus on one Wikipedia may be titled for the Ginkgoaceae family on another Wikipedia.
- There are also many cases where the bots propogate incorrect links. I have had an ongoing tussle with bot operators over w:Monoicous because they don't understand that w:es:Monoica and w:fr:Monoécie are not about the same topic (they should link to an article about monoecious, not monoicous). The plant editors keep removing the incorrect links; the bots keep adding back the links; and the bot operators believe they are absolved of any fault. There are also many cases where the article titles don't even remotely mean the same thing, even when the topic is the same. English wikipedia has an article on Plant sexuality (which we would delete as sum of parts), but it covers the same subject as w:es:Monoica does. In short, we're seeing links between articles that do not have titles of the same meaning.
- There are also many, many articles with titles that do not merit an entry because the entry would not meet our CFI. I can't see any reasonable way to get a bot to distinguish between cases or use appropriate selectivity in choosing which articles to create entries for. --EncycloPetey 14:29, 12 April 2008 (UTC)
- I agree with EncycloPetey. It's a great thought, but I think there are too many problems with it. In addition to the ones he mentions, there are also cases like wikipedia:Fixed-wing aircraft where Wikipedia's noble quest for NPOV has led it linguistically astray (the normal words being (deprecated template usage) airplane and (deprecated template usage) aeroplane, depending on dialect). (There are also issues with figuring out whether a Wikipedia article corresponds to a lowercase Wiktionary entry or an uppercase one, but an intelligent bot might be able to handle those by searching the article for non-sentence-initial uses.) —RuakhTALK 18:09, 12 April 2008 (UTC)
- Do note that while "airplane" and "aeroplane" may be the "normal" layman's terms, engineers (aircraft engineers) generally use the technical term aircraft. When Boeing talks to the public and to the Street, they use "airplane" (Boeing Commercial Airplanes Division), when their engineers and pilots talk, it is "aircraft". "Fixed-wing aircraft" is correct. (Besides evading the Pondian Problem ;-) Robert Ullmann 12:10, 20 April 2008 (UTC)
- I agree that (deprecated template usage) aircraft is the more formal/technical term, and that (deprecated template usage) fixed-wing aircraft is correct and precise, but even a Boeing engineer or pilot would presumably choose (deprecated template usage) airplane over (deprecated template usage) fixed-wing aircraft in a context where (deprecated template usage) aircraft alone didn't suffice, right? (And remember that Wikipedia's main naming convention is that articles should be named using the most common term for their referents.) —RuakhTALK 16:09, 20 April 2008 (UTC)
- I think this would work. Ullmann has already put significant effort into Tbot's checking mechanism and the quality of the Wikipedia data would seem not too much lower than the quality of some of our translations. I think that (if this is possible) a modified version of Tbot that accepted input from Wikipedia interwiki's instead of Wiktionary's translation tables would be very good. (And as an added bonus it could add the word to the translation tables at the same time ;). For more information on exactly what checks Tbot does you'd have to ask Ullmann, but I believe they require a foreign Wikt entry to exist and contain a translation in common with the English Wiktionary entry. Certainly this bot shouldn't create translations of articles that don't have entries in Wiktionary. I don't know how many of the Wikipedia interwiki's would pass Tbots checker, but I would think a significant enough number to want to give this a go. Conrad.Irwin 20:21, 12 April 2008 (UTC)
- I think it is a bad idea to try and get data from information which isn't designed to be a direct translation. Even if one article corresponds to another that does not mean that the titles are translations of one another. - [The]DaveRoss 20:43, 12 April 2008 (UTC)
- You make some good points, but the thing is, when Tbot creates a bad entry based on one of our translations tables, that's still useful: it calls attention to a problem in that translations table. When Tbot creates a bad entry based on Wikipedia interwikis, that's just annoying. (And, would it re-create the entry every time we deleted it?) —RuakhTALK 20:45, 12 April 2008 (UTC)
- Based on recent discussion here, I think it would be trivially easy to stop a bot creating an entry when a previous page with that title had been deleted. When a section on a page was deleted, but the entry still exists (e.g. the Dutch section was deleted but the English section remains) I think it would be harder (disclaimer: I am not a programmer). Thryduulf 22:08, 12 April 2008 (UTC)
- Besides the points raised above, I think that Wiktionary should try to create it's own content, and not rely on other projects and any mistakes they might make. Nadando 07:29, 13 April 2008 (UTC)
I should have pointed out one of my justifications for such a bot. A lot of contributors are already introducing such entries ... by hand! Many such entries are poorly formatted, and rarely tagged with any kind of "blindly copied from Wikipedia" tag. The entry for 狐獴 was one such example. It originally looked like this. I'm proposing to standardize the process with a bot. Such a bot (if done correctly) would give me the means to efficiently verify such entries, without a lot of additional formatting work. -- A-cai 07:46, 13 April 2008 (UTC)
- One option that we might consider is writing a bot to add the interwikis as ttbc's to the English entry. The beauty of this option is that we get a whole bunch of data, it's pretagged to be looked at by human editors, and can then be fed into Tbot in the normal fashion. Additionally, our readers will be forewarned that it's questionable data and so, hopefully, won't be led astray. This does, however, have the downside that it would flood ttbc categories, which would, admittedly, be irritating. -Atelaes λάλει ἐμοί 07:39, 13 April 2008 (UTC)
A starting point would be to try to extract some entries automatically, and create a report or list of what the automation thinks it world do. As noted above, Tbot's primary check uses the translation table in the FL.wikt entry; this won't work that way. A serious technical issue is that Tbot works by digesting the entire en.wikt XML, and then looking for specific FL entries; what method would be used to extract the 'pedia data? The en.wp XML by itself is not manageable. (one could get "langlinks.sql.gz" for a given set of wps, and then do some analysis) I myself have not tried to parse anything out of wp entries; they look superficially consistent in many ways, but that may not make them tractable. Robert Ullmann 12:10, 20 April 2008 (UTC)
Transliteration appendices
The various transliteration systems in category:Transliteration appendices need references attesting to their wider usage or indicating their source. If they are systems modified or created specifically for Wiktionary, then they ought to be moved to the Wiktionary: namespace, per earlier discussion (Wiktionary:Beer parlour#Organization).
I added Wikipedia links to all of these appendices, and here is a link to Thomas T. Pedersen's reference to many transliteration systems.
Please add a reference to any of these appendices you are familiar with. —Michael Z. 20:07, 12 April 2008 (UTC)
It also looks like Wiktionary:About Greek/Transliteration may be a candidate to become an appendix. —Michael Z. 20:53, 12 April 2008 (UTC)
- Most of those where created specifically for Wiktionary, and for some (New Persian) editors seem to be using multiple transliteration systems simultaneously. If you have specific objections to any of those, I'm sure people will be happy to discuss on respective talk pages. --Ivan Štambuk 21:39, 12 April 2008 (UTC)
- The translaiteration Appendices should have references, yes, but when the Transliteration system is part of a Wiktionary: namespace page, then its an internal transcription system that may or may not be used elsewhere. --EncycloPetey 03:19, 13 April 2008 (UTC)
- Right, but I'd like to just figure out which is which, and put each one in the right place.
- A reader should be able to look at each table, and:
- Know if they can expect to see the system used in other publications, and if so then in which field (linguistics, publishing, other dictionaries, etc).
- Know if they shouldn't use it in e.g. an academic paper and expect their peers to be familiar with it.
- Have confidence that what they are reading observes the Wiki principal of verifiable accuracy, and clearly identifies any original research.
- A reader should be able to look at each table, and:
- From browsing through them and having a look at the relevant Wikipedia articles, it does look like a significant proportion of them are systems used in academia, or possibly dangerously close to such systems. —Michael Z. 03:30, 13 April 2008 (UTC)
Proposal: a template for linking prominently to foreign-language Wiktionaries.
When I add a foreign-language word that has a that-language Wikipedia article, I typically add a prominent link to that article, using {{projectlink|pedia|lang=fr}}
or whathaveyou. However, if the word has a that-language Wiktionary entry, the link to it only shows up in the sidebar, which is fairly useless unless the reader knows to look there. I think it makes sense for the that-language Wiktionary entry to have a more prominent link, if only because the that-language entry is generally more complete (if only because it will have translations to languages other than English). So, I've created {{PL:wt}}
, such that this:
* {{projectlink|wt|español|lang=es}}
will produce this:
It's pretty much like all the other {{PL:*}} templates, except that it doesn't create a sidebar link (since the normal bot-managed interwiki link serves that purpose).
Before I start using it widely, mentioning it at Wiktionary:Links, etc.: does anyone object to this? (And, does everyone agree that this should only be used for linking to the that-language Wiktionary entry, never to other foreign-language Wiktionary entries?)
—RuakhTALK 00:36, 13 April 2008 (UTC)
- The icon is pretty pointless. It's just just a vague smudge, in my browser. Better to use nothing at all.
- Or perhaps the favourite icon, which was designed to display at 16px size. But then it should be the bullet, created using the list-style-image CSS property, not an icon placed next to a bullet. —Michael Z. 02:30, 13 April 2008 (UTC)
- But that icon is also used by Wikipedia. How about expanding the function of
{{infl}}
(and similar templates) to include a link to the appropriate wiktionary, if it exists? How feasible in this? Would this look OK in the inflection line? --EncycloPetey 03:16, 13 April 2008 (UTC)
- But that icon is also used by Wikipedia. How about expanding the function of
- Yeah, Wikt: and W: are the only two projects that share an icon, but that's what we have (see the selection at commons:Wikimedia#Favicon). But we do use the globe for Wikipedia, so links with the W would still be distinctive.
- But that's also why I think it may be better to use nothing. The full logo at 16px size is not attractive or even identifiable. It doesn't even serve as an eye-catching bullet, especially if it is right next to a standard bullet. A graphical element that serves no function simply makes things worse than they would be without it. —Michael Z. 03:46, 13 April 2008 (UTC)
- Some other-language Wiktionaries use the scrabble tiles for their logo. The W tile clipped out of this image might make a usable 16-pixel bullet (and favicon). —Michael Z. 04:01, 13 April 2008 (UTC)
- (edit conflict) O.K., I've removed the logo. I don't like the favicon idea, for the same reason that EP gives; and even if I did like that idea, I wouldn't like the idea of not including a bullet from just one element of a bulleted list. (I mean, technically it would be multiple unordered lists in the HTML, but to our readers it would look like one bulleted list with just one non-bulleted element, so, same effect.) And anyway, on reflection it doesn't make much sense for us to use some form of our own logo as a means of identifying a foreign-language Wiktionary. However, it might be nice to use a bit of markup, something like (es), as the “logo”. —RuakhTALK 04:02, 13 April 2008 (UTC)
- The puzzle piece is typically used for stubs, not for projects. I wasn't clear what I meant; I was thinking of adding functionality to the
{{infl}}
template along the lines of what we do for{{t}}
. So, an inflection line might look like: hablar (es) so that the interwiki link appears in the inflection line. The{{infl}}
template already includes the language code, which simplifies the process a bit, should we decided to do this. --EncycloPetey 04:03, 13 April 2008 (UTC)
- The puzzle piece is typically used for stubs, not for projects. I wasn't clear what I meant; I was thinking of adding functionality to the
- I'm not opposed to
{{infl}}
having such a link, but I don't think it's enough: I don't think most readers will understand it. To be honest, I'm not sure most readers recognize the significance of those translation-table links, either, but at least there we provide only two links, so most readers who are interested in a translation will probably try them out once to see what they are. That's not true of an interwiki link representing the entire language section. —RuakhTALK 04:02, 13 April 2008 (UTC)
- I'm not opposed to
- Here's an idea: take the actual wikitext one would use to create an interwiki link, and realize it on the page. Its meaning actually is vaguely self-evident, even for non-wiki editors. All the better if there was a way to add a tooltip reading ‘hablar’ in Spanish Wiktionary. —Michael Z. 04:27, 13 April 2008 (UTC)
- [[es:hablar]]
- Why shouldn't a self-referential (inter-wiki) link make use of the vernacular?
- Here's what it could look like, with the es: link after the definitions: User:Mzajac/hablar. It also looks fine just below the headword, but interrupts the flow. I think it's a bit cluttered if placed at the end of the headword line. —Michael Z. 05:00, 13 April 2008 (UTC)
Another option altogether would be to leave the link in the left sidebar, but make it more prominent (bold font?). Wikipedia uses a javascript trick to make other-language featured articles have a star for a bullet (w:Template:Link FA), so it would be possible to manipulate other attributes of the link. —Michael Z. 05:08, 13 April 2008 (UTC)
This idea has been suggested before, I think that javascript is the best solution, as the interwiki link would already be there and we can then just create add a more prominent link to a suitable place in the entry. Obviously the formatting and position of these links has yet to be determined, but I feel that they should be part of, or near to, the language heading. Here are a few of my ideas for layout:
As above I think that adding these with Javascript is better than adding these to all the entries, though I suppose it could be added to Interwicket. Conrad.Irwin 11:24, 13 April 2008 (UTC)
- Well, I'd still like to be able to list them in "see also" sections like with other sister projects, but if people want your approach: I like option #2 (though "has an entry for" might sound better than "contains"). Option #1 is the most prominent, but again, I'm not sure it's obvious to most people what (es) means: if I saw it, I think I'd interpret it as "Hey, we know that the name Spanish is slightly controversial, so we're also including this unambiguous language code for clarity's sake." I'm not sure if I'd bother to click the link, since I'd assume that the link was to inform readers of what the language code meant (since not everyone is familiar with language codes). Option #3 is kind of cool, but less prominent, and it's not instantly obvious what it means. There are fairly few cases where we just include a link that would be absolutely meaningless in a print edition (aside from the edit-links, sidebar links, etc.), and I don't think this link needs to be an exception. But option #2 is great; I'd be very happy with it, especially if it were in concert with "see also"-s instead of instead of them. In anticipation of it, and of other scripts that might need such a thing, I've created MediaWiki:langcode2name.js, which offers functions for handling the language code, English name, and FL name of each Wiktionary language. —RuakhTALK 14:12, 13 April 2008 (UTC)
- Inspired by your quick response, and finding myself at a loose end I have implemented #2. It can be trialled at the bottom of WT:PREFS "Trial the javascript prominent interwiki links." (try a hard refresh if the option doesn't appear straight away). Any thoughts would be appreciated. Conrad.Irwin 16:58, 13 April 2008 (UTC)
- No. 2 is my favourite too. No. 1 destroys the graphic effect of the title, so I wouldn't want to see it implemented. —Michael Z. 17:01, 13 April 2008 (UTC)
- It looks pretty good. I'm going to try to get used to it a bit
- Would it be acceptable to remove the "the" and the period at the end? It might look cleaner without the accoutrements, and if the beginning mirrored the language heading. I don't think the boldface is necessary to draw the eye to the link, and normal-weight text will probably be more readable at such small size, especially when it appear in certain foreign-language scripts.
- Since the note is already written out in full, the tooltip may be an opportunity to include the destination language, but this would require compiling many translations of in Wiktionary.
- Starting with the term helps emphasize the link, and reduces the verbiage. —Michael Z. 17:53, 13 April 2008 (UTC)
- Hmm, thinking about it - do we need "Wiktionary" in there, seeing as we are all one big project. Could we get away with "Spanish entry forhablar"? I like the idea of having the Foriegn language in the title, or even in the heading, as this makes it clear what to expect from the link. I think it would make it hard to include "In WIktionary" in the foreign language, as this would need different layout for each language. Perhaps the title could just be "español: hablar" or something that requires little effort ;). Incidentally please feel free to bugfix/experiment with the javascript so long as you bear in mind that it is being used by an unknown number of other people. I'll give your second idea a go now. Conrad.Irwin 18:06, 13 April 2008 (UTC)
- I think there needs to be a reference to the other project. We are already looking at a Spanish word's entry in English Wiktionary, and "Spanish entry" doesn't make it perfectly clear how the context will change when we click.
- "Español: hablar" looks fine to me, but does this construction work in every language? Probably, but some of them may have to be bundled with whatever passes for a colon. —Michael Z. 18:15, 13 April 2008 (UTC)
- I have no idea, I think we can leave it as a colon unless anyone else has an opinion. Conrad.Irwin 20:10, 13 April 2008 (UTC)
- I prefer having just the term linked, which becomes the self-evident subject of the note. When the whole note is linked, there is no differentiation, and it looks like a subtitle for the heading. On the other hand, this may be a problem for short words like i. —Michael Z. 19:55, 13 April 2008 (UTC)
- Yes, I was looking at one, and it didn't seem to be prominent enough, though I agree that it is better with only the word linked for anything longer. Conrad.Irwin 20:10, 13 April 2008 (UTC)
Weird: the link appears, but is gone if I refresh a page in my browser (Safari 3.1/Mac). —Michael Z. 19:48, 13 April 2008 (UTC)
- Yes, this is partly caused by bugzilla:12773, and partly because it is including an external dependancy that may or may not have downloaded before the script runs, I'll have a think about the best way to fix this. Conrad.Irwin 20:10, 13 April 2008 (UTC)
The action word for a link is a good idea, but the verb is look up, not lookup.
Not sure if I like the indentation breaking up the left margin, though. —Michael Z. 16:02, 14 April 2008 (UTC)
Compounds and grammar
1. The article forgo describes this English verb's grammar as {{en-verb|forgoes|forgoing|forwent|forgone}}. Very similar patterns appear in the articles go and forego, and likely a few dozen other compounds ending in -go. Isn't it a waste of energy to repeat the pattern go/goes/going/went/gone in so many places? Shouldn't a reference to go be enough for the grammar?
2. In the many years of Wiktionary, many people must already have asked this question and so I would have thought that there should be a page about this question and its answer somewhere in the Wiktionary: or Help: namespaces. Is there? I can't seem to find one.
3. It can be argued that this is a minor problem for the English language, but in German and Swedish where compounds are so much more common and inflexion patterns are more complicated, the question takes on a completely different dimension. Still, new methods are seldom introduced in sv.wikt or de.wikt unless they already exist in en.wikt. It appears sv:föregå and de:vergehen use the same methods as forego, the full grammar pattern is repeated in every article for every compound. --LA2 15:46, 13 April 2008 (UTC)
- Well if it's the wikitext duplication that bothers you, we could create a new special template for this group, allowing go to use
{{en-verb-go}}
, undergo to use{{en-verb-go|under}}
, etc.; but I don't see why we wouldn't want the displayed version to show all the forms. —RuakhTALK 15:54, 13 April 2008 (UTC)
- Personally, I think that we can be of most use to our audience if we show the inflected forms for each compound. Unlike print dictionaries we are not limited by space. Thryduulf 15:56, 13 April 2008 (UTC)
- My primary concern is with question 2: Where has this been discussed before? A special template for "go" could be one solution, but for German and Swedish it would mean hundreds of thousands of templates. Saving space is not one of our needs, of course, but adding new compounds is a problem if you have to repeat the grammar pattern each time, without the ability to use a template. I don't expect a solution and consensus to appear in ten minutes, but I was expecting this question to have been discussed before. --LA2 16:23, 13 April 2008 (UTC)
- For compounds with regular inflection, then perhaps a compound noun template(s) would be possible, depending on the grammar of the language in question of course. I can't see a way of automating irregular conjugation unfortunately. If the person defining the compound terms doesn't want to enter the inflections then they can use perhaps the
{{infl}}
template and categorise it somewhere where others can find the entry to add them later. Thryduulf 16:40, 13 April 2008 (UTC)
- For compounds with regular inflection, then perhaps a compound noun template(s) would be possible, depending on the grammar of the language in question of course. I can't see a way of automating irregular conjugation unfortunately. If the person defining the compound terms doesn't want to enter the inflections then they can use perhaps the
- A reference back to a main entry isn't always satisfactory. Often, compounds follow the inflection of the parent verb, but sometimes they do not. For example, Latin faciō has an irregular passive voice conjugation. Some of the compounds from faciō have this same irregularity in the passive (e.g. (deprecated template usage) patefaciō), but others do not (e.g. (deprecated template usage) cōnficiō). It's therefore better if each entry contains full information of its own. --EncycloPetey 16:52, 13 April 2008 (UTC)
- Well, the inflection line doesn't normally show all the inflected forms of a word, just the principal parts (whatever those are considered to be for a given language). In highly inflected languages, the other noteworthy forms are placed in a separate section (Conjugation/Declension/Inflection). But the idea of shunting this information off to a single "core" entry for each group of words is a non-starter, for reasons that are pretty basic to the philosophy of Wiktionary. Entries should stand on their own as comprehensive treatments of the word or form in question; the user should never be required to go to another page in order to get basic information on inflectional (or any other) properties of a word. Even for languages with quite regular inflectional patterns, like Latin, the principal parts for each verb are given in the inflection line (e.g. video#Latin). Users could be directed to an appendix to figure out this information for themselves, but that just isn't the way we do things here. I'm not sure if your specific proposal has been discussed before, but I don't think it would ever fly. -- Visviva 16:41, 13 April 2008 (UTC)
- When I search the Wiktionary: and Help: namespaces for "basic to the philosophy of Wiktionary. Entries should stand on their own, I get lots of hits in the Beer parlour, but no obvious policy page. Where should I look? --LA2 16:48, 13 April 2008 (UTC)
edit conflict:
- FWIW, Longman's DCE shows the past as "rare".
- As to question 1, In the case of compounds not separated by a space or hyphen, I would think that we would want to show the inflection in each entry because it is not necessarily obvious that one could click on part of the blue headword in the inflection line to determine inflection. I use the infl tmplt to suppress inflections that would result from using
{{en-verb}}
for compound words separated by a space or hyphen, phrasal verbs, idioms, and other phrases that I put under the verb PoS header. I inflect compounds that are not separated. I justify the different treatment by saying that it ought to be obvious to even a casual user that, for such entries, if the inflection is needed, one would click on the word involved. "Obviousness" is in the eye of the beholder, of course, so this is not entirely satisfactory. HTH. - As to question 2, unfortunately you have to use Google-type search skills in the various spaces (talk, wiktionary, WT, Appendix) to try to exhume old discussions of this and sometimes even a description of current practice. We don't seem to have that many policy systematizers active here and many practices have not gotten beyond disagreements. For example, there is a school of thought that would object to my suppression of inflection of phrases and others who would disagree with the very idea of trying to assign "real" parts of speech to idioms. DCDuring TALK 16:59, 13 April 2008 (UTC)
I think there is some confusion here, the inflection line contains a few key forms of a verb, for very regular forms we do have templates (like the infamous {{en-verb}}
which can handle far more than just vanilla regular English verbs) but for the less regular forms the effort of creating, cataloging and looking up these templates takes much longer than typing 30 characters. For the full conjugation of inflected languages we do have templates (see the list of French conjugation tables that use the standard patter) because the effort saved is very large and overcomes the maintenance costs. I don't think anyone has asked about this before because what we do does actually make sense. As for point three... the whole point of having seperate Wiktionaries is because different solutions work better for different languages readers and editors. I think it would be much easier for a newbie (and experienced editors) to find and use (or read and understand) {{el-verb|egkataleípo|εγκατέλειψα|egkatéleipsa}}
than, for example {{el-verb-λειπω|εγκατ|α|έ|egkat|a|é}}
even though it is shorter (see εγκαταλείπω) [I admint this is somewhat contrived, but the same principle holds for smaller idiosyncrasies in many places]. Conrad.Irwin 21:49, 13 April 2008 (UTC)
Scientific names and Taxonomy headers
Are these standard headers? If not, how should I correct them? They usually contain Latin taxonomy specifications for the entry. Example: bar-winged rail. --Panda10 17:15, 13 April 2008 (UTC)
- I prefer to incorporate the taxonomic name in the definition. That is what I have done with the example (as well as giving it a proper definition, and linking to the correct Wikipedia article). SemperBlotto 17:23, 13 April 2008 (UTC)
- I would do as SB does and:
- add a link to wikispecies (
{{wikispecies}}
or{{specieslite}}
}. Though they have 130K+ articles, they might not have the one you want. - look for picture in wikicommons. If there are multiple ones, I usually put in a commonslite link as well as the most helpful or interesting image.
- make these links external
- make in-line links to the individual words of the two-part species name, on the theory that we should be happy if we have the numerous component words for these names in Latin or Translingual and not try to keep up with the complexities and changes of taxonomic classifications and offer at best terse uniformative entries.
- add a link to wikispecies (
- I think it should be clear that we do not have the ability to provide comprehensive coverage of compond names in taxonomy (2- or 3-part) and chemistry (n-part). If we could provide a reasonable mapping from vernacular names to taxonomic names as well as definitions of the parts of taxonomic names, we would be doing things that WP and Wikispecies do not and are not likely to unless we fail. DCDuring TALK 17:49, 13 April 2008 (UTC)
- People have been using L4 headers Scientific names and Taxonomic names. While it may be useful to incorporate one or two into the definition, this pattern doesn't hold up well when there are several, or dozens. AF has been treating "Scientific names" as recognized (understand this is not an application of policy, of which there is none!), and "Taxonomic names" as unknown; I think we should treat "Taxonomic names" as a recognized L4 header, and convert "Scientific names" (which is sort of ambiguous, could be any kind of "scientific" naming ;-) to "Taxonomic names" Robert Ullmann 22:30, 14 April 2008 (UTC)
- It would be very desirable for us to have maps from vernacular names (in, say, English) to the one or more species or genera that may be appropriate. These names are the key to a vast amount of good information that is not necessariy accessible from the vernacular name. Providing this key is something useful to users and not done well by Wikispecies or even Wikipedia. It would pay to make it as complete as we can. But if the list of species would become overwhelming than we can limit ourselves to the genus names.
- In the case of dozens of scientific names you must be referring to cases where there are numerous derived or related species names appearing under a genus name that ought to be Translingual, that is an illiustration of the problem. These lists or often not complete, use obsolete or disputed names, or are wrong in other ways. I doubt that we will succeed in recruiting many taxonomists to maintain these listings. We also lack the specific structure that Wikispecies has for this kind of information and the breadth of info that WP offers.
- I would think that we would want to discourage the creation of new entries with those headings and with the extensive derived and related terms lists and exploit Wikispecies' and WP's work. Linguistically, the language of the species names includes vast numbers that cannot be said to have been adopted into English, but are essentially Translingual, with components that are Latin (or latinized Ancient Greek). We can provide a usual service to WikiSpecies and to WP by handling the linguistic aspects of these names (etymology, morphology, inflection) as well as association with vernacular names. DCDuring TALK 23:30, 14 April 2008 (UTC)
(after edit conflit)
- I agree that "Taxonomic names" is better than "Scientific names", but I'm not certain about either of them. Are these not Translingual proper nouns? e.g currently we have
- Homo sapiens: Translingual proper noun; English noun
- Hominidae: Translingual proper noun
- Homininae: Translingual proper noun
- Ponginae: English proper noun
- Pongo: Translingual proper noun
- Agaricales: Translingual proper noun
- Lycopodiopsida: Translingual proper noun
- Lynx: Translingual proper noun
- cycad: English noun
- chironomidae: English noun
- Insecta: English proper noun
- platanifolia: Translingual proper noun
- abutilon: English noun
- accipitres: English noun
- mongolica: Translingual noun
- I agree that "Taxonomic names" is better than "Scientific names", but I'm not certain about either of them. Are these not Translingual proper nouns? e.g currently we have
Categories also seem all over the place, with entries in at least: category:Taxonomic names, Category:Taxonomy, category:Zoology, Category:Botany, category:Entomology and Category:Biology. Thryduulf 23:44, 14 April 2008 (UTC)
- Lynx ought not be Translingual and should be lower case. I believe that the taxonomic names that appear as Translingual should all have at least the first letter of the first word be capitalized and should be proper nouns. The second part of species names should be entered as Latin, usually/always(?) an adjective, usually/always(?) uncapitalized. Some Latin-derived species and genus names have become part of English and often follow English rather than Latin pluralization. I rely on Stearns' Botanical Latin as a basic source, but haven't finished reading it yet. DCDuring TALK 00:02, 15 April 2008 (UTC)
- Note that lynx exists as an English common noun for the wild cats, while Lynx is a translingual proper noun entry for the taxonomic genus.
- I don't understand why the second part of a taxonomic name should be a different language to the first part? Just because it has a Latin etymology, and has a Latin homograph doesn't mean it isn't also translingual - particularly if it doesn't follow the Latin pluralisation rules. Thryduulf 00:10, 15 April 2008 (UTC)
- I was deferring to my understanding of EP thoughts on the subject. The theory might be that it doesn't become Translingual until it is an officially recognized name. Until that time it is New Latin, a variety of Latin.
- You are so right about Lynx. Sorry. "cycad" seems right as English, derived from Cycas, a genus. Each item would have to be checked for correctness.
- If it follows English pluralizaton, then it would certainly warrant an English entry. If it appears in non-technical English documents, it might warrant an English entry, but the Translingual really should be sufficient.
- As to categories: "Taxonomic names" serves to distinguish these Translingual from others; the discipline contexts/categories seem to be a shortcut for animal/plant/bacteria/mold/fungus/virus distinctions. cat:Taxonomy might be useful for New Latin terms used in taxonomy. That's my take, but based only on limited ill-remembered anecdotal experience and not systematic analysis. DCDuring TALK 00:43, 15 April 2008 (UTC)
My reading of the convenience samples of entries:
- OK, I think
- Homo sapiens: Translingual proper noun; English noun
- Hominidae: Translingual proper noun
- Homininae: Translingual proper noun
- Pongo: Translingual proper noun
- Agaricales: Translingual proper noun
- Lycopodiopsida: Translingual proper noun
- Lynx: Translingual proper noun
- cycad: English noun
- abutilon: English noun - Abutilon genus name
- Abutilon is a redirect to abutilon, rather than being a Translingual proper noun, reducing the ratio of L2/L3-correct entries to 8 / 15. DCDuring TALK 12:36, 15 April 2008 (UTC)
- Not OK
- Ponginae: English proper noun => Translingual proper noun ("TPN")
- chironomidae: English noun => u.c. TPN
- Insecta: English proper noun => TPN
- platanifolia: Translingual proper noun => Latin adj
- accipitres: English noun => Latin obsolete taxonomic name
- mongolica: Translingual noun => Latin adjective
All of them could stand a link to Wikispecies. Some don't even have WP links. Etymology would be fairly straightforward using a Classical or New Latin suffix and a Latin or latinized Greek head. DCDuring TALK 01:16, 15 April 2008 (UTC)
- Just noting that the conversation so far looks all good to me (speaking as a trained botanist with a specialty in systematics). However, we might want to consider subdividing the Category:Taxonomic names (or whatever we call it). We probably ought to subdivide out (1) the binomials used for species, (2) names of genera, and (3) higher-level taxa. Putting the whole shebang into a single category seems, well... unhelpful. Particularly so since the lexical use and structure will differ. Species names are binomials, including a functional noun and descriptor. Genera are singular nouns. Higher-level taxa are often constructed as plural nouns, descriptions, or substantive adjectives derived from the names of genera or from characteristics of the group. --EncycloPetey 02:16, 15 April 2008 (UTC)
Regarding the categories, my initial thoughts are
- category:Taxonomy should be used for taxonomic terms, e.g. genera, not individual species, etc names (there is already a note to this effect at category:Taxonomy)
- category:Taxonomic names should be a sub-cat of category:Taxonomy (and probably others as well) and either contain all the taxonomic names or be a parent cat to more specific categories (EP please could you suggest appropriate category and context labels names).
- Taxonomic name entries should also be in categories such as category:Entomology where appropriate but should also be categorised above.
- Individual species, genera, family and order names, etc should not appear in Category:Biology. Thryduulf 11:25, 15 April 2008 (UTC)
- The last suggestion might give us trouble in the cases where we do not have a category for the class of life or near-life (viruses?) under discussion or where the person making the entry does not have specific-enough knowledge. The less precise tag would help users in the meantime by providing a clue as where to look for more information.
- I think that there is a strong case for a specialized rfc tag (
{{rfc-taxon}}
?) for this kind of entry. The text field can convey information about the issues that previous editor had not yet resolved, with more detail always possible in the Talk page. - I still feel that we are wasting our time insofar as we are duplicating work being done by WikiSpecies. The entries already are misusing the Related terms heading. We should have the linguistic relationships (etymological, morphological, "Derived terms", "Descendants"). Maintaining the hierarchy is for Wikispecies. Perhaps we need templates that read from Wikispecies and provide their best information on next higher element and next lower elements in the taxonomic tree.
- To me the vernacular name to taxonomic name mapping is a matter of great importance and value both to normal users and students of biological fields and one only partially addressed even by, say, the USDA plant database.
- The use of the "Scientific name" heading is particularly troubling to me because it is so ambiguous. Is it supposed to be a synonym? A hypernym? A hyponym? A translation into Translingual ? DCDuring TALK 12:36, 15 April 2008 (UTC)
- I disagree with point 3 above. We should not add all the scientific names of insects to Category:Entomology because they will overwhelm the other terms in the list. Besides, scientific names of insects are not used solely in an entomological context; they may be used when discussing evolution, ecology, botany, agriculture, strict taxonomy, etc. A Category:Taxonomic names of insects is possible, but that opens up the possibility of thousands of other similar categories that I certainly wouldn't want to have to maintain.
- I agree somewhat with DCDuring. We don't want to be duplicating work that is covered on Wikispeices. However, Wikispecies does not cover the etymological origin of names and name components. That information falls under our mandate, and it is useful to be able to look up such things. Also, I would rather see the scientific name included in a definition, when it is added to a common name entry, rather than under a separate section header. --EncycloPetey 18:03, 15 April 2008 (UTC)
- I absolutely agree that we need to do what Wikispecies does not do: coverage of the language that they use, Etymology, etc. I had tried to say that somewhere above. The only taxonomic entries that I don't think are worthwhile for us are the two-part (or 3-part) species names except in cases where the usage is common (Homo sapiens being the most common of these).
- I also think we try to figure out how to get the latest information on taxonomic-tree navigaton (one level up or down) from Wikispecies at run-time. It might be that we can take advantage of the work Wikispecies has done to obviate the need any kind of category structure for taxonomic names at all. Perhaps we could read from them what kingdom (?) a given taxon is in. To me the question is whether our integration with Wikispecies is at run-time (ie, on demand) or periodically. If run-time integration can give adequate performance and transparency from a user perspective and is not too difficult technically, it would be highly desirable. But periodic (weekly, monthly, quarterly) updates (from Wikispecies (and to ???)) could be good enough. DCDuring TALK 19:30, 15 April 2008 (UTC)
- Run-time integration would just substitue one problem for another. Our definitions include all major uses of a term, and not just the singlemost current one. So, while Wikispecies uses the current APGII circumscription of Liliaceae, our entry should have at least three definitions, each based on one of the common major senses meant in scientific literature. We also experience the problem of more than one taxon sharing a name. Some of Wikispecies' pages have a parenthetical inclusion to disambiguate names assigned to zoological and botanical groups. So, a plant and animal, or an animal and alga, may share the same name. Each will have its own kingdom, included taxa, etc. We also include obsolete terms among our entries that have no corresponding page on Wikispecies because the name isn't used anymore. We also allow for any taxonomic name at any rank, whereas Wikispecies sometimes skips levels that aren't used often (like infraorder). Wikispecies also has huge holes in its coverage. Run-time integration is not a pipe dream, but at the present we have nowhere near the readiness to implement something like that. --EncycloPetey 21:47, 16 April 2008 (UTC)
I use {{British}}
, especially when a reference says so. This renders as (UK), and adds category:UK, but the United Kingdom isn't Britain. It would seem more natural to refer to the language as used by people from a place, rather than within the borders of a polity. —Michael Z. 18:38, 14 April 2008 (UTC)
- One major us of the UK tag is to discriminate UK-only from all-English or US usage. And linguistic place in this case is roughly equal to polity. As I understand it, UK = Great Britain = Northern Ireland + Britain; Britain = England + Wales + Scotland. I'm not too sure about Cornwall and Channel Islands. The UK tag is intended to include those places covered by the school system and English-language media located there and relates to contemporary usage that more or less covers the whole place. The languages and dialects that exist there are supposed to be covered by other tags, some of them somewhat controversial. I'm not sure how this set of tags can be improved. I'm also not sure how the English spoken in Ireland insofar as it differs from UK English fits into the tag system. DCDuring TALK 20:35, 14 April 2008 (UTC)
- Actually, Great Britain refers to England, Scotland and Wales as a unit, and corresponds to the island of Britain (Cornwall is part of England, and I'm not sure about all of the little islands either). The UK is the United Kingdom of Great Britain and Northern Ireland, comprising Britain plus one sixth of Ireland. —Michael Z. 22:01, 14 April 2008 (UTC)
- It is overly specific. The designation "UK" means that Northern Ireland is included, but the Republic of Ireland is not, and implies that the English of the former is closer to the language of the rest of the UK than to the English of the latter. But if an editor really means to be this specific, he will never use this tag, but a more specific one, e.g.
{{Ireland}}
,{{Northern Ireland}}
,{{Ulster}}
, etc.
- It is overly specific. The designation "UK" means that Northern Ireland is included, but the Republic of Ireland is not, and implies that the English of the former is closer to the language of the rest of the UK than to the English of the latter. But if an editor really means to be this specific, he will never use this tag, but a more specific one, e.g.
- It doesn't appear to correspond to any variety of the language, specific or general. We have British English and Wikipedia has an article on w:British English. My paper dictionary uses Brit. for "chiefly in British English..." It appears that dictionary.com, the online etymology dictionary, AHD and M-W also use "British". Is there any precedent in published dictionaries or in linguistics for a "United Kingdom English" or "UK English?"
- As far as I can tell, the phrase "UK English" only appears when the country is juxtaposed with English, for example English lessons for students visiting the UK.
- In the UK could be useful as a context label (not a language label) for e.g., institutions in the United Kingdom. For example, SAS means Special Air Service (in the UK) and Scandinavian Airlines System (in Scandinavia), but the abbreviation is used for both these institutions in British, American, and every other variety of English.
- I think the tag text should probably be changed to Britain or British. —Michael Z. 21:24, 14 April 2008 (UTC)
And Commonwealth English (most of the world) goes where? Robert Ullmann 22:32, 14 April 2008 (UTC)
Why should it go somewhere?
- I hadn't given it any thought, but there isn't really a language variety such as "Commonwealth English", especially from the point of view of labelling individual words. Wikipedia's w:English in the Commonwealth of Nations is a list of local varieties, and see also w:Regional accents of English.
- Much of the language and usage is the same as its source, British English. Individual regions have developed their own features, but most of these will fall under one or two of
{{Australia}}
,{{New Zealand}}
,{{Hong Kong}}
,{{India}}
etc.
{{Canada}}
stands out, because it is considered a category of "American English" or General American, has inherited many "Britishisms", and generally accepts much language from either.{{American English}}
renders as (US), so this just means that maybe 4/5 of category:US will have to be moved to{{North America}}
.
{{South Africa}}
and{{Philippines}}
may be special cases of their own. —Michael Z. 23:53, 14 April 2008 (UTC)
- The idea of "Commonwealth English" only applies to spellings. When we mark pronunciations, we must identify particular accents or dialects: AusE, RP, etc. --EncycloPetey 02:04, 15 April 2008 (UTC)
- I hadn't even thought of pronunciations, only spelling and vocabulary. I suspect that (Commonwealth) is a synonym for (British, Canadian), but is in danger of being used to mark up terms which are really British and not Canadian
- [A review shows that Commonwealth has some issues. I'll start a new topic at #Commonwealth, below.]
- Anyway, what do you think of changing the text from UK to British? —Michael Z. 02:27, 15 April 2008 (UTC)
- For spelling, I think it's fine, but for pronunciations it would be inappropriate. --EncycloPetey 03:35, 15 April 2008 (UTC)
- Oh, man, right. I never do listen to the pronunciations. Is it a problem because there are transcribed or recorded pronunciations with Irish accents marked up as UK? —Michael Z. 03:42, 15 April 2008 (UTC)
- Yes, that counts as an error. Any pronunciation specifically transcribed or recorded for an Irish English accent should be labeled with (Ireland), or something more specific like (Ulster), where appropriate. We only use UK when (1) the IPA/enPR is general to most areas of the UK, or (2) the editor isn't sure which UK accent is represented. I tend to stick with (Received Pronunciation) because I know what it sounds like (from years of watching BBC programs), but other UK regional accents are possible. --EncycloPetey 17:56, 15 April 2008 (UTC)
- Then it sounds like labelling the UK recordings as British would actually be more accurate, since Irish has its own label. —Michael Z. 18:09, 15 April 2008 (UTC)
- No, because "British" is not a current geographic location, unless you are assuming that the people of northern Ireland sound more like Londoners than like the rest of English-speakers in Ireland. We prefer geographic labels when a specific accent is not used, and the highest geographic label I've ssen anyone use is national. --EncycloPetey 21:37, 16 April 2008 (UTC)
- I'm confused. Am I getting something wrong?
- UK includes Northern Ireland, so it includes English, Scottish, and Irish accents.
- Britain or Great Britain are both current geographic locations which correspond to the big island, so either one includes English and Scottish. Since Britain includes many diverse accents, we prefer RP or mainstream English, and use other labels for regional speech including Scottish, Cockney, Cornish, etc.
- Sorry, I should have said "Britain is not currently a nation". We tend to prefer country names as the broadest item in
{{a}}
, so UK would be preferred if a specific regional accent cannot be given. "British" doesn't really improve specificity, since the island of Britain includes myriad English dialects and accents. The only way that "British" could ever be truly useful is if it is known that the pronunciation is identical throughout England, Scotland, and Wales, but is different in northern Ireland. There is little likelihood of that, and less that it would be known for certain. Eevn then, the question would be open in the mind of the reader about what was intended. --EncycloPetey 02:58, 17 April 2008 (UTC)
- Sorry, I should have said "Britain is not currently a nation". We tend to prefer country names as the broadest item in
Am I the only one...
...who hates the [show]/[hide] buttons being on the left side of inflection tables? I miss the old ways! — [ ric ] opiaterein — 21:38, 14 April 2008 (UTC)
- No, I also prefer them on the right. (Thus starts a quick opinion poll) Conrad.Irwin 21:41, 14 April 2008 (UTC)
- Right, please. "show ▼" should not look like a subheading. —Michael Z. 22:05, 14 April 2008 (UTC)
- Both., with ability to choose whichever you like via prefs. (left by default as I have heard several complaints about people not even knowing they could expand them, it would be nice if the entire div were clickable for expansion...) - [The]DaveRoss 22:07, 14 April 2008 (UTC)
- I too prefer it on the right, as my mouse is normally on the right hand side of the page for edit links / scroll bars (when using a trackpad it is even more annoying than a proper mouse). I do like the "▼" symbol though. Based on TDR's comment above, I think a WT:PREFS setting for left or right would be ideal. Thryduulf 23:24, 14 April 2008 (UTC)
- Right. In principle I prefer the left, but "show" and "hide" come out to slightly different widths, which means that the header moves a bit in a way I don't like. (Yes, I'm picky. What else is new?) —RuakhTALK 23:29, 14 April 2008 (UTC)
- Right as well. (Functionality to right, layout to left) - Amgine/talk 23:33, 14 April 2008 (UTC)
- Could not a "left" option place the link just to the right of the subheading, instead of next to the right margin? —Michael Z. 23:36, 14 April 2008 (UTC)
- I would expect that non-editing users would be better off with the show/hide more visible (almost certainly on the left) and editing users might well prefer them on the right, where their cursor often is. I don't see why the preferences of those of us here (almost by definition editing users) should determine what non-editing users see and use. The solution of having editing users be able to select the look in WT:PREFS or, better, "my preferences" would seem to be the best of both worlds. DCDuring TALK 23:46, 14 April 2008 (UTC)
- The preferences are pretty much unanimous, so wondering what hypothetical non-voters might find useful should not be an issue. A lack of data should not be used to support the alternative. I too prefer them on the right. That is where the scroll bar usually is for a broswer window, so that is where I want the collapsing/expanding arrows. Putting the arrows all the way on the left side means more unnecessary hand and mouse movement. That will be true regardless of whether the person is editing or simply reading. --EncycloPetey 03:47, 15 April 2008 (UTC)
Reveal arrows are a common interface element in operating systems. On the Mac, they are a grey triangle facing right, just to the left of the heading/text, which rotates to point down when opened. they don't require brackets or "show/hide" text. Doesn't Windows use something like an elbow-down arrow the same way? I have no idea about Linuxes
Why not just copy what readers already expect to see, instead of inventing a new interface? It may even be possible to use MSIE conditionals to give a separate presentation for many Windows users. —Michael Z. 00:15, 15 April 2008 (UTC)
- That is a good point, I think it was mentioned in the previous discussion about these - but not as forcefully. Windows, iirc, uses a [+] symbol. This code is of course copied from Wikipedia, an so it might be better to match what they do rather than try and get back to operating system level (which would not be how people expect websites to act). Maybe the sideways arrow would be better if the link is on the left, though it doesn't make sense if the link is on the right. Conrad.Irwin 01:02, 15 April 2008 (UTC)
- I have implemented the idea for making the whold NavHead (grey bit) clickable, and so this is yet another option to be considered, I am fairly ambivalent as to whether it is used, on the one side it is a bigger target to hit, on the other it is slightly less obvious what is going on. It does of course solve the problem of experienced contributors having the mouse on the wrong side of the screen ;). (Note that this method removes the functionality from the [show] button making it just a status indicator). Conrad.Irwin 01:02, 15 April 2008 (UTC)
- Forgot to mention... The idea of having the link after the text was trailed and dismissed - though that was before the whole NavHead could be clicked. <techy note>Also, there is a yucky hack repeated all over the place - " " should NOT appear a the start of <div class="NavHead">, the necessary space is now added by CSS. See this edit for what needs fixing</techy note>
- Additional notes: hard refresh if it isn't working for you yet. Also, you can put your own "show/hide" anywhere you want by setting the
.NavToggle{ float: [left|right]; };"
in your monobook. You can hide it altogether withdisplay:none;
. - [The]DaveRoss 01:20, 15 April 2008 (UTC)
- Additional notes: hard refresh if it isn't working for you yet. Also, you can put your own "show/hide" anywhere you want by setting the
- Having the whole bar clickable is a slightly weird—we are not used to things popping open unless we click on a widget. (The exception is labels for form widgets, but they only change cursor focus.) Unexpected behaviour is not normally desirable, especially if it can change the user's context, say by unexpectedly opening a large translations section, and pushing the text they were reading down out of the window.
- Frankly, we already have a very prominent and self-explanatory click target ([show▼]), so I don't think it improves things to make an invisible hot-zone about 15 times bigger.
- A quibble: it also activates if I click and drag to select text in the title, and behaves a bit strangely if I double-click or double-click and drag. —Michael Z. 03:37, 15 April 2008 (UTC)
All changes have been reverted due to lack of support here. I am happy to play around with other ideas, but the current situation seems to have been annoying a lot of people ;) Conrad.Irwin 11:17, 15 April 2008 (UTC)
- Thank you for your edits Conrad Irwin. I'll continue on Wiktionary:Beer parlour#"Show" tags. Best regards Rhanyeia♥♫ 14:21, 15 April 2008 (UTC)
- Please make it a PREF for those of us who
aren't dumblike it that way. - [The]DaveRoss 20:13, 15 April 2008 (UTC)
- Please make it a PREF for those of us who
First of all, thank you to the people who put in a lot of time consuming and not always interesting leg work to question, list, verify and clean up entries.
Second, I have read something like 200 RFV and RFD conversations over the past few days, and many more than that in the past, and I have a wishlist or set of tips that I think will make following discussions and cleaning up after discussions a lot easier. This will hopefully cut down on the time it takes to archive, and make less tedious the archiving tasks causing more people to be willing to do them.
- Please make your "vote" clear. Right at the top of whatever statement you are making is best, bold "keep", "delete", "merge", "cited" statements make it much easier to see quickly, without rereading a 30 statement discussion what the gist of the content was. Make any qualifying statements, blanket statements, policy statements or general statements after the vote, so those archiving later can skip the information not directly relating to the status of the page.
- Please close discussions which are completed. The best way to do this involves <s>striking through</s> the title, noting the changes which were made to the page/sense/entry at the bottom of the discussion, and removing the tags from the target page. If this process is followed it is much easier to archive the page, it doesn't even have to be done by hand.
- Please format your comments so they are easily delineated from other's comments. If everyone's comments are at the same level it is much harder to figure out who said what, meaning we have to reread the entire discussion. If comments are on different levels it is much easier to weigh the different opinions.
Some of these requests may seem like laziness, not wanting to have to read the entire discussion, but I think if we all try to be more clear in our intentions on these discussion pages it will be less painful for people to archive old discussions and they will be archived more often. The second benefit for clarity in discussions is that they are much more useful down the road, we archive most substantial discussions so that people can read them in the future to understand why decisions were made, and the clearer our statements the more readily future readers can utilize them. Thanks. - [The]DaveRoss 23:19, 14 April 2008 (UTC)
- +1. —RuakhTALK 23:29, 14 April 2008 (UTC)
- +1 and thanks to TDR for the colossal effort to close and archive. DCDuring TALK 23:46, 14 April 2008 (UTC)
- I appreciate these thoughts not only to aid helpful archivers like DaveRoss, but to promote more closure on the RFD and RFV pages. Many of the discussions there are left ambiguous and unanswered, leaving a difficult and potentially controversial task in the hands of the archiver. The burden of accomplishing consensus and actually altering/deleting articles in the RFV/RFD process should fall to those active participants in the discussion and not be left to interpret by someone archiving 100 entries. -- Thisis0 23:40, 14 April 2008 (UTC)
- Suggestion: Can we create a brightly-colored icon that can be placed on discussions which ought to have been closed, but for which the outcome is still unclear? Such a bright icon, added to discussions that have languished for a month (or more) might help draw attention to users who can help. --EncycloPetey 01:59, 15 April 2008 (UTC)
- something like:
{{look|nocat=1}}
- (perhaps with less obnoxious colors) - [The]DaveRoss 02:16, 15 April 2008 (UTC)
- I think the obnoxious colors will achieve the desired effect. In essence it says, "You must attend to this issue to make the ugly banner go away." --EncycloPetey 03:34, 15 April 2008 (UTC)
- I think that's a good idea. How about setting it up as [[template:more input needed}} or template:unfinished discussion? Thryduulf 11:12, 15 April 2008 (UTC)
- I've gone for the slightly snappier, imperative.
{{look}}
, which adds pages to Category:Input needed (though I suspect this feature is fairly useless) Conrad.Irwin 11:37, 15 April 2008 (UTC)
- I've gone for the slightly snappier, imperative.
- I think that's a good idea. How about setting it up as [[template:more input needed}} or template:unfinished discussion? Thryduulf 11:12, 15 April 2008 (UTC)
- I was under the impression that these would go in the discussions on RFV/D, which would list RFV/D in that category but not much else :). - [The]DaveRoss 20:09, 15 April 2008 (UTC)
- That was also my understanding. We wanted to draw attention to neglected discussion within the RFD/RFV pages with an attention-grabbing template that lets folks know the discussion needs resolution. --EncycloPetey 21:33, 16 April 2008 (UTC)
- As a trial I am going to start plunking this into old discussions without resolution, we will see how it goes. - [The]DaveRoss 13:59, 19 April 2008 (UTC)
IPA
Isn't the IPA for Wiktionary wrong? (the last letter) — This unsigned comment was added by Fshfsh (talk • contribs) at 22:50, 14 April 2008.
- No, not for some pronunciations. --EncycloPetey 03:56, 15 April 2008 (UTC)
- The IPA is godly. I might start praying to it.
- But if you're talking about the thing in the upper-left corner of every page, yes. It is. lol — [ ric ] opiaterein — 12:05, 15 April 2008 (UTC)
Commonwealth
{{Commonwealth}}
includes Canada, but much of Canadian English spelling, vocabulary, and pronunciation differs.
Specifically, the following entries have senses or spellings categorized as {{Commonwealth}}
, but are little-used in Canada: alphabetise, alphabetised, appal, archæological, archæologist, brew, discretise, editorialising, fanny, first floor, heads of agreement, homoeomorphic, homoeomorphism, hospitalisation, phonograph, point, pretence, seagulling, superfund, tea cosy.
I notice some are marked up (Commonwealth, except Canada), but they still end up incorrectly included in category:Commonwealth English. Is it okay to mark these as British (UK) instead? (Does category:UK represent UK dialect of English, or UK regional context, or both?) Is there a better way to mark them up? —Michael Z. 04:07, 15 April 2008 (UTC)
- If Canada is the only exception, then I think it makes sense for these to remain in the Commonwealth category. Otherwise they would have to be tagged
{{UK|Australia|New Zealand|South Africa|India}}
at the very least (and that would still leave out a bunch). It might be interesting to have a separate template and subcategory for Commonwealth spellings/words which are not used in Canada. -- Visviva 05:09, 20 April 2008 (UTC)
- Perhaps Canada is the most common exception, since its language is in many ways a part of American English, but a regionalism from any of the Commonwealth countries is an exception.
(Roman) numerals: which ones to include?
Recently 70.55.85.225 (talk) has been adding content to a lot of non-standard roman numerals such as mmmmm and LLM. Some of these edits are not good, since inconsistent and wrong, but some of it is usable as well, and he gave a reference on my talk page which defends them, although nothing about double subtractions can be found on w:Roman numerals.
However, I am thinkin that most if not all of these entries are sum of parts and therefore do not deserve a lemma. Of course we want C, M, and the basic ones, just like we have entries for digits, but not for numbers (or are supposed to). But as I look at it, these are a mess as well. 0–9 exist (though with very different levels of detail), but 10–19 as well, whereas 20 and on are redirects to the corresponding spelt-out pages. Someone really needs to do some cleanup here! See Category:Arabic numerals. (Note that entries such as 180 and 1337 have other reasons of existence, though there also, I’d propose to remove the ‘Translingual’ section.)
What do others think? H. (talk) 14:05, 15 April 2008 (UTC)
- I don't mind including SoP entries - particularly in cases like this where it isn't necessarily obvious what parts it is made up of, however I wouldn't ask anyone to go round and create these. It seems to me that it is better to return something rather than nothing for cases like this, and if these words are deemed not necessary for inclusion I would like to replace them with
{{only in|
or something useful (see XVII) perhaps linking to an appendix. As I said though, I would prefer them to have full entries if someone wants to create them. Conrad.Irwin 15:29, 15 April 2008 (UTC){{pedia|Roman Numerals}}
}}
"Show" tags
This conversation has started on Wiktionary:Beer parlour#Translation bars and continued on Wiktionary:Beer parlour#Am I the only one.... The "show" tags of translation bars are now far on the right, and they were tested on the left. This led to a lot of comments and I'll continue here by commenting some of them:
"'show ▼' should not look like a subheading" This may be true, and what I'd maybe be interested in trying next would be the "show" under the title text, but something else too as I'll write in the end of this message.
"Functionality to right, layout to left" I think this is part of the "layout" since it includes important mainspace content. Those who don't edit are not likely to pay attention to a small "show" on the right and they don't even know they are supposed to be looking for something.
"...right. That is where the scroll bar usually is for a broswer window, so that is where I want the collapsing/expanding arrows. Putting the arrows all the way on the left side means more unnecessary hand and mouse movement. That will be true regardless of whether the person is editing or simply reading." I haven't even thought about the scroll bar. Most softwares I use have most of the things on the left side (except the scroll bar), and because English (and many other language) text naturally starts from the left, that's where people are looking at. And there are things here also on the left side, like "save page" or "search". How about if there was a clickable area all the way under the title text, with ▼ both on the left and on the right, and "show" on the left? Best regards Rhanyeia♥♫ 14:18, 15 April 2008 (UTC)
- The fundamental issue is that we (those participating in community forums) are not a good model for "normal" non-editing users. As long as this is a construction site organized for the the convenience of the craftsmen rather than as a building to be used by normals while being renovated and maintained, it will not be possible to build any enthusiasm for efforts focused on the needs of normals. Even the limited amount of free-text Feedback seems to have been considered an annoyance. The lack of interest in our page-view statistics and in steps to improve our Google visibility are troubling to me. DCDuring TALK 14:57, 15 April 2008 (UTC)
- Well, any wiki which tries to orient itself to the needs of non-editing users at the expense of the needs of editors is not going to get far. That's not because editors are somehow better or more important, but because ultimately any work that gets done is going to be done by editors, simply because they feel like doing it. People who want to work on usability and feedback response are welcome to, and I for one applaud the efforts that have been made so far; but I've never had much interest in presentation (on-wiki or off) and personally prefer to spend my limited time working on content. Anyway, in the present case, IMO the more critical concern is that a poor interface may cost us editors (and therefore content) in the long run, when people who want to add translations (etc.) can't figure out where they are or how to do so properly. -- Visviva 01:46, 16 April 2008 (UTC)
- I like this last proposal (a bar beneath the gloss with down-arrows at either end); AFAICS it would solve the opacity issue without creating any new problems. Can we get a demo? -- Visviva 01:46, 16 April 2008 (UTC)
- I wouldn't mind seeing a demo either, but if one control is wanting, then I suppose duplicating it would be twice so.
- Another model is the OS X dictionary app's display of multiple references. It uses a reveal arrow at the left-hand side of a divider, but the whole divider line and label act as a control. The best thing about it is the simplicity. One widget, one rule, and a simple label, all in the same low-key colour. The content relies solely on typography to reveal its structure, with no, bullets, double rules, background colours, or unnecessary labels or punctuation.
- I'm not suggesting copying it verbatim, but keep removing elements until there's nothing left to remove. —Michael Z. 05:53, 16 April 2008 (UTC)
Misspelled words
I was told that it would not be an appropriate to use a redirect to help people spelling words incorrectly to the correct dictionary page. Could anyone tell me if Wiktionary allows us to help people searching for a definition to the right page. I'm not really looking for Template:misspelling of but just a general spellchecker to aid people looking for the definitions of words in which they do not know the spelling of. If a redirect is not how to do it, then how? -- penubag (talk) 09:19, 16 April 2008 (UTC)
- I have done some experimenting with using aspell to spell-check my searches,(<tech>this currently works by using javascrpt to open an iframe to a python script on localhost which in turn uses the api to colour links correctly - so it is very inefficient</tech>). This seems to be very effective, though there are issues - particularly in that you need to guess what language was being searched for (<tech>currently guesses english + the ACCEPT_LANGUAGE header of the browser - but should be possible to get the guess to change for other scripts</tech>). There are plans for us to get mediawiki:Extension:DidYouMean which deals with diacritics etc. and so it may be possible to integrate aspell into an extension like it at a similar time. I thing that this is very important, as a large proportion of our feedback has said "I can't find what I'm looking for" Conrad.Irwin 10:35, 16 April 2008 (UTC)
- You are doing what used to be called God's work, Conrad. DCDuring TALK 11:23, 16 April 2008 (UTC)
- I just looked at usage of "God's work". Let me make clear I meant that in a good, non-ironic way. DCDuring TALK 13:48, 16 April 2008 (UTC)
context...sense...qualifier...italbrac...a...
Are all of these really necessary? Do we really need more than 2, maybe 3 of these? — [ ric ] opiaterein — 12:19, 16 April 2008 (UTC)
- Italbrac does less than the others, I think. I think they are intended to have mnemonic names suggestive of their application, even if they don't do things very differently. But, as I understand it, context, for example, puts things in categories and also creates a list of candidate context categories. To me italbrac is the one that has been superseded. But I suspect it would not be wise to automatically replace it with one of the newer more specific tags. It is tedious to hand-check each one, so it doesn't seem to be a high priority to replace it so that the apparently redundant template can be deleted.
- Maybe we need to have template categories, like:
Encouraged, Standard, Permitted, Deprecated, To Be Removed, Experimental. Encouraged might be rendered ultra-accessible. Deprecated would be removed opportunistically and earn the scorn of an editor's peers if used. "To Be Removed" could have lists and a project page to encourage removal. Maybe we already have or have had such a system. Perhaps it has been found wanting. DCDuring TALK 13:46, 16 April 2008 (UTC)
- This seems plausible, though it might lead to some unnecessary political drama if imposed too rigorously. But let's not forget that a) this is an open wiki, where overt structure is frequently harmful; b) in the absence of real policies, templates often fill this role de facto. Nobody seems to mind when I and others use Template:quote-book in entries, but I imagine EP (and perhaps others) would have a fit if I tried to mark it "encouraged." Likewise I would have a fit if someone marked it "deprecated" (without a better replacement) or "experimental."
- I think we do have a deprecated template category somewhere, and possibly one for "subst-only" (which is a very important category, since templates of this type often to be orphaned). If not, those should certainly be created; "experimental" too. For "encouraged" one would want to see some sort of non-bureaucratic approval process; a flash poll on the BP/GP, maybe? -- Visviva 02:53, 17 April 2008 (UTC)
{{context}}
to provide labels before the definition and if the appropriate template exists it will categorise etc.
{{qualifier}}
is used to qualify items in lists, it is often used next to links to alternative spellings, in translation tables etc.
{{sense}}
is used under the Synonyms and Antonyms sections to refer to the sense/definition to which the listed terms apply.
{{italbrac}}
as mentioned is mainly redundant.
Hope this helps--Williamsayers79 18:15, 16 April 2008 (UTC)
- The talk pages of these templates are usually a good place to start if you want to know what they do. Some of them however, may need a documentation updated.--Williamsayers79 18:17, 16 April 2008 (UTC)
I don't use {{qualifier}}
myself, but all of the others are definitely necessary (though {{italbrac}}
may not be used much anymore). Although the output will look very similar for most of these, the function and location for their use are very different. Each exists so that, in the event we decide to format a particular section differently, we need only adjust the appropriate template. They also exist so that users can customize display for certain sections. Using the separate templates for the different functions keeps format and customization of different sections separate, rather than forcing a single format for all of those locations. --EncycloPetey 21:28, 16 April 2008 (UTC)
Checking my own uderstanding: {{italbrac}}
is designed soley to give users who are "in the know" the option of viewing given text one way or another based on personal preference, instead of having the text "hard-coded". Normal users would see it in the default setting. The same capability is a by-product of the other templates (but I hope the primary justificaton is what EP suggests: flexibility in altering presentation for the benefit of actual end users). There are analogous templates for controlling appearance and functionality in etymologies {{etyl}}
and {{term}}
. {{term}}
is actually supposed to be used when a term is referred to within text like usage notes, excluding definitions. It is particularly useful for non-English terms, especially non-Roman scripts. I would interpret all this as meaning that the more specific template are to be preferred over both hard-coding and {{italbrac}}
and "italbrac" is preferred over hard-coding. It would seem to suggest that we will have "italbrac" with us for a while. DCDuring TALK 22:02, 16 April 2008 (UTC)
- That seems about right. Some background reading: 2007: discussion leading to qualifier, 2007: qualifier is born, 2006: italbrac discussion. Frankly I'm still not sure why we are still using
{{italbrac}}
, except for inertia; all the standard cases where italics and parentheses would normally be required are already covered by specialized templates. The problem being, as mentioned above, that each case needs to be hand-checked. -- Visviva 02:53, 17 April 2008 (UTC)
List of descendents
What if we added a parameter to the Term template like |desc=true en to populate a list of descendants of a word? For example, on sandal the word σανδάλιον would make a list (somewhere) and include English: sandal. I have no idea if this is even possible or how it would be done, just putting it out there as something that could be pretty interesting. Nadando 23:38, 16 April 2008 (UTC)
- We already have a Descendants section. I'm not sure how such a parameter would work, since it would have to specify a language, yes? --EncycloPetey 02:50, 17 April 2008 (UTC)
- Yeah, that's why I put |desc=true|en or something like that. Nadando 02:52, 17 April 2008 (UTC)
- That sounds difficult or impossible to work. There would have to be a way that the descendant words were all marked (and many entries currently have no etymology at all!). Nothing we currently have does that. I think this would more easily be coded into the kinds of tables we currently have. --EncycloPetey 03:06, 17 April 2008 (UTC)
- Yeah, that's why I put |desc=true|en or something like that. Nadando 02:52, 17 April 2008 (UTC)
- To be done natively within MediaWiki this would require some sort of fancy-shmancy (and unapproved) extension. But I suppose some sort of automated script (that would sift Special:Whatlinkshere for links from Etymology sections preceded by an appropriate etymology template) could do a good deal of this work. Visviva 13:22, 17 April 2008 (UTC)
- That would certainly be worth a shot. I'd suggest, though, that instead of simply generating Descendants, it should also tackle Derived terms. After all, Derived terms are simply Descendants, but in the same language (at least the way we have the sections defined). So, the bot would need to know both the source and target language, and compare them. --EncycloPetey 21:40, 17 April 2008 (UTC)
en-adj (not comparable) link
Can the wikilink from the (not comparable) option in the en-adj template link to Appendix:Glossary#comparable instead of the more general wiktionary entry? -- Thisis0 07:01, 17 April 2008 (UTC)
- I've been bold and changed it per your suggestion. I've also changed the target of the "comparative" and "superlative" links to the same glossary entry. Thryduulf 11:56, 17 April 2008 (UTC)
- I've made the same edits to the
{{en-adv}}
template, the{{en-noun}}
template already links to the glossary. Are there any others that would benefit from this? Thryduulf 14:24, 17 April 2008 (UTC)
- I've made the same edits to the
- Good. Now all we need to do is work on the Appendix language about "the controversy", that is, vestigial prescriptivism. DCDuring TALK 14:38, 17 April 2008 (UTC)
- What do you mean? what do you want addressed? -- Thisis0 18:58, 17 April 2008 (UTC)
- The
{{comparable}}
, the sense-line tag, important for long entries, would benefit from the same link. I believe that{{not comparable}}
,{{countable}}
,{{uncountable}}
(Should that be displayed as "not countable"?) have links. I forgot to check whether all the links are to the appendix glossary. DCDuring TALK 14:50, 17 April 2008 (UTC)
- The
Other project sidebar links
Where the target page in another project has the same title as the Wiktionary page the sidebar displays just the name of the other project (e.g. at frog). Where the target page has a different name, the box-style templates (e.g. {{wikipedia}}
) display just the project name, the lite templates (e.g. {{pedialite}}
) display the project name and the title.
Where more than one page is linked to on another project this can be very confusing, for example:
I can see two ways around this - the first is to display the target page name, even if it is the same as the Wiktionary page title. The second is to allow a parameter that contains custom text to display as the name. Thryduulf 15:24, 17 April 2008 (UTC)
- Are you talking about "in-text" (aka "in-line") or "in-list" links? Have you looked at WT:LINKS? DCDuring TALK 16:02, 17 April 2008 (UTC)
- I'm talking about the "in other projects" links to Wikipedia, Commons, Wikispeicies, etc in the sidebar. The lists above are copies of what appears in these boxes in the pantograph, go and Lynx entries. What I'm saying is that we need to change how these links are displayed so they are less confusing. Thryduulf 22:34, 17 April 2008 (UTC)
- The problem is that there is not really enough space for any more information, see Template_talk:wikipedia2 where I have included some longer iwiki links. I'm really not sure how to make this less confusing. Conrad.Irwin 00:14, 18 April 2008 (UTC)
- Look at Afar. For this page, the main link in the disambiguation; other links have specific names. Is that what you mean? If so, I think it best to link to the disambiguation in such cases, and only link to more specific pages via pedialite if there are specific pages with direct connection to specific definition senses. --EncycloPetey 00:49, 18 April 2008 (UTC)
- I agree. —RuakhTALK 02:03, 18 April 2008 (UTC)
- That sounds good, but for entries like pantograph there are two Wikipedia articles, one at w:Pantograph the other at w:Pantograph (rail)· What I'm saying is that the link to w:Pantograph currently appears in teh sidebar as "Wikipedia" and the link to w:Pantograph (rail) appears as "Wikipedia: Pantograph (rail)", but that they should appear as "Wikipedia: Pantograph" and "Wikipedia: Pantograph (rail).
- At Lynx the links appear as "Wikipedia", "Wikipedia", "Wikipedia: Lynx (disambiguation)" and "Wikispecies". They would be much better as "Wikipedia: Lynx (cat)", "Wikipedia: Lynx (constellation)", "Wikipedia: Lynx (disambiguation)" and "Wikispecies: Lynx". Thryduulf 10:02, 18 April 2008 (UTC)
- I'm not so sure. I think it would be better to have only the sidebar links for wikipedia:Pantograph ("Wikipedia"), wikipedia:Lynx (disambiguation) ("Wikipedia"), and wikispecies:Lynx ("Wikispecies"). I think the sidebar links should just point to the project with more information, and that project can help its reader navigate around. (I'm not sure if it's best to try to implement this in the wiki-code for the templates, or in the JavaScript that produces the sidebar links.) —RuakhTALK 12:24, 18 April 2008 (UTC)
- I'm inclined to agree with both of you. That is, it would be nice to be able to suppress sidebar placement, but when present, sidebar links should clearly identify their destination (at least when != w:PAGENAME). As seen in the Lynx example,
{{PL:pedia}}
seems to behave in the desired way while{{wikipedia}}
does not; this is governed by the stuff in the "interProject" span at the end of the template. I can't see any reason why the two templates should behave differently in this regard. -- Visviva 13:46, 19 April 2008 (UTC)
- I'm inclined to agree with both of you. That is, it would be nice to be able to suppress sidebar placement, but when present, sidebar links should clearly identify their destination (at least when != w:PAGENAME). As seen in the Lynx example,
What happened to the main page?
What happened to the main page? RJFJR 15:51, 17 April 2008 (UTC)
- You need to delete Wiktionary:Main Page (added by "Bbnbcm") (- but there don't seem to be any sysops when you need them.
- When I clicked history there were only three entries. (Its since been fixed) RJFJR 16:09, 17 April 2008 (UTC)
Has been restored. The renegade admin has been de-sysopped by a steward (thanks Spacebirdy!) Robert Ullmann 16:27, 17 April 2008 (UTC)
- I suspect that the loss of images may be related. DCDuring TALK 13:19, 18 April 2008 (UTC)
- His deletion log doesn't show any recent image deletions that we're unreasonable. What surprises me is that we are not using local protected copies of the Commons images on the main page. It doesn't look like they're protected on Commons either... Mike Dillon 15:07, 18 April 2008 (UTC)
- It might have been a temporary problem at Commons or even more specific to me. It would only have taken a rename/move at Commons, which could be under another name. An incident always makes one a little skittish. But it is a vulnerability, not that a wiki won't always have plenty of them. I don't have the mindset for security. I appreciate whatever protection can be afforded our efforts, as long as there isn't excessive inconsistency with the fundamental philosophy. DCDuring TALK 16:12, 18 April 2008 (UTC)
FL links in translations
I've updated the CSS and the {t} templates for these. The appearance should be improved on most browsers (but I can only test a few); the links will have less effect (if any) on line spacing, and on IE's irritating habit of vertically centering a line with a superscript on the bullet. I may have made them just a little too small, tell me? (It has to do with how many pixels, so on my screen 70% and 75% are one pixel different, others may see 0 pixels difference.)
The result is that you can now do a whole series of customizations. You can change colours, font size and font, adjust the degree of super-scripting (including none), leave off the parentheses, or suppress the links entirely.
On all current browsers except Internet Explorer, you can also replace the parentheses with (e.g.) brackets, add symbols, or replace the language code with symbol(s).
See Customization at template {{t}}
. Robert Ullmann 15:51, 18 April 2008 (UTC)
Inactive Sysops
I just did a quick, informal audit of Wiktionary sysops. I was looking at the total number we had (75) and it seemed high. The reason that it seemed high is that 11 (~15%) of our sysops are relatively (or completely) inactive. We have, in the past, removed sysops who were no longer active on the project, and I was wondering if the time had come to do so again. If not removing the sysop flag, perhaps removing them from the list on WT:A, which gives the impression that we have a lot more help than we actually do. Here are the sysops which I found to be "inactive" by my own impromptu standards:
- User:Ortonmc - diff declaring he no longer wanted to be an admin.
- User:Jun-Dai - 4 edits since 2006, not sure about interest in the project any longer.
- User:Tawker - 1 edit in past year, not sure about interest in the project any longer.
- User:Kipmaster - Not much use of the tools, we can ask easily enough about interest.
- User:Psy guy - no edits since 2006
- User:Aulis Eskola - ~12 edits in the past year, not sure about interest in the project any longer.
- User:Andrew massyn - inactive since May 5 2007.
- User:Pathoschild - not terribly active here anymore, easy to ask about interest.
- User:Alhen - not very active here, still active on es.wikt, easy to ask about interest.
- User:Tohru - 5 months inactive, not sure about interest.
- User:Enginear - ~1 year inactive, not sure about interest.
Now, lest anyone say so, I have nothing against any of these people, I quite like all of them who I have interacted with. My primary concern is that we actually do need more sysops, and right now it looks like we have a lot more than we really do. I consider the sysop flag something which indicates a participation and commitment to the project (I know this isn't shared universally) and would like to see more active contributors flagged to help out, but when folks are done I think it is also a good idea to remove the flag. This is certainly not an all or nothing thing, there are varying degrees of inactivity amongst the people I have listed, but I am interested to hear thoughts on what is in the best interest of Wiktionary. I think the ideal here would be to come out with a clear idea of what we consider inactivity, what we consider a standard practice when a sysop is inactive, and then apply it now and in the future. - [The]DaveRoss 22:03, 18 April 2008 (UTC)
- Perhaps, instead of removing the sysop flag - which I see no reason to do if we want more sysops ;), we could just remove them from the main list at WT:A and use that as our count instead - there is no need to rely on the software's counter. For me, i would say that inactivity sets in after 6 months of no edits. When people declare that they no-longer wish to be sysops they should have the flag removed. Conrad.Irwin 22:17, 18 April 2008 (UTC)
- Agree with Conrad.Irwin. I see little harm in allowing inactive sysops to retain their flags, at least until we see evidence that this is dangerous. However, it is nice to have an accurate list of active sysops. What might also be nice is if one of our technical folks could write a dynamic list of sysops active within the previous five minutes, so that folks looking for an active admin at the moment (to block a rampaging vandal, ask a question and get an immediate response, etc.) could find one. -Atelaes λάλει ἐμοί 22:29, 18 April 2008 (UTC)
- I am curious as to why inactive sysops ought to retain the flag? It is my understanding that the sysop flag is not a merit badge, it is a set of tools entrusted to certain editors in order to help the project proceed. People who don't edit Wiktionary anymore don't need the tools. - [The]DaveRoss 22:51, 18 April 2008 (UTC)
- Simply because removing them seems like a waste of effort - and, should they pop back one day, then we'd suddenly have more sysops without having to wait for whatever procedure to give them the tools back. The tools should be given to anyone we can trust to use them properly, not just to those who can demonstrate that they are using them As a period of inactivity does not change a user's trustworthyness (we've been through the compromised account arguments before, it is exceedingly unlikely) it should have no impact on the tools. Conrad.Irwin 22:58, 18 April 2008 (UTC)
- The "trustworthiness" metric is an interesting one...some of the older sysops were simply appointed, others got 3-4 votes...this is neither here nor there, but it isn't as if there were sweeping mandates, just a confirmation that folks knew and trusted them. As it is, people who have been inactive for more than a year can hardly be expected to step in and know all the current policy (especially since we don't really write it down). Moreover, I think "sysop for life" adds to the notion that being a sysop is some kind of honor, it should be a set of tools that people have when they need them and don't when they don't. Several retirees have voluntarily dropped the flag when their work was done, others seem to just have disappeared. - [The]DaveRoss 23:12, 18 April 2008 (UTC)
- I agree with both points of view. I think it logically makes sense to remove an unused sysop flag after a while, especially since (pace Conrad) there is a bit of a risk of compromise now with the advent of unified accounts; but it just doesn't seem worth the effort to develop formal criteria (or vote on each individual admin) and petition stewards accordingly. —RuakhTALK 04:31, 19 April 2008 (UTC)
- Actually avoiding a bunch of votes is what I had in mind, I figured if we can just say "one year without using any sysop tools indicates inactivity" or some similar basic criteria then we wouldn't have to vote individually. That was basically the criteria we have used in the past. It is also of note that none of the people who have left for that length of time have ever come back, Kevin Rector was pretty close but never actually had his tools removed. - [The]DaveRoss 13:55, 19 April 2008 (UTC)
- <detab> At the heart this is a suggestion for a change of community trust metric. Some projects have chosen to address this with success by either an inactivity limit (if you are inactive x amount of time, the bits are removed) or your position as admin is scheduled for a community reconfirm vote x amount of time after your successful RfA. There may be other solutions as well. - Amgine/talk 14:15, 24 April 2008 (UTC)
- I think what would be good would be something like what Commons has introduced recently, whereby if a user doesn't use any admin tools for X amount of time (6 months I think there), then they are given a note on their talk page that their adminship is under review. If after a further month they still haven't become active again (or given a good reason why they still need the tools), then they are de-sysopped. If they become active again, they can reapply for adminship with a much lower threshold - I think the intention is that they get it back if nobody objects within a few days. If someone does object then the application reverts to a standard period and threshold RFA. Thryduulf 15:37, 24 April 2008 (UTC)
- I have divided the admin list by activity, feel free to modify as needed: [4]. Dmcdevit·t 23:02, 18 April 2008 (UTC)
Display of attributive use of nouns
We often have nouns that are used attibutively, but which do not seem to warrant an entry as an adjective because they do not have enough attributes of an adjective. amazon is an illustration with an unresolved RfV. We need, IMHO, some way of indicating that a noun can be used as an adjective. I would have in mind its use not necessarily for all nouns, but at least for those that have gone through an RfV process that has determined that the adjectival use of the noun does not warrant an adjective PoS. This might discourage the reentry of the adjective PoS and provide helpful information to users.
MW3 indicates such usage by : "often attrib".
Some options I can imagine are:
- No Adjective PoS header
- inflection line label along the lines "(often|sometimes|rarely) used as attributive adjective"
- a "Level 4" heading under a Noun heading: "Adjective use", with text as above and usage examples
- a link to citations page with a standard heading on the citations page for attributive use of a noun.
- a label on the definition line (per Visviva)
- a templated usage note (per Visviva)
- Adjective PoS
- a standard bit of explanatory text on attributive use of nouns and a link to an Appendix or WP article with more.
Has this been addressed before? Was it resolved in the negative or left open? DCDuring TALK 17:58, 19 April 2008 (UTC)
- And also possibly for those that are often used attributively (or better - for all nouns), when there is no corresponding English adjective (in -ic, -an...) meaning "of or pertaining to <noun sense>", to have an optional adjectival translation in noun's ====Translations==== section. Quite a lot of FL adjectives don't link proparly to base English forms because of that, and have "of or pertaining to" stubs. It would be much easier to use [[noun]] (''attibutively'') instead (or standardise this typical usage via some template). --Ivan Štambuk 18:12, 19 April 2008 (UTC)
- At the moment I would tend to favor option 1.4, a label on the definition line. A label on the inflection line is problematic, since one sense may have a much stronger attributive tendency than another. Actually I prefer option 1.5, a templated usage note in the noun section, but this has met with opposition, apparently on the grounds that too many electrons would be consumed. ;-) -- Visviva 05:04, 20 April 2008 (UTC)
- I have added the suggestions to the list above. DCDuring TALK 09:53, 20 April 2008 (UTC)
- I like the idea of a templated usage note, particularly since attributive use may apply to more than one sense of the noun. The templated note could include a link to an Appendix:English nouns, specifically to a section on attributive use. I would also include a link to a special section of the Citations namespace page associated with the entry, demonstrating attributive use. --EncycloPetey 15:25, 20 April 2008 (UTC)
- I like 1.4 and 1.5, and see no need to choose between them. The more, the merrier. :-) —RuakhTALK 16:13, 20 April 2008 (UTC)
- To clarify and develop 1.5 a little further: Under the heading "Usage notes", under the "Noun" PoS, we would have a template available (not mandatory) for insertion containing a link to the Appendix section referred to by EP, with text that said "Used attributively as an adjective", with attributively being the link word to the Appendix section.
- Please feel free to suggest modifications, radical or minor.
- One of the advantages of keeping the adjective PoS header is that a user who has looking for a usage that seemed adjectival for a word that had many Noun definitions would be able to click on Adjective in the ToC and go right to an abbreviated section that referred the user to the noun section with the templated explanatory text. Putting something on the Noun inflection line is better than forcing the user to page down for a usage note that the user didn't know s/he needed, but doesn't appear on the first screen. DCDuring TALK 16:46, 20 April 2008 (UTC)
I'm glad to see this brought up again. Previous attempts at a solution here, here, and here. It's clear to me that these nouns are not in any way adjectives, and referring to them as such doesn't cut the mustard. To me the test is if they are exchangeable with other adjectives. Her "voluptuous, attractive, amazon physique" could not be rendered "her attractive, amazon, voluptuous physique." If it could, and make sense, it's made the crossover to adjective. You read "amazon physique" as a unified noun phrase. That's what it is. Not an adjective. The Germans usually make one word out of it -- that's another test. Other obvious tests that go along with this are the predicative: ("Her physique was amazon.") and comparative forms ("more amazon") that aren't figurative/humorous/a colloquial slip of the tongue for "amazonian". As for a solution, see what I did at satellite, senses 5 & 6. This thing has a name, Noun adjunct. I once thought a templated usage note was the best idea, but the Noun Adjunct tag (preferably blue-linked to it's own appendix with thorough explanation) is the most appropriate to our format while getting to the most truth. -- Thisis0 16:53, 20 April 2008 (UTC)
- I don't think of a PoS header as an assertion of what something "is" (although that might depend on.... Oh, never mind.). An Adjective PoS header that was immediately followed by an explicit assertion that the word should not be considered a true adjective would probably serve to prevent users from going the wrong way with it.
- The term "noun adjunct" does not have the advantage of being widely understood without clicking on a link (which link is not yet present at satellite).
- It really depends on whether we are trying to create a dictionary that is primarily intended to be a map of linguists' current understanding of language or something helpful to learners and non-linguists, antiquated concepts and all. I think we have to build more bridges to the benighted minds of the mythical anon users, about whom we know so little, but who are the source of future registered users and contributors and the ability to win funds from users and grant-givers. DCDuring TALK 17:24, 20 April 2008 (UTC)
- An erroneous adjective section only reinforces the idea that these are, or might be, adjectives. There is clearly confusion surrounding this issue. It's our job to properly categorize and define language so it can be properly understood. "Dumbing it down" for the mythical anon is not at all productive, accurate, or desirable. We aim to have clear, usable definitions that inform the most casual user, and also more informative categories and tools for those who care to make one click. The limits of this dictionary won't be set by the least concerned user. If you fear offending him, why are the esoteric multi-Latin Etymologies at the top of every entry? Certainly that is more oblique at first glance than a Noun Adjunct tag or, on another topic, plurale tantum. -- Thisis0 17:43, 20 April 2008 (UTC)
- I would not object to hiding the Etymology and Pronunciation sections under show/hide bars or moving them out of precious first-screen space. It is hardly a question of "dumbing it down" to treat the mind of our archetypical user as someting other than a tabula rasa. I don't think of users who haven't spent much or their life on grammar as dumb even when they don't speak or write to my taste. We need to accept the realities of their prior education and other experience.
- The vast majority of users think of "Adjective" (when forced to think of it at all) as meaning modifier or describer of a noun. They do not differentiate attributive vs. predicative usage, comparability/gradability, let alone other more subtle attributes. I don't see what beneficial goals we achieve by adding the additional conditions if by so doing we limit a user's access to the most basic information that might be sought. It should be clear from Feedback that users don't find all that we do helpful. DCDuring TALK 18:29, 20 April 2008 (UTC)
- Can I tell you what doesn't make any sense in what you just said? First you say, (unafraid to use an esoteric term, I should point out) that we would do best to approach our users as a tabula rasa ("blank slate"), but then you are wanting to operate on a premise that they do have preconceived notions that 'noun modifiers must be adjectives', etc. Which is it? I agree with 'blank slate', believing we should impart complete and correct information. Second, you twisted my intent for "dumbing it down", assuming I somehow said non-grammar-philes are "dumb." No way. On the contrary, my point is that average users are intelligent enough to digest an accurate grammar tag, and we should not ever assume they are "too dumb to get it." This seems to be your assumption.
- Problems with your solution (Adjective POS with note saying "it's not really an adjective"): 1) That's dumb. 2) If a casual user has any defining characteristic, it's a tendency to glance or skim; a misleading adjective POS for a non-adjective is wrong. 3) It takes up a lot more room. 4) It separates definable noun senses from the Noun POS. 5) It doesn't make any more sense to a non-interested user than an appropriate, succint tag. Yes, there is no Appendix:Noun Adjuncts currently, but there will be. This approach (as in satellite senses 5 & 6) is accurate, succinct, non-intrusive to the uninterested user, and educational to the interested user. Please, please, can the conversation be about this, and not about how we should make this place to cater to the least user. That's what happened discussing plurale tantum, and I do not want this one to fizzle out 'cause it's just turns into you and me bantering about how you want to appease the least user. I really want to hear what others think about the proposed solutions. -- Thisis0 19:37, 20 April 2008 (UTC)
- I believe that it is our job to cater to "ignorant", impatient users first and other users (or the same users when they have more time) later. Esoteric terms seem fine for this forum, not for the target or our basic entries' first screens. If we are any good at language, we should be able to figure out how to "dumb it down" without doing violence to a deeper and more subtle understanding.
- I think our users' pre-existing understandings are the facts of life that we must accommodate to serve a non-elitist version of the mission of WMF. The first skim of the ToC is the first place that we can lose users of our longer entries. If they are looking for something that behaves a lot like an Adjective in the most central way (modifying nouns) and don't find Adjective, they will most likely go to another dictionary and be annoyed at us. Neither outcome will increase the chances that they will click on Wiktionary again. Like you, I had thought we ought to be able to count on our users to know enough about the language that we could completely dispense with an Adjective PoS for Nouns where the only adjectival use was as an attributive. But, 1., seeing that contributors often insert Adjective PoS sections after Noun sections and, 2., examining dictionary definitions of adjective have led me to question my own beliefs and preferences. My thought about using an Adjective PoS was that we could direct users from an Adjective PoS heading to both the Noun (for definitions) and to a helpful explanation of attributive use of nouns. I simply don't see how that inherently constitutes a problem. It might not be the best solution, of course. DCDuring TALK 20:15, 20 April 2008 (UTC)
- I think a separate 'Ajective' PoS would be unnecessary. As long as we mention somewhere (definition line, or usage notes) that it can act similarly to an adjective our entry should jive with what the user was maybe expecting. As for prempting users from creating an 'Adjective' PoS, well as long as we use standard template(s) we should be able to flag entries that have an attributive noun sense, and a separate Adjective PoS, and someone can cleanup afterwards. That being said, I like the combo of a definition-line 'context'-like tag/template and a templated usage note. To make sure users understand the entry, we can worry about the exact wording later. --Bequw → ¢ • τ 20:10, 20 April 2008 (UTC)
- I don't think that contributor creation of an Adjective PoS is a problem that has to be controlled and corrected as much as it is a concrete demonstration of how non-expert users look at PoS. It seems that if they know a word to be used in an adjectival way to modify a noun, then they think that a dictionary ought to show it as an adjective. If someone has facts that say, for example, that non-expert users have a category called two-word nouns and do not expect that the first of the two words is likely to be in a dictionary under Adjective, then I could put my concern to rest. I would even settle for a good sample of what ESL and grammar books would say about atributive use of nouns.
- My Longman's DCE (for learners) doesn't note attributive use in individual entries at all. My MW3 (unabridged, US) has a generous number of nouns marked often attrib immediately after n as well as having seperate entries for neer-SoP phrases like "beer hall". DCDuring TALK 00:26, 21 April 2008 (UTC)
- I agree that the creation of Adjective POS headers is a sign of a problem with our current approach (as are some of the messages received on WT:FEED), and will be a useful metric for any solution. But note that the current approach is to have no special notice of attributive use. If we start using usage notes and/or labels, particularly ones that contain the word "adjective" somewhere, I expect that user confusion (and the ensuing creation of spurious Adjective sections) will drop substantially. The proof of the pudding will be in the eating. -- Visviva 06:06, 21 April 2008 (UTC)
- So, a usage note in the noun PoS is one thing that might be able to agree on. It doesn't seem to require a vote, AFAICT. To generate a real test, we would probably need to find numerous entries of the following classes:
- Noun PoSs that have Adjective PoSs under the same heading (to prevent senses being added in addition to the existing presumably appropriate Adjective senses).
- Noun PoSs that have had Adjective PoSs added under the same etymology which Adjective PoS has been removed.
- Is the version of 1.5 laid out above after Ruakh's comment the best we can do? I wish I felt that we had a real metric: an actual share of additions of new Adjective PoS sections to English Nouns as a percent of total new PoS creations as well as a listing of the entries involved so we could make sure there wasn't too much large-scale irrelevancy. Can we flag entries that are having new Adjective PoSs added to existing noun PoSs (same English Etymology)? DCDuring TALK 10:36, 21 April 2008 (UTC)
- So, a usage note in the noun PoS is one thing that might be able to agree on. It doesn't seem to require a vote, AFAICT. To generate a real test, we would probably need to find numerous entries of the following classes:
You are all aware that the translations of these "noun adjunct" senses of nouns will actually be adjectives in most languages, even in Old English (the ones that have usually retained genders, have distinctive adjectival inflection etc.) ? --Ivan Štambuk 07:20, 21 April 2008 (UTC)
- Actually, look at the current translations at satellite. Other languages have different inflections for the compound-forming nouns, but they don't usually become adjectives. That's actually part of the reason I favor calling them what they are. Other languages know they are nouns, and actually have an inflection case for compound-forming nouns. Yes, some languages will have these as adjectives, but words that are nouns in English should be called that, and you'll find many other languages agree. -- Thisis0 07:30, 21 April 2008 (UTC)
- Well, I can tell you that every single English "noun adjunct" translated in Slavic languages (usally with -ni/-ski suffix) would be a classifier-type adjective, that Czech translation included. Lexical content of a first noun is used as a qualifier for the second noun, and every language that 1) has the abovementoined properties 2) favours adjective-noun vs noun-noun constructs (that is, not like modern German) would pretty much always use adjectival translation. Tbot-generation of entries from translation tables would have to be disabled for this "noun adjunct" senses. --Ivan Štambuk 07:49, 21 April 2008 (UTC)
- Why? Because the part of speech doesn't match between languages? If we did that, then we wouldn't be translating the names of languages. Translations are about translating, and understanding that the grammar in the target language may very well be different. Besides, in most cases there will not be a separate "noun adjunct" sense. A separate sense for "noun adjuncts" is only useful when the sense (when used as an attributive) is more specific or limited than the noun in general. --EncycloPetey 12:32, 21 April 2008 (UTC)
- yes, and also when it is used frequently in noun compounds (dairy, chicken), and when there would be any potential confusion or desire for an Adjective section. -- Thisis0 17:16, 21 April 2008 (UTC)
- Yes, and Tbot can't differentiate between those. I remember correcting about a dozen Croatian nouns into adjectives generated by Tbot that were incorrectly placed in the translation tables of English language names (whose noun senses are mostly fossilized adjectives anyway ^_^). Mismatching between the basic PoS categories such as nouns/adjectives that almost all the (relevant) languages of the world have is not a good choice IMHO. --Ivan Štambuk 14:42, 21 April 2008 (UTC)
- On the other hand, Spanish (and probably other romance languages) most often translate "<noun1> <noun2>" into "<noun2> de <noun1>". Spanish does have separate adjectives sometimes, but the rearrangement with de (“of”) is more common. --Bequw → ¢ • τ 21:31, 21 April 2008 (UTC)
If you label such things adjective, learners of English, who have studied many grammar rules but don't really know the language, will assume that you can do adjectivy things to them: modify them by very, too, so, use them attributively and predictively, grade them, etc. The other thing is, it's pretty hard to think of a noun that CAN'T be used attributively. I mean, maybe there are some words that only appear in certain constructions dint as in by dint of or sake, but apart from that... You can even do it with proper nouns. Why would this need to be mentioned at all? (I'm speaking of English nouns here).--Brett 01:41, 24 April 2008 (UTC)
- You're right. You can do it with all nouns. It's a regular property of nouns in modern English. We're just talking about those that are used most commonly in an attributive sense (dairy, satellite, chicken, etc.), those that have an attributive sense with a slightly different meaning (amazon), or anywhere there might be confusion or a desire for an Adjective header. (Unless of course they've made the full crossover to adjective, then that's what they are.) -- Thisis0 02:58, 24 April 2008 (UTC)
- Because our contributors regularly attempt to add adjective PoSs to nouns because they feel that the adjective sense is missing. The proposal at hand is to come up with some way of preventing that and to also direct users to the noun PoS definitions to find the meaning of the adjectival use they might be interested in and to a helpful note explaining attributive use of nouns (and what shouldn't normally done to such nouns) so they don't waste time looking in the wrong place in the future. I believe there are many users who do not remember this kind of thing or were never taught it. I was one of them, though blessed with the tendency to use nouns attributively without support from any rule. I expect speakers and writers of English to "adjectify" almost any noun they can in all the ways that you seem negatively disposed toward. Sometimes I think that this censoriousness must be much more UK than US (;-)). Wasn't that last just so George W. Bush of me? DCDuring TALK 02:34, 24 April 2008 (UTC)
- I don't think we need to prevent people from creating such adjective sections, nor am I advocating some such mechanism that will prevent or flag such contributions. No. All we are doing is making them more correct and imparting a little educational info. Like Brett said, as long as people think these are adjectives, they "will assume that you can do adjectivy things to them". They don't behave like adjectives because they aren't. Let's just start fixing the most common ones in a simple straightforward manner (Noun sense with tag), and get a good Appendix:Noun Adjuncts or somesuch going. We don't need a software flag or anything. -- Thisis0 02:58, 24 April 2008 (UTC)
- Thanks, I think I understand the situation better now. And, yes, we've run into the same issue at the Simple English wiktionary. Currently, it seems to be under control, but we have only a very small number of editors.
- By the way, I wasn't stating a preference but rather a fact about English. It is ungrammatical to say this is a very faculty office or ask how soccer is your ball? Yes, you can playfully force nouns to be adjectives, but this anthimeria is at a rather different level from what we were discussing.--Brett 12:01, 24 April 2008 (UTC)
The necessity of a new etymology header
Should the verb form entry be under a new etymology header like I have done with mast or is that unecessary? __meco 08:21, 20 April 2008 (UTC)
- Yes, I think it is necessary. If the verb form were placed in parallel with the noun, this would imply that they share the same etymology, which would be misleading. The extra header is somewhat annoying, but I don't see any way out of it while maintaining a sound ontology. -- Visviva 09:37, 20 April 2008 (UTC)
- I agree that a second etymology header is appropriate in cases like this, and is necessary to avoid misleading users. --EncycloPetey 23:22, 20 April 2008 (UTC)
- I think there is a page somewhere that actively recommends it.Circeus 23:08, 21 April 2008 (UTC)
- To push the matter closer to a conclusion, the entry might have "See [[mase#Norwegian|mase]]" under the etymology. Another possibility is to use the template {{term|mase||lang=no|insert gloss here}}. I also noted that mast's Norwegian section heading were not all at the right level after the insertion of the etymology. If "mase" does not actually have an etymology shown, then the Etymology heading at "mase" should have {{rfe|lang=no}}. Finally, I noted that mase, the lemma entry for the verb, as I understand it, did not have a Norwegian section. Following such trails can lead to valuable new entries when you have the energy and knowledge or reference materials needed. DCDuring TALK 01:31, 22 April 2008 (UTC)
- Pages like this I've reorganized before so that the definitions that don't have etymologies go above all the ones that do. That avoids the problem of an empty etymology section, but it often puts much less important definitions first, so honestly I don't think it would be an improvement over what you have. What we definitely don't want to do is write "Unknown" or the like as the etymology, unless the origin of the word had been thoroughly researched with no conclusion reached. DAVilla 18:59, 23 April 2008 (UTC)
- The priority would be to get the lemma form of the Norwegian verb entered, I would think. DCDuring TALK 19:47, 23 April 2008 (UTC)
Dutch gender
At long last we have a policy on this at the Dutch wikti, or at least I have proposed one and nobody objected. With as few as we are that is pretty much law. I have tried to explain the situation and its most reasonable remedy at Wiktionary:About Dutch#Gender and had a bit of a discussion with Visviva. I encourage the anglophone community (including its Dutch speakers, mothertongue or no) to support us in the chosen solution. It is admittedly a compromise, not of my making but that of the Taalunie. I must say though that the latter body has done a pretty good job imho. Jcwf 21:39, 20 April 2008 (UTC) nl:Gebruiker:Jcwf
- Thanks Jcwf for taking this on. I know absolutely nothing about the background issues here, but deferring to the Taalunie seems like the most sensible option. (This would, I guess, mean barring "common" from inflection lines.) Perhaps
{{nl-noun}}
could also link to an appendix where these issues are discussed? Word-specific details could be discussed in Usage notes (or Etymology), as and if appropriate. -- Visviva 23:38, 20 April 2008 (UTC)
- I'm a Flemish speaker and I didn't know the northern Dutch situation well.. In the entries I made, I looked to the gender Van Dale uses, and if there wasn't any I used {c} or {m|f}, but I support this proposal and I'll now use {f|m}. SPQRobin 15:20, 21 April 2008 (UTC)
- This seems like an excellent solution, giving information rather than dictating how an individual speaker should speak their own language. I think we also need an appendix where this is explained, as many (if not most) Dutch courses for English-speakers use northern Dutch and the terminology of "common gender". Physchim62 17:10, 21 April 2008 (UTC) (non-native speaker, level nl-2 on a very good day!)
- I'm a Flemish speaker and I didn't know the northern Dutch situation well.. In the entries I made, I looked to the gender Van Dale uses, and if there wasn't any I used {c} or {m|f}, but I support this proposal and I'll now use {f|m}. SPQRobin 15:20, 21 April 2008 (UTC)
Category for agent nouns?
Would anyone mind terribly if I created a Category:English agent nouns, and categorized accordingly? bd2412 T 05:12, 21 April 2008 (UTC)
- Not I. Seems a meritorious act. -- Visviva 05:53, 21 April 2008 (UTC)
- So, the category is for Bond, M, and Q? ;) --EncycloPetey 12:28, 21 April 2008 (UTC)
- Do we have an entry on too cute by half? bd2412 T 14:16, 21 April 2008 (UTC)
- So, the category is for Bond, M, and Q? ;) --EncycloPetey 12:28, 21 April 2008 (UTC)
American
Folks who also play on Wikipedia may be interested in commenting on w:Wikipedia_talk:Manual_of_Style#American, regarding use of the term (deprecated template usage) American to mean "United States". --EncycloPetey 12:27, 21 April 2008 (UTC)
Gaps in entry titles.
Do we have a good way of representing gaps in entry titles? Like, (deprecated template usage) too … by half probably warrants an entry, which too and half should link to; but what should its title be? —RuakhTALK 17:18, 21 April 2008 (UTC)
- I think we just hope like crazy that we can always split it into connected parts. "[too clever] [ by half ]" works for me, but I appreciate this ignores the main issue. Conrad.Irwin 17:21, 21 April 2008 (UTC)
- Well, it was just a few months ago that we finished deleting all of the "X the Y"-type entries, so I'm guessing that wouldn't be the preferred approach this time (though it does seem logical). As I recall, a primary justification for deleting those was that no one would ever look them up -- something which I'm afraid would apply to pretty much any other way of representing these. This is part of our larger difficulty in handling collocational information, I'm afraid. An interim step would perhaps be to have an Appendix: page detailing the behavior of the given frame (Appendix:Too X by half?), housing various and sundry usage examples. - Visviva 09:06, 22 April 2008 (UTC)
- Formulas for constructions could be permitted in any space outside of principal namespace where mostly more experienced users roamed. It would help if we had some agreement on which space had which kind of content. Appendix space would seem like a good place, but perhaps a more entry-like space that allowed constructions that used a Wiktionary-standard notation would be useful. Perhaps there is a suitable commonly used notation that we could appropriate. Such "entries" might be useful link targets from principal namespace. DCDuring TALK 10:58, 22 April 2008 (UTC)
- I doubt not looking them up that way is a good enough reason to delete scare the X out of, which would be found in searches or as a derived term. Sure, scare/frighten/knock the living daylights out of/the wits out of/... could be reduced to living daylights out of, wits out of, etc., which is a better way to handle those. And sure, no one would ever look up X like Y. However, there are already tens of hundreds of entries with "one" or "someone" as placeholders that would be found the same way as scare the X out of. No one would ever look up one and one's either.
- The question here I think is what to use as a general placeholder for an adjective. I would propose "thus" or "such" as options, but I'm not sure even "do" is used as a placeholder, and it seems like there's a bit of stigma against creativity, which would be very unfortunate. Nonetheless, we already have placeholders, exactly as do traditional dictionaries, and there's a lingering broader question of how to demarcate them. It's not apparent in the title, but I've generally made an indication in the entry itself by italicizing the marker in the heading instead of bolding it. For instance, compare take someone's point with someone else, up one's alley with '''one's self'''. But I seem to be in the minority, since some people like to not only bold the word, but also uselessly link it.
- On the other hand, Hippitrail pointed out that italicization is ambiguous in contexts like nth where the italicization is normal for part of the term. So... maybe there is a better solution for the heading, or maybe the solution requires rethinking the titles. While mind one's p's and q's would imply that "his" or "her" could be substituted, mind one's p's and q's is actually used that way very commonly, so to some extent, both entries are needed. Probably a usage note, or even just a couple of examples, one with "mind one's p's and q's" and one with "mind your p's and q's", are enough. Otherwise, how would you know?
- We don't generally use "..." in titles even for unclosed fragments, I think primarily at the insistence of Connel, who has argued against even such punctuation as (s)he] and s/he. While I strongly disagree with the priciple that including punctuation is always incorrect, and in fact find the wiki software too limiting, I'm fine with eliminating it when it's superfluous. However, I'm not sure that "..." is always superfluous. It isn't polite to say, by itself, "I'd like to know." It's only polite if it precedes something else. And anyways, what happens when we take an expression like that and translate it into another language where the "..." goes in the middle?
- "Too by half" needs "..." or something else in there. Never having heard the expression before today, I definitely think it warrants an entry somewhere that can be searched from "by half". However, I don't think that saying by half is good enough is good enough in the general case. This is obviously a much more general problem. DAVilla 21:45, 23 April 2008 (UTC)
- Appendices and similar places outside namespace 0 are useful for information not suitable for entries for various reasons, but are not very useful for inexperienced normal users using our search box. Notation algebra isn't going to work for them either. The best we can do for them with current software without silly proliferation of phrase entries is to have good usage examples and default-searchable citations that contain the search words they are looking for in such a way as to bring the best entry to the top of the search. Without better search, this kind of entry won't be found very often. I wonder how many actually look up this kind of article -- and how many find it. DCDuring TALK 08:43, 24 April 2008 (UTC)
- We could take 4 or 5 examples representing typical problems and try to analyse our way to a/some solution(s) I was thinking about don't come the X with me. bgc gives the first 5 option for X as "acid", "raw prawn", "cowboy", "tin soldier", and "orator". It will be impossible to give a typical X for this phrase, as it takes almost any noun phrase you can think of. What would a schoolchild enter if s/he came across don't come the tin soldier with me and wanted to understand it? If we can analyse that to a solution, and do the same with too clever by half, and some other specific examples, we will be well on the way to finding an answer. Regarding the too X by half, I must admit that I lean towards an entry at by half. It seems to be the logical first search, and should come up in the search list for approximate entries. By the same reasoning, perhaps an entry at don't come would also work? I find it useful to analyse to find the smallest "chunk" of meaning. -- Algrif 13:02, 24 April 2008 (UTC)
- If the schoolchild knew anything about the internet, s/he would enter the phrase in Google rather than Wiktionary's search box. (and if the schoolchild didn't know anything about the internet, s/he probably wouldn't know about Wiktionary either). If we had a Concordance:Don't come the N with me or similar, including a "tin soldier" use among others, that would presumably appear somewhere in the results (though not prominently, at least not until our content improves to the point where people actually start linking to us). On reflection I think Concordance: makes more sense than Appendix: for phrasal template entries, at least in most cases; of course that will require allowing a bit more content in concordance pages than we have done heretofore. -- Visviva 13:14, 24 April 2008 (UTC)
- I like both of the above.
- The chunking approach is immediately feasible and may help some users right after implementation.
- Concordance space would be good for this if it were part of our default full-text search or of a fall-back if the namespace-0 didn't have results.
- Usage examples, usage notes, and namespace-0 citations give more searchable material both for Google (???) and our own full-text search.
- I wonder if these would also increase the number of hits we would get from Google. The more entries only we cover, the more often we are on the top of their search results, the more click-throughs we get, the better we do in their algorithm (a virtuous cycle). Google drops very common stopwords unless linked by hyphen to non-stopwords. Phrases/chunks/formula(e/s) that have only stopwords are not going to be found via Google. If we could figure out a way to get people to come to Wiktionary for constructions involving mostly stop words, we would be offering something that might win us a certain type of user who would, of course, become a loyal fan because of our superior content. DCDuring TALK 17:35, 24 April 2008 (UTC)
- I note that some phrasal templates -- or at least snowclones of a sort -- have been put on Wikiquote by our friend BD2412, using the X-Y notation. The man can cite! See for example wikiquote:An X among Ys, a Y among Xs. These do seem to get good Googles, for what it's worth; the Wikiquote page for X me no Xs was #4 in a search for "but me no buts," just above the first actual scholarly treatment. Personally I would prefer, assuming we are going to have these on Wiktionary in some form, that we use a more linguistically-aware notation such as NP/VP or simply N/V/etc., as "N me no Ns." But first, perhaps we should consider what value we can provide that Wikiquote cannot. -- Visviva 11:00, 27 April 2008 (UTC)
- Neither the X/Y type or NP/VP type notations are going to serve inexperienced users well. How someone might learn that the possibility of such a search is not at all clear to me. Everything we put outside main namespace has second-class citizenship and will not be found by those who are not adepts. I suppose that there is value in having such rewards for becoming an adept.
- The value we create would simply be that we helped meet an expectation that someone had about what should be in a dictionary. I believe that dictionary users want help in understanding odd constructions. Certainly most dictionary have some kind of grammar and usage content. Neither WP nor Wikiquote would be my go-to Wikis for grammar and usage information. DCDuring TALK 11:45, 27 April 2008 (UTC)
- Well, insofar as we are an online reference work (and that has to be how at least 99.9999% of our current users use us), most people are going to come to any given page through a search engine, portal, or direct link, not by going to the main page and typing in the search box. So in the case of a snowclone, they will probably find the page by searching the web for information on a particular instance of that snowclone, in the same way that I found BD's pages on Wikiquote. It would never have occurred to me that Wikiquote might have such a page, but in the glorious world of Web 2.0, that was irrelevant; Google did my thinking for me. Appendix pages are indexed by Google et al., so that shouldn't be a concern for us. Likewise people will be able to find the content regardless of the page title. -- Visviva 12:04, 27 April 2008 (UTC)
- It would be nice to know how users actually get here. How many are just from sister project links? I don't think we can rely solely on individual-page attractiveness. We deliver branded information to a certain extent, so that users may select us from a search result page because of the good things that have happened to them on our site in the past. I would think that we would want to offer some kinds of search that Google doesn't (and can't) offer. Their orthography limitations are an opportunity. And so too might be some kind of grammar-restricted searches with "variables". "NP1 NP2 no NP1s"? Could we use categories (visible or invisible) to go in this direction for idioms and contructions? DCDuring TALK 12:26, 27 April 2008 (UTC)
Have we come to a decision about how we treat these sorts of entries? When trying to find if we have any sort of entry for the "s/x/y/" type of self-correction notation used frequently by those familiar with regular expressions (if we do, I haven't been able to find it.), I stumbled upon X one's Y off. Thryduulf 01:13, 2 May 2008 (UTC)
- Certainly not yet. I would favor Visviva's NP, VP, N, V formulation for anything that didn't fit the one('s)/someone('s)/something('s) approach. That won't cover animate/inanimate and other more semantic categories without modification, but linguists must have suitable vocabularies for such distinctions that we could try out. DCDuring TALK 01:33, 2 May 2008 (UTC)
- If the way of filling the blanks is kept simple, then users will be able to form them easily once they see one example, which they will eventually see somewhere in Wiktionary (synonyms, see alsos, translations, search results, maybe redirects for common searches for particular terms). They may also see it on the information desk. "User: How is (something with gaps) used? Wiktionarian: Look at (link). User: Oh, they are formatted like that, neat." I doubt that NP, VP, etc. are simple enough. -- Coffee2theorems 12:20, 3 May 2008 (UTC)
- I agree. —RuakhTALK 13:05, 3 May 2008 (UTC)
The simplest possible thing would be to pick a sequence of characters that is used for every gap, i.e. instead of "X one's Y off" you'd have e.g. "... one's ... off", "* one's * off", "? one's ? off", or some such. Using non-letters would be best (if the software allows it..?), because the choice is more clearly unique. With letters there are choices such as upper/lower case and choice of letter ("X one's X off" and "X one's Y off" would seem equally plausible ways of generalizing "too X by half" to me), whereas e.g. "..." doesn't suggest any alternatives. It would also be a plus if the characters can be easily typed ("… " can't, "..." can). -- Coffee2theorems 12:23, 4 May 2008 (UTC)
- I agree, though I think … (horizontal ellipsis) is fine provided ... (three periods) is a redirect. —RuakhTALK 15:37, 4 May 2008 (UTC)
- What expressions are there that have the same variable in two positions? I can think of a few in English, but there must me more. "X me no Xs" is almost an unfair example, having no restrictions on X other than it being mainly a noun. The switch in PoS defeats even Visviva's approach. "X after X" (periods of time > hour; distance; repetitive task; or object of repetitive task; indeed, anything repetitive) and "X in, X out" (day, week, month, year) are the two cases that first came to mind. "X upon X" and "X by X" are similar.
- I personally would prefer an approach that could handle these cases and that made clear PoS restrictions, which are common, but not always obvious from the ellipsis approach. That is a significant advantage of what Visviva had offered. We may need a more flexible framework to reflect the particular restriction on the variables such as (animate, human or near-human, mass noun, countable noun, etc.) DCDuring TALK 16:29, 4 May 2008 (UTC)
Pending an eventual resolution to this, I've started a list of entries that we should add when we decide how. The list is at User:Thryduulf/phrasal entries with variables, feel free to add any others you think of. Thryduulf 10:27, 12 May 2008 (UTC)
Search enhancement
Typing in the search box now shows you the words we have that match your typing. I like it! However, I notice that newly added words don't show up - is it running off of a preproduced list? SemperBlotto 08:27, 22 April 2008 (UTC)
To answer my own question - no, there is just a short time delay - excellent. SemperBlotto 08:37, 22 April 2008 (UTC)
- Marvelous. I'd always wondered if that sort of functionality would ever come to us. The responsible dev(s) deserve a signed thank-you note. -- Visviva 09:08, 22 April 2008 (UTC)
- That's awesome! (Though sadly, it gives our "misspelling of" entries greater potential to be detrimental. :-/ ) —RuakhTALK 14:31, 22 April 2008 (UTC)
- Excellent. It's a very good step, keeping us competitive with the well-funded sites. Something like "soundex" search or a list of aliases would be a wonderful next step for the many instances where user doesn't enter a spelling we have. It might be more important for us than for others in the WMF ambit. DCDuring TALK 14:41, 22 April 2008 (UTC)
- Yeah, pretty neat. I'm not sure how soundex works, but definitely it would be good to allow people like Hippietrail to tamper with this, letting it search the "did you mean" results instead of just the page titles.
- One problem though, it's rather a pain to do a search when the lists drops because it covers the search button. It only applies if what you've typed is a prefix to something else, but that condition is pretty easy to meet. Maybe it could "drop" up instead of down? DAVilla 21:55, 23 April 2008 (UTC)
- Is this CSS-adjustable? -- Visviva 13:26, 24 April 2008 (UTC)
- It's miraculous enough that this much has happened; I wouldn't put too much hope in the possibility of future improvements of the same kind. (For example, I wouldn't assume that the DidYouMean extension will be approved this year or this decade.) But in any case I don't think that w:soundex would work for the default search box, since the sounde algorithm is limited to English. On a hypothetical future version of Special:Search, with language selection etc., it would be a great addition. But if we want something like that, which involves front-end functionality rather than anything at the content end, it probably makes the most sense to set up a demo mirror of our own. Got cash? -- Visviva 13:26, 24 April 2008 (UTC)
- If we could get our usage up, perhaps we would be more useful to WMF for fund-raising and more "deserving" of technical attention. WMF did get a $500K from Sloan Foundation recently. I wonder what we could do that would help in that regard in terms of identifying funders whose interests coincide with where we might want Wiktionary to go. UK Prime Minister on recent visit to US spoke about the English language as a tool of joint national interest with US. Maybe govt. money has too many rules for WMF and is seen as tainted and insufficiently international, but there should be suitable funders somewhere. DCDuring TALK 17:16, 24 April 2008 (UTC)
Well, it seems to me silly to re-invent wheels (however fun it is ;) so I have wrapped aspell in some python and added a callback to WT:PREFS. If you want to test this feature then go to WT:PREFS and enable "aspell on http://devtionary.org (WARNING...". This is not a feasible long term solution, and if aspell turns out to do the right thing then I will implement it as a proper extension for MediaWiki. The javascript code is User:Conrad.Irwin/aspell.js. Known problems with the current implementation: It is very slow (this is because devtionary has to query the wiktionary API to provide colourful links, aspell is plenty fast enough ;), It only supports English (this is a limitation in the current installation, not aspell or the python script in general). I would appreciate comments on how well aspell performs, and any other ideas people have for doing this kind of thing.Conrad.Irwin 10:53, 25 April 2008 (UTC)
- To test what this does, visit a misspelled page (i.e. http://en.wiktionary.org/wiki/alhpabet ) or do a full text search for a word (i.e. Special:Search/hunderd ). Conrad.Irwin 12:26, 25 April 2008 (UTC)
"misspelling of" template
The {misspelling of|} template does not allow wikilinks within the template. Entries with this template are usually simple and do not contain other wikilinks, therefore they are not included in the page count. Is this by design? --Panda10 00:00, 23 April 2008 (UTC)
- Yes, this is by design. A number of editors here feel that misspellings aren't really words anyway, so they ought not to count towards our total number of entries. --EncycloPetey 00:02, 23 April 2008 (UTC)
alternative spellings of only some sense
nonpartisan had an alternative spellings section for non-partisan and an adjective section. Then I added noun section. But the alternative spelling only goes with the adjective (I think). How do we indicate an alternative spelling for only POS? RJFJR 16:35, 24 April 2008 (UTC)
- I'm pretty sure that there was recent discussion about this. BP, TR? DCDuring TALK 18:26, 24 April 2008 (UTC)
- IIRC, when we voted on the order of L4 headers, this case was considered. The Alternative spellings header may occur at L4 when it is specific to only one part of speech. --EncycloPetey 21:40, 24 April 2008 (UTC)
Election Notice
The 2008 Board election committee announces the 2008 election process. Wikimedians will have the opportunity to elect one candidate from the Wikimedia community to serve as a representative on the Board of Trustees. The successful candidate will serve a one-year term, ending in July 2009.
Candidates may nominate themselves for election between May 8 and May 22, and the voting will occur between 1 June and 21 June. For more information on the voting and candidate requirements, see <http://meta.wikimedia.org/wiki/Board_elections/2008>.
The voting system to be used in this election has not yet been confirmed, however voting will be by secret ballot, and confidentiality will be strictly maintained.
Votes will again be cast and counted on a server owned by an independent, neutral third party, Software in the Public Interest (SPI). SPI will hold cryptographic keys and be responsible for tallying the votes and providing final vote counts to the Election Committee. SPI provided excellent help during the 2007 elections.
Further information can be found at <http://meta.wikimedia.org/wiki/Board_elections/2008/en>. Questions may be directed to the Election Committee at <http://meta.wikimedia.org/wiki/Talk:Board_elections/2008/en>. If you are interested in translating official election pages into your own language, please see <http://meta.wikimedia.org/wiki/Board_elections/2008/Translation>.
For the election committee,
Philippe Beaudette
trans gloss in morna
This word is listed in Category:Translation table header lacks gloss even though there is a gloss. The structure looks fine to me. Can you take a look? What is it that I don't see? Thanks. --Panda10 11:00, 27 April 2008 (UTC)
- Fixed; just a top where a middle should have been. -- Visviva 11:02, 27 April 2008 (UTC)
- Thanks! --Panda10 11:32, 27 April 2008 (UTC)
I recently made {{compound}}
, modeled after {{suffix}}
. I think we ought to promote these templates, since they offer a possibility to keep etymology sections consistent and uniform. However, maybe a little more information than just a ‘+’ would be good. Therefore here a call for better wordings for those templates, maybe in the style of belofteploeg, where I didn’t replace the etymology by the template (yet).
Hoping for your input (but feel free to implement it yourself, I’m not frequenting this page anymore)! H. (talk) 17:17, 27 April 2008 (UTC)
- Comparable to
{{blend}}
, which has a more specialized role. Might allow more automatization of derived terms and, with lang parameters, enable identification of macaronics. DCDuring TALK 16:43, 4 May 2008 (UTC)
Petition on Meta
Hello,
I would like to notify you of a petition against the recent decision by the board to reduce community representation. Please find it here. I am sending this message to most English Wikimedia projects as I think it is important the community is informed. If you have any questions please ask me at my Wikinews talk page.
Thanks,
Anon101 (on Wikinews) 20:23, 28 April 2008 (UTC)
(Note- I did not create the petition)
- That petition gives half of one point of view and no place to voice opposition... Where do the people who are pleased that the board is looking to add professional voices to the discourse in order to make the most of the contributions that the community generate? I don't like one sided politics. - [The]DaveRoss 20:30, 28 April 2008 (UTC)
- I suppose that periodically or as the occasion warrants, we might remind people that Wikimedia Foundation ("WMF") provides the umbrella for us and all our sister projects.
- For those interested in the governance of WMF the and the issues that it deals with, here is the contact information for the mailing list:
- foundation-l mailing list
- foundation-l@lists.wikimedia.org
- List-Archive: <http://lists.wikimedia.org/pipermail/foundation-l>
- List-Subscribe: <https://lists.wikimedia.org/mailman/listinfo/foundation-l>,
- Unless there is an issue that is specific to en.Wiktionary, or wiktionaries in general or all of WMF's en sites, the discussion is best carried out there. DCDuring TALK 20:55, 28 April 2008 (UTC)
Rohingya (cit) split
News flash! SIL has split cit into rhg (Rohingya) and ctg (Chittagonian) [5]. We'll need to update the appropriate templates and Category:Rohingya language. --EncycloPetey 01:10, 29 April 2008 (UTC)
- Now that is unusual, (a split, usually it is additions) apparently because "cit" was an error to begin with? You changed
{{cit}}
to "Chittagonian", which puts things in a non-existent cat; all of the existing entries label themselves as "Rohingya" (although most are User:Drago ...). I've fixed it to redirect to{{rhg}}
, on the way to orphaning it. (This will show up in my language templates table as something to be fixed.) Robert Ullmann 12:06, 1 May 2008 (UTC)
Sorting clicks
- yet another trivial issue on which much heat can be expended ...
I've notice in working out the implementation of sorting translations tables in AutoFormat that humans have put !Xũ under X, rather than at the top of the table, where the simple code order would place it. This seems reasonable. Do note that the "!" is a click, not punctuation. This would also apply to ǂHõã, ǀXam, etc which would otherwise sort at the end (IPA characters, "ǀXam" starts with an IPA dental click, not a vbar/pipe). And doing the same sort for language headers in an entry. Robert Ullmann 09:55, 30 April 2008 (UTC)
- Makes sense. Dictionary sorting ought to consider letters only, ignoring case, punctuation, and spaces. I realize these clicks are more significant than most punctuation in their native language, but to English-language readers they are not letters. —Michael Z. 2008-04-30 18:51 Z
- On second thought, we are alphabetizing text in many languages and foreign scripts. Is there a native sort order for these symbols? Is there any reason not to use the default Unicode collation algorithm for all places where we have mixed languages? —Michael Z. 2008-04-30 18:55 Z
- That's not exactly true. There are places where we alphabetize text across language and script (e.g. at category pages), but the language names in the translation tables are supposed to be English names in the Latin script. I think we should probably use the Ethnologue name (Kung-Ekoka), but if there's a good reason to use (deprecated template usage) !Xu as our name for it, then we should do so, and IMHO we should ignore the ! in collating, just as we do spaces and punctuation and whatnot. (Likewise if there's a good reason to use (deprecated template usage) !Xũ, but that strikes me as unlikely.) —RuakhTALK 20:09, 30 April 2008 (UTC)
- Quite right. (I have since had my morning cup, and see that collation depends on the context)
- Yes, among others, and we use "!Kung" in the language template. There are others besides clicks, such as 'Auhelawa. I've added a line to the collation order lambda in AF. Robert Ullmann 11:56, 1 May 2008 (UTC)
People might also be interested in the description of what AF has been taught to recognize in tables at Category:Entries with translation table format problems, specifically the handling of grouped languages and subsidiary lines with qualifiers, e.g. doing things like:
(at butterfly) where either * or ** can be used with a language name, this makes it easy for applications parsing wikitext as they can treat * and ** identically, and expect the full language name. (And we don't have to have "Greek, Modern" ;-) Subsidiary notes that are not languages use *: as with
(although the Serbian things really ought to just be on one line ;-) The stuff described at the cat page is not policy, just what I've found that seems reasonably structured and useful. AF has been tagging things for a few days to see what will be found. Robert Ullmann 11:56, 1 May 2008 (UTC)
- On the last example, I agree that it should be one line. It's not a transliteration, so it shouldn't be parenthesized, but it could more easily be listed as we do with simple and traditional Chinese:
- Is it really necessary to inform the world that the first are Cyrillic characters and the latter Roman? DAVilla 20:53, 19 May 2008 (UTC)