Wiktionary:Beer parlour/2007/June
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Wikisaurus headers
(commenting on previous section)
Um, isn't the idea of a Wikisaurus page to define just one concept, however that might be best titled, and give all the synonyms of that, rather than trying to give all the synonyms of every meaning of any given word? In other words, list color under Wikisaurus:paint (draw) and Wikisaurus:color (hue), but leave out "Wikisaurus:color" as too ambiguous? DAVilla 22:11, 14 May 2007 (UTC)
- Not quite (according to Help:Creating a Wikisaurus entry). Any page will be for a given word form, including all possible parts of speech and meanings. I haven't seen any indication that page names should have parenthetic glosses. However, the idea you're getting at is present in the selection of which title to use for a page. For instance lean would be a poor choice for a Wikisaurus entry page name, since it has more than one meaning and the meanings are not related. Instead, you might choose thin or tilt, depending on your desired meaning. At least, that's the way I interpret the directions left behind by the Editors-of-Yore on the Help page. --EncycloPetey 22:21, 14 May 2007 (UTC)
- Ugh, count me out. Or maybe portapotty, Allen wrench, and automated teller machine then. DAVilla 22:42, 14 May 2007 (UTC)
- The reason for starting WikiSaurus - sorry, Wikisaurus - way back was that we spent ages theorising about how to do a thesaurus. But so many of the theories were based on some very, very simple examples, and in practice were pretty hopeless. So I took the plunge and started Wikisaurus in a simple way, just as a sort of pilot scheme, to get a feel for what we need to tackle; discover the real world problems. So, my view is Wikisaurus is far from set in concrete. In the end we might need to wait for Ultimate Wiktionary (or whatever it is called) to be able to do a really good job, where the atoms are smaller, down at the word & meaning level, in order to do the best job. But we can continue to experiment, understand the problems, refine as we go until then; developing "policy", "how to", "criteria" and "wish list" as we go, as well as the format, scope and content --Richardb 03:15, 15 May 2007 (UTC)
Meanwhile, what I would recommend is try to find a suitable headword from an existing Thesaurus. The headword will have been presumably chosen for popularlity/commonality of use, and lack of ambiguity. As sort of briefly expained in Help:Creating a Wikisaurus entry. And don't forget to link the Wiktionary entries to the WikiSaurus, using either {{Wikisaurus-link}} or "see Wikisaurus:xxxword--Richardb 03:15, 15 May 2007 (UTC)
Couple of things that might help.
- on User talk:Vildricianus he has a scrolling contents window-pane. This would be very useful to tidy up some of the really big Wikisaurus entries - eg: Wikisaurus:penis (sorry to use the sex stuff, but those entries certainly have shown some of the challenges we face.) How can we implement this scrolling contents pane ?
- Ooh, ooh, I like it!! I like it!! Can we have that on the RFV page, which I always have to scroll down several panes just to get to the oldest entries? And maybe on a few other pages as well? Or everywhere, why not? DAVilla 06:15, 15 May 2007 (UTC)
- That was his monobook.css - yes, you can turn that on for yourself, IIRC. --Connel MacKenzie 13:30, 15 May 2007 (UTC)
- Since it is so useful, why not make it the default for everyone of the "average" users who don't go diving into .css at all. --Richardb 15:50, 15 May 2007 (UTC)
- That was his monobook.css - yes, you can turn that on for yourself, IIRC. --Connel MacKenzie 13:30, 15 May 2007 (UTC)
- Ooh, ooh, I like it!! I like it!! Can we have that on the RFV page, which I always have to scroll down several panes just to get to the oldest entries? And maybe on a few other pages as well? Or everywhere, why not? DAVilla 06:15, 15 May 2007 (UTC)
- I created a little template {{Template:Wikisaurus-more}}.. See it in use in Wikisaurus:prostitute which I started to do a bit of clean-up on, separating the "good"from the down-right disgusting. (By the way, I think having the page semi-protected is a good idea, with the /more page left for the dross to accumulate in. Or maybe we should use a talk page to accumulate stuff, and promote words to the main page once it meets some sort of criteria.) Oh, and is there anyway to protect all Wikisaurus pages from CM ? Just kidding. But you did get pretty visicous with Wikisaurus:prostitute, I hope you'll admit ! . By the way, I'm sure Wikisaurus-more can be improved on. Go for it. and something similar to put in the /more page header would be useful.--Richardb 03:37, 15 May 2007 (UTC)
- No, I cannot admit that. The senses I moved were all either already present on the other pages or one of the following:
- *sperm sponge - One who soaks up Sperm.
- *Stop sign-shifter-referring to the the way a prostitute tends to stay to a stop sign while working
- *cum-guzzler - one whom guzzles male seminal fluids (as opposed to sipping lightly)
- *Sharmota
- Now besides none of those being idioms in the English language, and besides those being trite nonsense, and besides those being pointedly and pointlessly vulgar, they are not real. If anything is to be said of them at all, it is that someone looking for a reasonable synonym for "prostitue" would not do well, using one of those terms.
- I do demand, you remove your false and libelous statement about me, vis-a-vis "streetwalker."
- --Connel MacKenzie 14:21, 15 May 2007 (UTC)
- No, I cannot admit that. The senses I moved were all either already present on the other pages or one of the following:
- Oops! Very sorry Connel. My mistake. I looked up the history wrong. You are right, it was not you. I've changed the offending references to say "Some over enthusiastic vigilante" instead, but I can't be bothered to trawl through the history to find out who it was. Sorry again !!--Richardb 15:47, 15 May 2007 (UTC)
- It was not "enthusiastic vigilante" it was a responsible sysop acting on the outcome of the relevant community WT:VOTE! You restoring any of that, is what could be considered to be an "enthusiastic vigilante" action, right? --Connel MacKenzie 15:28, 17 May 2007 (UTC)
- Strictly speaking, a statement is not libelous unless it presents a possibility of pecuniary damage to the subject. Damage is an element of libel. So, unless you can prove how you would be injured by that statement in some material fashion (loss of income, medical bills for counseling to overcome your mental pain, etc.) I wouldn't be calling it libel. Cheers! bd2412 T 00:48, 4 July 2007 (UTC)
- Spoken like a true lawyer (no offense intended.) But, besides the fact that IANAL, I said "libelous", not "libel." Furthermore, your pedantry is an interesting expose of the deep-rooted corruption of the American legal system; lacking a discrete monetary measure (therefore lacking an estimate of the proper price-gouging the courts should inflict through fines or kickbacks) our courts would simply throw out such a statement? What about my lost time, battling nonsense resulting from a campaign of vitriolic verbal abuse? What about my decreased effectiveness here (indeed, on this "free" endeavor) due to people coming in from left field, assuming there is some substance to these baseless attacks? (A quick glance at my user_talk page will show plenty of effort diverted, directly as a result.) Well, completely corrupt legal system or not, if I were to play that game, I'd simply tally up the time lost, that I could have spent as billable hours... But then, having lost all faith in our legal system, I don't think I'm about to go that route, unless I really, really have to. But damn, how does our legal system view "lost time" on volunteer efforts? Or is that just another piece of armor for a kickbacks-driven, money-owned legal system? --Connel MacKenzie 07:46, 4 July 2007 (UTC)
- Furthermore, the page Wikisaurus:prostitute should not exist. There are separate pages for male and female. If the page exists at all, it should only link to those two other pages. --Connel MacKenzie 14:32, 15 May 2007 (UTC)
- Not with you on that one CM. don't understand. --Richardb 15:47, 15 May 2007 (UTC)
- The entries for Wikisaurus:promiscuous man and Wikisaurus:promiscuous woman cover all the senses of Wikisaurus:prostitute, subdivided correctly. --Connel MacKenzie 15:26, 17 May 2007 (UTC)
- So, CM, are you saying that "promiscuous woman" and "propmiscuous man" mean the same thing as "prostitute" to you ? Really ?--Richardb 13:45, 6 June 2007 (UTC)
- Um, what? No, maybe I meant what I said. Take a look at those entries. --Connel MacKenzie 13:55, 6 June 2007 (UTC)
What namespaces to include by default in search
In one way Wikisaurus has become less useful now it has its own namespace. Take the word pilsener for example.
If you do a "Go" for "pilsener"you get nothing. If you do a search for "pilsener", you just get one entry in some user's page ! (Please leave pilsener empty for a while, other wise I might have to use a vulgar example!)
But in the days when Wikisaurus was in the main name space, you would have found pilsener in Wikisaurus:beer. As you do now IF you know to tick the wikisaurus name space for your search. I think it was more useful to have Wikisaurus entries inlcuded in the main namespace, or at least in the default search.
(That way when somone looks for some vulgar word, they will find it. Whereas now they don't, so they create it !)
So why is Wikisaurus namespace excluded from the default search, yet so many other (non-Main) namepsaces are included. How many searches give huge long answers because they are full of references to stuff in ancient User Talk pages, Beer Parlour Archives, etc etc.
My view is that the default search should include only the Main namespace, Help, Category, Appendix, Concordance, (Index ???), Rhymes, and Wikisaurus namespaces. The average user does not want to see all the background and technical stuff that goes on in Wiktionary, User, Template, MediaWiki, Image (?), Transwiki namespaces, and none of the Talk pages.
What do you reckon ??--Richardb 04:14, 15 May 2007 (UTC)
- Absolutely agree. DAVilla 06:04, 15 May 2007 (UTC)
- I don't think you are using the "default" settings for searching - check your preferences. --Connel MacKenzie 13:28, 15 May 2007 (UTC)
- Thanks Connel. I've never changed my search preferences before. NEver knew there were any to change. As will no doubt be the case of a lot of users, especiallly the non-logged in users. Do new users (and not logged in users) get everything included in a search ? Wouldn't it be better to make that more restricitve set listed above as the default preferences, to avoid all the junk and dirty laundry sppearing in searches ?--Richardb 15:59, 15 May 2007 (UTC)
- Just checked. The default for a not-logged in user is just the Main namespace. I'd like to see that expanded to at least include the Wikisaurus namespace, so people can find words like pilsener that are as yet only in wikisaurus.
- Can we do that ?
- I'm fine with the idea that the "more conservative" amongst us might want to hold off on that till Wikisaurus is cleaned up a bit more. Though personally I believe that if a fine upstanding young man wants to see if his favourite new dirty word is in wiktionary, we'd be better off if he found it was already there, rather than thinking he has found an omission, and wasting his time, and our administrators time, by adding the word in (for probably the hundredth time !)
- Just checked. The default for a not-logged in user is just the Main namespace. I'd like to see that expanded to at least include the Wikisaurus namespace, so people can find words like pilsener that are as yet only in wikisaurus.
--Richardb 16:12, 15 May 2007 (UTC)
- No, do it all as you originally said: in addition to Wikisaurus, regardless what kind of state it might be in, add the Appendix, Index, Category, Concordance??, and Rhymes spaces, basically anything that is part of the dictionary as a "final" product rather than part of the community, and probably also the Help space if that includes instructions on using the dictionary (e.g. about ogg files) although unfortunately that has community stuff mixed in as well.
- I've just checked, and the default search preferences for new users is likewise the Main namespace only. I would think that anyone logged in should also be searching the User spaces and all related talk spaces. DAVilla 17:27, 15 May 2007 (UTC)
- Developer requests generally require a clear community consensus. For en.wiktionary, that means, in essence, a WT:VOTE. Please start one. (Is two weeks enough for such a clarification?) When the vote is done, someone then files a bugzilla request for the change of the default parameters, at which point the devs can be pestered on IRC to actually do it. --Connel MacKenzie 15:36, 17 May 2007 (UTC)
- I've run short votes and somewhat regretted it afterwords.
Now see Wiktionary:Votes/2007-05/Expand namespaces for default searchesDAVilla 18:05, 17 May 2007 (UTC)
- I've run short votes and somewhat regretted it afterwords.
- I don't know how I missed this vote, earlier. What Brion, Tim, et al., have responded to in the past, was a VOTE for specific namespaces. Has a bugzilla for the feature request you propose in the vote been filed yet? From what I understand, adding a single namespace, or changing the search defaults is a two-minute task for them. Writing a few screens to allow b'crats to edit the site configuration file is obviously going to take a bit more effort. (They've always wanted VOTEs on these sorts of things, so they can wash their hands of the political situations. I agree that making that 100% under the power of the b'crats would alleviate that...but unless someone here is good a PHP, I don't think they'll do it, this year or next. We still don't even have single-user-logon!) --Connel MacKenzie 09:09, 8 June 2007 (UTC)
- How did I miss your response here? I shouldn't have started the new vote, Wiktionary:Votes/2007-06/Expand default namespaces for searches. On the other hand, the issue is important enough to get developer attention, especially since it would make votes of expansion on other projects easier to support. What project shouldn't include Help: in its default search for new contributors? DAVilla 01:51, 9 June 2007 (UTC)
We need more people voting on this, please.--Richardb 13:37, 6 June 2007 (UTC)
- The previous vote was postponed, explaining the poor turnout, and incidentally, you voted on the wrong page. This is the new vote. DAVilla 01:51, 9 June 2007 (UTC)
Purpose of Wiktionary:Requests for deletion
This has been brought up for a few specific entries at Wiktionary:Requests for deletion, which isn't a convenient place to have such a discussion, so I'm bringing it here.
Currently, the introductory text of Wiktionary:Requests for deletion says, "This page is where users can propose and discuss the deletion of pages in the main namespace (see the nomination category). Requests are archived when a decision has been reached (be it deleted, kept, or transwikied); the deleting administrator should remember to sign." In other words, that page is for proposing, discussing, and recording the deletion of entire entries.
However, there exists the template {{rfd-redundant}}
, which attaches to a given sense and adds it to the same category (as of this instant, that is: there's a small revert war going on right now, and it might not do that by the time you read this), and people have for a while been adding pages with so-tagged senses to Wiktionary:Requests for deletion (in the same way that pages with senses tagged with {{rfv-sense}}
get brought to Wiktionary:Requests for verification).
Obviously one of these needs to be changed, and there doesn't seem to be consensus which one.
So, please discuss.
—RuakhTALK 19:53, 15 May 2007 (UTC)
- My own opinion, by the way, is that as long as RFD is a forum for discussing deletions as well as just requesting them (which it certainly is right now, as you can tell from the fact that administrators often post entries there — and which is certainly the intent, because straightforward, no-discussion-needed requests are made using
{{delete}}
instead of the RFD process), it certainly makes sense for RFD to cover deletions of parts of pages; certainly removing the English section from a multilingual page is as drastic a change as removing an entire English-only page, even if MediaWiki software doesn't allow us to restrict the former to administrators. (Well, it's a bit less drastic in that a non-administrator can look in the history and restore the previous version, but still. Pretty close.) I'm not sure whether deletion of a seemingly redundant sense can fall into the same category; if not, though, I think it would be better suited to Wiktionary:Tea room than to Wiktionary:Requests for cleanup, since the former seems to be a better forum for bringing out the finer points of the various senses. Or maybe we should have a{{rfv-redundant}}
, requesting cites that are clearly specific to the seemingly-redundant sense? —RuakhTALK 20:06, 15 May 2007 (UTC)
- We could have the template automatically cross-post the word to each and every one of the different pages... </joke>
- Personally I feel it belongs on either RFD or RFC (not RFV or TR), and as the same action must be taken in either case, which term to use is only an aesthetic issue. Right now, I'd lean in favour of putting them on RFC. — Beobach972 20:39, 15 May 2007 (UTC)
- Several things:
- rfd-redundant/rfd-sense have been around for a while and are in use.
- To my knowledge, you are the only one who uses rfd-redundant, and quite liberally at times.
{{rfd-sense}}
has not been in use since its first appearance in mathematics, per Vildricianus. DAVilla
- To my knowledge, you are the only one who uses rfd-redundant, and quite liberally at times.
- Non-sysops have historically edited the preamble; it is meant to be descriptive, not definitive.
Agreed. I mean,look at the preamble of this page.How confusing!Well, it wasn't too confusing, just misleading in the first part. But, get this, protected! DAVilla 17:40, 6 June 2007 (UTC)
- Yes, I really should have updated the preamble as the template started getting used regularly, but I forgot.
- Changing the target page is a Bad Thing; it leaves orphaned discussions on one page, and dead links from the entries themselves (despite active conversations.)
- The distinction between an "rfv-sense" and an "rfd-redundant" is more than a template name; it immediately conveys more information about the type of complaint/question.
- As all of our pages grow, the techniques for narrowing in on precisely what is questioned will continue to gain importance. A few more languages with bot runs like Spanish had, and I won't be surprised if in a couple years, we retire "rfd", "delete", "rfv" and "rfc" entirely. (Well, yeah, I guess I will be surprised when it happens. But eventually, I expect to see that milestone in Wiktionary's development.)
- That's a very interesting point. We have already had cases where an rfd applied only to a specific language section, so maybe it would grow into that. DAVilla 17:30, 6 June 2007 (UTC)
- rfd-redundant/rfd-sense have been around for a while and are in use.
- --Connel MacKenzie 03:54, 16 May 2007 (UTC)
- Several things:
I've written up a vote at Wiktionary:Votes/pl-2007-06/Requests for partial deletion. The vote isn't actually begun yet, so please don't vote (I'll post here again once it is begun); right now I'd just like input on whether this is an acceptable form for the vote to take; I don't want people pouncing on me for starting a vote without enough discussion beforehand. :-) —RuakhTALK 23:17, 6 June 2007 (UTC)
- Thanks for checking first. :-) In seriousness, there is currently an option of nominating an entry/sense on either RFD or RFV, depending on what is being requested. That is, if something is nominated because it clearly does not meet WT:CFI (e.g. Star Wars fictional character names) then it goes on WT:RFD. If something is nominated because it doesn't appear in other dictionaries or it doesn't seem to be attested, it goes on WT:RFV.
- So, I think reducing the scope to one or the other would help. All of Uncle G's complaints seem to have been about senses being nominated on RFD...that seems to be the only sticking point that needs clarification. Adding all those other options into the mix implies that vote would end inconclusively.
- Runoff votes are to be avoided (as they tend to be divisive to the community...some large segment will always get the shaft.) Starting a vote with the assumption that it will decay into a runoff, is a really Bad Idea. --Connel MacKenzie 23:48, 6 June 2007 (UTC)
- P.S. You seem to be missing several of the other minor WT:RFD discussion links. :-) --Connel MacKenzie 23:50, 6 June 2007 (UTC)
- Thanks for looking it over. :-)
- Re: "So, I think reducing the scope to one [RFD] or the other [RFV] would help.": The vote already states explicitly that it doesn't affect RFV. Should I make that more salient somehow? Or am I misunderstanding your concern?
- Re: "All of Uncle G's complaints seem to have been about senses being nominated on RFD": True, but I think it's clear from his line of argument that he'd feel the same way about POS sections or even entire language sections. I take your point about breadth of scope, but short of making this vote even more complicated, I don't see how to address that. Do you have any thoughts?
- Re: "Starting a vote with the assumption that it will decay into a runoff, is a really Bad Idea.": O.K., I'll fix that.
- Re: "You seem to be missing several of the other minor WT:RFD discussion links.": Bah. I got all the ones I saw. Feel free to add any others that have actual discussion.
- Anyway, thanks again. :-)
- —RuakhTALK 01:34, 7 June 2007 (UTC)
- I like the idea of proposing votes by writing them up this way before they're active, especially for approval voting where all options have to exist before the vote has begun. Whoever it is that voted is going to just have to be blanked, and there should probably be a big colorful warning on the page that such action will be taken to keep people from doing so.
- As to the proposal itself, I'm not entirely sure as to what is being proposed. I had tried implementing a "request for review" category a while back, with
{{rfr}}
, for anyone to explain in words what they saw wrong with an entry, and then have it correctly placed. But most everybody who takes such actions knows what they're doing anyway. - If you want to supercede
{{rfd-redundant}}
and the conceptual{{rfd-sense}}
with something similar, it might be a valuable alternative to the Tea Room. Where the Tea Room is for questions about almost any aspect of an entry, RFR would be a place of action on definitions when the meanings themselves are not disputed, only their organization. It could be thought of as a branch-off from the Tea Room, where some of us think{{rfd-redundant}}
belongs, much as{{rfv}}
was functionally, if not originally, a branch from{{rfd}}
. - (I would not, however, advise a different process for
{{rfv-sense}}
.) DAVilla 17:09, 7 June 2007 (UTC)
- Hmm. Is it clearer/better now? Have I addressed all your points? —RuakhTALK 06:47, 8 June 2007 (UTC)
- Looking simply at the format/structure of the vote, it seems very redundant. It should have only one voting section, in essence, the "Most favored" section. The preceding 13 redundant subsections (1.1 - 1.1.3.3) just obscure the intent, right? --Connel MacKenzie 18:42, 7 June 2007 (UTC)
- I added a "do not vote yet" thing, but it could use a box and some color. --Connel MacKenzie 18:42, 7 June 2007 (UTC)
- Well, I rather preferred the full structure — the new, reduced structure is imposing the opinion that all such discussions should go in one place, with no flexibility. (After all, my goal in putting forth this vote is primarily to get person X to stop harassing editors using page Y to propose partial deletion; I see no reason to force person X to use page Y that way himself if he doesn't want to.) But, I do understand the value of simplicity, and it probably is a bad sign if a vote's internal structure involves four different header levels. (If anyone has input on this, please give it!) —RuakhTALK 06:34, 8 June 2007 (UTC)
I've started the vote now: Wiktionary:Votes/pl-2007-06/Requests for partial deletion. I've also gone ahead and voted imperfectly; a vote here isn't truly underway until a voter has felt the need to qualify his vote. :-) —RuakhTALK 04:06, 9 June 2007 (UTC)
- Well, that is only so funny, because it is true. :-( Perhaps our votes should have "Pros" and "Cons" in their preambles. Trying to represent what the opposing viewpoint is (or is believed to be) may give a better sense of just what the vote is trying to accomplish? --Connel MacKenzie 09:13, 11 June 2007 (UTC)
- Perhaps our "votes" could have additional "minimum votes" required of participants in the preliminary conversations and/or preceding disputes. (E.g., in this case, Uncle G.) --Connel MacKenzie 09:16, 11 June 2007 (UTC)
Request for bot flag - User:SemperBlottoBot
I would like the community to agree that User:SemperBlottoBot should be given bot status.
It is intended to run under the direct supervision of User:SemperBlotto to load the conjugated forms of Italian verbs (of which there are very many).
Details are on its User page.
Feel free to discuss the format and content of entries, either here or on my talk page. SemperBlotto 22:00, 18 May 2007 (UTC)
- As I've mentioned to you before, it doesn't make sense to use a non-standard format — all the more so for bot-generated entries. Is it really such a problem to use {{gerund of|[[---]]|lang=Italian}}, etc.? (You've said that it's "more important" to have accurate content than consistent presentation, which is true, but I don't understand your objection to having a consistent presentation.) —RuakhTALK 02:11, 19 May 2007 (UTC)
- I have no objection to this in the bot-generated forms (I didn't know it existed) as the extra typing is not a problem for my ancient fingers. Which other such templates exist that I should use? SemperBlotto 07:14, 19 May 2007 (UTC)
- I'd really like to see {{form of|...|lang=Italian}} used for all the auto-generated forms (in deference to a more specific tag like gerund of, where applicable.) If possible, the one-word English translation (of the stem) on each entry would be helpful too. --Connel MacKenzie 13:36, 19 May 2007 (UTC)
- ALL entries now use
{{form of}}
if no specific template is avaliable. SemperBlotto 13:45, 19 May 2007 (UTC)
- Great, thank you so much. :-) When you bring this to WT:VOTE, I'll support. —RuakhTALK 17:43, 19 May 2007 (UTC)
- My understanding (from Wiktionary:Bots#Process) is that there is no formal vote. The community just reaches a consensus and a bureaucrat is nudged to set the bot flag. SemperBlotto 17:58, 19 May 2007 (UTC)
- Sigh. Another thing to update ... we've been running votes (it is a good way to record consensus, and make sure a reasonable number of people are paying attention ;-) (Connel, you want to edit the Bot page?) Robert Ullmann 18:03, 19 May 2007 (UTC)
- GAH! A catch-22! That page now says it is policy and can only be changed (corrected, in this case) by a vote. With so many people so touchy these days, no, I will leave that correction to someone else. (Yes, that is a throwback to bot policy of several years ago; long before WT:VOTE existed. Yes, the idea now is to have it on WT:VOTE for one week, instead.) --Connel MacKenzie 05:48, 23 May 2007 (UTC)
- I don't remember any vote being put forward on a policy page, at least not as recently as the bot policy was worked out, so claiming that a vote is required is just an unsupported claim that does not carry the weight of the community. I could have missed something here in the Beer Parlour before the voting procedure was formalized, but how could a page claim that a vote is required if the voting procudure hadn't been formalized? DAVilla 17:52, 6 June 2007 (UTC)
- GAH! A catch-22! That page now says it is policy and can only be changed (corrected, in this case) by a vote. With so many people so touchy these days, no, I will leave that correction to someone else. (Yes, that is a throwback to bot policy of several years ago; long before WT:VOTE existed. Yes, the idea now is to have it on WT:VOTE for one week, instead.) --Connel MacKenzie 05:48, 23 May 2007 (UTC)
- Sigh. Another thing to update ... we've been running votes (it is a good way to record consensus, and make sure a reasonable number of people are paying attention ;-) (Connel, you want to edit the Bot page?) Robert Ullmann 18:03, 19 May 2007 (UTC)
- My understanding (from Wiktionary:Bots#Process) is that there is no formal vote. The community just reaches a consensus and a bureaucrat is nudged to set the bot flag. SemperBlotto 17:58, 19 May 2007 (UTC)
- Great, thank you so much. :-) When you bring this to WT:VOTE, I'll support. —RuakhTALK 17:43, 19 May 2007 (UTC)
See Wiktionary:Votes/bt-2007-05/User:SemperBlottoBot.
Entries for romanisations
What is the status of the Romanised forms of non-Latin-character words? Someone recently created an entry for laós the romanisation of λαός. I have the idea (but cannot cite sources) that opposite opinions exist, but that current policy is against such entries - although we state that entries will exist for all words, and romanisations will be found in many books about, say, Greece or Russia. —Saltmarsh 09:37, 19 May 2007 (UTC)
- If it can meet the general threshold for attestation in that romanized form, then it's hard to see why it couldn't be included. On the other hand, just being a romanization of an attested word is plainly not enough. -- Visviva 16:56, 20 May 2007 (UTC)
- Agreed generally, though we do include Pinyin. —RuakhTALK 18:57, 20 May 2007 (UTC)
- We have entries for romanizations when they are used. I.e. not just mention in a textbook, dictionary, etc. Pinyin is not an exception. We have entries for Pinyin because Mandarin is routinely written in Pinyin, with and without diacritics. Likewise for Japanese rōmaji; we have entries because it is used. If it was just a helpful dictionary phonetic transliteration (like we have for many scripts), we would not have entries.
- In general, all wikt entries are the correct spellings in the scripts that are used, and can be attested.
- The only variance for Japanese and Chinese is that we don't ask for attestation for each script form for modern words; if a word is attested in kanji, it isn't necessary to go chase usenet postings writing the word in rōmaji ;-) See WT:AJA. (This is the case Visiva mentions.) For languages not conventionally written in romainizations (e.g. Greek, Russian, Arabic, Sanskrit, etc.) this doesn't apply. Robert Ullmann 19:12, 20 May 2007 (UTC)
- The only variance for Japanese and Chinese is that we don't ask for attestation for each script form for modern words […] Yeah, that makes it an exception. :-) —RuakhTALK 20:35, 20 May 2007 (UTC)
- I suppose that one could make the argument that Greeklish would qualify as attestation of such Romanizations of Greek words, but my understanding is that the system is fading from use as Greek characters become more common in cyberspace. In any case, the fact remains that such things were merely a contrivance used to represent a Greek word in characters which were more available at the time. I vote laós be deleted. Atelaes 05:22, 22 May 2007 (UTC)
- Greeklish gets used still, but it is not very common. More troubling is that there is no one standard form, so we would have to include many such entries for one original word. I don't see it being useful, so I vote to delete. ArielGlenn 23:55, 7 June 2007 (UTC)
Handwriting and old fonts idea
I've got an idea about adding a selection of images showing the handwritten appearance of each entry and also its appearance in different styles of printing. Potentially the examples could include traditional cursive handwriting, plain print, old style fonts which show "s" as a letter like an elongated "f", and so on. I think that this information would be useful for young children who are still practising their handwriting at school, for anyone who is trying to read older printed texts and also for students of English. Would this be a good idea? Pistachio 14:19, 27 May 2007 (UTC)
- The so-called long ess can be handled by Unicode: ſ (
ſ
). —Stephen 20:37, 29 May 2007 (UTC)
- The so-called long ess can be handled by Unicode: ſ (
- The long s has also been added to the edit tools box. See the Latin/Roman category, in the antepenultimate grouping of characters, immediately to the right of the eszett (ß). † Raifʻhār Doremítzwr 13:46, 9 June 2007 (UTC)
- That's a very big project. I would suggest first writing an
Appendix:English orthographyAppendix:English typography with scanned examples of text and the English alphabet. Then, maybe prepare a couple of example pages of what you think entries with this information should look like. If it gets other people excited and involved, then it could be a useful addition. If no one else gets involved, it could end up dying quietly like the shorthand project did. --EncycloPetey 07:14, 28 May 2007 (UTC)- I think this an excellent idea, but as EncycloPetey notes, it is an immense project. I would like to, one day, have examples of Ancient Greek words from engravings and old texts, etc. I have a feeling that this is something that will sit on the backburner for quite a while, but I feel it would, if ever done, add a great deal to the entries. Atelaes 07:30, 28 May 2007 (UTC)
- I like the premise of the project. A lot. But where would you get the images for the initial 52 letters, cursive and block? Your own handwriting, scanned in? (From there, it would be a matter of simply bot-adding something, right?) --Connel MacKenzie 09:22, 28 May 2007 (UTC)
- My handwriting would be no use for this O_O' I'm not sure whether composite images could be used to spell words, or whether each word would require a separate image. Also, it would be interesting to know whether it would be allowed to use cut out words from scanned-in images of old books. It seems quite a good idea first to begin a trial project, and then see what happens. Pistachio 00:03, 29 May 2007 (UTC)
- If the books are reasonably old (pre-1923, in the U.S., which is generally farther back than the rest of the world goes) then they are in the public domain, and can be used by anyone for any purpose, forever. Cheers! bd2412 T 00:41, 29 May 2007 (UTC)
- ...unless the copyright was renewed. --EncycloPetey 13:13, 29 May 2007 (UTC)
- If the book was published before 1923, it is in the public domain, period. No copyright renewal. Copyrights are not indefinitely renewable - they have definite dates of final and irrevocable termination. Cheers! bd2412 T 02:55, 7 June 2007 (UTC)
- ...unless the copyright was renewed. --EncycloPetey 13:13, 29 May 2007 (UTC)
- If the books are reasonably old (pre-1923, in the U.S., which is generally farther back than the rest of the world goes) then they are in the public domain, and can be used by anyone for any purpose, forever. Cheers! bd2412 T 00:41, 29 May 2007 (UTC)
- My handwriting would be no use for this O_O' I'm not sure whether composite images could be used to spell words, or whether each word would require a separate image. Also, it would be interesting to know whether it would be allowed to use cut out words from scanned-in images of old books. It seems quite a good idea first to begin a trial project, and then see what happens. Pistachio 00:03, 29 May 2007 (UTC)
- Yes, great idea! Appendix:English typography and a few examples are the place to start. DAVilla 01:39, 9 June 2007 (UTC)
Isn't this a poor choice? The term idiom is pretty unambiguous whereas idiomatic’s first and foremost sense is contranymical to the sense of idiom. __meco 07:47, 29 May 2007 (UTC)
- This was one of the earlier templates, prior to
{{context}}
gaining widespread acceptance here. Many items were tagged with{{idiom}}
that truly were only idiomatic (my guess: 30%) and with no clear-cut way of distinguishing idioms of idiomatic phrases, the lowest-common-denominator was idiomatic. At this point in time, I don't see a simple, clear way of splitting them apart. --Connel MacKenzie 07:57, 29 May 2007 (UTC)- My point is that this sentence is written in idiomatic English. And that certainly is not what what that label wants to point out. __meco 08:06, 29 May 2007 (UTC)
- And I notice that there exists some confusion about this distinction in senses also on the WT:CFI page where it reads: "An expression is “idiomatic” if its full meaning cannot be easily derived from the meaning of its separate components.
- My point is that this sentence is written in idiomatic English. And that certainly is not what what that label wants to point out. __meco 08:06, 29 May 2007 (UTC)
- For example, this is a door is not idiomatic, but shut up and red herring are."
- This points also to the practical impossibility of using the term idiomatic without reservations to reflect on something being of the nature of an idiom, seeing as the first definition of idiomatic is "Pertaining or conforming to the mode of expression characteristic of a language." __meco 14:41, 29 May 2007 (UTC)
- You can override the redirect, which would be a simple way to start sorting the two out, but please make the distinction very clear in the respective categories and in Appendix:Glossary. DAVilla 01:37, 9 June 2007 (UTC)
Languages without literature
Wiktionary is intended to include all words in all natural languages. In order to attest contested words, we require examples of use in print, not merely references to other dictionaries, or appearances in word lists. So how would we document and attest, say, Zuruahá, which was first discovered in 1980 in the Amazon with no writing, or other less extreme examples of illiterate languages. There is likely to be wordlists somewhere transcribed by scholars, but no printed literature in the language. It would seem to me that our two possible sources of these could be submissions of transcriptions from speakers of the language (unlikely, and we probably don't need to worry about this) or secondhand wordlists and grammars from anthropologists and other field workers. Do we make an exception for such languages and accept references in scholarly works that define but don't use words in context? I can't think of any other way to fulfill our mission if, for example, someone were to send something in Category:Shabo language to WT:RFV, but this would require a modificaion of our current WT:CFI. Thoughts? Dmcdevit·t 10:45, 29 May 2007 (UTC)
- Why wouldn't criterion 3 - "Appearance in a refereed academic journal" cover such cases? --EncycloPetey 17:56, 29 May 2007 (UTC)
- Am I wrong in thinking the use-mention distinction ("This filters out appearance in raw word lists, commentary on the form of a word," etc.) means that criterion is intended to indicate usage in context, not mentions in foreign language articles discussing grammar? I think it means this citation for hablamos, from Pedagogy: Colleges and Universities, would probably not be accepted, "They usually are able to pronounce hablamos correctly; the place the stress on the -bla-, but they often mispronounce hablan and habla." (Gerald S. Giauque, Correcting Misplaced Stress in Conjugated Spanish Verbs). Dmcdevit·t 19:39, 29 May 2007 (UTC)
- IMHO, if a language isn't written then there's no need and no point in including it in our project - one of our "features" (or possibly our limitations?) is that we are a written dictionary providing written words. If nobody writes the language, there's not a lot we can do for it (apart from maybe collecting audio samples of rare languages, but that is currently out of our scope). --Keene 00:54, 30 May 2007 (UTC)
- I must disagree with Keene in this instance. Wiktionary is a free and open piece of software which is being built by volunteers. The only limitations are ones we impose on ourselves. Granted, it will be difficult to accommodate rare languages such as the one mentioned above. However, it is not impossible. If some field linguistics anthropologist (is this the correct profession?) has managed to document a language and come up with an orthography, we should allow the results of that research to be included in Wiktionary (provided that there are no copyright issues). I recognize the philosophical objections to including original research. However, what if someone recorded samples of the language, and uploaded the audio files to wikimedia commons. Couldn't someone theoretically create an entry for one or more of the words in the recording, and then "cite" of the commons audio file?
- I also disagree that Wiktionary is for written languages only. First of all, any written form of a language, regardless of language, ultimately derives from some spoken form. Of course, there are many cases in which the spoken form may have fallen into disuse (ex. Latin). My point is that written language is generally an attempt to capture the spoken language. Some languages have better success at creating writing systems than others, but that shouldn't mean that we ignore such languages. Even if the language lacks a standard orthography, I have seen several dictionaries that use IPA to spell the words (not all that uncommon, especially for less common languages).
- In short, as long as someone is knowledgeable and crazy enough to try and single handedly document a language spoken by only 130 people, why not allow it? Finally, the point to including such words is that it adds to world knowledge. Why is including a word from an obscure language any less legitimate than documenting every single usage of the "F" word? -- A-cai 01:57, 30 May 2007 (UTC)
- Indeed, I didn't think the fact that all natural languages are with our scope was in dispute. I just want us to figure out a way to cite such additions that isn't unnecessarily burdensome or unreliable. Dmcdevit·t 06:50, 30 May 2007 (UTC)
- In short, as long as someone is knowledgeable and crazy enough to try and single handedly document a language spoken by only 130 people, why not allow it? Finally, the point to including such words is that it adds to world knowledge. Why is including a word from an obscure language any less legitimate than documenting every single usage of the "F" word? -- A-cai 01:57, 30 May 2007 (UTC)
- Going by the recent (similar) "fictional languages" vote, I'm inclined to think the Appendix: namespace would be the best interim compromise for spoken languages. I think the vote indicated there is a general desire from this community to have that information here, somewhere. But page-moves (as presumed spellings are corrected) will be a greater concern as we continue adding languages. If it is truly original research, it probably should be published in book form on Wikisource first. From there it could go either to the Index: or Appendix: namespace here (or both) then from there into the main namespace (presumably via bot.) --Connel MacKenzie 02:19, 30 May 2007 (UTC)
- No, it wouldn't be published on Wikisource -- that's a site for previously published material. However, it could be published on Wikibooks, with sound files uploaded to Commons. --EncycloPetey 02:48, 30 May 2007 (UTC)
- Erm, yes, I guess I meant Wikibooks, then. (I have no idea why, but I still get those two backwards.) --Connel MacKenzie 03:09, 30 May 2007 (UTC)
- I didn't really have original research in mind so much as previously published scholarly work. What about something like this? It's from the nineteenth century, but assuming there was nothing suspect about the scholarship, we should be able to enter articles for these with that as a citation. Or other academic references. I think CFI would need to be changed for that to happen, though. Dmcdevit·t 06:50, 30 May 2007 (UTC)
- No, it wouldn't be published on Wikisource -- that's a site for previously published material. However, it could be published on Wikibooks, with sound files uploaded to Commons. --EncycloPetey 02:48, 30 May 2007 (UTC)
I agree with A-cai. I also agree with Connel MacKenzie that it makes sense to put such languages' lexicons in appendices or indices, but for a different reason than his: our organization scheme is all based on spelling; what puts French pain and English pain on the same page is that they're written the same way. For unspelled languages, it makes no sense to choose a spelling just for the sake of forcing it into our mold; but if we organize these languages into indices or appendices, then we can more flexible — more wiki — in our transcriptions and standards. —RuakhTALK 02:51, 30 May 2007 (UTC)
- It's not as if we're expecting (I think) Wiktionarians to write down unwritten languages as they hear them. Most unwritten languages have spelled words in that qualified scholars have transcribed them, usually according to a phonetic scheme. I don't see any reason to exclude them from the main namespace. Dmcdevit·t 06:50, 30 May 2007 (UTC)
- I second the suggestion that such information should be placed in an appendix. I do feel that Wiktionary is a perfect medium for preserving languages (frankly, if the community voted against appendices and main namespace entries for such languages, I'd advocate hosting them all in the User: namespace!). That said, here is my opinion: if the language has been written to any degree (with presumably varying spellings), but some work has since emerged as the standard for orthography (as, eg, the Duden is for German) or simply the reference for the language (as, eg, the WNT for Dutch), then we ought to enter it with those spellings. If, in contrast, the language has no native literary corpus at all, but has been transcribed in multiple academic works, then we ought to include them all (obviously, don't violate copyrights, though), perhaps ‘backwards-dictionary’ style, eg:
- tree - abau (Mendelsoehne, UDoX), apou (Chengdu, TSXG)
- where Mendelsoehne wrote the Unabridged Dictionary of Xyzlian (abbreviated à la OED2), Chengdu wrote a Tentative study of Xyzlian grammar, and so forth.
- — Beobach972 02:18, 3 June 2007 (UTC)
Layout - nouns with 2 genders
θέρος has 2 genders with different, but related, meanings but (I guess) the same etymology. Two suggested layouts:
a Noun 1
θέρος m (théros)
a Noun 2
θέρος n (théros)
b Noun
θέρος m (théros)
θέρος n (théros)
c Noun
θέρος (théros)
Please can someone suggest the best format or point me to a similar word in another. —Saltmarsh 14:21, 30 May 2007 (UTC)
- Just a side note to begin with: that's rather odd that "harvest" became masculine as I'm fairly sure that both meanings were neuter in Ancient Greek. (On that note, maybe there's some reason for the change? It would simplify things greatly if they somehow had different etymologies, but here I am looking for the easy way out!) It figures that someone would complicate something along the way! The second option resembles what is done with Latin adjectives, though there isn't an issue of a difference in meaning (e.g. ārida). Because of declension templates (if you're still using those), it seems to me more logical to separate the two as in the first option. Medellia 16:06, 30 May 2007 (UTC)
- Interesting that this issue should turn up just now. We've been discussing a sinilar situation for a Russian word (in RfC?). The first option seems best in this case if there will be two different inflection tables. --EncycloPetey 20:01, 30 May 2007 (UTC)
- I will follow the line taken at правило, which is Option (a) above (and discussed at Wiktionary:Requests for cleanup#правило) until I hear otherwise. I have checked out the differing genders which are confirmed in 3 sources. —Saltmarsh 05:11, 31 May 2007 (UTC)
- Interesting that this issue should turn up just now. We've been discussing a sinilar situation for a Russian word (in RfC?). The first option seems best in this case if there will be two different inflection tables. --EncycloPetey 20:01, 30 May 2007 (UTC)
- Please use (b); it is used in various other places (Arabic, Japanese, etc), is not hard to deal with. The last thing we need is yet more headers! (It isn't just noun, after all; we end up with hundreds of header combinations). Please?. Robert Ullmann 05:54, 31 May 2007 (UTC)
- Tx! In the case EP is referring to, if there are subheadings, then it is necessary to have another POS header; in that case I would strongly advise just using Noun again, not adding numbers which don't mean anything. Robert Ullmann 09:31, 31 May 2007 (UTC)
- Any time an inflection line is used, we have to have a POS header. This wasn't something I used to know, because it's not normally an issue, but it's true. So, option (b) isn't really an option. Likewise, if there might ever be subheaders, then we need to have a section header for them to be subheaders of!
- "it's true" ? stated exactly where? Robert Ullmann 06:12, 1 June 2007 (UTC)
- Any time an inflection line is used, we have to have a POS header. This wasn't something I used to know, because it's not normally an issue, but it's true. So, option (b) isn't really an option. Likewise, if there might ever be subheaders, then we need to have a section header for them to be subheaders of!
- Tx! In the case EP is referring to, if there are subheadings, then it is necessary to have another POS header; in that case I would strongly advise just using Noun again, not adding numbers which don't mean anything. Robert Ullmann 09:31, 31 May 2007 (UTC)
- Often there are other considerations, such as different declensions. Most dictionaries handle this sort of thing with superscript numbers. In any case, if you have ===Noun===, then one or several definitions, then a declension table, it is likely that someone would not think to look further down the page to a homophone with its different gender and declension. If we use (b), it should at least include a line in the definitions that other nouns with other meanings and different grammar are to be found further down the page. —Stephen 19:16, 31 May 2007 (UTC)
- I agree. It may not be pretty, but we have to seriously consider the use of numbered POS headers. --EncycloPetey 21:37, 31 May 2007 (UTC)
- Often there are other considerations, such as different declensions. Most dictionaries handle this sort of thing with superscript numbers. In any case, if you have ===Noun===, then one or several definitions, then a declension table, it is likely that someone would not think to look further down the page to a homophone with its different gender and declension. If we use (b), it should at least include a line in the definitions that other nouns with other meanings and different grammar are to be found further down the page. —Stephen 19:16, 31 May 2007 (UTC)
- If you want to repeat the POS header, just don't use the numbers. They are useless. Let me say that in stronger terms: the numbers are utterly meaningless cruft. The only possible use they could serve is so that other things could refer by number, which is the utterly last thing we would want! (We are still going to be cleaning up the use of "sense numbers" in translations sections for a long time now.) Do not number the headers! Robert Ullmann 06:30, 1 June 2007 (UTC)
- I wouldn’t say that numbered headings are useless. A heading marked ===Noun 1=== or ===Etymology 1=== tells you that there is a ===Noun 2=== or ===Etymology 2=== to follow, which is precisely what it’s supposed to convey. No other things need to refer to the numbers, and as far as I know they do not. I think it’s extremely useful. But if we do not use numbers (even though most good dictionaries do use them), then we need to add something in the definition line that tells you that another instance of the word follows (I have not been able to think of an acceptable note that will do the same job as numbered headings). —Stephen 16:31, 2 June 2007 (UTC)
- On this, I agree completely with Robert Ullmann. Stephen: other dictionaries do not do that; they use the numbers to indicate separate etymologies for the same spelling. They use a small number; we use a gigantic "Etymology 2" heading with the etymology spelled out. --Connel MacKenzie 02:24, 3 June 2007 (UTC)
- Uh, Connel, My copy of Lewis' Elementary Latin Dictionary uses big bold numbers (the same size as the entry word) placed smack in front of each headword. And it does it for words with different inflections and gives etymologies only occassionally. You can't limit this discussion to English dictionaries and their conventions because this discussion affects all the languages we include. --EncycloPetey 02:45, 3 June 2007 (UTC)
- I will note that we use numbers if there are multiple etymologies (see ben), but we do not number the different noun senses or whatnot. My own inclination is assuming that they have identical etymologies to duplicate the ===Noun=== header (without numbers).
- PS- if there is a table of contents at the top of the entry, will that not alert readers that there are multiple noun sections on the page? Beobach972 01:43, 3 June 2007 (UTC)
- The TOC is frequently too complex. If it has more than three or four lines, I just scan it quickly for the POS I’m looking for, and if there is a second or third one, I am not aware of it. The only way I know that there are more than one instance of a word is if the page is very simple and short so that I can see it immediately, or if the instances are numbered. —Stephen 01:50, 3 June 2007 (UTC)
- Additionally, editors are likely to try to merge two sections both labelled Noun, whereas they will take a closer look before doing so if the sections are labelled Noun 1 and Noun 2. Stephen posted an example of a Russian noun лебедь "swan", where there are two nearly identical senses and two nearly identical (but subtly different) declension tables. If I hadn't been told in advance that the declension tables were different, I might have assumed that it was an accidental duplication. --EncycloPetey 01:54, 3 June 2007 (UTC)
- (part of this argument is also given at Wiktionary:Requests for cleanup#правило) Can I fulfil Medellia's initial wish above - if the words are different (even if only in gender) they (by definition) have etymologies which vary. If θέρος has 2 genders, it is 2 words not one, they both evolved from Ancient Greek when both meanings (harvest and summer) had the same n gender - ie they were one word. At some point the word for harvest changed gender so θέρος (n) θέρος (m) have different etymologies - the masculine version has a step beyond the neuter version - we have two words? —Saltmarsh 14:51, 6 June 2007 (UTC)
- as long as we have two genders *and* different meanings then we have two words, and not one word with two meanings. I care less about the numbers, but something needs to catch the reader's eye to alert them of multiple entries on one page, and I have no better alternative. (Just for diversity, the dictionary I am looking at right now has two full-on entries for these words, as opposed to numbering the etymologies etc.) ArielGlenn 03:05, 8 June 2007 (UTC)
- Thanks for your help θέρος now has 2 (numbered) entymologies - the words differ (if only in gender) and therefore have different (if only in the last step) entymologies. —Saltmarsh 06:30, 15 June 2007 (UTC)
Entries with level or structure problems
Coming back to this ...
At Connel's vehement insistance, since 18 April Autoformat has been tagging entries into Category:Entries with level or structure problems instead of correcting the header levels. The category has now accumulated over 1000 entries, almost all correctable by the bot. For example, there is an entry with structure:
1 English 1.1 Pronunciation 1.2 Noun 1.2.1 Usage notes 1.3 Derived terms 1.3.1 Hypernyms 1.3.2 See also 1.3.3 Translations 1.3.3.1 Translations to be checked 1.4 Verb 1.4.1 Translations 1.4.2 References 2 Dutch 2.1 Noun
(at this version) which you (and the bot) can see clearly how to fix without having to know anything about the content. Most cases are much simpler, and the code looks at the whole structure when deciding if a header belongs at L4. It knows how many POS sections there are, the level of the header(s) preceeding and following, and the proper levels (Related terms is usually at 4, but sometimes at 3, Translations always at 4, ttbc always at 5; all +1 if inside a numbered ety section, and so on).
The corrections are disabled by a flag in the code (named Connel
;-). If left disabled, these will continue to accumulate until someone manually does the fixes. and I for one have no interest in manually doing 1000's of fixes I have working automation for
If the flag is enabled, the bot will automatically "eat" the category over a period of time, leaving the entries that it knows have problems, but can't un-ambiguously correct. (As well as unusual cases, like chocolate it doesn't have rules for; these raise questions that are worthy of attention.) The category will then be worth human attention.
Please look at some of the entries, in particular the ones that say "would have corrected level of ..." in the edit summary. And tell me what we should do. Robert Ullmann 06:18, 31 May 2007 (UTC)
- I'm not convinced the bot would always do the right thing (e.g., in your example, it's not necessarily obvious whether the "Derived terms" section should be at L3 after the "Verb" section, or whether it should be at L4 where it is, though with this specific ordering the latter seems more likely), but I have faith that it would in most cases leave entries better off and in the rest leave them no worse off, so think that you should have your bot correct the ones it can (or thinks it can). —RuakhTALK 06:58, 31 May 2007 (UTC)
- I've found that the sequence of headers is extremely reliable, while the levels are sometimes all over the place, I think because they aren't so noticeable looking at the entry. So in a case like this, it is quite certain that the derived terms belong to the noun. (and in fact they do ;-). Robert Ullmann 07:07, 31 May 2007 (UTC)
- Question: How often will AutoFormat "re-correct" an entry that a human editor fixes (if that human's correction is at odds with AutoFormat's rules, that is.) --Connel MacKenzie 07:20, 31 May 2007 (UTC)
- Never. Assuming that the human edit conforms to WT:ELE policy (whence the AF rule). If the human insists on violating WT:ELE, then there will be an issue. (For example, if the human insists on putting "Derived terms" at L3, when it applies to only one POS; where ELE says explicitly that this is only to be done in the exceptional case where it must apply to more than one. If it is not known from which part of speech a certain derivative was formed it is necessary to have a "Derived terms" header on the same level as the part of speech headings.) Robert Ullmann 07:26, 31 May 2007 (UTC)
- Earth to AutoFormatBot: "If it is not known from which part of speech a certain derivative was formed it is necessary to have a "Derived terms" header on the same level as the part of speech headings." What part of that don't you understand? Derived Terms goes at L3! --Connel MacKenzie 07:01, 1 June 2007 (UTC)
- AF to Connel: which part of "If it is not known ..." don't you get? Derived terms goes at level 3 IF AND ONLY IF it is NOT KNOWN what POS it is derived from. Eh? L3 is the rare exception as the sentence you quote makes entirely clear.
- Here on Wiktionary, all POS headings for an entry are rarely entered on the first go-around. Most are added later, yet share the subordinate headings. Without some keen knowledge of the English language, AF is almost certain to get it wrong. --Connel MacKenzie 22:59, 2 June 2007 (UTC)
- AF to Connel: which part of "If it is not known ..." don't you get? Derived terms goes at level 3 IF AND ONLY IF it is NOT KNOWN what POS it is derived from. Eh? L3 is the rare exception as the sentence you quote makes entirely clear.
- Earth to AutoFormatBot: "If it is not known from which part of speech a certain derivative was formed it is necessary to have a "Derived terms" header on the same level as the part of speech headings." What part of that don't you understand? Derived Terms goes at L3! --Connel MacKenzie 07:01, 1 June 2007 (UTC)
- Never. Assuming that the human edit conforms to WT:ELE policy (whence the AF rule). If the human insists on violating WT:ELE, then there will be an issue. (For example, if the human insists on putting "Derived terms" at L3, when it applies to only one POS; where ELE says explicitly that this is only to be done in the exceptional case where it must apply to more than one. If it is not known from which part of speech a certain derivative was formed it is necessary to have a "Derived terms" header on the same level as the part of speech headings.) Robert Ullmann 07:26, 31 May 2007 (UTC)
Come on, I know you are pining for the time when entries were just an unstructured (and uselessly unparseable) list of L3 headers, but we've moved way past that, to where someone/something reading the wikt database can expect to be able to tell which POS things belong to. And when someone adds another POS (say a Noun in front of the Verb), the attributes of the verb don't come to be erroneously associated with the noun. If an entry is permitted to be:
==English== ===Verb=== ===Derived terms===
(where the derived terms are derived from the verb of course) and someone adds a noun:
==English== ===Noun=== ===Verb=== ===Derived terms===
now the Derived terms can only be interpreted as from the noun and the verb or not known, which is wrong. Derived terms must be at L4 unless it must apply to more than one POS. If the entry is formatted as ELE requires:
- <Connel's Comment>Completely untrue! How is AF supposed to know the subtleties of the English language? Who is AF to second guess the human editor putting the Noun sense there? Do the derived terms belong subordinate to the verb only? Not likely. More likely is that the noun and verb share those derived terms, which is exactly what the human editor implied. --Connel MacKenzie 00:28, 2 June 2007 (UTC)
==English== ===Verb=== ====Derived terms====
and someone adds a noun:
==English== ===Noun=== ===Verb=== ====Derived terms====
the derived terms properly stay associated with the verb, instead of introducing a serious semantic error.
No-one else has any trouble with this. I am, frankly, incredulous that someone who is so concerned with the parseability of entries would be so utterly determined to prevent programs reading the en.wikt from being able to associate attributes with POS. You, of all people? Robert Ullmann 07:26, 1 June 2007 (UTC)
- See, that is just it. I see your "corrections" as the introduction of error! The subordinate headings should only be subordinate when applicable. --Connel MacKenzie 00:30, 2 June 2007 (UTC)
- Put another way, AF is (would be) fixing entries where the structure is broken. If a human insists on re-breaking the structure, it will probably fix it again. OTOH, if the human insists on dumbing-down the entry to a string of L3 headers that no automation reading the en.wikt can parse with significance (thereby making most of the information in the entry unusable), then all AF can do is try to prod some human to restore a meaningful structure. Robert Ullmann 08:24, 31 May 2007 (UTC)
Given that it is Madaraka Day and I am just going to mostly be watching tennis, I'm flipping the flag over for a while to collect some examples. Robert Ullmann 06:15, 1 June 2007 (UTC)
Troublesome sentence
"If it is not known from which part of speech a certain derivative was formed it is necessary to have a "Derived terms" header on the same level as the part of speech headings."
Connel, I have been wracking my brain trying to figure out how you can be so obtuse. I think, after much noodling, that you imagine that sentence means:
- "Because we can't tell all the time which POS terms are derived from, we have to just put Derived terms at level 3 all the time."
But that isn't what the sentence says or means. It means:
- "In the exceptional cases where a certain derivative can't be associated with a particular POS, in that particular entry that derivative must be under an L3 header at the end (of the lang/ety sect), instead of in a level 4 section under the POS as is usual, as shown in the example above."
And, specifically, if there is only one POS, the exception described in the sentence can never apply. Robert Ullmann 08:20, 1 June 2007 (UTC)
- message to everyone (else): I am saying all this because I have the utmost respect for Connel. It would perhaps be easy to set up some vote and in some way just override, but I do not want to do this. 23:08, 1 June 2007 (UTC)
Robert,
I wish you wouldn't rush this portion of the process. I would prefer to give responses like this more time to rewrite (for politeness' sake) but at this point in time you seem to be charging ahead.
- Connel, I know you dislike intercut comments, perhaps even more than I (;-) so only one thing: I am not trying to charge ahead, just come back to the issue and discuss it; I am not in any hurry at all; but it is worthy of very serious discussion; now we have ~400K entries, very soon we will have ~5-10M ... Robert Ullmann 23:10, 2 June 2007 (UTC)
It is a fallacy to think that all POS headers are always included. It is a fallacy for you (and AF) to therefore assume that a single POS header is therefore the only possible one. It is also a fallacy to assume that each of those headings is necessarily subordinate to that one POS heading.
The wording of that sentence is implicitly relying on human editor's knowledge of the word, not what appears in the entry when they start editing it. The wording of that sentence further relies on the human editor's ability to comprehend precisely which senses and POS are distinct and which are not for any subordinate headings.
It has taken me a while, but I've finally figured out that your first pass was to subordinate the translations sections incorrectly, using that as the claim for entries being structured "incorrectly." So while I didn't notice for translations, I am only noticing now, when there are intervening sections (that is; the translations, if properly at L3, would be listed last correctly.)
- no, the "first pass" was to assume that the L4 sections in ELE were correct; e.g. "see also" was an L4 header use only when the others (including "coordinate terms" ;-) could not be made to fit ;-) Robert Ullmann 23:16, 2 June 2007 (UTC)
It is a fallacy to assert that "we have moved far beyond that." It is demonstrably untrue, even by your own statistics. You have the notion that they should be very deeply nested. You've made compelling arguments for it that have won a couple people over to your side. However, the single biggest complaint I have about AF is the complete disregard for WT:BOT.
At no time, should the bot be "assuming" it is smarter than the human editors. At no time should it be undoing a human edit that has corrected it. If it makes a level change, and a human undoes that edit, it is beyond audacious that AF would then edit it again. It must keep a list of entries it has edited, and leave them alone for some time. Adding cleanup tags is one thing (there is only a finite number of entries for it to tag.) But revert-warring is quite another. The bot needs to control itself and not re-edit things.
Now, given the "troublesome sentence," you wish to now reword it to your liking. Even though it is in direct conflict with several year's of edits. Even though your rewording is in direct conflict with the original intent, from when it was written. I'm sorry, but I can't go along with this.
The only advantage to your system is that "subordinate" headings will be hidden under the incorrect POS, or later duplicated, further clogging up entries. If accurate, that might be acceptable (with a vote, of course) but does not seem to be true, most of the time.
--Connel MacKenzie 00:23, 2 June 2007 (UTC)
- I think Connel is pointing out that some entries may have structures like this:
==English== ===Noun=== word (words) # noun def to word (words, wording, worded) # verb def ===Derived terms=== * [[reword]]
- In such a scenario, the editor may have forgotten the L3 verb heading and meant to associate the derived term with both the noun and the verb senses. I'd be surprised if that combination of editor error happens in any significant numbers, and I'd be comfortable with demoting such L3 sections, but but Connel is being cautious, ala BOT policies. Rod (A. Smith) 00:48, 2 June 2007 (UTC)
- I think you're mistaken; I think Connel MacKenzie has in mind some sort of entry that is correct by his reading of WT:ELE, but not by other people's. Your example can't be what he has in mind, because a human editor wouldn't need to engage AutoFormat in a revert war; (s)he could simply add the missing POS header, and set the Derived terms header to whatever level is appropriate. While I appreciate Robert Ullmann's desire not to tread on Connel's toes, I think the only solution might well be a vote to clarify WT:ELE in this regard. (Unfortunately, to do so we'd need to figure out unambiguous wordings for each viewpoint, and since it seems that each viewpoint's adherents already think it's unambiguous, that might be a complicated affair.) —RuakhTALK 01:21, 2 June 2007 (UTC)
- I agree that ELE is ambiguous. It currently allows the various "other" heading to be at L3, L4, L5 or level 6, either as peers of POS headings, or subordinate. Using AF to consistenly force them to L4, L5 or L6 is what I am concerned about here; I do not think it is linguistically correct to do so. Humans can make that judgement call; an automated function cannot. That's why my primary concern is that AF doesn't make changes only once, but is currently set up to repeat its assertions indefinitely. --Connel MacKenzie 04:30, 2 June 2007 (UTC)
Heading levels once more
The automation methods of User:AutoFormat have caused a very old issue to resurface. The inherent question is: are "Synonyms", "Antonyms", "Related terms", "Derived terms", "Translations" (and several other similar headings) always applicable only as subordinate headings. For this conversation, I will call those headings "other" headings.
In English, they are not. On Wiktionary, it has been (for a very long time) quite convenient to enter those "other" headings at the same level as the part-of-speech ("POS") headings. The alternate, is to duplicate those "other" headings under each POS, particularly when new POS headings are added.
On Wiktionary, it was never dictated that such "other" headings must be subordinate to POS headings. When Ncik edited WT:ELE, he added those examples and changed the text; Eclecticology reverted the (incorrect) changes to the text, but not the examples. Recently, AutoFormat has been "enforcing" this structure to my dismay.
Where Wiktionary is now, is quite different from where it was when that issue came up. To my mind, the same issue is central to the debate: most of those "other" headings apply to all POS headings, with the rare exception being that it applies to only one POS heading.
Where Wiktionary is today
- There is significantly more automation in play on en.wiktionary.org, than when the debate last arose. It is possible now, to systematically restructure all entries to either conform with a new layout, or be tagged as exceptions. For various reasons, this was not remotely possible two years ago.
- What to do now
If the community feels strongly against my perception, then there should be a WT:VOTE that changes the structure dictated in WT:ELE. The new format would be that only language headings go at level two, only etymology & POS headings go at level three, and all "other" headings go at level four (or five if there are multiple etymology headings.) The only exception might then be "References."
To me, that would be a radical change, dictating that any "other" headings currently at L3 would then need to be duplicated at L4 for each POS, and reviewed by humans to whittle out any that are specific to only one POS.
If we are going to consider a radical change to the general layout, I would prefer that discussion resume, for eliminating the POS headings as subordinate to etymology. I would also prefer that discussion resume on completely eliminating the POS headings, listing a "Definitions" section instead (with individual lines using the Special:Prefixindex/Template:pos tags at the start of the line. (The etymology sections would then be subordinate to the definitions section, disambiguated in a similar manner as our translations currently are. The "Inflections" heading would also be at a subordinate level, generalizing which parts of speech applied to which inflection line...as those are identified at the start of each definition.)
- Summary
If the community really feels that the benefit of micro-specifying the "other" headings outweighs the notion of grouping them together in consistent locations, then a vote will certainly show that. If the community feels that micro-specifying the "other" headings is a good stepping stone, for technical considerations, to get from point A to point C, then a vote will also show that.
I would appreciate the community's comments on which approach would suit the English Wiktionary best: 1) Forcing "other" headings to be subordinate to POS in all circumstances, 2) Continuing to allow "other" headings at the same level as POS (and telling AutoFormat to stop sublimating them) or 3) Flesh out my other proposal above.
Thanks in advance. --Connel MacKenzie 01:31, 2 June 2007 (UTC)
- I'm an examples person, so allow me to show a non-English example:
- Option 1:
- ===Noun===
- 代表 (traditional and simplified, Pinyin dàibiǎo)
- representative
- ===Verb===
- 代表 (traditional and simplified, Pinyin dàibiǎo)
- to represent
- ===Derived terms===
- 三个代表 Three Represents
- 全国人民代表大会 National People's Congress (lit. Pan-National Congress of the People's Representatives)
- Option 2:
- ===Noun===
- 代表 (traditional and simplified, Pinyin dàibiǎo)
- representative
- ====Derived terms====
- 全国人民代表大会 National People's Congress (lit. Pan-National Congress of the People's Representatives)
- ===Verb===
- 代表 (traditional and simplified, Pinyin dàibiǎo)
- to represent
- ====Derived terms====
- I usually go with option one because it is easier, and Mandarin Grammar doesn't require putting such a fine point on things. This debate has me curious. Question:
- Does WT:ELE have a preference for option 1 or 2?
- If so, what is the preference?
- If not, should WT:ELE be that explicit (it would have to be if Robert wants consistent headers for his bot)?
- My personal take is that option two might technically be more accurate, but is more trouble than it's worth in most cases. IMO, we shouldn't be laundry listing synonyms, derived terms to the point where it just becomes another unruly list of words. A few appropriate words to illustrate relationships etc is much more helpful than a half a page of random words. -- A-cai 08:36, 2 June 2007 (UTC)
- Wow, I didn't think I was so unclear. You list "Option 1" as what I called "2)", and you list "Option 2" as what I called "1)". :-( AFAIK, your "Option 1" is what WT:ELE "says." As far as Robert knows, your "Option 2" is what WT:ELE "says." To take your example a tiny bit further, here is how I think "Option 3" might look:
- ==Mandarin==
- ===Definitions===
- # {{pos_n}} [[representative]]
- # {{pos_v}} [[represent]]
- ====Inflection====
- :'''代表''' (''traditional and simplified, Pinyin'' '''[[dàibiǎo]]''')
- ====Derived terms====
- * (''representative, noun'') [[全国人民代表大会]] [[w:National People's Congress|National People's Congress]] (''literally'' Pan-National Congress of the People's Representatives)
- * (''represent, verb'') [[三个代表]] [[w:Three Represents|Three Represents]]
- With all sorts of other headings tossed in a L4. That would allow for "parts of speech" to grow in the template namespace, rather than being constricted by the headings. (E.g. "Pinyin syllable, or "Letter" or "Symbol" or "Determiner".) That would allow similar definitions to be intertwined. That would allow human judgement to better determine the order definitions are given. That would allow multiple lines of inflections, probably to absorb what we currently use "Derived terms" for now. The remaining problem with such a scheme would be determining an acceptable way to subdivide separate etymology portions (the way normal dictionaries list 1, 2, etc.)
- --Connel MacKenzie 14:06, 2 June 2007 (UTC)
- Wow, I didn't think I was so unclear. You list "Option 1" as what I called "2)", and you list "Option 2" as what I called "1)". :-( AFAIK, your "Option 1" is what WT:ELE "says." As far as Robert knows, your "Option 2" is what WT:ELE "says." To take your example a tiny bit further, here is how I think "Option 3" might look:
- Part of the problem here is that for entries that are already in the form of Option 1, the "Derived terms" may not relate only to the heading just above them; they may include both noun and verb derivatives, or terms for which the derivation is not entirely clear. This means that any conversion would have to be done by hand (or if automated, subject to careful review) -- furthermore, there may be cases where the affiliation of a particular term cannot be clearly decided based on available information. There's a lot to be said for Option 2, but I don't think its retroactive application to existing entries would be wise. -- Visviva 09:15, 2 June 2007 (UTC)
- Well, AF has been retroactively applying that rule, which I obviously don't agree with. That is the premise of this debate. --Connel MacKenzie 14:06, 2 June 2007 (UTC)
- Connel is flat wrong in this case (and if that is in fact the entire premise of this debate, we are done!): AF does not move L3/4 class headers to L4 when there is more than one POS. (If someone adds a POS, they need to be looked at by the human, which I'm sure Connel will agree with.) Robert Ullmann 00:06, 3 June 2007 (UTC)
- But how on earth can AF know there isn't another POS that hasn't been entered yet? --Connel MacKenzie 02:08, 3 June 2007 (UTC)
- Connel is flat wrong in this case (and if that is in fact the entire premise of this debate, we are done!): AF does not move L3/4 class headers to L4 when there is more than one POS. (If someone adds a POS, they need to be looked at by the human, which I'm sure Connel will agree with.) Robert Ullmann 00:06, 3 June 2007 (UTC)
- Of course it doesn't. But if there is another POS added, it is absolutely critical that derived/related terms not be accidentally associated with the other POS, as they would be if they are at L3. As you point out, only a human adding the POS can decide that those terms do in fact apply, or that it is ambiguous. A very important case of this EP mentions below (but gets backward ...) any past verb or noun can be an adjective, if someone adds the adjective, a derived term at L3 will be associated with the adjective as well; but are essentially never derived from the adjective. (summit can be an adjective too, but summiteer is not in any possible way derived from the adjective!) Robert Ullmann 16:41, 3 June 2007 (UTC)
- My personal opinion is that Synonyms, Antonyms, Quotations, Usage notes, and Translations should always be at L4 (or deeper) under a specific part of speech header. These are always tied to particular meanings, definitions, and sense that will be specific to a part of speech. While it would be nice to have Derived terms, Related terms, and Descendants placed this way, it takes a lot of work to figure out where just one word in such a list should go. Terms such as poor boy and craptastic are derived from more than one independent word, so they can't be assigned to derive from just one definition by the very structure we have in the entries. Worse, it is often completely unknown which of several parts of speech was the immediate progenitor of a derived term. We're too fledgling a dictionary to try to tackle the millions of debates that could arise from such a policy, and much of the resolution would depend on original research, which we don't include on principle. --EncycloPetey 17:51, 2 June 2007 (UTC)
- The derived term "home run" is clearly derived from "run", right? Which sense? The verb or the noun? Isn't it clearly both? And which noun senses would it be limited to, in conjunction with the verb? --Connel MacKenzie 23:11, 2 June 2007 (UTC)
- It is either adjective/noun or adverb/verb. Since home run is a noun ("and he home runs"? Not! Whereas a form from adverb/verb safely run: "and he safely runs" is in fact a verb), it is adjective/noun, from the noun, not the verb. But, as you say, there are cases. Rarely. Most, like this, are perfectly clear. Robert Ullmann 01:19, 3 June 2007 (UTC)
- You are suggesting that the noun home run is not derived from the noun run at all? I don't think you'll score many runs with that assertion. When you say this is perfectly clear, I beg to differ. --Connel MacKenzie 02:08, 3 June 2007 (UTC)
- That is a bold and unsusbtantiated claim. Consider the entry I created today for gorged. Are the related terms there associated with the Verb past tense form or the Adjective use of this word? Consider that we run into a similar problem for every single participle in English. --EncycloPetey 01:26, 3 June 2007 (UTC)
What AutoFormat is and is not
I'm going to make this as clear as crystal or as mud.
From the documentation:
- AutoFormat is not a policy instrument
- It fixes things that are errors, common mis-understandings of standard format; it is not intended to enforce policy. Anything controversial or being debated is outside of scope.
The structure code of AF (now, for the most part, conditional on the Connel
flag) was written to fix entries to conform to WT:ELE as it presently exists.
It is absolutely, utterly, NOT intended or supposed to impose or "push" some particular policy. (see, for example, the infix stuff going on infra, or noun with numbers, which are covered by the headers table for now). The purpose is to catch lots of basic picky formatting sh*t that users new and old do, without constantly snapping at them.
If we want to change WT:ELE, go for it!
But if policy is to be whatever doesn't get your head bitten off, no.
Once more: AF is not an instrument of policy. It does tend to expose problems, which is useful. When there are issues, it may result in a vote or whatever. Robert Ullmann 23:52, 2 June 2007 (UTC)
- Another note on that: lot of programmers (and such in other fields) get their egos attached to various things; and they have and cause serious trouble when that bit or piece isn't a good idea. I have been programming for 35+ years; I invented things that you use every single fracking day. I don't have my ego attached to the 6 lines of code under "if Connel" in AF. I do have my ego attached to making the en (and sw and rw) wikts the best damned resource and tool possible. Robert Ullmann 00:24, 3 June 2007 (UTC)
That said, maybe, possible vote, or whatever?
Connel: If the community feels strongly against my perception, then there should be a WT:VOTE that changes the structure dictated in WT:ELE. The new format would be that only language headings go at level two, only etymology & POS headings go at level three, and all "other" headings go at level four (or five if there are multiple etymology headings.) The only exception might then be "References."
- I would think that the vote would be to confirm WT:ELE, this is not a change: lang at L2, POS at L3, other at L4 is what it says?
- Everyone else seems to me to be fine with ELE? (mod as Connel says References and See also need to be L3/4)
- But please note that it doesn't matter whether it is a change or not: if we can confirm where we want to be/go that is fine. Robert Ullmann 23:52, 2 June 2007 (UTC)
Actually, no such structure is dictated in WT:ELE that I can see regarding whether the headers of contention are to be L3 or L4. The only place that suggests they are L4 is in the example structures given. While some sections are specified for L3 or L4, Related terms, Derived terms, and Descendants have no such specification given. So, specifying them as L4 would be a change to WT:ELE. And BTW, I don't think Anagrams should ever be anything but L3. --EncycloPetey 01:30, 3 June 2007 (UTC)
- Thank you. Yes, that is the "misinterpretation" that AF is doing, that I see. I thought Anagrams were supposed to be L4 underneath the L3 heading "Trivia"? Along with
{{rank}}
and Shorthand and a few other things...did Hippietrail never formalize that suggestion? --Connel MacKenzie 01:56, 3 June 2007 (UTC)
- Of course Anagrams is L3. So is Trivia (which used to be shown in the example); if either is nested somehow, AF just leaves it alone.
- As to Derived terms, that troublesome sentence states clearly that it is usually at L4, else there would be no need for the exception. (Regardless of how frequent or infrequent the exception is in practice). It says unequivocally:
IF (more than one POS AND not known) THEN header is at level 3, ELSE header is at L4 (the normal case without the exception)
There is no other possible "ELSE" clause. (level 5? 2?)
- This conveniently ignores the fact that when that text was mistakenly added, and for a very long time afterward (until AF?) 100% of "Related" and "Derived" (etc.) terms were at level 3. The wording being ambiguous in that light was understandable, or perhaps simply not noticed and corrected. Obviously, forcing them always to L4 is incorrect. (Think "fuzzy logic" for just a moment; the English language borrows and shades and commingles meanings back and forth a lot; why should the format incorrectly say that one sense is all that ever was and ever could be the origin of a flavor of a derivative? Especially when more than one sense/POS is directly responsible for a derivative's formation!) --Connel MacKenzie 16:17, 3 June 2007 (UTC)
And this is of course the way it should be; putting Derived terms at L3 in all cases would be a severe dumbing-down of the format. (What would we do? Have endless notes pointing out that such-and-such is known to be derived from the noun (see summit) but we had to put it ambiguously at L3 because Connel won't let us put it under the POS it belongs to?)
- Excellent example of what I mean: how could summiteer possibly not derive from the verb/participle, as well as the noun? --Connel MacKenzie 16:17, 3 June 2007 (UTC)
- But in fact it does not, and it is known not to. (Why I picked it of course!) Summiteer is easy to attest back into at least the 19th century. The verbification of the noun is a relatively recent phenomenon. And -teer words don't come from the verb forms in any case: a charioteer is not one who chariots, a rocketeer isn't one who rockets, and a musketeer isn't one who muskets. They are derived from the noun. Robert Ullmann 16:33, 3 June 2007 (UTC)
- Excuse me? Which noun, summit or summiting? I disagree with your assessment, here. --Connel MacKenzie 17:42, 3 June 2007 (UTC)
- But in fact it does not, and it is known not to. (Why I picked it of course!) Summiteer is easy to attest back into at least the 19th century. The verbification of the noun is a relatively recent phenomenon. And -teer words don't come from the verb forms in any case: a charioteer is not one who chariots, a rocketeer isn't one who rockets, and a musketeer isn't one who muskets. They are derived from the noun. Robert Ullmann 16:33, 3 June 2007 (UTC)
AF can't tell whether it ought to be "known" which POS, but it can tell when the exceptional condition cannot be met because there is only one POS, so, coded directly from ELE:
if inPos and header in L43: if npos < 2 and level < 4 + ety: if not Connel: level = 4 + ety act += ', header ' + header + ' to L' + str(level) else: levelact = ' (AutoFormat would have corrected level of ' + header +')'
(L43 is the set of headers that humans can put at L3 after multiple POS or L4 in a POS section as appropriate, ety is 0 if there is one or no ety section, 1 if there is more than one, and nesting is one level deeper) There are still cases that have to flagged out because a L4 header (often "Translations") is nested under something AF can't change (with or without "Connel" set).
It is extraordinarily frustrating to me that Connel keeps screaming about AF second-guessing humans when it is being extremely careful not to. Robert Ullmann 09:53, 3 June 2007 (UTC)
(repeating Connel's text again:) If the community feels strongly against my perception, then there should be a WT:VOTE that changes the structure dictated in WT:ELE. The new format would be that only language headings go at level two, only etymology & POS headings go at level three, and all "other" headings go at level four (or five if there are multiple etymology headings.) The only exception might then be "References."
(editing from that, I think Connel would want to allow Derived terms at L3 when needed?:)
- No, I would want "Derived terms" at L4, duplicated whenever necessary, to avoid this issue ever resurfacing! (That is, IF the community disagrees with me.) --Connel MacKenzie 16:21, 3 June 2007 (UTC)
The new format would be that only language headings go at level two, only etymology & POS headings go at level three, and all "other" headings go at level four (or five if there are multiple etymology headings.) When necessary because they refer to more than one POS, Derived terms, Related Terms, and Descendants go at L3 after the POS sections. The only exception might then be References, See also, and External links headers go at L3 at the end. Usage notes goes anywhere that nests properly.
Good?
Of course that is exactly what ELE says now (except for See also, and Descendants isn't clear) Robert Ullmann 10:07, 3 June 2007 (UTC)
- If it "says" that, then please quote the text that says so because I couldn't find it. This doesn't mean that Related terms and Derived terms can't go at L4, but there isn't anything at all in the ELE that says they must or should. --EncycloPetey 15:21, 3 June 2007 (UTC)
- Of course it says that: it gives a big huge utterly clear example of what an entry should look like, with Derived and Related terms shown very, very, clearly at L4. And shows them thus three times, with no variation. And then it carefully describes the exception when Derived terms must be at L3 instead. Robert Ullmann
- No, those examples are in conflict with the rest of the document, added by Ncik pushing his POV! --Connel MacKenzie 16:22, 3 June 2007 (UTC)
- Of course it says that: it gives a big huge utterly clear example of what an entry should look like, with Derived and Related terms shown very, very, clearly at L4. And shows them thus three times, with no variation. And then it carefully describes the exception when Derived terms must be at L3 instead. Robert Ullmann
- Again, could you please quote the passage that "says" that? --EncycloPetey 16:13, 3 June 2007 (UTC)
Okay: direct quote, copy and fracken paste from WT:ELE:
==English== ===Alternative spellings=== ===Etymology=== ===Pronunciation=== *Hyphenation *Rhymes *Homophones *Audio files in any relevant dialects ===Noun=== Declension #Meaning 1 #*Quotations #Meaning 2 #*Quotations etc. ====Usage notes==== ====Synonyms==== ====Antonyms==== ====Derived terms==== ====Related terms==== ====Translations==== ====References==== ====Further reading==== ===Verb=== Conjugation #Meaning 1 #*Quotations etc. ====Usage notes==== ====Synonyms==== ====Antonyms==== ====Derived terms==== ====Related terms==== ====Translations==== ====Descendants==== ====References==== ====Further reading==== ===Anagrams=== ---- (Dividing line between languages) ==Finnish== ===Etymology=== ===Pronunciation=== ===Noun=== '''Inflections''' #Meaning 1 in English #*Quotation in Finnish #**Quotation translated into English #Meaning 2 in English #*Quotation in Finnish #**Quotation translated into English ====Synonyms==== ====Example sentences==== Generally, every definition should be accompanied by a quotation illustrating the definition. If no quotation can be found, it is strongly encouraged to create an example sentence. Example sentences should: :*be grammatically complete sentences, beginning with a capital letter and ending with a period, question mark, or exclamation point. :*be placed immediately <em>after</em> the applicable numbered definition, and <em>before</em> any quotations associated with that specific definition. :*be ''italicized'', with the defined term '''boldfaced'''. :*be as brief as possible while still clarifying the sense of the term. (In rare cases, examples consisting of two brief sentences may work best.) :*be indented using the "#:" command placed at the start of the line. :*for languages in non-Latin scripts, a transcription is to be given in the line below, with the same indentation. :*for languages other than English, a translation is to be given in the line below (i.e. below the sentence or below the transcription), with an additional level of indentation: "#::". The goal of the example sentences is the following, which is to be kept in mind when making one up: :# To place the term in a context in which it is likely to appear, addressing level of formality, dialect, etc. :# To provide notable collocations, particularly those that are not idiomatic. :# To select scenarios in which the meaning of the example itself is clear. :# To illustrate the meaning of the term to the extent that a definition is obtuse. :# To exemplify varying grammatical frames that are well understood, especially those that may not be obvious, for instance relying on collocation with a preposition. --End-- I’ll edit the example lay-out accordingly. (Note that I made the indent be #:, since it is not to break up the numbering, of course.) I also added a few lines about languages other than English. I’dd also add the following sentence to the header ‘Quotations’: ‘Quotations are prefered over example sentences. However, nothing stops you from providing both. In some cases, it might be reasonable to remove the example sentence, if the quotation exemplifies the same use. Quotation are generally put under the definition which they illustrate. If there is both an example sentence and a quotation, the quotation follows the example sentence.' Any objections/changes? Suggestions for better wordings are wholeheartedly welcomed, I still feel constrained when having to write in English, not finding the words with the connotation I want… [[User:Hamaryns|H.]] <small>([[User talk:Hamaryns|talk]])</small> 16:41, 25 June 2007 (UTC) :Looks good, although it is a bit wordy. That might be helped by a couple of clean examples, perhaps one without a direct citation following, and one example with a citation following in order to make it clear that the examples and citations can work together. --[[User:EncycloPetey|EncycloPetey]] 16:45, 25 June 2007 (UTC) :I'd put a vote up anyway -- people are a bit [[tetchy]] about changing ELE, though [[w:WP:SNOW]] may apply. Also, I'd like to suggest putting transliteration at the same added indentation as translation: '''その''' (''romaji'' '''sono''') # that #:'''その'''人はモンキーです。 #::'''sono''' hito wa monkii desu. #::'''That''' person is a monkey. <!-- first thing to pop into my head, sorry --> :... but I guess it's fine either way. Looks good! [[User:Cynewulf|Cynewulf]] 19:28, 25 June 2007 (UTC) :: It would also be nice to italicize transliterations consistently: ::# that ::#:'''その'''人はモンキーです。 ::#::'''''sono''' hito wa monkii desu.'' ::#::'''That''' person is a monkey. :: [[User:Rodasmith|Rod]] <small>([[User talk:Rodasmith|A. Smith]])</small> 20:41, 25 June 2007 (UTC) :::Er -- right. Oops. [[User:Cynewulf|Cynewulf]] 20:43, 25 June 2007 (UTC) *If it's such a shoo-in, why bother skipping the vote? Just to give naysayers ammunition? It is rare to have a completely non-controversial vote - frankly I think I'd enjoy that change from tradition. :-) --~~~~ ====Related terms====
Or is that not supposed to be part of the document? If so I missed the part that says "ignore the example above, it was just put there to look pretty, and is actually utterly meaningless" Robert Ullmann 16:22, 3 June 2007 (UTC)
- Yes, that is meaningless. Can we please move forward, and put this to a "frackin" vote, so we can get a sense of what the comminity actually feels on this issue? --Connel MacKenzie 16:24, 3 June 2007 (UTC)
- Sigh. The example that everyone follows means nothing?
- Will take a bit of care to write the vote. Robert Ullmann 16:46, 3 June 2007 (UTC)
- Question: what did AF just do to live? How is live down derived only from the verb? --Connel MacKenzie 17:44, 3 June 2007 (UTC)
- It shifted the POS sections down a level within the numbered ety sections. Derived terms (live down) was already only under the verb in ety one; AF didn't change that. Please look at the previous version? Robert Ullmann 18:34, 3 June 2007 (UTC)
Proposed edit of WT:CFI
Does the community approve of these two revisions which I recently made to WT:CFI? If there are no objections, I'll take the proposed revisions to WT:VOTE. † Raifʻhār Doremítzwr 15:27, 2 June 2007 (UTC)
- I have already stated my opposition, have I not? To repeat: it is incorrect to assert typographical quotation marks as preferable to ASCII quotation marks. It is incorrect to assert that the English language has two new parts of speech, which it does not. --Connel MacKenzie 15:39, 2 June 2007 (UTC)
- Both typographical and ASCII quotation marks and apostrophes are used on the WT:CFI. Whilst I prefer the former, I believe that to have one consistent style, irrespective of which one, is better than having a mix of the two, wouldn't you agree? As for the affix question, that is presently being discussed on WT:RFD (here and here), so the approval or disapproval of my revisions will have to wait until that discussion is concluded. † Raifʻhār Doremítzwr 15:49, 2 June 2007 (UTC)
- Obviously, I don't agree that one consistent style is wanted, there. If one were to be chosen, it would certainly be the convention of using only ASCII quotes.
- The mere existence of the RFD discussions is an indication that the assertion of new parts-of-speech within the English language does not have widespread support. Retention of either of those entries (in some acceptable form) does not merit the assertion you are making: presumably, that numerous language authorities (of which officially there are none) have agreed that English, as a language, has new parts of speech. That is an incredulous claim, on the face of it. On closer inspection, it is nearly impossible.
- --Connel MacKenzie 16:01, 2 June 2007 (UTC)
- The Concise Oxford English Dictionary (11th Ed.) lists both -i- and -o-. † Raifʻhār Doremítzwr 16:06, 2 June 2007 (UTC)
- ORO doesn't. Cambridge doesn't. Bartelby doesn't. ARTFL doesn't. Wordnet doesn't. Urbandictionary does. --Connel MacKenzie 16:19, 2 June 2007 (UTC)
- Is their omission a failure of ORO's, Cambridge's, Bartelby's, ARTFL's, and Wordnet's or is their inclusion a failure of the COED's and Urbandictionary's? And to what extent ought we to rely upon argumenta ad verecundiam? By the way, Wiktionary lists two more infixes: -bloody- and -fucking-; if you deny infixes' existence in English, shouldn't these also be added to the WT:RFD list? † Raifʻhār Doremítzwr 17:03, 2 June 2007 (UTC)
- The infix -o- is also listed in The American Heritage Dictionary of the English Language (4th Ed.), The American Heritage Stedman's Medical Dictionary, and Dictionary.com Unabridged (v 1.1), as seen here. † Raifʻhār Doremítzwr 17:30, 2 June 2007 (UTC)
- Thanks for finding the two other incorrect entries. As they are marked slang, (obviously) there is less to fix there; certainly "Infix" is not an appropriate heading, even if it is a decent explanation of those joke terms.
- You are implicitly suggesting that the English language widely recognizes infixes. That obviously is not true, so it is up to you to provide ample evidence that it does. --Connel MacKenzie 18:24, 2 June 2007 (UTC)
- Well, what heading would you prefer? Would you like expletive infixes to be marked "Expletive that gets inserted into in the middle of words", and infixes like -o- to be marked "Vowel used to connect certain cran-morphed wordstems"? Because I think it's a lot more practical to just call them both "Infix", so we don't need to add a new POS header every time someone adds an infix that doesn't fall into one of these categories. (Or are you arguing that because the word "infix" is uncommon, we should delete-on-sight the entry for any infix?) —RuakhTALK 19:57, 2 June 2007 (UTC)
- Me personally? I'd like it labelled ===Garbage===. :-) The heading "Infix" is valid in other languages, but not in English. User:Doremitzwr (on another front) audaciously suggesting that -ology and -logy should be merged, based on the incorrect premise that English has an -o- infix. (While -ology and -logy are related, they are not the same.) Suffixes are mostly additive in English; we don't have stuff in the middle (outside of those two humorous slang interjections/intensifiers.) Instead one suffix is added to the next. (Confer: prioritization.) The -o- isn't an infix, it is simply a joining rule, probably mentioned in an obscure grammar tome somewhere, or listed as an exception. While calling it a "Suffix" would be wrong, I don't see the need to suddenly claim that English has a new part of speech. It would be greatly beneficial to not mislead people by thinking that -o- is anything in English at all. The same, obviously, cannot be said for Greek. The English section of -o- should simply be put through RFD and eliminated. Delete-on-sight? Perhaps. Maybe someone can rework it with the ===Letter=== heading. --Connel MacKenzie 22:53, 2 June 2007 (UTC)
- Part of the reason that WT:CFI currently says "Prefixes and Suffixes" is that Prefix and Suffix are accepted as standard POS headers. By contrast, Affix is not a standard header, and neither is Infix. Before editing WT:CFI, you'd have to have a successful vote to change/extend current header use. In any case, the document should not say "Affixes" because that would lead to people using that as a header; they should not. --EncycloPetey 17:38, 2 June 2007 (UTC)
COW reminder
Just a reminder that we have a Collaboration of the Week project. Each week, three words are selected for improvement edits (see prject page for details). This week's words have been:
|
|
|
...yet very little editing has happened. These words are certainly used often here because of their grammatical definitions, and it would be nice to see them in good shape. You don;t have to make the articles perfect, just find a bit you can improve, whether it be formatting, etymology, example sentences, quotations, synonyms, and so forth. --EncycloPetey 02:59, 3 June 2007 (UTC)
- OK, I've given "laugh" a good going over. — Paul G 16:15, 5 June 2007 (UTC)
New vote on placenames
Started a new vote: Wiktionary:Votes/pl-2007-05/Placenames 2. Cheers! bd2412 T 03:12, 3 June 2007 (UTC)
- Thank you. Are you going to vote on it, also? :-) --Connel MacKenzie 03:58, 3 June 2007 (UTC)
- And are you planning to list it at WT:VOTE? --EncycloPetey 04:01, 3 June 2007 (UTC)
- I did that. Obviously a minor error. --Connel MacKenzie 04:24, 3 June 2007 (UTC)
- And are you planning to list it at WT:VOTE? --EncycloPetey 04:01, 3 June 2007 (UTC)
Headers in ELE set vote
Okay, especially since Connel has told me to get on with it ;-) We'll see how it goes ...
Vote set at Wiktionary:Votes/pl-2007-06/Headers in ELE.
Everyone please look at this; the changes and clarifications are small but significant. Robert Ullmann 18:27, 3 June 2007 (UTC)
- I think you're trying to do too much with a single vote. There are at least three or four separate votes combined. --EncycloPetey 18:39, 3 June 2007 (UTC)
Timeline (times UTC, in Nairobi +3, 9:20 PM): (while my SO has just come back from visiting relatives in Kitale for 3 days, and wants to tell me about eating samaki and ugali and asking me what I've eaten and wants me to look at a bill for her hair salon that wasn't budgeted and a letter from her sister and watch two different stories on KTN and Citizen and ...)
- 18:23 I save the vote page
- 18:25 EP votes to oppose
- 18:27 I write this section announcing it
EP complains I was "springing it on the community in the vote page" but he voted two minutes before the community even got to see it!
sigh
I would have thought that with a 30 day policy vote, people might think about it for a while? I mean, there is no hurry? We could have maybe discussed it a bit? Instead, EP etches the vote in stone two minutes before it is announced, without even knowing if I had finished editing it and then complains it isn't structured correctly?
sigh
it could have been edited
sigh
please look at the vote, which is now immutable, except that we can of course re-vote and change anything we want, and vote and comment, here or there or talk pages as you please
it will accomplish what Connel was asking for, which is collecting some community opinion Robert Ullmann 22:48, 3 June 2007 (UTC)
- Let me lay this out briefly:
- Did you post your vote concept for discussion before starting the vote? No.
- Therefore, you sprung the idea on the community in the vote, rather than in advance.
- Discussions should happen before votes because it always gets too messy to have them during the vote. I noted other objections as well. If you disagree with my first point above, then please provide a link to the discussion on your vote that happened before the vote began. --EncycloPetey 16:22, 4 June 2007 (UTC)
Is the name Irene "English"?
This name occurs in a great many countries (rather than languages) with the same exact spelling. Nevertheless it is listed as an English proper noun only. Shouldn't there be made some allowance for such names to be listed as proper names also in other languages? Is translingual the correct language code to use here?
Having written the above I come to think that proper names, hadn't they better be categorized by country instead of by language? At least for some categories of proper names. Although in some (not very many, mind you) this scheme won't work (Munich – München; Gothenburg – Göteborg). __meco 09:05, 4 June 2007 (UTC)
- I would advocate "Translingual" in that case, although given that the canonical pronunciations vary, perhaps there should be separate headings for each language. I don't think, however, that categorizing by country would be helpful here; Wiktionary is a linguistic resource rather than a geographic one. -- Visviva 13:18, 4 June 2007 (UTC)
- As I haven't yet learned how to understand the pronunciation stuff, I didn't think about that. That of course merits individual entries for all sorts of languages no matter what else. __meco 14:53, 4 June 2007 (UTC)
- Why not just repeat it for the languages it applies to, with the etymology sections pointing to the first/oldest one? "Irene" is a female given name in English... --Connel MacKenzie 16:39, 4 June 2007 (UTC)
- Agreed, Translingual is not appropriate at all here. The entry should start with English and then have a section for every language the name is commonly used in, ideally with some good etymology. Atelaes 16:56, 4 June 2007 (UTC)
New words in the Collins English Dictionary 9th edition
According to this, the 9th edition of the Collins English Dictionary published today recognizes the following new words and idioms: muffin top, Tamiflu, brainfood, Gitmo, man-bag, season creep, man-flu, Londonistan, hoodie, plasma screen, carbon footprint, celebutante, carbon offset, 7/7, wiki, pro-ana, WAGs, size zero, rendition, and McMansion. Uncle G 14:54, 4 June 2007 (UTC)
- Wow, did someone go on a page-creation spree after you listed all those, or is Collins just super slithering snail slow? DAVilla 18:24, 6 June 2007 (UTC)
- Page creation spree. About 2/3rds were red when he posted it. --Connel MacKenzie 00:15, 8 June 2007 (UTC)
Category English irregular verbs.
There seems to be some confusion as to whether irregular phrasal verbs should be included in this category. My feeling is that this category would be a useful list of Eng irregular verbs if only simple forms were allowed. There are about (very very roughly) 150 irregs. But if we add the 1,000+ phrasal verbs that are formed from irregular verbs already in this list, then it becomes cluttered, cumbersome, and far from useful. Opinions? Algrif 13:42, 4 June 2007 (UTC) The same goes for similar categories, eg English irregular simple past forms, and English irregular past participles. (by the way, I am just as guilty as anyone else for placing some phrasal verbs into these categories, which is why I would like some policy clarification for the sake of harmony). Algrif 14:16, 4 June 2007 (UTC)
- Agreed; phrasal verbs ought to be omitted. † Raifʻhār Doremítzwr 14:53, 4 June 2007 (UTC)
OK. I'll start cleaning up a bit. Algrif 16:47, 4 June 2007 (UTC)
- I agree. Go for it. --Connel MacKenzie 17:10, 4 June 2007 (UTC)
- FWIW: I thought that all multi-word entry titles are not supposed to use the full inflection templates (at least for verbs) but rather are supposed to link each component headword, so a reader can look up the proper inflection from the component words. Common idiomatic variants redirect to the main entry of the multi-word form, so someone looking up a form of a "phrasal verb" should end up at the root of that form, which links (on the inflection line) to the component words which do expand the inflections. Clear as mud? --Connel MacKenzie 17:13, 4 June 2007 (UTC)
- My working assumption is that if it's a lemma form, it has the full set of sections and an inflection line. Only a non-lemma wouldn't have the full inflection line. --EncycloPetey 19:01, 5 June 2007 (UTC)
- That is not my understanding, for multi-word terms. Rather than expanding them all out to ridiculous lengths, the root form (only) is given, with each component word "wikified". --Connel MacKenzie 08:55, 11 June 2007 (UTC)
- My working assumption is that if it's a lemma form, it has the full set of sections and an inflection line. Only a non-lemma wouldn't have the full inflection line. --EncycloPetey 19:01, 5 June 2007 (UTC)
- FWIW: I thought that all multi-word entry titles are not supposed to use the full inflection templates (at least for verbs) but rather are supposed to link each component headword, so a reader can look up the proper inflection from the component words. Common idiomatic variants redirect to the main entry of the multi-word form, so someone looking up a form of a "phrasal verb" should end up at the root of that form, which links (on the inflection line) to the component words which do expand the inflections. Clear as mud? --Connel MacKenzie 17:13, 4 June 2007 (UTC)
As mud. Thanks. Can you direct me to a good example so I can follow the accepted format, please? Algrif 15:11, 5 June 2007 (UTC)
Auto hiding inflection templates
There is a divide on wiktionary between inflection templates that hide themselves and those that dont. See {{it-conj-are}} and {{fr-conj-er}}. Has a consensus been reached as to which of these is preferable? Conrad.Irwin 21:35, 4 June 2007 (UTC)
- I don't know if there's a consensus, but my personal preference is for ones that hide themselves, and I tend to assume that non-self-hiding ones were created by editors who didn't know how to do the hiding thing (or perhaps were created before it was possible). That's not counting the small, right-aligned templates like {{sv-noun}}, though, which seem fine as they are. —RuakhTALK 23:12, 4 June 2007 (UTC)
- The "hiding" conjugation tables is a very new phenomenon (in Wiktionary time) and as far as I knew, still experimental. I like them a lot. But it certainly hasn't been voted on (nor discussed enough) yet. --Connel MacKenzie 08:04, 5 June 2007 (UTC)
- I would prefer them to be seen rather than hidden. Could they be hideable - ie like hidden things but with the default the other way round? SemperBlotto 08:09, 5 June 2007 (UTC)
- I prefer them to be hidden, as when using wiktionary I look for the meaning of words, not the linguistics of them. Something that should probably be mentioned is that the
border-collapse: collapse;
in theNavFrame
element of the Mediawiki:monobook.css stylesheet causes the tables to be rendered with no borders in Opera 9(XP) when they are hidden using a NavFrame, this is not a major problem, but a noticable inconsistency. Conrad.Irwin 09:32, 5 June 2007 (UTC)
- I prefer them to be hidden, as when using wiktionary I look for the meaning of words, not the linguistics of them. Something that should probably be mentioned is that the
- Because they are using the same wrapper class as Translations, the WT:PREF should show/hide them identically. I agree with Conrad.Irwin regarding the definition vs. linguistics distinction. I do not know my CSS well enough to fix the problem described, for Opera9. (That should be mentioned on WT:GP, not here.) --Connel MacKenzie 15:21, 5 June 2007 (UTC)
- Some inflection tables are quite large and overwhelm the page; these at least should be hidden for default. If we need consistency, then , I would vote for hiding them all by default. ArielGlenn 02:37, 8 June 2007 (UTC)
Proposed addition to CFI
Two users attempted to argue for the retention of Arsenal (an English football club) using the argument that translations into languages that use other scripts (such as Russian and Chinese) will have different forms. We don't accept this argument, as far as I am aware, but we don't say so in the section on names in WT:CFI.
I think we need to, as part of the ongoing review of which proper nouns are acceptable. Is everyone happy for me to add the following phrase to that section?
"The fact that names have translations or transliterations into languages that do not use the Latin alphabet is not, in itself, a valid argument for the inclusion of a name in Wiktionary. Such translations and transliterations can be (or will eventually be) found in Wikipedia."
And I've said this several times, but can we discuss and finish the review of what types of names we are allowing? (If we are already doing this somewhere else, please direct me to it.) — Paul G 15:58, 5 June 2007 (UTC)
- That proposed statement is wrong, for the simple reason that Wikipedia leaves the job of translations to us. Wiktionary is, according to the main page, a translating dictionary and the "lexical companion to Wikipedia". We have all of the infrastructure for articles that contain translations, and our articles are geared towards such things; Wikipedia does not, and its articles are not. We've long been the lexical companion to Wikipedia in regard of providing translations, and should continue to be. We are quite happy to translate Rome into 18 languages, for example.
The question that you should be asking is whether there actually are translations for Arsenal, rather than trying to add a new rule that contradicts long-standing practice and one of our fundamental goals as expressed right at the top of the main page. If the word does not actually have translations into other languages, then you can rebut the argument being presented without having to change the Criteria at all. Uncle G 17:51, 5 June 2007 (UTC)
Seconded. Your text sounds like a good clarification, to me. I don't see a problem with starting a WT:VOTE (one month) on that wording.Perhaps Uncle G's concerns need to be addressed first. --Connel MacKenzie 17:56, 5 June 2007 (UTC)- The "placenames/landmarks/grographic features" vote is currently in progress. I'm waiting to see how that turns out before broaching the wider subject of proper nouns again. I think others have stated similar hesitation. --Connel MacKenzie 17:55, 5 June 2007 (UTC)
- Uncle G, I think you are mistaken. Including "Arsenal" probably (it is currently unclear) goes against WT:CFI. Yes, we certainly should provide translations of "Rome", but not for every proper noun, as this requires us to include every proper noun, and, as WT:CFI shows, we do not do this, and will not even after we have resolved the current debate on what names should we include. There are already many translations of "Arsenal" linked to from the the Wikipedia article, should anyone need them. — Paul G 18:02, 5 June 2007 (UTC)
- What makes you think "every proper noun" has established, unpredictable translations in other languages? Kappa 14:54, 6 June 2007 (UTC)
- Not that this is universally applicable to the question, but would the translation of "Arsenal" in most any language that uses a different word to identify the team simply be the translation of arsenal? bd2412 T 03:17, 7 June 2007 (UTC)
- Besides going against CFI, the general proper noun is excluded de facto, so any arguments at Wikipedia about us translating them are simply uninformed. For instance, how many, or are there even any, full names of people here? In my mind it isn't even feasible at this point. If the general consensus at Wikipedia is that the translations belong here, then nothing is going to be resolved until that's formalized, and only then begun to be addressed. And thereafter we're going to have to see a lot more cross-project coordination. Wiktionary does not have any concept of notability. DAVilla 18:19, 6 June 2007 (UTC)
- I will repeat my occasionally made argument that full names are non-idiomatic combinations of their components - given names and surnames. The components belong in Wiktionary. Full names should be excluded as non-idiomatic, not simply because they are full names. bd2412 T 03:15, 7 June 2007 (UTC)
- Interesteing. I've wondered if a few full names I've heard match my own proposed criteria, that being out of context, given that I didn't get the point when they were used, mainly in comedy. But what you're leaning towards is a statement something like, proper nouns can be included so long as they're notable? That would be just that much broader. I think you'd have the support of an entire sister project then. DAVilla 15:40, 7 June 2007 (UTC)
Misspellings (technical improvement)
Due to the recent Wiki-software improvement of displaying the deletion log for deleted entries to all users, I think we should reconsider how we handle misspellings.
Simply deleting the entry now, with a link to the correct spelling, should accomodate accommodate everything we need to say about misspellings. Anyone doing a lookup on the incorrect spelling will be one click away from the correct spelling (without misleading them into thinking they spelled it right) while at the same time, allowing red-links to appear red.
Furthermore, Protected titles could have a section just for misspellings.
Sadly, CFI-passing misspellings will be a thornier issue and for now, will have to continue to be handled as we currently do. But if the deletion method is shown to work well for common terms, we can revisit what our criteria for misspellings should be (obviously, I feel they need to meet a much higher standard than we currently require.)
I propose changing WT:DELETE and WT:CFI to reflect the new technical ability of the MW software. Further clarification on misspellings would be dealt with sometime in the future. Comments are appreciated.
--Connel MacKenzie 17:41, 5 June 2007 (UTC)
- Was your misspelling of "accommodate" deliberate? ;) — Paul G 18:05, 5 June 2007 (UTC)
- Sadly, no. It does reinforce my statement, though. --Connel MacKenzie 18:14, 5 June 2007 (UTC)
- That's a good idea. I don't think we currently have any policies that encourage us to create entries with the intent of immediately deleting them, but if doing so doesn't bother you, it certainly doesn't bother me. —RuakhTALK 20:43, 5 June 2007 (UTC)
- The only problem that i can see with this is that only administrators can delete pages, meaning that most users will be unable to help with this. Conrad.Irwin 21:08, 5 June 2007 (UTC)
- Agree with the idea. And users can help. They just RFD the page to draw it to the attention of the administrators.--Richardb 11:08, 6 June 2007 (UTC)
- This will work particularly well if you create a specific template (
{{RFD-misspelling}}
perhaps) that categorises the pages into a subcat of the regular RFDs, akin to how speedy deletion templates work on en.wp. Thryduulf 18:39, 6 June 2007 (UTC)
- This will work particularly well if you create a specific template (
- Sounds good. Should I set up
{{delete-misspelling}}
similar to{{delete}}
then, for an experimental phase? --Connel MacKenzie 08:43, 8 June 2007 (UTC)
- Sounds good. Should I set up
sisterlinks
Hello, I would like input on placing the a wiktionary friendly template sisterlinks on each word or at least on main words. This could become a new and encompassing policy. This would make it easier for users to negotiate the numerous available links to more information on words. The template could be added to existing articles using a bot account. WritersCramp 20:40, 5 June 2007 (UTC)
- I dislike this template here. It works fine on Wikipedia, where the typical article length is much greater, but here the articles would be dwarfed by it. It's also unlikely that the majority of our words could benefit from this template. How many Commons images will there be for like? How many Wikiquotes on the? What useful Wikisource links could there possibly be? No, this template should not appear on entries. --EncycloPetey 21:45, 5 June 2007 (UTC)
- I am also against this proposal, It is a generic template that is too easy to add. While the individual templates like {{wikipedia}} are fine, because they are only added when there is definitely something at the end of the link, adding a boxful of unresearched links to any page merely clutters it up and provides frustration for those who, assuming blue links go somewhere, find that few relevant search results are returned. Although I assume that these are intended only to be added to articles where all of the links return useful information, it then becomes too great a temptation to add it where some or even most of the links are useless. Certainly on wikipedia I have found that I ignore the {{sisterlinks}} because, from experience, I have found them to be unhelpful most of the time. Conrad.Irwin 22:17, 5 June 2007 (UTC)
- I am opposed not only to the use of this template, but to the current use of
{{wikipedia}}
, which should correspond solely to disambiguation pages. Specific Wikipedia titles correspond to specific definitions, and we need to work out a way to link page to several w:page (area) listings, especially for nouns with several definitions, common and technical. A separate solution would need to be worked out for each of the sister projects. DAVilla 18:02, 6 June 2007 (UTC)
- I am opposed not only to the use of this template, but to the current use of
- I agree in principle. The Wikipedia link should be to whatever page on Wikipedia bears the anadorned identical page name (unless we have a proper noun carrying only one possible sense, but Wikipedia has gone crazy with spinoffs, perhaps). One solution exists for specific senses in the form of the
{{pedialite}}
template. See Afar for a page that uses this template as I imagine it should be used: in the See also or External links section. --EncycloPetey 18:13, 6 June 2007 (UTC)
- I agree in principle. The Wikipedia link should be to whatever page on Wikipedia bears the anadorned identical page name (unless we have a proper noun carrying only one possible sense, but Wikipedia has gone crazy with spinoffs, perhaps). One solution exists for specific senses in the form of the
" instead of "See also"? --Connel MacKenzie 03:39, 7 June 2007 (UTC)
- I've extended that with a new template
{{pedia}}
as illustrated at English. DAVilla 19:59, 6 June 2007 (UTC)
- I've extended that with a new template
- Wow, it seems to have broken the "never more than three lines total" rule for that type of box-thingy. Shouldn't that section be "===Further reading===
" instead of "See also"? --Connel MacKenzie 03:36, 7 June 2007 (UTC)
- I consider External to mean outside of Wikimedia. After all, we link our etymology templates to Wikipedia without noting it's external, and some definitions of propoer nouns do the same. I see no problem with setting
{{pedialite}}
links under See also based on this. --EncycloPetey 16:24, 7 June 2007 (UTC)
- I consider External to mean outside of Wikimedia. After all, we link our etymology templates to Wikipedia without noting it's external, and some definitions of propoer nouns do the same. I see no problem with setting
- I think
{{wikipediapl}}
is better; it has less random whitespace, and it's more obvious where the link is. (My initial assumption when I saw{{pedia}}
was that "Wikipedia" linked to information about Wikipedia and "disambiguation page" linked to information about disambiguation pages.) —RuakhTALK 14:50, 7 June 2007 (UTC)
- I think
- Don't like the links? Okay, thanks for the feedback. I'll change them back to the standard. As for the look, okay, so it's not polished. Give me a chance to clean it up. My motivation was not the look, it was the function. See, for example, fish, which I don't really know how to fix actually. DAVilla 15:31, 7 June 2007 (UTC)
- For another way to do this, how about adding links on the left margin, where the interwiki links are for other languages? They could even be coded similarly. DAVilla 15:07, 7 June 2007 (UTC)
- To elaborate on the opposition, it would make sense to link between projects correctly, just as links on Wikipedia itself are disambiguated to point to the correct, relevant topic, for the convenience of users. Otherwise, you're asking every user to rediscover every search for him- or herself. In that sense, I don't think the sisterlinks template works well even on Wikipedia.
- If you want people to be able to search one project from another, and if the ability to just go to that site isn't enough, then how about adding it as a search option? I had long ago made a similar proposal for the other language projects by using a [drop bar]:
- SEARCH FOR
- _________
- IN ANY LANGUAGE
- WITH DEFINITIONS IN
- [English]
- SEARCH FOR
- Another drop bar could select the project, although it would have to reword the text somewhat. DAVilla 15:07, 7 June 2007 (UTC)
- On the other hand, w:Template:sisterlinks could be very useful on a failed search page. DAVilla 15:31, 7 June 2007 (UTC)
- Agree with the negtives. Doesn't look useful to me.--Richardb 11:04, 6 June 2007 (UTC)
- I also agree with the rejection of the use of this template, it just seems to be linking for the sake of it.--Williamsayers79 13:02, 6 June 2007 (UTC)
Inconsistency between processing of {{idiom}} and {{context|idiom}}
The discussions of idioms initiated above, on 21 May and 29 May, took lots of twists and turns but I don't think they really addressed something which is starting to rub me the wrong way. I hate to make a big deal out of this or risk starting a pissing contest, but I have just realized that there seems to be a fly in the ointment in the way English idioms are being auto-categorized. If an editor uses the template {{idiom}} then a parethetical (idiomatic) is inserted at the start of the definition and the entry is automatically categorized under Category:Idioms. But if the editor uses, say, {{context|US|idiom}}--or even just {{context|idiom}}--then a parethetical (idiomatic) is still inserted at the start of the definition but the entry is now automatically categorized under Category:English Idioms. The result is a dreaded thing in a dictionary--inconsistency, with entries (including my own) that rightfully belong in a single category spread randomly over two. All of this leaves me a bit green about the gills. Would it really be such a tall order for one these categories to be emptied into the other and for the functioning of the two templates to be rendered consistent? -- WikiPedant 03:50, 6 June 2007 (UTC)
- The two uses are supposed to be identical. There is a problem with
{{language}}
, which is called from{{idiom}}
, that might be to blame. I've been meaning to look into the former for some time, and I'll do so tonight. DAVilla 04:55, 6 June 2007 (UTC)- Yes I think it is a good idea to rationalise these templates and have a single category. It had put me off tidying up idioms in the past.--Williamsayers79 13:00, 6 June 2007 (UTC)
- Thank you, DAVilla, for looking into this. I hope it's readily fixable. -- WikiPedant 16:30, 6 June 2007 (UTC)
- Okay,
{{language}}
is fixed. Most everything should be under Category:English idioms shortly. The remainder are hard-coded cats or cases where the language code cannot be deciphered. DAVilla 02:32, 29 June 2007 (UTC)
Idea about "Random Page"
- Large portions of discussion here moved to Wiktionary:Grease pit#Random page DAVilla 23:11, 8 June 2007 (UTC)
It seems somewhat pointless to have a random page section which can link to words from any language. The only function for a random page capability in Wiktionary would be to learn new words from the language which you speak, or desire to speak, and not from other languages. I think it would be a good idea to have the ability to view random pages from one's desired language only, english, spanish, etc. But as I am browsing for words I don't know, I find that I must first go through many words languages I don't quite care to see.
Would this kind of idea be possible to put in to use?
--68.41.43.166 01:46, 7 June 2007 (UTC)
- Yup. http://tools.wikimedia.de/~cmackenzie/rnd-en-wikt.html
- Or http://tools.wikimedia.de/~cmackenzie/rnd-wikt.html (to pick a language other than English.)
- --Connel MacKenzie 03:21, 7 June 2007 (UTC)
- To anyone using the old interface: the language parameter changed recently, from the 639-2 code, to the full language name instead. For example, instead of "lang=ru" use "lang=Russian"; instead of "lang=gae" use "lang=Scottish_Gaelic", etc. --Connel MacKenzie 06:04, 7 June 2007 (UTC)
- Further discussion moved to the Grease Pit. Please see Wiktionary:Random page for some basic information on how this will be set up. DAVilla 23:01, 8 June 2007 (UTC)
Make or Do
Collocations with the verbs make and do cause a lot of problems for non-native English speakers. I'm not sure if this is within the scope of this project, but I feel that it would be useful if there was a category(?) or other means of separating the expressions / collocations that use these two verbs. Make the bed. Make a phone call. Do the shopping. Do your homework. etc etc etc. Ideas or comments, anyone? Algrif 16:07, 7 June 2007 (UTC)
- I think an Appendix article might be more useful (e.g. Appendix:make). --EncycloPetey 16:18, 7 June 2007 (UTC)
- My initial reaction, too, is for an appendix entry. On the other hand, Paul G has done some really good stuff with "hidable" 'Related terms' sections. By merit of being in the entries in question, the 'Related terms' approach might be much better. --Connel MacKenzie 16:40, 7 June 2007 (UTC)
- There's no reason we couldn't do both. A list of Related terms certainly has uses, but an Appendix can provide a comparison and summary more easily than a list of terms can. --EncycloPetey 17:03, 7 June 2007 (UTC)
If it is possible to do both, that sounds good to me. If we can give the go-ahead, I would be happy to do the donkey work. Algrif 15:04, 8 June 2007 (UTC)
- I don't see any problem with either or both approach. --Connel MacKenzie 19:37, 8 June 2007 (UTC)
Tell me when it is set up and I'll get cracking. Algrif 12:28, 9 June 2007 (UTC)
The Vietnamese Wiktionary has a few tables like this: be, for instance. I included archaic forms because Shakespeare is quoted so often. – Minh Nguyễn (talk, contribs) 07:26, 11 June 2007 (UTC)
- Could an admin. set this appendix / hideable related terms up so I can start filling them in, please? Algrif 11:13, 12 June 2007 (UTC)
Do we have a template for foreign laguage words lacking English translation?
The German word None doesn't have an English translation. When originally posted the edit summary read: I have no idea what the proper English word for this is. If someone knows it, please insert. I figure there ought to be a template for this type of situation. __meco 18:20, 7 June 2007 (UTC)
- Maybe
{{rftrans}}
? That seems to be mostly unused. --Connel MacKenzie 19:35, 8 June 2007 (UTC) - Perhaps
{{substub}}
is better for that example. --Connel MacKenzie 19:35, 8 June 2007 (UTC)
- Category:Requests for language cleanup June is what I check on. I don’t believe I’ve encountered
{{rftrans}}
or{{substub}}
before. —Stephen 01:50, 9 June 2007 (UTC)- Yes, entries tagged with {{subst:nolanguage}} are obviously more quickly eliminated. --Connel MacKenzie 08:51, 11 June 2007 (UTC)
- Category:Requests for language cleanup June is what I check on. I don’t believe I’ve encountered
An idea for automatic information gathering from users
Dear Wiktionary community,
I tried to participate in development of few articles in English and in Russian. The problem is that I'm not a professional linguist and cannot contribute with high level material but, at the same time, willing to do so.
Creation of good level articles takes much longer time for me and costs much more efforts, than similar contribution would take to a professional linguist. I think I (and many other people like me) can bring some valuable information into the project.
THEREFORE!!! I suggest a new way of gathering information from regular users, which can be later edited by professionals.
The information can be collected by asking users to answer questions like "can you translate this word/phrase/sentence?" or "do you think this *** is a correct translation of ***?" or "do you know a plural/singular/... form of *** word?", "can you give a sample of phrase/sentence with this *** word?" and many others.
This way some elementary information can be gathered and checked (by the users as well), which can be later sorted by professional linguists and editors.
It would need a button like "do yo want to contribute to the project by answering few questions?" to appear here and there leading to a page asking on which languages the user can contribute (translations and single language information). Then it should be possible to "answer", "not answer", "stop answering"....
What do you think about it? Regards Leonid Paramonov
- I think that is the best, novel suggestion I've heard in a long time. I shall give an implementation (of my interpretation of what you said) some thought. P.S. Please create an account! --Connel MacKenzie 23:11, 7 June 2007 (UTC)
Thanks! Who knows - this might become a self maintaining&expanding system of it's own interest. I was thinking about creation of information rating by users votes (answering yes/no to the question "do you think it is a correct translation/spelling/*?"). This can be applied to information fragments or whole pages. Low rating would indicate necessity for more attention/edition AND as a result more "checking questions" given to users from low rating fragment.
Leonid
P.S. I do have an account but was not using it for a while (don't remember the user name)
P.P.S. my new user name is --Leonid 16:19, 8 June 2007 (UTC)
I think it might be a good way to start filling in blank pages which are dominating in most of the languages. And it is not because someone is not woking good enough. It seems to be a natural way of building up this information. Very simple calculation shows, that (at least on the initial stage) number of new pages should grow at least in geometric progression (if not exponentially) before the saturation can be reacher (if it can be reached at all), while the linguists, who are working on the project have only finite time (voluntary vorking on it) to fill in the gaps.
Leonid 09:34, 13 June 2007 (UTC)
Verbix
Does anyone know if we may borrow conjugations from Verbix? I've asked them through email, and I haven't got any reply yet. Do you think we can? Smiddle / TC@ 19:34, 8 June 2007 (UTC)
- I see "© Verbix 1995-2006." at the bottom. So, no. No way. --Connel MacKenzie 20:07, 8 June 2007 (UTC)
- I sometimes consult sites like Verbix when I'm adding a conjugation table for Spanish, just to make sure I haven't made any mistakes, but only after I've written out all the conjugations. I don't believe this would constitute a copyright violation; I just see Verbix as a replacement of sorts for the conjugation tables in the back of my English→Spanish dictionary that I used to consult occasionally. Just make sure not to rely on those sites if you don't already know how to conjugate in that language. – Minh Nguyễn (talk, contribs) 07:21, 11 June 2007 (UTC)
Time to phase out {{wikipedia}}
and friends
As mentioned on WT:GP, I've written a new template to replace the functionality of {{wikipedia}}
and his floating box friends. The box 1) screws with page formatting, especially images and section editing, 2) is ugly, overly intrusive, and incongruous with the rest of the page, especially when more than one (or three), 3) as a consequence, has no consistent placement, ending up wherever it fits best. However, problems with other solutions, like {{pedialite}}
are the opposite: it's invisible, especially if you are only viewing a single language section in a long page.
This last problem is, I think, solved with the new "interProject" I imported from Spanish Wikipedia a few days ago. It puts links in the sidebar like interlanguage links: I put it in {{wikipedia}}
, so you may have already noticed it already. Putting that into {{pedialite}}
-version templates both solves the intrusiveness, ugliness, and formatting problems of {{wikipedia}}
while offering the noticeability of the box. I've created a single template that can incorporate any of the projects with parameters at {{projectlinks}}
. It calls for a separate section at the bottom of the page like external links. See United States for an example of how this can be used. The mock-up at Template:projectlinks shows all the parameters. It is also easier to use, consolidating all the projects and languages into one template as well as the parameters to make piped links, and make it link to a different page than the page name. The code looks simple, like:
== Sister projects ==
{{projectlinks|pedia|lang1=es|source|commons|page3=foo|label3=bar}}
The documentation of the parameters is at Template:projectlinks. Perhaps I'm getting ahead of myself, but I would like to propose that we migrate all of our crossproject links to this unified system. (Also, thank you to Pathoschild for help on the template coding.) Dmcdevit·t 00:54, 9 June 2007 (UTC)
- Note that I've edited your change to
{{wikipedia}}
to remove two extra blank lines it introduced into the text. People should be careful not to leave a blank line at the bottom of a template (e.g. the last position the cursor can be moved to in the edit window is the end of the last line of text) unless it is desired. Robert Ullmann 15:44, 9 June 2007 (UTC)
- Note that I've edited your change to
- I do not agree with certain aspects of your solution. We need to really think through how to link to each sister project, especially Wikipedia for which there's a distinction between disambiguation pages, paralleling our own entries in some ways, and pages that refer to specific meanings of a word that may have many definitions.
{{pedialite}}
or an alternative needs to be included several times, and having a single call to do this is not elegant. I have an example at fish that's a little ugly visually (I don't know how to fix the CSS stuff) but makes that nice distinction in code, using a new template{{pedia}}
which calls either{{pedialite}}
or the new{{sister}}
depending on the parameter. The one thing I fully agree with you on is that the{{wikipedia}}
template should be phased out in the near future. DAVilla 01:15, 9 June 2007 (UTC)
- I agree also that the
{{wikipedia}}
should go (even though I've added a lot of them) - seems like I never see them in the same place twice, although I always try and stick 'em at the top of the page. What about links to foreign-language Wikipedia projects (I've put in a number of those as well, see e.g. deism and pandeism)? bd2412 T 01:26, 9 June 2007 (UTC)
- I agree also that the
- The template can easily link to disambiguation pages, or any page with a different name, where needed, with the "pageX" parameter, and to any language for any of the projects (with "langX=Y"). Take a look at the documentation. I'm not sure what specifically you don't like about how it would look at fish. I've changed fish to this template (revert back after you've seen it if it's terrible), so you can see what I'm getting at. We can work on the details. Dmcdevit·t 01:28, 9 June 2007 (UTC)
- The rest of this converstion was moved to Wiktionary:Grease pit#Template:projectlinks. DAVilla 11:24, 9 June 2007 (UTC)
I have created a vote at Wiktionary:Votes/2007-06/Wikipedia box template but I do not intend to start it unless it is unclear where the community stands. As it pertains to Wikipedia, I'm in favor of keeping the boxes for disambiguation pages only. Does anyone object? DAVilla 12:15, 9 June 2007 (UTC)
- I'm not sure that recommending 'disambiguation's is the best way to go. We really have approached this problem piecemeal throughout en.wiktionary's history. I'm arriving at the conclusion now, that the only external links we should have, are to other WMF sister projects (outside of quotations, of course.) Anything that needs to reference a more distant web-reference probably belongs in an appendix, not the main namespace. --Connel MacKenzie 09:06, 11 June 2007 (UTC)
- DAVilla, I think that would be a disservice. I commented on this issue somewhere, but my comments seem to have vanished into the ether. If we have a policy of linking to disambiguation-only, we are taking a very POV approach. What happens when we link to w:America with a box because it has a dismbiguation page, but we don't link to w:Venezuela because it doesn't? I don't think that's a fair or reasonable approach.
- I would rather see a policy of (1) No more than one box per entry. (2) Use the box to link only to the most generally relevant page on Wikipedia (disambiguation when appropriate). (3) Use in-line links only in a section at the end of the article (though that is a separate issue from what we're discussing). The box is a useful visual clue that a more detailed article exists. For lengthy pages (and we want all our articles to grow to full size), the box makes the link much easier to find, especially since it usually resides in the upper right corner of the page. For major proper nouns like countries, languages, deities, religions, etc, an encyclopedia article is going to be much more enlightening than a simple definition. The boxes assist with that. --EncycloPetey 20:11, 29 June 2007 (UTC)
Request for layout
I spread the gospel of Wiki a lot, but I wish that the homepage more prominently displayed the other uses and benefits of the wiki network. In other words instead of the familiar globe and language page, or in addition, please show wiktionary etc. — This unsigned comment was added by Nathanday01 (talk • contribs) at 02:04, 9 June 2007 (UTC).
- I've read your comment several times trying to understand what you are saying, but I still can't figure it out. Are you talking about wiktionary.org, wikimedia.org, or wikipedia.org? And which wiki network are you talking about? The Wikimedia Foundation? What is the "familiar globe and language page"? Sorry, I can't figure out what you're saying. ~MDD4696 20:01, 9 June 2007 (UTC)
- I think they're suggesting that we put a "Features" section on wikipedia.org, since that page is currently just a big logo and an enormous list of languages. But that's intentional: people who speak only French wouldn't understand a "Features" section in English, and neither would speakers of only Spanish, Russian, Vietnamese, Turkish, and Greek. Since it's such a multilingual website, we can't really show the user anything until we know what language they use – that is, once they click on a language at wikipedia.org. – Minh Nguyễn (talk, contribs) 07:17, 11 June 2007 (UTC)
Why the two categories? Aren't they the same thing? Looking at the verbs entered there, it seems that no-one knows why one or why the other. While I'm on it, can we please standardise the usage notes on modal verbs in some way? Algrif 16:39, 10 June 2007 (UTC)
- The first category is for the lemma (i.e. the infinitive form); the latter category is for non-lemma (i.e. inflected) entries. --EncycloPetey 16:55, 10 June 2007 (UTC)
So, we shouldn't have any modal verbs in the first one then? Algrif 17:33, 10 June 2007 (UTC)
- Eh? Not at all. If anything, I think there are certain modal verbs that should be in both — could, for example, is an inflected form of the modal auxiliary can, but is also really a modal auxiliary in its own right. (It's been argued that it's a "remote" form of can in all its uses, whether remoteness of time — "Yesterday we couldn't find it" — or of possibility — "Do you think a boy could ever swim faster than a shark?" — or of relationship, i.e. politeness — "Could you do me a favor?" — but I don't think we help our users at all by labeling it so.) —RuakhTALK 18:24, 10 June 2007 (UTC)
The problem is that none of the following are infinitives in any sense of the word:- can, could, will, would, shall, should, may, might, must, ought to.
- So what you are saying, as an example, is that both can and could should be in the first lemma category, but only could in the second category due to it's use as the past of can (usually in reported speech type clauses).
- Please correct me if I am wrong. Thanks.
- One further question. Would it be a good idea to open an appendix for some basic usage notes (read: grammar) for the English modal verbs? It would certainly help in tidying up the pages for them. Algrif 11:20, 11 June 2007 (UTC)
- Re: "The problem is […] must, ought to.": Yeah, they're not infinitives, but they're our lemmata, if only because we don't have an alternative: those verbs are defective, and don't have infinitives. (Every word needs to have a lemma form, so if the normal lemma form doesn't exist — say, a noun that exists only in the plural — we have to make do.)
- Re: "So what you are saying, […] if I am wrong. Thanks.": Yes, except that that's just my opinion — I'm not aware of any previous discussions on the topic — so if you or anyone else disagrees, then we should discuss it.
- Re: "Would it be a good idea […] modal verbs?": I think so, yes. Keep in mind, though, that we are descriptive, not prescriptive; we describe the grammar that people use, not the grammar that people perhaps should use. (At least, we can describe both, but the former is of primary importance, and descriptions of the latter should take the form "Such-and-such reference claims that blah should only blah blah blah.")
Thanks. I wouldn't disagree. I'm just trying to find some sort of consensus to follow because, at the moment, this particular group of very important verbs look very messy. If were a student of English looking for clarification of some kind, I would have to conclude that even the English don't know what to do with these verbs. :)) Algrif 16:14, 11 June 2007 (UTC)
- Could an admin. set up an appendix for this then, please? I'll happily fill it in once it is there. Algrif 11:10, 12 June 2007 (UTC)I'll do it myself, now I've worked out how it's done. Algrif 10:44, 20 June 2007 (UTC)
Small Improvement re Wikisaurus Citations
I added the following to the Wiktionary:Wikisaurus/criteria page. Hopefully it's not controversial. But, if you feel we do need a vote on this addition, by all means say so here. --Richardb 07:40, 11 June 2007 (UTC)
- With the idea of citations becoming adopted more, consider putting the supporting citations, google count etc in the .../citations sub-page. And include a link in the main entry, See Cites, linked to the citations page.
- Absolutely not. WTF is the namespace for? --Connel MacKenzie 08:40, 11 June 2007 (UTC)
- What namespace is that you are referring to ?
- When did you create this page? It is not true; WS pages and their contents must meet WT:CFI. And why did you add the banner atop it, without anyone's input at all? --Connel MacKenzie 08:43, 11 June 2007 (UTC)
- Page now listed on WT:RFDO. --Connel MacKenzie 08:48, 11 June 2007 (UTC)
- Connel's RFDO entry immiediately deleted. I'm not going to put up with that Jack-boot approach any more. I'm more than willing to be corrected, and to debate issues, But to be stomped like that is outrageous!!!--Richardb 09:56, 11 June 2007 (UTC)
- Did you seriously just delete the entire request for deletion page because you disagreed with someone's nomination. And with a personal attack in the deletion summary and another one here? Please get a grip. Dmcdevit·t 10:08, 11 June 2007 (UTC)
- For my full personal explanation, please go to User:Richardb/explanation-12-Jun-2007
Page Wiktionary:Quotations, should it be renamed "Wiktionary:Citations" ?
I was going to put a link to Wiktionary:Citations in reference to the above discussion, but when I went to look at that page, I was redirected to Wiktionary:Quotations. We seem to have changed our terminology in recent times, so I believe this page Wiktionary:Quotations should be renamed to "Wiktionary:Citations". Do we need to vote such a change ? Or do we really need two sparate pages "Wiktionary:Quotations" and "Wiktionary:Citations" ? --Richardb 07:40, 11 June 2007 (UTC)
- Calling them "citations" is a misuse. We act as a secondary source, listing how the word is used. Those are not "citations", they are "quotations." Why would you want to rename it to the wrong thing? --Connel MacKenzie 08:40, 11 June 2007 (UTC)
- Connel. You seem to be so blinded with rage with anything to do with Wikisaurus that you fail to even read stuff. I was pointing out only that The Wiktionary:Citations page presently just redirects to the Wiktionary:Quotations page. There is currently no difference ! I'm suggesting that either the redirection is wrong, and we need to separate pages (which maybe you support), or maybe we really only need a page about Citations.
- To quote one part of the page "The appropriate section title is "Quotations", a lever four heading. You may also find "Citations", but it has not been determined which one is best. "Quotations" is in the majority, though". In other words, this page is way out of date. So don't go stomping on me becuase I'm not quite clear on what is happenng with "Citations" or "Quotations". The pages I naturally turned to - Wiktionary:Citations and Wiktionary:Quotations are one and the same, and way out of date.--Richardb 10:04, 11 June 2007 (UTC)
- The wild misrepresentations your are firing off, are amusing. No, you added a policy banner to that page, which did not have one.
- Call me a liar if you want to, but the page history is clear that the Policy Header was put on that page back in May2006. --Richardb 13:13, 12 June 2007 (UTC)
- Because it is not policy. It is not policy, because it says the exact opposite of what the WT:VOTE on the topic concluded. The only thing out of date, about it, was that it was never cleaned up (deleted) when it should have been!
- There is no such VOTE. As is often the case, CM asserts something is common practice, or has been VOTED on, when there is no truth in his assertion, and the evidence to counter him can be shown.
- Contrary to CM's assertions, Wiktionary:Wikisaurus/criteria is a valid policy page. It has been a policy page since May 2006. Futhermore, Wiktionary:Wikisaurus/criteria is referred to by a paragraph in WT:CFI [[1]]that has been there since May 2006. And WT:CFI has been VOTED on as policy since then, I believe. If WT:CFI is policy, then Wiktionary:Wikisaurus/criteria is policy too, by virtue of this clear, long standing exception reference. --Richardb 13:13, 12 June 2007 (UTC)
- The wild misrepresentations your are firing off, are amusing. No, you added a policy banner to that page, which did not have one.
- But your misplaced anger at me (it was just cleanup!) turned quite ugly, when you deleted WT:RFDO. It is as if you've been in the WM-tech channel recently (or reading offline logs of it) searching for possible catastrophic actions (such as deleting and restoring a page with a very high number of revisions...something that has been a problem the past couple days.) If we are all lucky, the devs have already fixed it. If not...well, I guess we'll see en.wiktionary when it is restored from backups (I think they did one within the last month.) --Connel MacKenzie 10:13, 11 June 2007 (UTC)
- Assume bad faith if you want to. It was an innocent mistake. See User:Richardb/explanation-12-Jun-2007--Richardb 13:13, 12 June 2007 (UTC)
- But your misplaced anger at me (it was just cleanup!) turned quite ugly, when you deleted WT:RFDO. It is as if you've been in the WM-tech channel recently (or reading offline logs of it) searching for possible catastrophic actions (such as deleting and restoring a page with a very high number of revisions...something that has been a problem the past couple days.) If we are all lucky, the devs have already fixed it. If not...well, I guess we'll see en.wiktionary when it is restored from backups (I think they did one within the last month.) --Connel MacKenzie 10:13, 11 June 2007 (UTC)
- Apparently, that bug has been fixed. --Connel MacKenzie 10:22, 11 June 2007 (UTC)
- You know, that was my thought, too, but apparently (according to our own definition, as well as the OED's and those at Dictionary.com) the word "citation" can mean simply "quotation". I'd still rather we just used the word "quotations" — call a spade a spade, and all that — but it doesn't seem to be a huge deal. (?) —RuakhTALK 16:06, 11 June 2007 (UTC)
- The practice up until this point has been to call the section on an entry page Quotations, but to call a subpage listing such quotations Citations. This disparity was in existence before I began editing here so I know neither the history or the rationale. I suspect it had at least something to do with distinguishing an in-page section from an off-entry subpage, but that's just a guess. In any case, most of our ELE policy elaboration pages are named for the sections they discusss, so if we continue to use the section header Quotations, then the information about formatting that section should be a Wiktionary:Quotations. If we do vote for a Citations namespace, then I assume information about that namespace would reside at a new page called Wiktionary:Citations (replacing the redirect). If that should happen, each page (both Q & C) should have a clear statement at the top explaining what it does and does not cover, and link to the other page. That's my opinion anyway. --EncycloPetey 04:09, 12 June 2007 (UTC)
- Don't we need the explanation Wiktionary:Citations even now, if we are already using /citations sub-pages. All I was trying to point out that there is confusion - the documentation seems to confuse Quotations and Citations. I'd volunteer to have a stab at clearing it up, but given the current mood of CM towards me, I won't bother any time soon. Nothing I could do would satisfy him ! Personally I don't have much passion about which way it goes, just as long as it is clear and documented, not just left as "common practice" for everyone to guess at, and CM to rule on.--Richardb 13:13, 12 June 2007 (UTC)
- The /Citations sub-pages probably won't exist for much longer, so it would be appropriate to put off changes to the description until the practice (or at least the goal) is more firmly established. 203.154.48.179 19:58, 28 June 2007 (UTC)
- Couldn't we just always call them quotations? Despite what the dictionary says, citing usually means giving the name of a source material and quoting means giving an excerpt from a source material. wikipedia:Quotation and wikipedia:Citation also make this distinction, shouldn't we too? Of course on Wiktionary when we quote we also cite, but the main content is the quotation (used for illustrative purposes), not the citation (used to make it easy to verify the quotation). -- Coffee2theorems 16:57, 14 June 2007 (UTC)
- I agree with EncycloPetey: Quotations should be documented on WT:", and that page should left as it is and where it is. The Citations namespace will probably be documented on WT:Citations, although I agree with Coffee2theorems that these are all Quotations and should, ideally, be called Quotations. — Beobach972 18:07, 14 June 2007 (UTC)
- It would be a friendly amendment, but it does need some discussion. Does Quotations: cover the case of references? 203.154.48.179 19:58, 28 June 2007 (UTC)
Richardb
I don't like having to bring this up here, but if for no other reason than that I just blocked an admin, I figure there ought to be a discussion. A very large proportion of Richardb's edits are hysterical personal attacks, mostly (all of them?) against Connel. A few minutes ago he just took it one step further, by deleting WT:RFDO [2] when Connel nominated a page that he had himself created. Reverting the nomination would have been a conflict of interest, and childish enough, but using his adminship to delete the page is downright disruptive. I gave him a one day block, either so that he can cool down, or, if not, so that we won't have to have a wheel war over this tonight. Dmcdevit·t 10:25, 11 June 2007 (UTC)
- I agree that deleting WT:RFDO is unacceptable behavior for a sysadmin. Connel's reaction was to start a vote to desysop Richardb (WT:VOTE#User:Richardb). Seems like a logical thing to do, although I'm not sure that we have explicit policies for such an eventuality. That is probably because nobody expects a sysop to act in such an irresponsible manner.
- As to the original beef between Connel and Richardb, why exactly do we need a separate Wikisaurus? Why can't you just list synonyms and antonyms in a collapsible section of the wiktionary entry and call it a day? I, for one, would never think to look for synonyms in something called Wikisaurus:some word. -- A-cai 21:00, 11 June 2007 (UTC)
- Yes, it appears I was writing this at the same time as Connel's proposal, but I think the community should have a discussion before any vote. I'll put up a link here from the vote. Dmcdevit·t 22:00, 11 June 2007 (UTC)
Personal Explanation by Richardb
(After 24 hour block has finally expired)
For my full personal explanation, please go to User:Richardb/explanation-12-Jun-2007. In brief, I felt I was being unjustly and outrageously attacked by Connel, who was wrongly quoting policy etc. I decided to delete his RFD entry. I was viewing the RFD entry, and made the innocent mistake of using the delete tab. I was wrongly thinking that the delete tab would delete only the entry I was viewing, not the whole RFD page. I am not a deletionist, and rarely use the delete tab for anything other than cleaning up my own mistakes. I was unaware I had mistakenly deleted the whole RFD page until dmcdevit had already blocked me.
You will also see some other points of view I have in User:Richardb/explanation-12-Jun-2007 about CM's behaviour, totally contradicting policies, and the nature of his constant attacks on me. I would welcome other people letting me know if I am right or wrong about these.--Richardb 12:33, 12 June 2007 (UTC)
Outside view
I'm a fairly new contributor here (but an experienced sysop at en.wikipedia and commons) and while I have been learning the ropes of Wiktionary, I have been struck by how few user and editing disputes there are in general here than at en.wp.
The only exceptions to this that I have experienced all seem to involve Connel MacKenzie, who comes across as being particularly bad tempered and ready to assume bad faith at the drop of a hat. Examples of this are the phrasing of his comments at WT:RFD#do_exactly_what_it_says_on_the_X - particularly with reference to the further nomination of what it says on the tin as "not idiomatic" without apparently reading the entry here, the linked Wikipedia article or searching for any references. I may have gone overboard in my citing, but as I said I am new here and even if I wasn't I don't think it deserved the tone of response it got.
WT:TR#usurer is another example where I feel Connel has been neglecting to act in a manner appropriate to collaborative editing. I agree that he is not the only one there who has been uncivil, but he is also being unreasonable in failing to read and respond to the points of others and dogmatically espousing his point of view without providing evidence when requested. It seems common that Connel can drive other users to personal attacks against him when they seem to be able to control themselves in interactions with others.
I am not familiar with the ins and outs of this whole situation (including almost no knowledge of the Wikisaurus issue), and I have not had any interaction with Richard B that I can remember off-hand, so my opinions are probably not as neutral as they might be. However, my impression is that although Richard has not acted in a saintly fashion, in the events of the past day or so he is the greater wronged.
All of this will be just more hot air unless a way forward can be found. As a start I propose a two-part set of actions to move us beyond where we are:
- I suggest both Richard and Connel are placed, by the community, on a civility parole.
- In spirit this is like the English law practice of being bound over to keep the peace. What this entails is that if either of them are uncivil (based on the principals of the proposed Wiktionary:Civility) or engage in personal attacks, an uninvolved admin can block them for up to 24 hours, depending on the severity. This would be accompanied by a full note by the blocking admin on their talk page detailing what precisely they were blocked for. This is similar to many remedies imposed by the Arbitration Committee on the English Wikipedia.
- Connel and Richard both agree not to delete, mark with
{{delete}}
or list on WT:RFD any entry or other page that the other either started or has made significant edits to.- If they feel that anything does need any of these actions taking, then they should ask an uninvolved administrator to take a look and act as they feel apropriate.
The aim of these is not to punish, but to enable rational discourse to take place, where a permanent solution can be found. I would suggest that both of these last initially for 3 months, if during that time the community feels that they aren't working or have worked to such an extent that they are no longer needed (this is obviously my hope) then this can be revised. Thryduulf 14:30, 12 June 2007 (UTC)
- ps. I should have made clear that sober discussion on this is what I'm after initially. I don't know the ins and outs, and this is not an attempt to impose my view from on-high but as an idea from which the community can build if it collectively wants. Thryduulf 14:32, 12 June 2007 (UTC)
- It is my sincere belief that all other sysops are too tainted by Wikipedia assault tactics such as the notion of even having an "arbcom", to even touch things that they know should be fixed (immediately, in some cases.)
- The suggestion of not tagging things for discussion is ludichrist.
- Regarding usuress, the suggestion that arguments (in direct conflict to references and sources provided) is absurd. A known "problematic" contributor (confer coordinate) is pushing an invalid POV without any evidence whatsoever. That is the same "problematic" contributor that reacted to actual references with accusations of insanity.
- Regarding what it says on the tin; the obvious mistake was that it is a British idiom, completely unheard of, over here. Saying "not idiomatic" is exactly what the complaint was. So, expressing myself in a concise manner, direct and to the point, is "in-civil?" That, good sir, does not follow. The statement "<sarcasm>for the overkill</snarky>" was an obvious attempt to infuse some levity and humor into a discussion that had an unpleasant tone! Did you really take offense at that?
- I had blocked myself for yesterday as a display of understanding and good faith. Since that did not help in any way, your proposed remedy will likely continue only to reduce productive conversation (as yesterday's block did) rather than "provide a way forward."
- --Connel MacKenzie 19:55, 12 June 2007 (UTC)
- I agree with you that not RFD-ing articles is no solution; discussion is an essential part of Wiktionary. Rather, the solution is to undertake a modicum of civility in RFD and other discussions. (I've seen you do it before, so I know you can.) —RuakhTALK 22:48, 12 June 2007 (UTC)
Maybe we just need a modified Godwin's Law approach — the first person to accuse someone else of bad faith automatically loses the argument. :-) —RuakhTALK 16:28, 12 June 2007 (UTC)
- Ah, but it's a simple matter to accuse someone of assuming bad faith instead, to win the game. ;-) Dmcdevit·t 20:00, 12 June 2007 (UTC)
Workshop on Computational Lexicography
Recently, a workshop on Computational Lexicography was announced on linguist list, taking place in Vienna in October. People here might be interested… See Linguist list. H. (talk) 16:23, 11 June 2007 (UTC)
What is "Common practice" ???
I've put up this page for discussion. User:Richardb/common practice At this stage, this page is just here for discussion. If we can get some consensus, perhaps we could later promote this to be a POLICY page.--Richardb 13:33, 12 June 2007 (UTC)
- While the tone of that page is overly-hostile, and counter to "moving forward," the intent of it seems good. From the inception of WT:VOTE, the intent to have "confirmation votes" has been clear. Obviously you think that is not clear enough, so you wish to start this page? To me, it seems a little redundant. Shouldn't this be summarized in a paragraph or two of WT:VOTE itself? --Connel MacKenzie 21:16, 12 June 2007 (UTC)
- I truly fail to see any hostility in that page. Perhaps you are reading something peronal into it that just isn't there.
- Reason for having it as a seperate page is for simple clarity, and ease of referring to it from other places. It is a suggested simple policy page.--Richardb 00:24, 13 June 2007 (UTC)
- I don't see any hostility in the page myself. It seems to be rather carefully neutral, in fact. I'm not sure that this page needs to be a policy, though it is useful information. --EncycloPetey 17:25, 13 June 2007 (UTC)
- EP, I think that was a result of this. --Connel MacKenzie 18:58, 13 June 2007 (UTC)
- Hmmm, I see what you mean. The original version is a bit raw. --EncycloPetey 22:21, 13 June 2007 (UTC)
- I like the improvements, thanks Ruakh. EP, what is the objection to "policy". In my life in business, and in sports admin, it is "common practice" to put all established "Common practice", "Definitions" etc into a policy manual, so there is no room for confusion. It is better to have such common practice and policy clearly identified for all to see. Instead of just "understood" (which is then so often not understood!). So, if it reflects common practice and there is a consensus, WHY NOT label it as policy ?--Richardb 02:20, 14 June 2007 (UTC)
- EP, I think that was a result of this. --Connel MacKenzie 18:58, 13 June 2007 (UTC)
- Clutter. As I suggested (and you rudely ignored,) this seem like it could be a rational piece of VOTE in one of the hidden-heading sections. Ruakh, nicely done, by the way. --Connel MacKenzie 02:48, 14 June 2007 (UTC)
- Please re-read the entry above at 00:24, 13 June 2007 (UTC), where I replied to your comment. - Reason for having it as a seperate page is for simple clarity, and ease of referring to it from other places. --Richardb 13:01, 18 June 2007 (UTC)
Size comparison relative to the 1st edition of the OED
I read on this week's Wikipedia Signpost that the en.wiktionary now has 400,000 entries. As far as the wikipedia page goes, the first edition of the Oxford English dictionary comprised 414,825 words and that took many decades to produce.
Is there some sort of press release or celebration planned for this milestone? Consider that the OED first edition is the high watermark in lexicography, that the wiktionary community has achieved an equivalent size in such a short time (though doubtless it is not yet to the same consistent quality) is some achievement.
faithfully, Witty lama (from the en.wp.)
- Just check first if we have 400,000 English words. We have rather 400,000 words across all languages - I think!--Richardb 15:25, 12 June 2007 (UTC)
- We have 134,542 English entries (including entries for plurals and verb forms) and 400,000 entries across all languages. This number includes inflected forms of Spanish and Italian verbs. We are in no danger of outdistancing the OED yet. --EncycloPetey 15:46, 12 June 2007 (UTC)
- Stats page shows English Real Entries= 89,816, English Real Definitions = 168,917 --Richardb 03:18, 13 June 2007 (UTC)
- But for some reason "real entries" do not include anything labelled australia, slang, archiac, rare, obsolete, dialect and a host of others...?! Widsith 08:29, 13 June 2007 (UTC)
- Australia?! Why are Australian words excluded?! — Beobach972 16:02, 13 June 2007 (UTC)
- Perhaps we need to distinguished between real words that are simply dialectal and not universal and entries that are universal? — Beobach972 16:02, 13 June 2007 (UTC)
- Ah. Drat. I was all excited there for a minute. Oh well, we'll get there eventually. Wittylama (en.wp.)
- I see 134,919 listed there under the heading "Total language sections", as the number of "==English==" headings present on en.wiktionary.org. Because the OED includes slang and forms in their various counts, that would be the column to use for comparison. Beobach972, I agree that the heading "Universal English entries" would be a better heading for that first column. (Australian terms are excluded for the same reason US terms are...some dialects don't recognize them as "words.") --Connel MacKenzie 18:34, 13 June 2007 (UTC)
- More accurately, perhaps "General English entries"? --Connel MacKenzie 18:35, 13 June 2007 (UTC)
- Ah, that's an even better title!
- To my thinking, we ought to have three counts (at least) of English entries by the time we're finished: (a) general English entries (with a brief footnote explaining what that term means — English terms that are not restricted to a few dialects; and not including inflexions of verbs, plurals, etc), (b) English entries (including dialects, but not including inflexions of verbs, etc), and (c) all English entries. What do you think? Is that feasible, technically? — Beobach972 19:02, 13 June 2007 (UTC)
- Well, counting entries instead of definitions gets tricky, but I'll see what I can do. --Connel MacKenzie 19:14, 13 June 2007 (UTC)
- But only if you help me with a good title for that second column. (You have the 1st and 3rd already, as described above.) "English base form entries"? (Would I need to exclude adjectives and adverbs from that count?) --Connel MacKenzie 19:23, 13 June 2007 (UTC)
- Ah, yes, I apologise; we ought to be counting the definitions and not the pages
, I suppose. No, I don't think adverbs or adjectives should be excluded — but do you think that the comparative and superlative forms of adjectives out to be included or excluded? — Beobach972 20:03, 13 June 2007 (UTC)
- Ah, yes, I apologise; we ought to be counting the definitions and not the pages
- Oh look, we have Inflected Forms listed (I'm just blind — Beobach972 16:36, 14 June 2007 (UTC)), I suppose we could just do the arithmetic. So, let's see, we have :
- Real Entries - the number of pages that have content that is not tagged as obsolete, etc
- This is a count of entries
- Real Definitions - the number of definitions not so tagged (general English, in other words)
- This is a count of "#" definition lines that are not form of, nor slang nor obsolete
- Inflected Forms - the number of definitions that are plurals, etc
- This is a count of "#" definition lines
- Slang - slang definitions (I assume — or do two slang senses on one page count as one?)
- This is a count of "#" definition lines
- Obsolete - as with slang
- This is a count of "#" definition lines
- Total Language Sections - does this tell us how many pages there are in a given language?
- This is a count of entries that seem to have at least one "#" definition line somewhere in the language section (e.g. ==English==)
- Total Definitions - this tells us how many definitions a language has (all English entries)
- This is a grand-total count of "#" definition lines
- So, yes, if we could count how many English definitions there are (that's Total definitions) but then exclude Inflected forms and (I suppose) Obsolete terms (but include slang and Australian English, etc), we could arrive at a count of Real entries (incl dialectal) / English (incl dialectal)... which brings up an interestign thought: will this column be added for all languages (which would probably be a waste of time, unless it's ridiculously easy for you to compile counts) or just for English? Anyway, back to naming it... other than English (incl dialectal), hmm... English lemmata? (Footnotes will help us explain the precise criteria for each column.) — Beobach972 20:03, 13 June 2007 (UTC)
- Well, not quite. The "Inflected forms" is a count of definitions not entries. I do the same checks for all languages (since they are supposed to all be using "form of" or similar, anyhow.) Spanish has an extra check for "esbot" but otherwise, yes, those columns are summed for each language. I don't "like" lematta as that tends to be meaningless for English, and ambiguous for all other languages (whether or not there is a footnote.) Hmmm...but that is shorter... --Connel MacKenzie 23:29, 13 June 2007 (UTC)
- Well, English base forms (or Base forms, for other languages) would work fine. — Beobach972 16:36, 14 June 2007 (UTC)
- What is the difference between Inflected forms being a definition count and not an entry count? If Total definitions is a definition count, couldn't we subtract Inflected forms (and Obsolete?) from it to arrive at the number of non-inflected (and non-obsolete?), but including dialectal, definitions? (I apologise if this is lingering confusion over my mistaken reference above to entries and not definitions.) — Beobach972 16:36, 14 June 2007 (UTC)
- Well, a definition "counting" as an inflected form does not prevent it from "counting" as slang or obsolete. So simply subtracting would be incorrect. I've replaced "Real" with "Base form" for those columns, and indicated on each, which they are (entries vs. definitions). Since my routine auto-generates the footnotes at the bottom, please let me know (here or on my talk page) what should/could be reworded there.
- Before I completely lose myself in code, I'd like to confirm: You are asking me to ADD one column, that displays the number of entries that are (base form OR slang) right? Or should I add that "intersection count" in parenthesis for form of, slang and obsolete? Or do you also need intersection(base + formof + slang) & intersection(base + formof + obsolete) & intersection(base + slang + obsolete) (for a total of six new columns,) as well?
- --Connel MacKenzie 00:43, 15 June 2007 (UTC)
- I'm asking you to add one column, which would contain all of the definition-lines currently counted in the Real Definitions column, plus (ie, without subtracting) all of the definition-lines marked as slang, dialect, vulgar, pejorative, offensive, derogatory, colloquial, colloq, informal, dialectal, dielect, (possibly sex, too- what do you think- are sex words real? I've never encountered the template), entertainment, usslang, ukslang, cockney, cockneyrs, rs (I'm just going through the list on WT:STATS - but does this template even exist?), aave, london, england, au, australia, canada, india, jamaica, patois (existant?). This will give us an idea of ow many words we have that are real English words, whether 'universal' or Australian (etc). Other than including the templates I just listed in bold, you should maintain whatever restrictions you have placed on the Real Entries section (so if you don't include form-of there, don't include it here, etc). — Beobach972 04:47, 19 June 2007 (UTC)
Can we see which words are most looked for by our readers
I believe I recall that at one time we had a tool where we could look up the most looked for words (GO or SEARCH). But I can'find it now. Would have expected it in special pages. Anyone know if we have such a tool, and where it might be ?--Richardb 02:40, 13 June 2007 (UTC)
- I think Wiktionary used to have Special:Popularpages, but it appears to have been removed. Rod (A. Smith) 03:17, 13 June 2007 (UTC)
- This page? (Perhaps someone is interested in creating a work-safe sub-list from that list :D ) \Mike 18:50, 14 June 2007 (UTC)
- Surely, you mean this, right? Or are you trying to emphasize the influence Wikisaurus has had? --Connel MacKenzie 01:05, 15 June 2007 (UTC)
- What? No, I didn't spend even half a thought on the issue of certain namespaces containing objectionable material/being inherently objectionable. \Mike 12:16, 17 June 2007 (UTC)
- Not quite. I was trying to find which words people typed in and go/searched for most, not just which ones were viewed most. But yes, I was trying to find out what the pattern of usage was. What percentage of our traffic is people looking up "dirty" words. I just remembered there was such a page in the past. Strange it should have been removed. Since another use was for us to audit the words most searched for (and thos most viewed) for accuracy and completeness. Would also be good to have a longer time period. And also to find those words which were searched for but not found. But, no great reasons to invent such a tool, just asking if someone knew where the tool was, had got to.--Richardb 12:53, 18 June 2007 (UTC)
- The original statistics pages were turned off for performance reasons on all WMF sites in 2004, IIRC. --Connel MacKenzie 22:29, 28 June 2007 (UTC)
Hiragana Indexing
I don't understand the point of the "hidx=" parameter on Japanese parts of speech. The only thing it does is place a word beginning with a voiced kana under its unvoiced form of the same kana, instead of under its own heading of the voiced kana (i.e. with the "hidx=" parameter used, が ka goes under か ka when both が and か headers are provided. Do we want it this way? The only good thing i can see about it is that it eliminates about a third of the headers on pages such as Category:Japanese adverbs, but it seems to be a little less practical. Why have automatically indexed headers if they're not going to be used?--Hikui87 20:30, 13 June 2007 (UTC)
- That's done so our categories match the standard Japanese 五十音 (gojūon) collation. In that 五十音 collation, 濁点 (dakuten), 促音 (sokuon), and 拗音 (yōon) lack indices of their own. Rod (A. Smith) 21:12, 13 June 2007 (UTC)
Does this apply to katakana as well? Should "ア" have it's own header, or should all words starting with "ア" fall under the "あ" header? —This unsigned comment was added by Hikui87 (talk • contribs) 2007-06-27T17:32:06.
- 五十音 (gojūon) collation is phonemic, so words starting with "ア" (katakana a) and kanji words whose first sound is a should be indexed under "あ" (hiragana a). Rod (A. Smith) 17:44, 27 June 2007 (UTC)
ELE Alternative forms header
I'm breaking down the issues in the previous vote, as the majority of response requests; DAVilla has also created a useful template to keep people from voting before anyone gets a chance to look at something ... (thank you!)
See Wiktionary:Votes/pl-2007-06/ELE Alternative forms header
There is significant usage of Alternative forms as a semantic equivalent of Alternative spellings, in cases where "spellings" just doesn't make much sense. (For example, an entry for a Symbol, a single Han character, etc.)
Present usage: as of 25 May 2007
Header | Level 3 | Level 4 | Level 5 |
Alternate spellings | 7 | 1 | 0 |
Alternative forms | 742 | 79 | 1 |
Alternative spellings | 6771 | 868 | 15 |
Forms and variants | 5 | 2 | 0 |
"Alternate spellings" has been routinely corrected to "Alternative spellings", and "Forms and variants" to "Alternative forms".
This does not affect the placement of this header (usually first at L3, but sometimes nested under ety, and at L4 to apply to one POS, or just misplaced). Not in the scope of this vote! This is only about accepting Alternative forms as an alternative form of Alternative spellings (no, I couldn't resist saying that ;-) Robert Ullmann 13:42, 14 June 2007 (UTC)
- I support the existence of an "Alternative forms" header, but don't support describing it as "a semantic equivalent" of the already-standard "Alternative spellings"; after all, if they have the exact same meaning, then why allow both? In particular, I think any vote on this topic should provide guidelines for when each is appropriate. Further, in cases where a word has some alternative spellings, and some alternative forms that can't really be described as alternative spellings, would we have two separate sections, or would the section have to be called "Alternative forms"? —RuakhTALK 16:00, 14 June 2007 (UTC)
- Yes, some guidelines would be helpful. I've used "alternative forms" for cases where the word changes form based on context, but I haven't been sure that my application was in the spirit of the heading. The classic English example is "a"/"an". Is "an" an alternative form of "a"? Rod (A. Smith) 16:29, 14 June 2007 (UTC)
- I agree with Ruakh, I support allowing the header (to add to you examples, another good use is Mull of Kintyre test vs Mull of Kintyre rule, where test and rule are not alternative spellings), but it isn't quite equivalent to Alternative spellings, hence the need for it. — Beobach972 16:44, 14 June 2007 (UTC)
- Interesting issue. I've come across this when writing some Japanese entries. To me the clearest rule for "alternative spellings" would be to use it, and only it, if you're talking about different ways to write a word that have the same pronunciations and the same meanings (except for nuances that are lost when read aloud).
- When to use "alternative forms" is less clear. s is one example of its use with symbols. Han character entries, like 学 and 學, tend to use "see also" links -- should these use "alternative forms" instead? Do we really want to say that š is an alternative form of s, as they aren't really interchangeable? Should "derived terms" etc. be used for symbols, too? For non-symbols, won't "synonyms" do instead of "alternative forms"? -- Coffee2theorems 18:57, 14 June 2007 (UTC)
- I do not want "alternative spellings" restricted to identical pronunciations. I think that was be an unfortunate over-extension of what it currently encompasses (spellings.)
- "Alternative forms" and "alternative spellings" should never (IMHO) be used instead of
{{see}}
links, but rather, always in addition to them. I agree that for symbols, "derived terms" or "synonyms" make more sense. - --Connel MacKenzie 20:07, 14 June 2007 (UTC)
- Note that the discussion further down on this page goes into quite a bit more detail on this topic. --Connel MacKenzie 01:07, 15 June 2007 (UTC)
(start of next section moved down, so the discussion on Alternative spellings and Alternative forms continues here ...)
- Is the topic here "where to place 'alternative spellings'" or is it to revamp the whole sequence? AFAIK, the "alternative spellings" heading belongs only as a navigation aid at the top of a language section, therefore always at L3. (Either it is an alternative spelling, or it is not; if it is an alternative spelling, then it will be used for all senses, whether prescribed against or not. Such prescriptions should be expounded upon in usage notes, not confounding site-navigation for newcomers.) --Connel MacKenzie 20:12, 14 June 2007 (UTC)
- Regarding "if it is an alternative spelling, then it will be used for all senses", that's true for all senses of a given "word" (i.e. a sequence of characters with a single etymology), there are can be multiple "words" in a given language entry (i.e. multiple terms each with its own etymology), and an alternative spelling might apply to only one of those "words". E.g. the English word "spelt" has an alternative spelling of "spelled" only in the verb form sense, not in the noun sense. Rod (A. Smith) 20:28, 14 June 2007 (UTC)
- Clearly, that isn't an alternative spelling though. --Connel MacKenzie 20:37, 14 June 2007 (UTC)
- ("Clearly"?) OK, then, how about "a.m." as an alternative spelling of only one sense of "am"? Rod (A. Smith) 20:52, 14 June 2007 (UTC)
- As complicated as it may make things, I would very much like to have "Alternative spellings" as a nestable header, as there are a great many words in A. Greek in which a particular alt. spelling applies only to a specific POS. Atelaes 21:14, 14 June 2007 (UTC)
- Or draught and draft; I believe even the British draft proposals. Or, one I recently learned, ghey and gay, where the spelling ghey was created to disambiguate. Or any of approximately fifty-seven gazillion Hebrew words (rounding to the nearest gazillion), where Biblical Hebrew would spell two words the same way (due to its relatively limited use of matres lectiones) and Modern Hebrew usually distinguishes them by inserting a superfluous י or ו. —RuakhTALK 21:19, 14 June 2007 (UTC)
- Alright, I thought of one : paw has an alternative spelling of pa only in the sense of father, not an animal's hand. — Beobach972 20:28, 16 June 2007 (UTC)
- I had no idea people could construe "Clearly" as in some way pejorative. It was meant as a "fluff-word" kindof mild intensifier, nothing more. If my mannerism caused offense, I apologize.
- All the examples being given in the last three or four responses here are precisely "alternative forms'" not "alternative spellings." These are all excellent examples of why the new heading should not have the same restrictions as "alternative spellings." But then, if I felt like nitpicking, I'd press the point that each of the "alternative forms" examples above could possibly find a better heading such as "related terms" or "synonyms", etc. But I won't. "Alternative forms" (at any heading level) seems like a workable solution. --Connel MacKenzie 21:43, 14 June 2007 (UTC)
- I could make the case that a.m. and am are alternative spellings, and I could also make the case that they are alternative forms, so I agree with you (Connel MacKenzie) on that one. In contrast, Ruakh's example of draft is excellent — draught is not, as far as I can tell, a form (like spelt) that is different, but a pure spelling difference. — Beobach972 04:10, 15 June 2007 (UTC)
I think that I would benefit greatly from someone explaining what the distinction is between "Alternative spellings" & "Alternative forms". Atelaes 21:59, 14 June 2007 (UTC)
- Someone mentioned using "identical pronunciation" as being definitive for "alternative spellings." I don't think that exactly identical pronunciation is correct, but I do think approximately the same pronunciation, with the same meaning is. While fish may be synonymous with ghoti, I don't think it would be fair to call it a valid alternative spelling (both because ghoti can also be pronounced as goatee and because it isn't an accepted word in general use.)
- For "alternative forms" I consider anything with identical meaning, transformed for inflection/ conjugation/ mutation/ pronoun substitution reasons, to be an alternative form. I would not use the "alternative forms" heading, if the other form is already specified on the inflection line; instead I think in practice, it is reserved only for one-off cases. For example, feel his oats is a good example of an alternative form, of feel one's oats.
- Oh, I see someone took some (good) initiative with Wiktionary:Alternative spellings. Does that help? --Connel MacKenzie 23:07, 14 June 2007 (UTC)
- Alternative spellings are like colour vs color : both have the same origin, both have the same pronunciation (the UK pronunciation of a word — in this case, colour — might differ slightly from the US pronunciation of color, but the British would pronounce colour and color identically, and the same for the Americans), both inflect in the same way (plural : colo(u)r + -s; past tense : colo(u)r + -ed), and both have identical definitions (in this case, exactly identical, but in some cases only nearly identical, as different spellings may carry differ connotations — like theatre in the US). — Beobach972 04:10, 15 June 2007 (UTC)
- As an American, I would never in a million years pronounce "colour" the same as "color." --Connel MacKenzie 00:58, 16 June 2007 (UTC)
- Hmm? Could you add the IPA (or SAMPA or enPR) to the articles to show how you would distinguish them, please? The only difference I've noted is that Britons pronounce both as /ˈkʌlə/ (without the 'r') whereas Americans pronounce both as /ˈkʌlɚ/ (with the 'r'). — Beobach972 02:19, 17 June 2007 (UTC)
- Wouldn't that be /ˈkəlɚ/? I don't think we Americans say /ʌ/. But this is way off-topic. 203.154.48.179 19:25, 28 June 2007 (UTC)
- Alternative forms are like proven vs proved : often (but not always), some part of each term (particularly if it is a multi-word term) will have the same origin or related origins, but the pronunciation will often be different, and the difference is more than spelling (in this case it reflects different methods of forming the past participle: one, like he was clean-shaven (ie, verb + en), the other, like he was inspired (ie, verb + ed)). The meaning will nevertheless be identical or very similar. — Beobach972 04:28, 15 June 2007 (UTC)
- Further examples :
- Alternative spellings are like draft vs draught (correct me if I am wrong, but my understanding is that there were no differing grammatical approaches to each one's formation, only that some writers interpreted Old English dræht as draught, others simplified the spelling to draft),
- Alternative forms are like worked vs wrought (identical meaning, but very distinct methods lead to each one's formation — wrought is from Middle English wroght, from Old English geworht, a 'strong' conjugation of sorts, whereas worked is work + ed). — Beobach972 04:39, 15 June 2007 (UTC)
- Alternative spellings are like colour vs color : both have the same origin, both have the same pronunciation (the UK pronunciation of a word — in this case, colour — might differ slightly from the US pronunciation of color, but the British would pronounce colour and color identically, and the same for the Americans), both inflect in the same way (plural : colo(u)r + -s; past tense : colo(u)r + -ed), and both have identical definitions (in this case, exactly identical, but in some cases only nearly identical, as different spellings may carry differ connotations — like theatre in the US). — Beobach972 04:10, 15 June 2007 (UTC)
- So how does one decide whether a term is an alternative form, synonym, related term or a derived term? Why is an overlapping header needed? This is currently clear as mud to me. -- Coffee2theorems 05:46, 15 June 2007 (UTC)
- I thought I understood this also, Coffe2theorems, but now it seems quite apparent that I am not clear on it. To me, intuitively, draught and draft can't possibly be related, let alone alternate spellings or synonymous (are they really?) Also, while color/colour seems to be an excellent example of an alternate spelling, I don't see how one can say that they are identical in meaning; they aren't (and their definitions here should diverge, not be artificially merged by well-intentioned contributors.) Please note that I did not bring up color/colour...in fact, I avoid that example in unrelated discussions like this (as it is so volatile.) Likewise, I don't see worked and wrought as being more than "similar." One is a past participle, the other an adjective. Yet I can readily accept that they are alternative forms. --Connel MacKenzie 00:55, 16 June 2007 (UTC)
- Oh, let me make two more notes, for clarity : my explanation was intended to supplement Connel MacKenzie's, not be separate (I want to explicitly specify this because I didn't make any mention of items such as feel his oats = feel one's oats). — Beobach972 20:28, 16 June 2007 (UTC)
- Well, draught and draft are (in some senses — not all — as Ruakh pointed out) synonymous, they are alternative spellings. As for colour and color : I think that the differences (the different senses that one or the other has acquired) are not related to the spelling — rather, the word acquired a meaning of 'In corporate finance, details on sales, profit margins, or other financial figures, especially while reviewing quarterly results when an officer of a company is speaking to investment analyists.' in American English which it did not acquire in British English. Now, because in American English the word is spelt color, whereas in British English it is spelt colour, the corporate finance sense is not given on the colour page, but it isn't a sense of the spelling 'color', it's an American English sense of the word that it represented in the IPA as /ˈkʌl(ə\ɚ)/. I don't intend to spark any fights over the spelling of colo(u)r, by the way, it just made a good example. Also, I suppose it should be noted that I am all along disregarding the archaicness/rarity of any sense : so, for wrought vs worked, wrought has sense restricted itself to being primarily an adjective, but can still be a verb (what hath God wrought?). Similarly, one could theoretically say that is a fence of worked iron. — Beobach972 20:28, 16 June 2007 (UTC)
- I checked the AHD just to be sure, and draft is indeed a descendant of the Middle English draught (along with, obviously, draught), re-spelt from that draught. — Beobach972 02:30, 17 June 2007 (UTC)
- Coffee2theorems: A derived term is a term derived from the word; for example, flashy is derived from flash. A synonym is a word (often etymologically unrelated) for the same entity : electric torch = flashlight. — Beobach972 20:28, 16 June 2007 (UTC)
- Take a look at flash and paw, where I have tried to place all of the alternative spellings, -onyms, derived terms, etc. in the correct places (so you can see the distinctions) (and please make corrections if I have made mistakes!). — Beobach972 21:20, 16 June 2007 (UTC)
Sigh... the overwhelming response to the previous vote proposal was that the issues needed to be separated. Clearly people aren't capable of that? The previous section (and proposed vote) is about alt-spell v alt-form, this section and vote is about the order of headers if they are at L4 (and not whether they should be). How do we make progress on separate issues and votes if you won't keep them separate?
If you can't, we either have to handle each issue serially, which will take a year (10-odd issues times a month+ for discussion and vote on each), or I will have to go back to try to find some overall composition that is acceptable, and you'll have to suppress your complaints that there are too many things in one vote? (eh? ;-)
Maybe I (or someone) should just re-factor the discussion in this section that belongs only in the previous section to put it there? Dunno. Robert Ullmann 15:19, 15 June 2007 (UTC)
- Oh come now. This is a productive discussion. Your choice of how to split up the discussions didn't suffice in this case; how can you accuse people of foul play when the prerequisite issues are contested?
- I do think serial discussion of each heading order change is reasonable, but I don't think a full month is needed between each one starting. You just happened to pick a very thorny one to start with. (Wait a sec...they are all thorny. Hmmm.) I think with one week per heading, with perhaps a few similar ones combined, the goal of a standard format can be reached. But rushing through any step isn't going to help it to pass a vote. For each header, if you stagger them one week apart, there conceivably could be one week of discussion to refine each proposed vote's verbiage.
- I'm not saying it isn't a productive discussion, it is as you say. And I am certainly not "accus[ing] people of foul play". Just a bit exasperated at the difficulty of keeping issues separated. I moved the order discussion down to separate it. Robert Ullmann 14:19, 16 June 2007 (UTC)
I've recalled what bothers me most about all of this. The "alternative spellings" heading predates the template {{see}}
. For that reason, (as Ec argued) an "alternative spellings" heading was supposed to be used for a navigation aide; that is, if someone types in a word and arrives at an "obsolete" or "archaic" term that they obviously did not intend, there would be a link near the top of the page, so they could figure out their error quickly. Navigation being the keyword. While "paw" might be an alternative spelling of only one sense of "pa", it is a disservice to readers to make them hunt all around to find out the spelling (entry) they were looking for in the first place.
But, with the advent of {{see}}
, I see the "alternative spellings" heading as entirely useless. I don't know that I've seen any really compelling arguments for having "alternative forms" either. Inflections, derived terms, related terms, synonyms, antonyms etc., should cover all things that are linguistically related. Meanwhile, {{see}}
should be the only thing we use for a navigation aide.
I think it would be much more consistent and comprehensible, if "alternative spellings" and "alternative forms" were deprecated. (The heading "See also" as well.)
--Connel MacKenzie 21:45, 16 June 2007 (UTC)
- The "alternative" sections seem very useful. "related terms" seems too vague for words that have the identical semantics but differ in orthography or where context somehow dictates the word choice (e.g. "a"/"an", "buses"/"busses", "through"/"thru", etc.). Rod (A. Smith) 22:32, 16 June 2007 (UTC)
- I think this discussion would benefit greatly if we took a look at it the other way round. That is, instead of discussing what a feature was intended to do, let's look at what it is currently doing, with examples for reference. Often new needs and functions are discovered once a new approach is added, and often these needs and functions were not originally anticipated. Once we know what functions are needed, we may better determine which functions might be merged and which ought to be separated.
- What is
{{see}}
used for? Well, my experience suggests that the majority of uses for this template are for spellings which use the same basic letters in the same sequence, but with variation in diacritics and capitalization. One such function is examplified by the entry for comanche (a Spanish word) would link to Comanche (an English word), so that a person who did not know about case-sensitivity would be alerted to the fact that there are separate entries. A second function is that the entry for canon links to cañón (and vice versa). Not all diacriticals can be typed in for a search, so thins kind of cross-linking is very, very useful.- The
{{see}}
listing has also included double-letter variants such as taal and tall on tal, although this hasn't gotten much feedback. All of these so far are language independent, listed at the very top of the page. To make things more confusing,{{see}}
is also used immediately under the language header for language-dependent see-alsos, such as wheels and the wheel on wheel, in cases where standard dictionaries would simply list the simple senses at the root, where users would be most likely to look it up. DAVilla 21:54, 28 June 2007 (UTC)
- The
- What is Alternative spellings used for? The primary use I have seen is for variant spellings of the same word in the same language. For example, color links to colour. Often, these forms are labelled according to current use (dated, archaic, etc) and regionality (UK, US, Commonwealth, etc).
- What is See also used for? This is a big grab bag, but most often I have seen it successfully used to link to words whose spelling and meaning differ from the current entry, but which are useful for interpreting the current entry, such as other symbols in a set (see ☿). The best use I have seen is for linking to an appropriate Appendix page, such as The appendix to the Countries of the World or to a usage appendix, such as Appendix:Brief amounts of time.
- What is
- The various functions listed above do not overlap, so trying to merge them all into a single
{{see}}
template function is probably not a good idea. In particular, merging the Alternative spellings into the{{see}}
template would lead to a list at the head of each page composed partly of similarly spelled words in other languages and partly of variant spellings of the same word. This makes such a list useless. A user would not know without following and evaluating each link what the link is for. It also removes our ability to label certain spellings as "archaic", "obsolete", "UK", etc in a single convenient location. --EncycloPetey 01:02, 17 June 2007 (UTC)
- I think this discussion would benefit greatly if we took a look at it the other way round. That is, instead of discussing what a feature was intended to do, let's look at what it is currently doing, with examples for reference. Often new needs and functions are discovered once a new approach is added, and often these needs and functions were not originally anticipated. Once we know what functions are needed, we may better determine which functions might be merged and which ought to be separated.
- OK, sorry for the going off on that tangent; your points are well-taken, and obvious. --Connel MacKenzie 17:33, 19 June 2007 (UTC)
- Yes, I was under the impression that 'See also' sections were for dissimilar (visibly dissimilar) terms, eg male could link to gender (just to give an example), (and other stuff — it's not restricted to dissimilar terms), whereas the 'see' template was for very similar (visibly-similar, I mean) terms only (canon to cañon, as said, because the two look similar). — Beobach972 02:19, 17 June 2007 (UTC)
- Wow, I think you've hit the nail on the head. "Looks similar" seems to be the most important aspect of an "alternative spellings" item. Can we describe that a little more explicitly? Spaces, hyphens, "ou" <--> "o", "ll" <--> "l", "ise" <--> "ize", diacritics (including pointing?) I miss anything? I would think that "af" <--> "ough" and "gh" <--> "g" would not be expected as "looks similar", but that is certainly debatable. --Connel MacKenzie 17:33, 19 June 2007 (UTC)
- Eh, I think similarity is a only ‘coincidental correlation’ for alternative spellings. It is the correlation for the
{{see}}
template, but alternative spellings must have some similarity of meaning (and be in the same language — ), and they aren't required to bear the utmost visible similarity (gue vs g, for example, or que vs ck, w vs ugh, k vs que). — Beobach972 19:15, 20 June 2007 (UTC)
- Eh, I think similarity is a only ‘coincidental correlation’ for alternative spellings. It is the correlation for the
I don't have any strong objections, but given the confusion, I'm not convinced we need "Alternative forms" for English entries. An example that is clearly not an alternative spelling but which would need to be placed at the same location would be quite illustrative. I remember that being the case for an idiom recently, having something to do with pants. Even those cases will be subject to opinion though, as there are always Synonyms and weaker locations for such equivalents.
It would, of course, be necessary to use "Alternative forms" in languages where "spelling" doesn't make sense. DAVilla 22:12, 28 June 2007 (UTC)
- English examples where I think "Alternative spellings" isn't perfectly accurate, but where it would be nice to have a stronger header than just "Synonyms", include:
- no ifs, ands, or buts ↔ no ifs, ands or buts
- all bark, no bite ↔ all bark and no bite
- blue collar ↔ blue-collar (N.B.: some people use one or the other consistently, but many people use the hyphenated variant attributively and the spaced variant otherwise)
- cooperate ↔ co-operate
- Dumpster ↔ dumpster
- More generally put, I guess I don't think that punctuation, spacing, and capitalization variants really count as alternative spellings, but to call them "synonyms" seems ridiculously weak, seeing as they're just slightly different forms of the same word. —RuakhTALK 23:18, 28 June 2007 (UTC)
- I agree with including these in some way, and that "Alternative spellings" may not be strictly accurate in these situations. However, I think we can describe the function of the header is terms just broad enough to allow for single-charatcer changes, even if that character is not a letter or character in the language. Likewise, we bend our own rules a little on a regular basis when it comes to phrases and idioms like all bark, no bite. I don;t think a little more bending would hurt in this case. I guess the question we have is whether to allow "Alternative forms" in addition to "Alternative spellings, or to make a wholesale switch. The problem with replacing the current header is that "forms" is not as precise when it comes to English (the language of this project) and it could lead to all sorts of things appearing in that section that really shouldn't be there. Would it not just be better to simply use an in-house definition of "spelling" that isn't restricted to letters, and to have an explanation of this fact in the ELE and on the page describing the policy for the section? --EncycloPetey 23:40, 28 June 2007 (UTC)
ELE level 4 header sequence
(this section moved down from above, to discuss the order separately ... )
Issue is specifying the preferred order of L4 headers. This is not about whether they should be L4, but what the sequence is when they are.
Present usage, in alphabetical order: as of 29 June 2007
Header | Level 3 | Level 4 | Level 5 | Includes |
Alternative forms | 1005 | 119 | 1 | Alternate forms (160), Alternative terms (41), Alternative Form (1), Alternative Forms (1) |
Alternative spellings | 7347 | 889 | 16 | Alternative spelling (11), Alternative spelings (1), alternative spellings (1), Alternative sppellings (3), Alternative spelligns (1), Alternate spellings (13), Alternative spellingss (1), Altnernative spellings (1), Alternative Spellings (28) |
Antonyms | 799 | 3789 | 92 | Sinonyms (1), Antonynms (1), Antononyms (1), antonym (1), Antonymes (2), Antonym (395) |
Conjugation | 804 | 4871 | 12 | Conjugataion (2), conjugation (1), Conjugations (1626), Conjugation 1 (1), Conjugation 2 (1), Conjugaton (1), Conugation (1), Conjugtaion (1) |
Coordinate terms | 2 | 10 | 6 | |
Declension | 1047 | 4753 | 38 | Declensions (6) |
Derived terms | 5196 | 12106 | 491 | Derivd terms (1), Derivated terms (17), Derived termss (1), Dervied terms (2), Derivered terms (1), Derivate term (4), Derrived terms (1), Deived terms (1), Derived Terms (59), Derive terms (1), Deriverd terms (1), derived terms (5), Derived term (33), Derived trems (1), Derived forms (20), Derived verbs (1), Derived tords (1) |
Descendants | 395 | 436 | 27 | |
External links | 2822 | 219 | 3 | External lonks (1), External Links (11), External Link (4), External link (16) |
Holonyms | 1 | 5 | 4 | |
Hypernyms | 1 | 45 | 9 | |
Hyponyms | 0 | 38 | 6 | |
Inflection | 253 | 1605 | 17 | Inflections (55) |
Meronyms | 0 | 4 | 4 | |
Quotations | 1031 | 1607 | 45 | Quotation (18) |
References | 7211 | 23229 | 37 | Refences (1), Refereces (2), references (1), Reference (96), Refererences (1) |
Related terms | 10839 | 17649 | 238 | Related items (1), Related forms (10), Realted terms (6), Relative terms (5), Related term (72), Related torms (1), Related Forms (2), Related verbs (7), Related Sites (1), Related Term (2), Releated terms (1), Related termws (1), related terms (9), Related Terms (355) |
See also | 18048 | 7125 | 105 | See əlso (1), See also: (1), See slso (1), See also (1), See Aslo (1), Sell also (1), See aso (1), See Also (250), see also (6), see also: (1) |
Synonyms | 4863 | 18114 | 528 | Synonoms (2), Synoynms (6), Synonims (5), Synonymbs (1), Synonym (508), Sybonyms (7), Synonmyms (1), Synyonyms (1), Synonyme (1), Synonymous (1), Synonymns (5), Synonysm (1), Synonymss (1), Synoyms (1), Synonyymi (1), synonym (1), Synonys (1), Synonyns (1), Synonmym (1), Synonynms (1), Synonymes (11) |
Translations | 2997 | 29837 | 1212 | ?Translations (1), Translations (1), translations (4), Tranlations (2), Transltions (2), Transaltions (1), Translations: (1), Translation (341), translation (1), Translatons (4), Traslation (1) |
Troponyms | 0 | 8 | 3 | |
Usage notes | 1444 | 3115 | 91 | Usage notet (1), Usage Notes (20), Usage noted (1), Usage noes (1), Usage Note (17), Usage note (536) |
(table updated June 30, 2007; the previous table, May 25, is here. The analysis code is from AutoFormat, it will—sooner or later—fix all these.)
Notes:
- except for Alternative spellings (normally first at L3), and See also and External links (normally last at L3), all the others occur significantly more often at L4 than L3
- Aren't "===References===" also supposed to go only as L3 at the very end? --Connel MacKenzie 20:14, 14 June 2007 (UTC)
- The only mention of in ELE is the example, and it is shown at L4, under a POS. References is then described after the definitions but before other L4 stuff like Translations. (!) The vast majority of the existing ones are at L4 as you can see from the above ;-) Robert Ullmann 14:35, 16 June 2007 (UTC)
- Aren't "===References===" also supposed to go only as L3 at the very end? --Connel MacKenzie 20:14, 14 June 2007 (UTC)
- Coordinate terms is mentioned in ELE, but as you can see, not used
- Conjugation, Declension and Inflection are not in ELE now, but used extensively in non-English entries
See Wiktionary:Votes/pl-2007-06/ELE level 4 header sequence for a proposed order. (Based on EP's suggested order.) I'm not sure where to put Alt-spell when it is under an L4 header, just before Synonyms seemed reasonable. The logic is/was that the things referenced should move from most closely associated (inflections), to synonyms, to things related, to translations in other languages, to other things (see also, external links). Robert Ullmann 14:28, 14 June 2007 (UTC)
- How about putting the alternative spellings just before quotations? It makes sense to have a wide range of quotations in main entries, and that means there will (eventually) be various spellings in them, so mentioning the spellings before their appearance in the quotations would make sense. -- Coffee2theorems 19:59, 14 June 2007 (UTC)
- Does the current discussion matter to the order of the headers though? I think the proposal is fine, although I'd move alternative spellings one place up in the list, but that's really minor. -- Coffee2theorems 08:47, 16 June 2007 (UTC)
I'm just taking Alt-spell of of the L4 ordering list. We could place it later if wanted. As shown in the stats above, this isn't a very big issue anyway. It is almost always at L3. (Note that the stats include L3 at the end, which happens too.) draft vote page edited Robert Ullmann 14:41, 16 June 2007 (UTC)
- The order proposed on the vote page seems reasonable to me. I believe I'd vote for it. — Beobach972 20:35, 16 June 2007 (UTC)
- Likewise, I'm OK with the proposed sequence; disregarding the question of whether or not some of the headers should be at L3 or L4. (FWIW, the last three items in particular - those following Translations - are the ones I would prefer most to see at L3) I think it's best for now to set aside the whole issue of Alternative spellings and or/forms as that seems to be a hot topic. --EncycloPetey 21:15, 16 June 2007 (UTC)
For reference (if any of you ever wondered what WT:ELE would look like with as many of its headers implemented as possible, or if you wanted an understanding of the status quo re: header sequence), I have formatted flash and paw according to [my understanding of] WT:ELE and made use of as many headers as I could possibly... — Beobach972 21:28, 16 June 2007 (UTC)
(moved "alternative spellings/forms" comment to #ELE Alternative forms header) Rod (A. Smith) 22:32, 16 June 2007 (UTC)
Thanks — Beobach972 Those two entries make things a lot easier and clearer. It also lead me to a point that I wanted to check. in no time Is listed as an adverb and category idiom. I would like to see a category "prepositional phrases". A lot of these idioms are in fact prepositional phrases. I would like to add a number of them, such as in a flash, to the wikt. What do you think? Algrif 14:43, 18 June 2007 (UTC)
Thanks also. You example does point out one problem I have with the current ELE, namely, the needless repetition of pronunciation information. In both flash and paw, there is a single pronunciation but this information is repeated in the entry because we subsume it under the Etymology header. This is not part of the current discussion, so I won't do more raise the issue now so that people may recall it at a later date. --EncycloPetey 21:44, 18 June 2007 (UTC)
- In neither of these cases is that needed, the Pronunciation can just be left at L3 at the top when it is not specific to the Etymology; lots of existing entries do that. As you say, a separate topic Robert Ullmann 13:49, 19 June 2007 (UTC)
- Well, I disagree; I think it very much needs to be listed with every etymological section. Paw is pronounced /pɔː/ whether it's ‘that man is my paw’ or ‘that limping dog has injured his paw by stepping on a nail’, but not all words are so neat. It's very important to be consistent, so if we were to list pronunciation information at the top of paw, we'd list it at the top of resume, too, ‘/rɪˈzjuːm/ or /ˈɹɛzuːˌmeɪ/’ — which would be unhelpful. Giving it with every sense is redundant, but harmless; not doing so, however, could lead to confusion. — Beobach972 21:35, 19 June 2007 (UTC)
- The key is that it's irrelevant to this discussion, but I would like people to consider later that putting Pronunciation ahead of Etymology consistently would eliminate the need to repeat pronunciation sections. --EncycloPetey 21:39, 19 June 2007 (UTC)
You've used Coordinate terms which (as noted in the table above) isn't used anywhere else. Do you think it should be retained? I was just going to drop it. Robert Ullmann 13:49, 19 June 2007 (UTC)
- My opinion is ‘why not?’ : if you can think of some reason why it would be bad to have it, well, we can drop it. If it's harmless, though, I'd say keep it; I'll make use of it. I can see a rationale for keeping it, too : a number of the references I checked in my quest to identify what ‘co-ordinate terms’ were bemoaned their absence from dictionaries; it is, conceivably, useful information. — Beobach972 21:35, 19 June 2007 (UTC)
- Therein lies the key problem. No one knows what a coordinate term is, so the header is never used. --EncycloPetey 21:41, 19 June 2007 (UTC)
- To be honest, I'm not sure a separate coordinate terms section is terribly useful; in cases where coordinate terms aren't obvious, I think it's usually best to include them right in the definition line. (That said, I wouldn't mind allowing for the possibility of such a section, provided ELE explained how it was to be used.) —RuakhTALK 22:34, 19 June 2007 (UTC)
- I'm not sure why/how we'd include them in the definition line (take a look at iron!). Anyway, the best reason I see for not using the header very often (and this is different from removing it from CFI and not allowing it!) is that the goal is already accomplished by categories. Say you look at the entry for iron, a chemical element, and you want to know other elements. You could look in the co-ordinate terms section ... or you could look in Category:Elements, linked at the bottom of the page. I'd still vote ‘weak keep’ on the header, as many words do not have categories for their co-ordinate terms (for flash, for example, there is no Category:Verbs that mean 'emit light'). — Beobach972 02:25, 20 June 2007 (UTC)
- But that function can be filled by Wikisaurus. --EncycloPetey 01:41, 21 June 2007 (UTC)
- No, because if the words were synonymous, we'd list them in the Synonyms section, and not even resort to Wikisaurus. That function could perhaps be fulfilled by See also, though. — Beobach972 02:50, 21 June 2007 (UTC)
- I think you should take another look at Wikisaurus. It is not a strict list of synonyms, or there would be little point in having it. Look particularly at Wikisaurus:laugh and Wikisaurus:old for example. There are synonyms, antonyms, and near synonyms, as well as "see also" terms. The value of accumulating terms i Wikisaurus is that we don't have to maintain a complete list on each and every page that needs the list. I think some form of the Coordinate terms idea would serve better on a Wikisaurus page. --EncycloPetey 18:06, 21 June 2007 (UTC)
- We're keeping all the -onyms, right? — Beobach972 02:50, 21 June 2007 (UTC)
- Well, I think the point right now is just to decide that "allowable" -onyms will follow Synonyms and Antonyms, and to figure out at a later time exactly which ones we want to keep. We're just trying to determine sequence right now without worrying too much about what gets kept or removed. --EncycloPetey 18:06, 21 June 2007 (UTC)
Can we have a bot fix all the obvious spelling and capitalization errors reported in the table above? bd2412 T 02:06, 20 June 2007 (UTC)
- I don't think that's a very good idea. The cleanup lists I generate provide a list of attrocious entries; it is very rare that one header is misspelled, while the rest of the entry is perfect. Usually, the heading misspelling is indicative of an entry that is formatted completely wrong. E.g. User:Connel MacKenzie/todo2. Using a bot to correct only the headings would simply mask those problem entries that (usually) need to be overhauled. --Connel MacKenzie 16:59, 21 June 2007 (UTC) NOTE: the XML dumps have not been running this month, so this is getting out of date. The earlier sections of that page have been removed as they have been cleaned up, leaving "new/experimental" headings at the start of that list. --Connel MacKenzie 17:09, 21 June 2007 (UTC)
- Are the above covered in your cleanup lists? If not, can we have a bot make a list of the words which each of the above errors for human perusal? bd2412 T 22:35, 22 June 2007 (UTC)
- The headers above and Connel's todo list(s) are different, overlapping, sets. These are all typos, almost all capitalization or plural/non-plural variations. They do not indicate that there is anything else wrong with the entry. See for example 索取 where A-cai managed to write "External lonks" but the entry is fine. AF will (as noted above) sooner or later fix them all, whilst also running all of its other checks and tagging problems. (you can look at some examples of level 3 and level 4 headers that are in the list above, the ones that are not typos are tagged by AF) Robert Ullmann 12:32, 30 June 2007 (UTC)
- Are the above covered in your cleanup lists? If not, can we have a bot make a list of the words which each of the above errors for human perusal? bd2412 T 22:35, 22 June 2007 (UTC)
One thing that bothers me about the table presented at the start of this section: for L4 headings, it does not indicate which ones are properly nested under etymologies, vs. which ones are simply mistakes. (As I've argued elsewhere, with a sane reorganization of headings, they'd all be at L3, but never mind that for now.) I think the designations listed in the table above are even more misleading than I initially assumed. We should be discussing only where they should be. --Connel MacKenzie 17:23, 21 June 2007 (UTC)
- That's a separate issue. Right now we're deciding whether they should be in this sequence assuming for the moment that we put then at L4 nested under a POS section. The issue of whether they should be at L4 or L3 is a separate one (albeit tangled in this issue a bit). In other words, we'll probably discuss and vote on level of placement next, but deal for now with the sequence on the assumption that they will be used as currently demonstrated in the ELE examples. Trying to tackle both issues at once makes the discussion too big for one vote. --EncycloPetey 18:06, 21 June 2007 (UTC)
- To resolve this "separate issue", how about Wiktionary:Votes/2007-06/Level of basic headings? DAVilla 23:05, 28 June 2007 (UTC)
Language ordering
I don't know if this has already been discussed elsewhere, but: currently, languages with "Old" or "Middle" etc. in their names are sorted under O and M and so forth. Wouldn't it make more sense to sort them by the language name first and the qualifier(s) second (i.e. as if they were "French, Old"; "Dutch, Middle", etc.) so that they appear next to the modern forms? (This would only affect ordering; they'd still be written ==Old French==, ==Middle Dutch==.) --Ptcamn 08:35, 16 June 2007 (UTC)
- See Wiktionary:Beer_parlour_archive/2007/April#Language_sort_order. Robert Ullmann 14:29, 16 June 2007 (UTC)
- Well, as that discussion never really concluded anything, I'd like to say that I agree with DAVilla. Language ordering should be kept as simple as possible, allowing for software enhancements to deal with "grouping" and "sub-grouping" issues. That said, I think this is a good reminder that we still need a vote on using HT's extension. --Connel MacKenzie 16:44, 16 June 2007 (UTC)
Endorsements now open for Wikimedia Foundation Board
The Wikimedia Board Election Steering Committee invites all community members to endorse candidates they support. Endorsements may be submitted on meta now till next Saturday, 23:59 June 23, 2007.
Each qualified community member can submit up to three endorsements.
Please note several things:
- Only confirmed candidates are listed, so the list can be updated
during the endorsements phase.
- You need an account on meta, not just the project that you are qualified to vote under, unless you meet the criteria on meta too.
- Please link your meta user page and your home wiki page. Detailed procedure can be found on the meta endorsement page.
All information is available on meta at:
On endorsements:
http://meta.wikimedia.org/wiki/Board_elections/2007/Endorsements/en
On candidates each:
http://meta.wikimedia.org/wiki/Board_elections/2007/Candidates/en
Election general: http://meta.wikimedia.org/wiki/Board_elections/2007/en
FAQ: http://meta.wikimedia.org/wiki/Board_elections/2007/FAQ/en
Questions about election are welcome at:
http://meta.wikimedia.org/wiki/Talk:Board_elections/2007/FAQ
Thanks to devoted volunteering translators, those pages are also available in some languages other than English.
Thank you for your attention, we look forward to your participation.
For the election committee,
- Philippe | Talk 00:35, 17 June 2007 (UTC)
A category for Metonym and/or Synecdoche
I found a word that was a metonym, and created a category:metonym, meaning to find a load of metonym words and include them in the category. Then I realised that a lot of the words were already labelled synecdoche. eg: wheels. On further investigation, there is no clear boundary between these two "classes", there is a lot of overlap, or confusion. Metonym is arguably a superset of Synecdoche. So, perhaps a category for both would be preferable/sufficient. But then, what to call such a category ? Or maybe not to bother ? Ideas ?--Richardb 01:41, 17 June 2007 (UTC)
- Almost any word can be used as a metonym, so having a category would be pointless. It's a figure of speech construction, not a grammatical function or a particular definition. A writer can say "A wagging tail greeted me," when in fact it would be understood that it was not a disembodied tail that did the greeting, but a dog. I think this could be better handled with an Appendix explaining the phenomenon and some of the common examples. --EncycloPetey 02:27, 17 June 2007 (UTC)
- Yes, but perhaps we could give constructions that are so notable/idiomatic/whatever-it-is-on-which-we're-basing-our-definitions that we define them : we have no entry for wagging tail, but we do have Kremlin. — Beobach972 02:39, 17 June 2007 (UTC)
- How do we then get people to limit additions to the category? I could say "a face loomed in the darkness" (and we have an entry for face). I could say "She's the brains of the operation." This sense of brain/brains would currently be labelled (idiomatic) in the definition rather than metonymy, because that sense is so common we no ,onger think of it as metonymy but a distinct definiiton sense. We would be hard pressed to set up clear boundaries between "too rare to be listed", "obvious metonymy", and "idiomatic sense in its own right". An Appendix would be less likely to grow out of control and easier to deal with. --EncycloPetey 02:47, 17 June 2007 (UTC)
- Ah, good point. I suppose an appendix would be best. — Beobach972 02:47, 21 June 2007 (UTC)
New feature on development branch of Lucene search
I jut got a note from the developer of http://ls2.wikimedia.org/ who mentioned that he added one of the features I requested. (The "exact case" matching. Compare a search here for "cliche" vs. a search there...the #1 hit is correct in the new search.) Please try it out. I'm sure he is eager for more feedback (not necessarily from me - but if we coordinate what we as a community want to request...) --Connel MacKenzie 03:38, 17 June 2007 (UTC)
P.S. Anyone know how to describe how Hebrew "pointing" should be treated by the search engine? --Connel MacKenzie 03:38, 17 June 2007 (UTC)
- Please also note the importance of timely feedback, as he is working closely with the core MediaWiki developers on this, primarily for Wikipedia. Once it is part of the main SVN branch, there will only be bugfixes, not new features. --Connel MacKenzie 03:44, 17 June 2007 (UTC)
- Hebrew "pointing" (vowels) should be ignored; I don't believe any Wikimedia project uses them more than half the time. Pointing should be stripped both from the search indices (e.g., an entry that contains "רוּחַ" should be treated as though it contained "רוח") and hence, obviously, from searches (e.g., a search for "רוּחַ" should be treated as a search for "רוח"). —RuakhTALK 17:42, 17 June 2007 (UTC)
- I guess my question is more along the lines of "what is pointing" from the technical encoding perspective. That said, does the "ls2" do that searching correctly? (Loooking at your example, it looks like both "ls2" and the current search engine work?) --Connel MacKenzie 17:20, 19 June 2007 (UTC)
- From the technical encoding perspective, "pointing" are the Hebrew-script characters of type "Non-Spacing Mark" (Mn) — viz, U+05B0–05B9, U+05BB–05BD, U+05BF, U+05C1–05C2 (sixteen characters in all, all in the Hebrew block). Unicode also has a number of pre-composed letter-plus-pointing characters, which make up the majority of the Hebrew-script characters in the Alphabetic Presentation Forms block; the MediaWiki software already handles these properly by splitting them up into the individual characters in article titles and content, so I imagine the search software would need to do the same.
- Neither "ls2" nor the current search engine works properly with regards to pointing; they both treat pointing as word separators, when they should actually treat them as though they weren't there at all. (For a concrete example: a search for "תפר" (to sew) should pull up תּפר (to sew), while a search for "פר" (bull) should not. Currently, however, the reverse is true, because both search engines treat "תּפר" the same as they would "ת פר". Is that clear at all?) —RuakhTALK 19:46, 19 June 2007 (UTC)
- I suspect the same issue will apply to Devangari and other Indic scripts. Dijan might be able to provide some input. --EncycloPetey 20:09, 19 June 2007 (UTC)
- By the way, does this give us the opportunity to ask that ligatures like "ij" (which apparently is common in Dutch) be treated like their components, in this case "ij"? —RuakhTALK 17:47, 17 June 2007 (UTC)
- Absolutely. Is there a list of these you can point to, or send directly to Robert Stojnić (with explanation)? --Connel MacKenzie 17:20, 19 June 2007 (UTC)
- Unfortunately, I don't know the relevant languages. I'll see if I can bug someone who can be more helpful. :-) —RuakhTALK 19:46, 19 June 2007 (UTC)
- At least Dutch, Croatian (& Bosnian), Slovene, and Czech. These all have digraphic characters, specifically Dž (Dž), dž (dž), d' (ď), l' (ľ), t' (ť), IJ (IJ), ij (ij), LJ (LJ), Lj (Lj), lj (lj). NJ (NJ), Nj (Nj), nj (nj). All characters in parentheses are the single-unicode form; those not in parentheses are component character forms. There's also the issue of searching for AE (Æ), ae (æ), OE (Œ), & oe (œ). --EncycloPetey 20:07, 19 June 2007 (UTC)
Fixed. Verify at http://ls2.wikimedia.org/
It's been fixed for enwiktionary at the moment, others will come bit later. --Rainman 10:02, 22 June 2007 (UTC)
Project Phrase
I think it would be a good thing if we made something (A project) that lets you placei a phrase, no matter how long, in the project translate ever word. We can eigther make something to place on each page saying what the word means, and it doesn't show on the page. it link back here were thewordcan be translated in the phrase or sentance (Example: **(Meaning)**). or we could make a list for the project that has a list of every word and its meaning/ simpler meaning if it's already in english. It coulkd be in any form, like a different page.
Box for phrase or sentance.
box down here or up there that lists every language we have here.
Box for results.
The language box could be for the language you want it to be translated in. (This can be useful to tell the project if it should find the word a simpler meaning if the original word is already in that same language they want to translate it in.) If the word has severe meanings, in the same language or and severe languages, the results should have severe results to cover the words. The resulted should also have all the languages the word comes in when it's used. —This unsigned comment was added by 70.130.186.37 (talk • contribs) 2007-06-18T01:26:01.
- If I understand your suggestion, you are requesting a tool to automate translations into English. Note that machine translation is much more difficult than it might appear at first blush. Rod (A. Smith) 02:34, 18 June 2007 (UTC)
- Creating a "Babel-fish"-like translator derived from Wiktionary data would be a neat side-project, but I don't think it would integrate very well into the MediaWiki software directly. Creating a dictionary of un-idiomatic phrases (even if translations were limited to, say 18 languages) would be quite enormous. The number of entries would be quite staggering, (both in terms of volunteers to enter it, and computing power to support it.) --Connel MacKenzie 17:10, 19 June 2007 (UTC)
Linking directly to specific meanings
I am an infrequent Wiktionary participant, so please forgive me (and direct me to the appropriate pages) if this question has already been addressed. But I haven't noticed it discussed in the current version of this page, and am not sure how to phrase search terms to find it among the vast dicussion archives.
I have had many situations in which I wanted to link either within WT or from another Wikimedia project not just to an entry here, but to a specific meaning. Most recently, I wanted to specify current meaning #7 of cool when updating ginchy. Not finding an obvious mechanism in use, I settled for a version of what many dictionaries use:
This isn't especially satisfactory for two reasons:
- It doesn't take advantage of wiki software to take the user directly to the relevant definition. (This is even more important when the link comes from outside WT, like from Wikipedia or Wikiquote, where there is less familiarity with this project's formatting.)
- Any renumbering of the list in the target page will render this definition inaccurate, even misleading. For interwiki users, one cannot even use "What links here" to find potential inaccuracies to fix them.
I know of at least one way to provide a specific link that avoids these problems, but I'd like to know what the community has discussed first. Thank you for your assistance. ~ Jeff Q 03:05, 18 June 2007 (UTC)
- Since the number associated with senses is susceptible to change, we usually indicate a particular sense by enclosing the first gloss of that sense in parentheses, but I don't know of a good way to apply that to your example. Rod (A. Smith) 04:44, 18 June 2007 (UTC)
- That does make sense for Wiktionary definitions, since it provides an immediate specific sense. On the other hand, it doesn't help the interwiki linking for term definition. An example I run into periodically is linking a word or phrase in a Wikiquote quotation to the relevant Wiktionary entry, especially for colloquialisms that one cannot expect most English speakers to understand. In such a context, it is impractical to insert a definition into the originating project's material. ~ Jeff Q 04:50, 18 June 2007 (UTC)
- Rod, I'm a little confused. You revised your response while I was responding, so my response is actually to your specific example which I repeat here:
- Am I right in thinking that this is the preferred method (over the unreliable superscript number I tried) to give the specific sense meant in Wiktionary definitions? If so, it would seem to resolve my urgent WT-only issue, although a link solution might also be useful here as well as for interwiki.
- Also, I didn't mean to by mysterious before. My idea is one that has been in use at Wikipedia and Wikiquote for a while — internal non-heading anchors using an empty XHTML "span" tag as a prefix. For example, one could include the following for the above "cool" definition like so:
# <span id="popular"/> Being considered as ''"popular"'' by others.
- The display would be identical to the current line in cool, but any project could link to it using the syntax
[[wikt:cool#popular]]
. For a live interwiki example, try this anchored link to Wikiquote:[[q:en:Latin proverbs#Kill_them_all|''Kill them all. Let God sort them out.'']]
- which becomes:
- At Wikiquote, we use this extremely sparingly, because it isn't very wiki-editor-friendly, and we aren't sure how well it will scale. But it can be used selectively for cross-project links. Thoughts? ~ Jeff Q 05:10, 18 June 2007 (UTC)
- Also, I didn't mean to by mysterious before. My idea is one that has been in use at Wikipedia and Wikiquote for a while — internal non-heading anchors using an empty XHTML "span" tag as a prefix. For example, one could include the following for the above "cool" definition like so:
- It is a nice idea regardless of how it works out. Maybe the current context tag could be updated to handle the ID. Something like:
{{context|slang|id=pop}}
- --Halliburton Shill 06:12, 18 June 2007 (UTC)
- It is a nice idea regardless of how it works out. Maybe the current context tag could be updated to handle the ID. Something like:
- One problem is that our senses are typically a single line; they're not like whole sections of Wikipedia articles. Linking to a specific line doesn't work very well, because it's not generally obvious to readers exactly what line is the one they're supposed to be focused on. (That said, we could add some JavaScript to MediaWiki:Common.js that would highlight whatever sense were thus linked to.) —RuakhTALK 06:16, 18 June 2007 (UTC)
- Well, browser users might reasonably be expected to read the material that their browser jumps to, which is how anchors work. I agree that highlighting through JavaScript would be useful, but I try to avoid complications that intefere with and can delay simple and ready implementations of practical ideas. As far as the ID names, I would highly recommend avoiding the tendency to unnecessarily abbreviate clarifying information. (For instance, "pop" is ambiguous in the above example, possibly making one wonder whether the "id" parameter is some kind of pop-up mechanism or other impenetrable concept for non-geeks. "Popular", when part of a link for "cool", is much more intuitive and less likely to be misunderstood or to overlap another anchor on the page.) We seasoned wiki editors seem to have a tendency to try to abbreviate things unnecessarily to save a few characters. We must remember that our primarily editing audience may not be tech-savvy, most wiki editors learn signficantly by osmosis (i.e., examining existing text to deduce its function), and a few extra characters are a small price to pay for clarity (especially for a non-paper work). ~ Jeff Q 02:56, 19 June 2007 (UTC)
- Re: "browser users might reasonably be expected to read the material that their browser jumps to": Not at all. Not when the material is a single line. Either their browsers implement fragment identifiers by scrolling such that the element starts at the very top of the viewport, in which case the one line is well above the center of the users' attention, or they scroll such that the element starts somewhere within the viewport (which IIRC is how older browsers did it, and which still happens when there's not a full viewportful of content after the start of the element), in which case it's not so easy to see exactly what line is being linked to. Re: "I try to avoid complications that inte[r]fere with and can delay simple and ready implementations of practical ideas": Well, since the JavaScript isn't necessary for such links to work (it just makes them more likely to be useful), there's no reason we can't simply and readily implement the practical idea if we want, and later add the JavaScript. :-) —RuakhTALK 05:05, 19 June 2007 (UTC)
- Don't get me wrong — I'd love to see a non-flashy but noticeable highlighting, perhaps like this, to avoid the short-content problem you describe for some pages — surely quite a few in a web dictionary. (I disagree with the implication that most users look at the center of a screen first. I believe it's a universal expectation [within the English-speaking world, at least] that one starts reading a page from the top, so a new page should be expected to draw one's attention to its top, not its center.) I've just seen too many great ideas fall apart over technical complications, either failing to be implemented, or being done in such a way that creates a huge barrier to participation for ordinary people. That's a major reason I haven't tried to push span-tagging as a general solution. I find most wiki editors (on Wikiquote at least) have enough of a challenge just following basic wiki formatting, without adding XHTML stuff, easy as it might be for us geeks to understand. But given such span-tagging or other direct linking to specific senses, a editor-transparent implementation of context/sense highlighting would be very useful. ~ Jeff Q 06:02, 19 June 2007 (UTC)
- Re: "I disagree with the implication that most users look at the center of a screen first. I believe it's a universal expectation [within the English-speaking world, at least] that one starts reading a page from the top": I certainly don't think they look at the center, and I'm sorry if I implied that they do. I think they look near the top. But the very top of every page is reserved for navigation links or a site banner or a little friendly whitespace. Re: "But given such span-tagging or other direct linking to specific senses, a[n] editor-transparent implementation of context/sense highlighting would be very useful.": Yes, sorry, I didn't mean to suggest that editors should have to do anything to support the highlighting. I think it would be pretty simple to add a
window.onload
handler to MediaWiki:Common.js that would do any necessary highlighting, without any work on an editor's part (aside perhaps from making sure that the identified element is directly a child of the sense's <li> element). —RuakhTALK 06:18, 19 June 2007 (UTC)
- Re: "I disagree with the implication that most users look at the center of a screen first. I believe it's a universal expectation [within the English-speaking world, at least] that one starts reading a page from the top": I certainly don't think they look at the center, and I'm sorry if I implied that they do. I think they look near the top. But the very top of every page is reserved for navigation links or a site banner or a little friendly whitespace. Re: "But given such span-tagging or other direct linking to specific senses, a[n] editor-transparent implementation of context/sense highlighting would be very useful.": Yes, sorry, I didn't mean to suggest that editors should have to do anything to support the highlighting. I think it would be pretty simple to add a
- Someone once told me (mid-2005) not to get my hopes too high for Wiktionary formatting; one can expect a complete reformatting every three months (at that time, that was true) as contributors come and go. While the format has stabilized somewhat since then, it still is not set in stone (although it takes much longer to change now.)
- Using "span"s to work-around Wiki design problems doesn't sit well with me. I'd rather see creative re-nesting of headings. But such an experiment would depend on a complete re-arrangement of our current headings. I could see, for example, the entry lead/experiment2 (confer lead/experiment) having L4 headings underneath "Definitions" that have the particular gloss as the actual L4 heading.
- Using such a technique eliminates the visual numbering within the content of the page, for numbered definitions. But the numbers do still appear in the TOC. Additionally, duplicate senses tend to be exposed much more readily (duplicate headings.) One nasty problem is finding what pages might link to a particular sub-heading.
- I'll finish the other 30 senses at lead/experiment2 later. I'm sure you can see what I mean from the first 10 senses.
- I've also noticed how convenient it is to have the definitions listed in the TOC (to the reader.) Such an arrangement eliminates essentially all complaints I could ever have, regarding the overloading of entries with "linguist-only" information...the information that is relevant to causal readers is presented first, in theory, allowing for a great deal of expansion nested below.
- Linking to sub-senses would look something like this: lead/experiment2#black lead used in pencils (without "/experiment2", of course.)
- I think the "span" syntax proposed above is far beyond any newcomer's ability. Therefore, it would be "just asking" for trouble.
- --Connel MacKenzie 16:55, 19 June 2007 (UTC)
- You folks obviously know better than I, but it seems to me that the number of different critical Wiktionary needs that might be easily addressed by nested headings is so large that it cannot be done usefully and universally due to the underlying XHTML heading limitations. Also, Connel, I think your "experiment2" is quite logical, but is also a significant departure from typical dictionary practice, and in my mind calls too much attention to the sense headings over the content. On the other hand, I agree that "span id" is unsatisfactory for general use. Had I the time and temerity, I might try to come up with a wiki-markup mechanism for non-heading anchors, especially given the many uses I've found for their use. Oh, well. ~ Jeff Q 00:43, 24 June 2007 (UTC)
- This is something that we really need to address. Some kind of solution is necessary, in my estimation, for Wiktionary to be as open and usable as it should be (and for us to accomplish our short-term mission of leaving every other web dictionary in the dust). Wikis are excellent tools for building semantic networks, but when we start having to use arbitrary numerical labels ("sense #4 under English etymology #3") instead of intuitive semantic labels ("English sense meaning 'pencil lead'"), we're just creating a mess which is a) unsustainable, b) inaccessible to the general public, and c) a horrendous pain the butt for all concerned. Basically there are two issues here:
- 1. The distinctions between homonyms are unclear. The semantic structure of an entry needs to be somehow accessible from the TOC; an opaque list of "Etymologies 1-5" followed by parts of speech doesn't cut it. Whenever I come to an entry like that (and I've made quite a few myself) I find myself reaching for a print dictionary... For most words with multiple etymologies, each etymology has a fairly distinct core sense; given that, I'm not sure why we can't insert labels parenthetically into the Etymology headings (e.g. "Etymology 1 (metal)"). At the moment, however, User:AutoFormat goes around "fixing" those whenever I try to add them.
- 2. The distinctions between definitions are unclear. Linking to specific definitions using semantic labels, preferrably with some kind of automated highlighting, would be a major plus.
- Now, I don't really understand what's wrong with using
span
, provided it's embedded in a standard template (or set of standard templates). The more relevant issue in my estimation is "how do we go about formulating consistent rules for choosing labels?" Of course I may be missing something (or several things)... At any rate, this subject merits further in-depth discussion and experimentation; thinktank page, anyone? -- Visviva 08:13, 24 June 2007 (UTC)
- This is something that we really need to address. Some kind of solution is necessary, in my estimation, for Wiktionary to be as open and usable as it should be (and for us to accomplish our short-term mission of leaving every other web dictionary in the dust). Wikis are excellent tools for building semantic networks, but when we start having to use arbitrary numerical labels ("sense #4 under English etymology #3") instead of intuitive semantic labels ("English sense meaning 'pencil lead'"), we're just creating a mess which is a) unsustainable, b) inaccessible to the general public, and c) a horrendous pain the butt for all concerned. Basically there are two issues here:
Policy on mentioning titles of TV shows
I new here and I'm wondering if it's OK to have definition of a phrase, mention a TV show be the same name because that is the case with arrested development. 80.109.79.136 16:11, 18 June 2007 (UTC)
- Very few television show titles meet WT:CFI. "Arrested Development" certainly doesn't, so I removed that sense of the phrase. Thanks for pointing it out. Rod (A. Smith) 16:28, 18 June 2007 (UTC)
- Actually, User:Ruakh beat me to the edit, but the entry is fixed nevertheless. Rod (A. Smith) 16:41, 18 June 2007 (UTC)
- Not the definiiton, no, but the fact that it is the title of a television series can serve as a citation attesting to the term's use. --EncycloPetey 21:36, 18 June 2007 (UTC)
- I just reread my entry above, and I can't read it (i.e. I have no clue, just by reading it, what it says). To me, it seems like horrible grammar. It could just be a psychological problem (like not being able to recognize faces, or maybe I was, or am now, just tired) but can someone write a corrected version, please? 80.109.79.136 13:27, 19 June 2007 (UTC)
Appendix name: make do take have
Hi. I've just created a new Appendix:MakeDoTakeHave. I'm not sure about the page name, though. Has anyone got a better suggestion? Algrif 17:02, 19 June 2007 (UTC)
- Not sure I understand how useful this is - so you "make" a telephone call, but you also "take" a telephone call. This doesn't explain to a non-English speaker why these are different things. bd2412 T 01:57, 20 June 2007 (UTC)
Hi. It explains in the notes that make is when you call out and take when you receive the call. But it's main function is to clear up the doubts. So many en.L2 speakers spend a lot of time trying to remember which of these four verbs is correct, or possibly more importantly, incorrect. E.g. not do or have a phone call. Simply that. And as such, I see this appendix as a useful resource. I just wish I could come up with a better name before I start filling in all the collocate entries with links. Algrif 08:42, 20 June 2007 (UTC)
- I like the concept of this page, but I'm worried about where it is heading. "Take" a phone call is used, but is rare in comparison to "make." For that reason there probably is better notation that could be used to list odd-varieties. (E.g. "I will have a phone call with the company VP at 3:00PM" or talk-show slang "Let's do a phone call, shall we?") You might need to hammer out some more definitive distinction, perhaps statistical relativity of the individual collocations? --Connel MacKenzie 17:54, 21 June 2007 (UTC)
- Sorry about the frequency suggestion. Re-reading the proposal, bd2412's comments above and everyone's comments below, statistics is not the way to approach this. If I understand correctly, the problem being addressed is that these collocations are not particularly idiomatic on their own. But looking at a collection of collocations of break, for example, would yield the dissimilar derived terms: do a break (split a piece of wood), make a break (run across a dangerous open field), take a break (get a cup of coffee), have a break (sabbatical), give me a break (incredulous interjection) etc. Of those, only two of those are currently listed as even being likely Wiktionary entries (on break#Derived terms.) I still think the appendix idea has merit. Perhaps listing one table-row for each collocation, with the "idiomatic" explanation off to the right? --Connel MacKenzie 22:13, 22 June 2007 (UTC)
- "Take" a phone call isn't uncommon in British English, typically used when you have a choice to receive a call or not - e.g. for a reverse charge call ("I'll accept it" also possible), or when there are several people who could answer it, although "I'll get it" is probably more common here. Thryduulf 19:19, 21 June 2007 (UTC)
- I hear it often in the US in the form "I have to take this call," as an apology before answering one's cell phone. --EncycloPetey 19:28, 21 June 2007 (UTC)
My reasons for starting this appendix is that it is a resource that is almost impossible to find anywhere else. One of my occupations is teaching EFL as part of my penance for living in Spain. This is one of those questions I hear several times daily in one form or another I have to do a phone call. Er. ¿Es correcto eso? The list of nouns and phrases, that take as collocations only one or two of these four verbs, is quite long. In many languages there is no distinction between make and do have and take as in English. In Spanish it is almost always hacer for example. Voy a hacer una foto (or maybe sacar) is translated NOT as I'm going to do a photo. Voy a hacer la cama is translated NOT as I'm going to do the bed. You know that. I know that. But the en.L2 person only knows it doesn't sound right, but also doesn't know what is correct either. I would just like to make clear that this is just to be orientative rather than definitive. Collocations are a statistical phenomenon, so of course you will find exceptions. But collocation information is very useful for en.L2's (or any other language L2's come to that.). It helps them to sound less foreign, not to metion more confident.Algrif 11:37, 22 June 2007 (UTC)
- As another EFL teacher, I have to agree that this is a very useful type of content for us to include. It would of course be ideal for such appendices to include information about relative frequency/significance in various corpora, general usage notes, etc. -- Visviva 13:09, 22 June 2007 (UTC)
- I'm sure everyone will agree that this sort of information is useful to include, but I'm not sure if exactly this kind of appendix is the best way. The problem is that (to take an extreme example) the word dump has many meanings, but only one of them has a collocation that fits into your rubric, and I'd rather not advise EFL learners to use to take a dump (=defecate) when what they really mean is to dump out, or (similarly) to use to make a scene (=draw attention to oneself) when what they really mean is to perform/act/act out a scene. Also, the most problematic feature of these collocations is that many or most nouns take more than one. Is this appendix really going to explain the difference between to make a baby (=have sex) and to have a baby? Or between to do time, to have time, to make time, and to take time? I think usage notes are best for all of these. What I think this appendix can be useful for is giving general trends — what the different verbs are typically used for — if that sort of information exists. (I'm really not sure if there are general trends, but if so, that's what this appendix would be really good for). —RuakhTALK 17:32, 22 June 2007 (UTC)
Aye! There's the rub! General trends is what causes mistakes. Half the time the correct verb is not the expected verb. But I take your point. The idea is good, but is there a better way? Certainly usage notes is a first step. This appendix could be for the most common expressions, perhaps with a note to the reader to check the exact meaning in the main entry. Algrif 17:53, 22 June 2007 (UTC)
I've made a large number of entries to the appendix, adding also the note at the top explaining that the meaning of the 4 verbs is always that of doing the activity, which should more or less eliminate mis-use, mis-interpretation. Further comments? BTW I have not added usage notes to any of the main page entries as yet. Algrif 13:34, 23 June 2007 (UTC)
- Regarding possible names for the appendix, what about Appendix:Collocations of common English verbs or Appendix:Collocations of do, have, make, and take? — Beobach972 03:32, 25 June 2007 (UTC)
- As you can now see, I've gone for the most obvious solution. Appendix:Collocations of do, have, make, and take Algrif 18:17, 27 June 2007 (UTC)
Acquiring Words and their Parts of Speech
I was wondering how I can go about getting each word in the Wiktionary and their part(s) of speech. I saw that they have made available for download a list of each entry without any other content, or the whole Wiktionary. Is there a way to just get the words and parts of speech without having to download and parse the whole Wiktionary? Thanks. —This unsigned comment was added by Vhtci (talk • contribs) 2007-06-20T17:28:22.
- At this point in time, no, there is no such list generated. It does sound immensely useful though. I assume you'd want that limited to English entries (since foreign languages often have different terminology that is simplified here to something comprehensible for an English reader.) --Connel MacKenzie 17:34, 21 June 2007 (UTC)
- http://tools.wikimedia.de/~cmackenzie/pos.txt (From last XML dump.)
- You may wish to do your own parsing, or course. For English, I normalized the headings down to the "big eight": Noun, Verb, Adjective, Adverb, Preposition, Pronoun, Conjunction, Interjection. In hindsight, I probably missed "Transitive verb" and "Intransitive verb" which should be OK, as those aren't valid headings anyhow. I probably should have removed "form-of" stuff (etc.) as well. I may play with this some more before the next XML dump. --Connel MacKenzie 01:54, 22 June 2007 (UTC)
- Updated with today's dump... --Connel MacKenzie 07:36, 29 June 2007 (UTC)