Jump to content

Wiktionary:Beer parlour/2012/February

From Wiktionary, the free dictionary
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


Interwikis

What is going on with the iw's? Jcwf 00:15, 1 February 2012 (UTC)[reply]

Very weird, isn't it? The translations are not linked to other wikis either (if that's not the same issue). --Anatoli (обсудить) 00:31, 1 February 2012 (UTC)[reply]
[[half]] seems to be fine (both interwiki-wise and translation-link-wise); and I just tried making a null edit, to see if it might be a parser issue, and the null edit did not break it. So I don't know why some pages would be affected and some not. —RuakhTALK 01:17, 1 February 2012 (UTC)[reply]
Oh, but now WT:BP is fine. So maybe it was a parser issue, but has been fixed? If you see any other pages with this problem, maybe try making a null edit? (That is, going to the "Edit" tab and clicking "Save page" without making any changes. That won't show up in the edit-history, but it will cause the page to be re-parsed.) —RuakhTALK 01:20, 1 February 2012 (UTC)[reply]

Definition of article

I heard namespace 0 pages no longer require an internal link to be counted as article. Is this true for all wiktionaries? If so I'll update wikistats accordingly. Thanks, Erik Zachte 05:16, 1 February 2012 (UTC)[reply]

I believe it's true for all Wikimedia wikis; can anyone confirm this? Mglovesfun (talk) 13:53, 4 February 2012 (UTC)[reply]

Words requiring another word

I am thinking about words requiring another word, such as unrained (on, upon) or undwelt (in). Is there any term for these? Should we treat them in any special way, e.g. categories? Equinox 01:11, 3 February 2012 (UTC)[reply]

Hmm:
  • 1996, Herbert M. Collins with Franklin R. Hall and Michael Hopkinson, Pesticide formulations and application systems, volume 15, page 187:
    A relative value (the visual rating) was thus obtained for the test formulation. The "unrained" surfaces were not rated visually in this study as the final aim of the method evaluation was to compare the values of the "rained" surfaces of the test formulations to the "rained" surfacs of the reference formulations.
  • 2002, Demografie, volumes 43-44:
    Development of the number of dwellings registered also a considerably faster rate of growth as regarded undwelt flats ... For the first time since 1970 even the absolute growth of the number of undwelt flats was higher than of those....
But both are formed in parallel to phrasal verbs. DCDuring TALK 16:40, 3 February 2012 (UTC)[reply]
All kinds of words require other words, for example, rational number requires integer, circle requires point, etc. —AugPi 19:09, 3 February 2012 (UTC)[reply]
I think Equinox means that class of words that must be construed with a preposition. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:18, 3 February 2012 (UTC)[reply]

Deleting empty categories; yea or nay?

I'd like to think it's ok (but not mandatory) to delete any empty category which isn't meant to be empty most of the time. To qualify 'meant to be empty most of the time', I mean like Category:French nouns lacking gender or Category:Spanish plurals, where ideally they would never be used, but are there to catch entries with problems.

Argument for deleting empty categories: it's rather irritating to click via a link or by typing into the search box and find an empty category, such as "This category is for the [foo] names of various languages." and then having zero entries. I'd prefer a red link to a blue linked category with nothing in it. NB when the category is valid but empty, such as Category:Old Provençal terms derived from Persian (example), it can be restore immediately when used. I say this specifically in relation to Special:UnusedCategories where there are around 2000 at the moment. I think it's okay to delete the majority of these. Mglovesfun (talk) 11:03, 4 February 2012 (UTC)[reply]

I think that it is OK to delete these. People can always add such categories to their watchlist (I watch the deleted Category:Tbot entries (Italian) for example) in order to catch problems. SemperBlotto 11:13, 4 February 2012 (UTC)[reply]
It's OK with me if you delete those categories, but could you add their preambles to their respective talk pages before you do so, please? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:11, 4 February 2012 (UTC)[reply]
Yes but I wouldn't, it would be a waste of time. If the category is used again it can be restored. Too many more important things to be done round here. Mglovesfun (talk) 14:12, 4 February 2012 (UTC)[reply]
Obviously, there's no point in copying preambles that are merely generated by templates like {{prefixcat}}, but I think you should copy preambles that are manually inputted. Wouldn't you agree? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:35, 4 February 2012 (UTC)[reply]
Until we have a way to automatically generate categories as soon as they are needed, I'd really rather these kinds of categories be kept. When these are needed again, they are likely to stay redlinked for quite a while. --Yair rand 09:48, 5 February 2012 (UTC)[reply]
Yes, I agree. —RuakhTALK 23:08, 5 February 2012 (UTC)[reply]
Keep them IMO. Categories meant to be empty and there to catch mistakes are useful for (a) preambles and (b) __HIDDENCAT__. I don't understand the argument "it's rather irritating to click via a link or by typing into the search box and find an empty category": where does one see such a link (except of course on the bottom of a page, where it exists even if it's red)? How often does one search for categories by name in the search box?​—msh210 (talk) 18:30, 5 February 2012 (UTC)[reply]
I think it's more generally a case by case basis, but I'd have to lean towards Msh210 (talkcontribs) and say to keep them. -- Cirt (talk) 23:58, 13 February 2012 (UTC)[reply]
@msh210 and indeed everyone, because of our topical category system where categories are imbedded within categories en masse. As of right now, Category:History contains three empty categories. Mglovesfun (talk) 13:51, 1 March 2012 (UTC)[reply]
I agree with Mglovesfun. I'm for deleting empty categories as well. Empty categories are useless and misguide users who are looking for something. It's not a big problem to create a category if new entries appear in it. As an empty category I mean only topical and "main" categories such as XX nouns, XX verbs etc., not cleanup categories such as "entries which need X script", "translation requests" which should be empty. Maro 15:47, 7 April 2012 (UTC)[reply]
Very late input but meh, I oppose this in cases of maintenance categories like the French one Mg cited, but for non-maintenance ones knock yourself out, I guess. 50 Xylophone Players talk 01:11, 10 June 2012 (UTC)[reply]

Bot edits to fix Double redirects

Per request I am posting this here.

  • Double redirects are redirect pages that link to redirects
  • There are five types of double redirects, only one of them (first type) is typically fixable by bot
    1. ordinary double redirects: Redirects that link to other redirects that eventually lead to an article
    2. protected double redirects: Redirects that are protected from edits that link to other redirects
    3. external double redirects: Redirects that link to other redirects that eventually lead to a wiki page on another wiki
    4. self redirects: Redirects that link to themselves
    5. redirect loops: Redirects that link to other redirects that do not lead to an actual article
  • Double redirects are a navigational hazard for the reader as they will not re-redirect the user.
  • Pywikipediabot has redirect.py which can be used to handle ordinary double redirects (type 1 in the list above) when used with "double -always" parameters. (intended code here)
  • En.wiktionary gets a few double redirects in a blue moon. Bot flag may be unnecessary. en.wiktionary has no double redirects currently. That said if there are many double redirects created such as with username renames or mass move of articles there would be a flood of recent changes so a bot flag could be a good idea.
  • Human edits are unnecessary as the task is mundane and routine, it would be a waste of human time to keep watching the special page as well as carry on with the edits that can be delegated to bots.
  • Bot operates on practically every wikimedia wiki currently

I hope this gives a good general idea about the problem and how bot edits can help. -- Cat chi? 02:50, 5 February 2012 (UTC)[reply]

Does KassadBot or another already do these? (I assume not, but figured I should check.)​—msh210 (talk) 18:33, 5 February 2012 (UTC)[reply]
It doesn't touch redirects -- Liliana 18:38, 5 February 2012 (UTC)[reply]
Anyone have an idea how often this is an issue? Roughly how many edits this bot will make per, say, year?​—msh210 (talk) 17:36, 7 February 2012 (UTC)[reply]
It entirely depends on user activity. If no one moves pages no edit would be made and how many redirects point to the moved redirects. page moves do happen. Currently the bot would make edits once in a blue moon. It would probably be a few edits per year, however do consider a scenario where:
  • Page A with hundreds of redirects linking to it
  • Page A is moved to page B
  • Bot would make hundreds of corrections flooding the RC feed.
I however noticed a pattern of the deletion of older redirected pages. Examples include renames of:
I do not know if such deletions are based on past consensus but it is my belief that deletion of redirects is a bad method. It makes the entire site difficult to cite as for instance someone citing legerrio would not be able to retrieve the information again. Furthermore with such deletions all discussions renamed accounts previously participated will become a redlink removing proper attribution to comments. This is probably a separate discussion so I do not want to indulge in it too much but this is something to consider.
-- Cat chi? 09:16, 12 February 2012 (UTC)[reply]
To give an idea this is how unrestricted french edition looks: fr:Spécial:Contributions/タチコマ_robot. It depends on local activity. -- Cat chi? 12:03, 7 April 2012 (UTC)[reply]
Can my bot be unblocked if there are no objections? -- Cat chi? 20:11, 3 June 2012 (UTC)[reply]
We are not very fond of redirects here, and the problems you cite are actually not as bad as they seem (information is always retrievable, for example). In any case, has your bot been voted in? That is standard policy around here. See WT:V for more. --Μετάknowledgediscuss/deeds 20:46, 3 June 2012 (UTC)[reply]
Yes. Wiktionary talk:Votes/bt-2011-12/User:タチコマ robot. I was asked to post it here. The bot will not create any new redirects. It will only update existing ones. -- Cat chi? 21:02, 3 June 2012 (UTC)[reply]
Obviously, that vote failed. You will need to start a new vote (at WT:V). --Μετάknowledgediscuss/deeds 21:41, 3 June 2012 (UTC)[reply]
On the vote I was told to start this thread though. I do not want to go around in circles. -- Cat chi? 20:50, 4 June 2012 (UTC)[reply]
Well, if you give enough information, this time it will go through and you won't have any more circles to deal with. I'm not overfond of bureaucracy, mind you, but I think that the requirement that bots be voted in is a sound one. --Μετάknowledgediscuss/deeds 20:55, 4 June 2012 (UTC)[reply]
That is fine. Alright, I will file a new vote then. -- Cat chi? 21:57, 4 June 2012 (UTC)[reply]
There: Wiktionary:Votes/bt-2012-06/User:タチコマ robot -- Cat chi? 09:29, 5 June 2012 (UTC)[reply]

Misuse of "uncountable"

I've just noticed that North Pole is marked as uncountable. This is incorrect. "Uncountable" refers to the non-existence of a plural form of a word, not the uniqueness of the thing the word refers to. It is perfectly possible to form the phrase "North Poles" even though the Earth has only one. (In any case, other planets have a North Pole, and so we could say "the North Poles of the planets in the Solar System".)

Would someone like to volunteer to replace "uncountable" with plurals in entries that are actually countable nouns? — Paul G 16:12, 5 February 2012 (UTC)[reply]

Unfortunately I used to get this wrong quite a lot, usually where I really meant {{en-noun|!}} i.e. no plural attested (without being a mass noun). Equinox 16:17, 5 February 2012 (UTC)[reply]
I don't think one actually could say "North Poles"; it would be north poles, uncapitalized. North Pole is a proper noun, so it should use the en-proper noun template, not the en-noun template. --Yair rand 16:17, 5 February 2012 (UTC)[reply]
  • 1822, The gentleman's magazine, and historical chronicle, volume 92, Part 2, page 212:
    There is a satisfactory proof that the conjoint action of the two North Poles occasions the line of no variation.
  • 1947 October 20, “Three Magnetic Poles In Arctic”, in Milwaukee Sentinel:
    Army aviators have established a year 'round defense against Russian attack across the Arctic and have added the discovery of two magnetic North Poles to one previously known.
  • 2005, James Maxlow, Terra Non Firma Earth:
    Figure 39 Recent geomagnetic North Poles plotted as small circle arcs.
    Counterexamples. DCDuring TALK 17:20, 5 February 2012 (UTC)[reply]
    I stand corrected. Are those noun senses or proper noun senses, though? --Yair rand 08:56, 6 February 2012 (UTC)[reply]
    As a quick answer: I don't know. I would think that the magnetic pole and the rotational pole would each be a proper noun. But similarly, the Durings would seem to be a proper name as well, perhaps short for a list of full names or referring to a complete lineage without anyone being able to identify all the members of the group. DCDuring TALK 19:15, 6 February 2012 (UTC)[reply]
I've gone through and got several that I think might have been mismarked, but since I'm new to Wiktionary (well, I registered in 2004, but only to correct a prescriptivist who was being a prick about the singular they), I won't do more until someone checks my recent contribs to that effect. —Quintucket 18:25, 5 February 2012 (UTC)[reply]
In the case of Laserdisc I have now split it into Proper Noun and (countable) Noun sections. The two use different templates. Equinox 18:29, 5 February 2012 (UTC)[reply]
There is also {{singulare tantum}}. Mglovesfun (talk) 10:59, 6 February 2012 (UTC)[reply]

Latin -que compound words

Following up on the little discussion there was last year in RFV (when archived: Talk:fasque), I've open what I hope can be a bigger discussion about our policy towards -que on the WT:ALA talk page (Wiktionary talk:About Latin#Latin -que compound words). - -sche (discuss) 20:07, 7 February 2012 (UTC)[reply]

Citations from online sources

CFI says: "As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived by Google." Until recently, I have understood this to mean that online sources are acceptable as long as they meet a certain "durability" threshold, one presumably lacked by forums or everyday people's blogs and journals. Thus, I've been culling citations from content found on sites like CNN.com, The Huffington Post, Gamespot, etc. for a while now, and have seen the same done by others.

But I've been left with questions about the acceptability of drawing citations from online sources after this discussion on RFV. If it's really the case that online sources are generally considered unacceptable, it doesn't make sense to me why Usenet would be the exception to the rule, because I can't see any special quality that sets Usenet apart. I'm puzzled that it would be considered acceptable to draw citations from Usenet, but not from content on any other online source, no matter its stature, and am concerned about how this seemingly arbitrary limitation would adversely effect my ability to attest words and phrases.

Can I get some clarification? I'm honestly confused here. Astral 23:16, 7 February 2012 (UTC)[reply]

I think the main difference is that usenet isn't owned by a single entity but can be mirrored by anyone on the internet. That means that no single entity can take the sources offline either, which is what gives them their durability. —CodeCat 23:45, 7 February 2012 (UTC)[reply]
Does this mean that non-Usenet online sources like CNN.com should not be used for citations? Astral 01:41, 8 February 2012 (UTC)[reply]
I think it does mean exactly that. If a given entity has a policy that says that it will archive all articles as originally written, it might be worth considering, but such policies can change. As long as the material is copyrighted, even the legality of archival copying is at issue.
On the general question of archiving digital information, consider the following:
  • 2001, Bruce Sterling, Digital Decay[1], retrieved February 7, 2012:
    Originally delivered as the keynote address for Preserving the Immaterial: A Conference on Variable Media at the Solomon R. Guggenheim Museum on March 30, 2001
    Bits have no archival medium. We haven't invented one yet. If you print something on acid-free paper with stable ink, and you put it in a dry dark closet, you can read it in two hundred years. We have no way to archive bits that we know will be readable in even fifty years. Tape demagnetizes. CDs delaminate. Networks go down.
DCDuring TALK 02:11, 8 February 2012 (UTC)[reply]
I'm not sure he's comparing like to like here. You can print bits on acid-free paper with stable ink. Printing a DVD on acid-free paper with stable ink would take a lot of space, but you can fit 17,000 books on there[2], and many an organization that has tried to store that quantity of paper have lost it to fire or water. Is the ongoing maintenance required to keep 17,000 books safe cheaper or easier than making an annual copy of a DVD? Or if you trust film stock (and they swear that it will last hundreds of years), even if you only stuff 640 x 480 b/w bits per frame, an hour and a half of film will hold as much as a DVD. We can't permanently archive bits in the quantities we're used to slinging around, but bits and the information stored in them haven't got harder to store.--Prosfilaes 03:05, 8 February 2012 (UTC)[reply]
(edit conflict) Isn't it counterproductive for an online dictionary that bills itself as such ("As Wiktionary is an online dictionary...") to avoid citing online sources on the principle that digital media doesn't last as long as paper? Digital media is cheaper and consumes a lot less space than paper media, meaning that, in the 21st century, there's more incentive to build and maintain digital archives. But it would seem more beneficial to have citation standards based on concrete criteria — like Wikipedia's RS — rather than abstract ideas about the relative permanency of various media formats. Astral 03:13, 8 February 2012 (UTC)[reply]
It's not an abstract idea; it's a concrete practical solution to the idea of being able to check a citation in a decade or two. Why is it counterproductive for an online dictionary to avoid citing online sources? I don't see the connection there.--Prosfilaes 03:21, 8 February 2012 (UTC)[reply]
If it was concrete, when I asked what the citation standards were, I would have been directed to a policy page with clearly defined and outlined criteria. The information I've found or been given has been contradictory. CFI says, "As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups," but users are telling me that online media isn't appropriate for citation because it isn't as lasting as paper media (which is debatable). Except Usenet. Astral 04:01, 8 February 2012 (UTC)[reply]
There's a difference between ill-defined and abstract. It's not debatable that online media isn't as lasting as paper media; most books are owned by several libraries in permanent collections in formats that have a life expectancy of centuries, as well as being held by Google and UMich in online formats.--Prosfilaes 04:47, 8 February 2012 (UTC)[reply]
I don't see your "durability" threshold. I just don't see any evidence that CNN and friends tend to stick around longer than anyone else. It doesn't take a lot of money to stay online; I bet a small website could stay online in perpetuity for $10,000. But it does take a will to do so, and I see no evidence any of them have made claim that that's a goal of theirs. Moreover, if we plan on sticking for another decade or two, I'm not sure that we can trust even those claims.--Prosfilaes 02:47, 8 February 2012 (UTC)[reply]
I still don't get how Usenet is somehow the exception to the "digital media is not durable" rule. Copyright argument aside, Usenet archives are just as prone to the whims of fate as any other online source, i.e. just as likely to be rendered inaccessible through the shut down of a site or succumb to storage medium decay or destruction. It's not really feasible to base citation standards around personal suppositions about what media formats or sources are the most "durable," because there's no way to conclusively know how technology is going to progress. Astral 03:43, 8 February 2012 (UTC)[reply]
In reality, the inclusion of Usenet has more of a practical purpose; it allows us to include relatively recent slang words that would otherwise be unattestable. -- Liliana 04:11, 8 February 2012 (UTC)[reply]
But there's no one source of Usenet, and Usenet gives an implied license to archive to basically anyone. (I believe there's an X-Archive: No header or something that can be used to rebut that presumption, but most Usenet posts don't have that.) CNN can and does unilaterally take down posts. It's obviously feasible to base citation standards around suppositions of what media formats are most durable, because we've done it. A lack of conclusive knowledge has nothing to do with the feasibility, merely the wisdom. While we don't know conclusively anything, I think our choices have a good chance of being correct; libraries, particularly academic libraries, aren't going anywhere quickly, and Google and UMich are working on making paper sources also online ones.
You want to attack Usenet? Okay, but I don't think it will win you what you want. Usenet is an exception to our general rules because it's such a convenient corpus. I'm guessing that arguing that it's no more durable than online materials, if it provoked a chance, would be more likely to exclude Usenet as a citation source then add arbitrary online sources.--Prosfilaes 04:47, 8 February 2012 (UTC)[reply]
  • Comment: Keep in mind please, just because a source goes offline, does not mean it is not durable. It can still be accessible in news archive sources like Newsbank, or Lexis Nexis, or Westlaw. Cheers, -- Cirt (talk) 05:14, 8 February 2012 (UTC)[reply]
    What are the inclusion policies of those organizations? DCDuring TALK 08:57, 8 February 2012 (UTC)[reply]
    They're durably archived, digitally, microfiche, the works. -- Cirt (talk) 18:52, 8 February 2012 (UTC)[reply]
    I meant: what content do they include from, say, CNN? Do they include user comments, CNN replies? Do they include all original postings or just final corrected versions? Their content is behind a paywall, isn't it?
    It goes without saying that what they have has the same copyright restrictions as the original, possibly extended by the addition of access aids, such as keywords. DCDuring TALK 19:25, 8 February 2012 (UTC)[reply]
    • Really I want to remove that "durable" part. Nothing in the world is durable, apart from stone tablets. -- Liliana 05:46, 8 February 2012 (UTC)[reply]
      Durable doesn't mean infinitely durable. We can be reasonably sure that print works on paper won't survive more than a few hundred years. I actually feel a little better knowing that print works are also archived digitally. Print works that exist only on high-acid paper, introduced about 150 years ago, are unlikely to last in that form for three hundred years from the printing. Apparently the problem is particularly serious for works printed in Russia and eastern Europe.
      If there were several multiple paper copies of the Usenet archives using acid-free paper, I would feel better than depending solely on the multiple electronic copies that I am told exist. Perhaps the site of the Norwegian seed and DNA repository could be used as one site for such storage. Perhaps copies of annual editions of the WMF projects could also be so archived. Perhaps some funding could be found for such a noble purpose. DCDuring TALK 08:57, 8 February 2012 (UTC)[reply]
I'll maintain my usual line that "durably archived" is bollocks and needs to go completely. Nobody can know which resources will last and will not. It would violate WP:CRYSTAL (Wikipedia is not a crystal ball) but doesn't since we're not Wikipedia. But anyway, I would very happily dump it completely. Our current solution is just to totally ignore the meaning of "durably archived" and interpret as meaning "published works and Usenet", which isn't a meaning but rather a description. Mglovesfun (talk) 16:52, 8 February 2012 (UTC)[reply]
I agree it's not really very clear, and I think your definition is actually clearer. I would support modifying CFI so that it defines appropriate sources as such, instead of calling them just 'durably archived'. —CodeCat 18:13, 8 February 2012 (UTC)[reply]
Er, a lot of online works are published in some sense. "Printed works and Usenet", perhaps.--Prosfilaes 20:41, 8 February 2012 (UTC)[reply]

I'm starting to agree with most of the other folks in this thread above, "durable" is kinda silly wording and should just be trimmed out. Newsbank, or Lexis Nexis, or Westlaw are all perfectly find as sources, and are archived, on microfiche, and digitally, and have survived for a long time and will continue to be archived successfully and available very easily to any researcher, and should be weighted equally to online sources. -- Cirt (talk) 18:52, 8 February 2012 (UTC)[reply]

Problem with the word durable might be that it might be interpreted as permanent - as the synonyms section of its entry suggests. But if the word has its comparative and superlative (as its entry suggests), than it can't be equated with the word permanent so easily... AFAICT. And if durable then ain't synonymous with permanent, I'd say it could serve the purpose for CFI. At least if it is reworded to "archived in an extensively durable manner such as Usenet..." or s.t. --BiblbroX дискашн 20:33, 8 February 2012 (UTC)[reply]
Actually, if the word permanent has its comparative and superlative then maybe I am completely wrong about its meaning. --BiblbroX дискашн 20:35, 8 February 2012 (UTC)[reply]
All (almost all ?) adjectives that have an absolute sense (not gradable or comparable) also are used otherwise. See (deprecated template usage) unique, for example. I find it hard to take absolute meanings seriously except in mathematics. Astronomy, geology, and history all favor non-absolute meanings, IMO. The field of archiving and storage is the realm of man-made artifacts, which seems a particularly poor realm for absolute meanings. DCDuring TALK 22:12, 8 February 2012 (UTC)[reply]
If they're printed on microfiche, then they are printed sources and already clearly usable under CFI. Moreover, my problem is with "CNN.com, The Huffington Post, Gamespot, etc.", and the theory that any and all text on those sites (include etc.) can be trusted to be durable.--Prosfilaes 20:41, 8 February 2012 (UTC)[reply]

CNN.com gets archived to news archive sources like Newsbank, Lexis Nexis, Westlaw. Those news archives are stored on microfiche. Therefore, CNN.com is durable. -- Cirt (talk) 23:50, 8 February 2012 (UTC)[reply]

I don't see how they get the videos, and they certainly don't archive the comments, and I would be surprised if now and forever there was no corporate blog or other informal stuff that didn't get so archived. But those are largely quibbles. If we want to make a list of those sites that are archived by such processes, that would be cool and useful. I see that as an affirmation of our current (somewhat de facto) policy, and not an encouraging of arbitrary websites.--Prosfilaes 01:49, 9 February 2012 (UTC)[reply]
Oh, agreed, of course. -- Cirt (talk) 03:48, 9 February 2012 (UTC)[reply]

Don't forget that durability is mentioned in CFI for verifiability purposes. Stating that a word does not exist or is not worth an entry only because citations are from media not considered durable enough would be absurd. And, again, Internet pages can be durably archived by our software when needed. Lmaltier 21:47, 10 February 2012 (UTC)[reply]

Surely not unless we get permission from the copyright holder. Equinox 21:50, 10 February 2012 (UTC)[reply]
Well, there's also Internet Archive. -- Cirt (talk) 23:57, 13 February 2012 (UTC)[reply]
We are in no way capable of tracking any significant segment of English outside what is durably recorded. Such is better left to dedicated dictionaries with dedicated scholars authoring them. The value over the long term of non-durably recorded terms is zero, as nobody will be looking them up.--Prosfilaes 22:48, 10 February 2012 (UTC)[reply]

This is currently the category for such terms as (deprecated template usage) lion, (deprecated template usage) tiger and (deprecated template usage) jaguar. The idea is presumably that a "panther" is any species of Panthera, but I have never in my life heard (deprecated template usage) panther used this way, it's not in the OED, and even if some citations can be found to support it, it's a very misleading name for a category of this sort. If we want to be that specific we should just go ahead and call it Category:en:Species of Panthera, otherwise why not just use Category:en:Big cats like everyone in the real world actually does. Ƿidsiþ 07:21, 8 February 2012 (UTC)[reply]

Big cats doesn't have an exact definition (for example, pumas and cheetahs are sometimes considered and sometimes not considered big cats). I don't think using Species of Panthera is ideal either; words like Latin pantherinus, which is related to panthers, but not a species, would be excluded. If we are to change, I suggest just Category:en:Pantherinae (includes Panthera, snow-leopard and clouded leopard), but I don't think it's ideal either.Ungoliant MMDCCLXIV 13:36, 8 February 2012 (UTC)[reply]
Categories are used to make searches easier, they should be designed for readers. Therefore: 1. They don't have to have a precise scientific definition. 2. Their names should be clear (Pantherinae is OK in Wikipecies, but not in a language dictionary; furthermore, the precise scientific classification changes over time, sometimes often, e.g. for fish, and these changes are irrelevant here).
I think that Big cats is an ideal name for this category. Lmaltier 21:37, 10 February 2012 (UTC)[reply]
Big cats works for me, though that term has a fuzzy boundary. ~ Robin 12:11, 16 February 2012 (UTC)[reply]
The simplest fix I can think of is delete Category:en:Panthers and have all the terms in Category:en:Felids or Category:en:Felines. There are not all that many of these terms. --Dan Polansky 12:23, 16 February 2012 (UTC)[reply]

Numbers or Numerals

I am kind of confused. [[Category:Latin numerals]] and [[Category:Latin numbers]]. Two very same categories. --KoreanQuoter 15:27, 8 February 2012 (UTC)[reply]

It's a long dispute that has, to my knowledge, never been resolved. You can find a lot of it by searching in the archives. -- Liliana 15:51, 8 February 2012 (UTC)[reply]
What Liliana-60 said (to put it mildly). Mglovesfun (talk) 16:47, 8 February 2012 (UTC)[reply]

Definitions as sentences

I am once again reminded of Wiktionary:Votes/pl-2009-03/ELE Amendment 1. On the French Wiktionary we treat all languages the same, and all definitions are treated as sentences, even when it's a one word translation. on User talk:Mglovesfun/Archives/1#formatting, User:Widsith said "Just a small point, but glosses from foreign languages into English shouldn't end in full stops. Just the translation(s) alone is fine. Thanks!" This is absolutely the most common practice, but WT:ELE actually says "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop." The formatting for non-English languages is pretty consistent; for English it's anything but. Some start with capital letters, some don't. Some finish with fullstops (i.e. periods) some don't. Any chance of implementing Visvisa's suggestion in Wiktionary:Votes/pl-2009-03/ELE Amendment 1 from 2009? That is, treating all definitions as sentences. If nothing else, it would enforce consistency. Mglovesfun (talk) 12:15, 9 February 2012 (UTC)[reply]

Are you saying that the definition of "xyz" should be "An xyz is a whatever." or that it should be "A whatever." ? SemperBlotto 12:20, 9 February 2012 (UTC)[reply]
Sentence format sorry, initial capital letter, final fullstop, even when it's a single word. So Spanish fuego is define as "Fire." Mglovesfun (talk) 12:24, 9 February 2012 (UTC)[reply]
OK. If it ever comes to a vote - I'm in favour of free format (whatever the original editor thinks is best at the time). SemperBlotto 12:26, 9 February 2012 (UTC)[reply]
That's the status quo, AFAICT. Mglovesfun (talk) 12:30, 9 February 2012 (UTC)[reply]
When it comes to definitions, I imagine two different kinds:
  1. Simple "equational" definitions, where you get "definiendum = definiens", which are the norm for foreign-language definitions which give one-word translations or a list of largely synonymous one-word translations punctuated by commata. When I use this form of definition for English terms, I follow the OED in using the 〈=〉 symbol, as in the two senses of inverted hat.
  2. "Full-sentence" definitions, where there is an implied form of "definiendum [means / is / &c.] definiens", which are more-or-less the norm for English definitions which give descriptive glosses (that are usually semantically substitutable for the definiendum) or a number of equivalent descriptive glosses punctuated by semi-cola, and sometimes ending in one or more one-word synonyms (which are doing essentially the same thing as one-word–translation foreign-language definitions). Despite being "full-sentence" definitions, these can be very short, as in the case of senses 1 and 3 of inverted circumflex.
If any form of practice were to be formalised, I'd hope it would be the practice I describe above. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:06, 9 February 2012 (UTC)[reply]
My practice is fairly similar to yours, but I mostly only use the equals-sign notation for foreign terms, when I'm defining one as basically, "equal to such-and-such other foreign term". (For example, I defined קמ״ש (K.M.Sh., kph) as
# ={{term||[[קילומטר|קִילוֹמֶטֶר\־רִים]] [[ל־|לְ־]]\[[ב־|בְּ]][[שעה|שָׁעָה]]|kilometer(s) per hour|lang=he|tr=kilométer(im) l'-/b'sha'á}}: [[kph]]
  1. =קִילוֹמֶטֶר\־רִים לְ־\בְּשָׁעָה (kilométer(im) l'-/b'sha'á, kilometer(s) per hour): kph
.) And EncycloPetey has objected to my doing even that.
RuakhTALK 16:55, 9 February 2012 (UTC)[reply]
That's interesting. Without knowing anything about Hebrew, I'd tend not to support that practice. My reasoning is this: English entries and non-English entries have, AFAICT, slightly different purposes. English entries are meant to explain what a word means; in the case of true synonyms, it is therefore appropriate to define one as "= [the other word]" to save unnecessary duplication. In the case of non-English entries, they're meant to give translations; accordingly, any non-English lemma ought to link directly to an English translation, saving any equivalent terms for a Synonyms section. That's my rationale, anyhow. I admit, however, that I mostly work with English terms, and have not thought through all the implications of my stance; in no way do I mean to be dogmatic. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:27, 11 February 2012 (UTC)[reply]
No, the purpose is exactly the same: describing a word, including its sense(s). The difference is that, for non-English words, it may be easier to provide a definition, because a translation may be sufficient to explain the meaning of the word. But this translation is a definition. Lmaltier 08:55, 11 February 2012 (UTC)[reply]
Yes, upon further (less fatigued) reflexion, you're right. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 08:16, 12 February 2012 (UTC)[reply]
@Doremítzwr: But it's not an "equivalent term", it's not a "synonym": it's the same term. It's the pronunciation, it's the etymology, it's everything. קמ״ש simply is קִילוֹמֶטֶרִים בְּשָׁעָה. —RuakhTALK 14:21, 11 February 2012 (UTC)[reply]
OK, then; shouldn't they be listed in Alternative forms sections, rather than in Synonyms sections? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 08:16, 12 February 2012 (UTC)[reply]
I'm not the one who suggested it should be in a Synonyms section. ;-)   But anyway, no: someone looking up קמ״ש will want to see קִילוֹמֶטֶרִים בְּשָׁעָה. If I had to remove one part of the definition or the other, I'd rather remove the "kph" part, because it's easier to figure out "kph" from קִילוֹמֶטֶרִים בְּשָׁעָה than the reverse. —RuakhTALK 14:52, 12 February 2012 (UTC)[reply]
Forgive my fuzzy thinking. I'm with you on this one. If קִילוֹמֶטֶרִים בְּשָׁעָה had an entry, I wouldn't support that, but as it doesn't, I think it's a good way to do things. Alternatives could include having that information in an Etymology section or changing the definition to "initialism of קִילוֹמֶטֶר\־רִים לְ־\בְּשָׁעָה (kilométer(im) l'-/b'sha'á, kilometer(s) per hour): kph", but I shan't pettifog. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 18:27, 12 February 2012 (UTC)[reply]

I don't agree with the use of initial capital letters and full stops in definitions. In most cases definitions do not have a main clause verb, thus they cannot be treated as ordinary sentences. This is more clear in foreign languages entries, where the "definition" is very frequently a single word or a set of words separated by commas. Moreover, I personally find it ugly and annoying, I mean being obligated to use something like [[word|Word]] instead of a plain [[word]]. What if there are two entries, one with a capital initial and one with a lower-case? The reader wouldn't know which one is the correct translation until they click on the link. I don't like the equation symbols either. I could accept them in a glossary, where the gloss comes right after the headword, but here it seems to me ugly and unjustified. --flyax 21:12, 9 February 2012 (UTC)[reply]

I've always found current practice very inconsistent. All dictionaries have a consistent presentation for definitions, capitalized or not, with a full stop or not, but they are consistent in the whole dictionary. Don't forget that, even for non-English words, what is provided is a definition, even when this definition is a single word (a definition is an explanation of what the word means, e.g. psychanalyst is a good and sufficient definition for psychanalyste.
fr.wikt use capitalized definitions with full stops, for all words (except where the convention is not applied). On the other hand, nl.wikt does not use full stops, nor capitals. This second option has two advantages:
  • in some cases, the absence of a capital makes the definition clearer, less ambiguous, as mentioned above.
  • the absence of a full stop discourages the addition of encyclopedic details.
A change is really needed, for consistency, and I would favor this second option. Lmaltier 21:39, 9 February 2012 (UTC)[reply]
I also strongly favor the second option (no capital and no full stops).Matthias Buchmeier 10:59, 10 February 2012 (UTC)[reply]
Me too. --JorisvS 11:07, 10 February 2012 (UTC)[reply]
Not ever? Mglovesfun (talk) 11:42, 10 February 2012 (UTC)[reply]
In long definitions a punctuation mark somewhere in the middle might be necessary. In these cases we could agree to always use the semi-colon. --flyax 12:02, 10 February 2012 (UTC)[reply]
Maybe only allow capital letters and fullstops for multi-sentence definitions. And in partial reply to DCDuring below, not all multi-sentence definitions will be bad one. Mglovesfun (talk) 18:51, 10 February 2012 (UTC)[reply]
Definitions may be very long (e.g. for mathematical terms), but I don't think that multi-sentence definitions are needed. I can't find any example. This is a strong clue that unneeded encyclopedic details have been included. Lmaltier 21:28, 10 February 2012 (UTC)[reply]
Some definitions in English sections are in the form of clauses with a main verb. Some examples can be found among senses using {{non-gloss definition}}, especially those beginning with "Used". These can be viewed as sentences for which the headword is the subject of the sentence. There are also others with a clause as the main element of structure. Some definitions have other punctuation, such as semi-colons and commas separating main parts.
I don't think that such definitions are as intelligible without initial caps and final period. (I have no more evidence for my opinion than has been advanced for other claims about appearance and intelligibility in this discussion.)
Uniformity of appearance among definitions has been acknowledged by several in this discussion as a desideratum.
The consequence of accepting these propositions is that, if there is to be a single standard appearance for English, it must have initial caps and final period.
It might be nice to enforce a rule of only-one-period-per definition, which might be highly effective for identifying potentially encyclopedic entries, at least until semicolons replace periods among those trying to conceal their encyclopedic works. DCDuring TALK 11:59, 10 February 2012 (UTC)[reply]
I agree for only-one-period-per definition (if there is a period). About non-gloss definitions: yes, they are very rare in paper dictionaries, but they are very common here, as they are used for inflected forms. But, as we want to use a different format for them anyway, there may be an exception for them. Lmaltier 09:07, 11 February 2012 (UTC)[reply]
I don't mind either about an exception for non-gloss-definitions. However I don't think all of these are ordinary sentences. Statements beginning with a "used to .." are participle clauses the way I see it and inflected form definitions have no verb at all. These definitions should begin with a lower-case letter as well. --flyax 12:33, 11 February 2012 (UTC)[reply]

Here are two useful links: (a) Terminology, p.31-35 (b) ISO/IEC Directives Supplement, p 35. I am not implying that we are obligated to follow these instructions just because they've become an ISO standard, I am just giving them for further reading. --flyax 09:33, 11 February 2012 (UTC)[reply]

I understand that ISO wants to standardize the use of words, with precise meanings, in their documents, and they are right. We describe the languages as they are used, this is a very different objective. Anyway, I don't see how these documents relate to this discussion. Lmaltier 10:46, 11 February 2012 (UTC)[reply]
My intention was to draw our attention on the way ISO wants to format definitions. (a) See in page 31: Definitions shall not: be given in full-sentence form ...; in p. 35: Definitions shall be lower case, including the first letter, except for any upper-case letters required by the normal spelling of a word in running text . (b) See in I.2.2.4.6: ... letters normally appearing in lower case shall remain in lower case (this applies in particular to the first letter of the definition). The definition shall not end with a full stop .... --flyax 12:01, 11 February 2012 (UTC)[reply]
I now understand, but they don't want to standardize dictionaries, they want to standardize definitions in their own documents. They are right, it's important. But different dictionaries make different decisions. The decision should be based on arguments. Lmaltier 13:24, 11 February 2012 (UTC)[reply]
We all think the same way I think. Reason, arguments, dialectic, personal preferences, stuff to study, all these are necessary. --flyax 14:17, 11 February 2012 (UTC)[reply]

I completely support the status quo, that is, I support having full sentences for English definitions and glosses for FL-to-English definitions. The needs of a single-language dictionary are very different from those of a translating dictionary and it doesn't seem strange or inconsistent to me to have a different style for the two cases. Ƿidsiþ 10:01, 15 February 2012 (UTC)[reply]

Translations are provided in the Translation section, and definitions in definition lines (# lines). Definitions make senses clear, and translations provide words of the same sense in other languages. I don't see any reason not to apply these principles systematically (keeping in mind that, for foreign words, a translation may be a good, sufficient, definition, but not always). Simple principles make everything simpler. Lmaltier 18:43, 15 February 2012 (UTC)[reply]
Have just found a lovely example of why I dislike the 'free format' SemperBlotto advocates, this, where the first two definitions have no initial capital and no full stop, but the third definition has both. Mglovesfun (talk) 11:48, 21 April 2012 (UTC)[reply]

Indicating nasalisation in Proto-Germanic entry names

There is a discussion on this right now but I think it needs a bit more input. Please look and contribute if you can? Wiktionary talk:About Proto-Germanic#Indicating nasalisation in entry namesCodeCat 13:27, 10 February 2012 (UTC)[reply]

Diitidaht (Nitinaht - Southern Nootkan)

How can I become a contributor. I would like to enter my Diitidaht dictionary (I have thousands of words) and the language has less than 10 (5) speakers. I also speak Romany (Kalderash Gypsy); Danish and English; some Lushootseed (Straits Salish), some Nootkan, and some Makah (also southern Nootkan). — This comment was unsigned. User:Pakkichipps 02:21, 11 February 2012 (UTC)[reply]

Welcome. Read the following pages carefully and you'll be fine. Help:How to edit a page, Wiktionary:Tutorial, Wiktionary:What Wiktionary is not, WT:ELE, WT:CFI. Also, remember to sign your edits in discussion pages (just type ~~~~ and it will be converted into a signature). Ungoliant MMDCCLXIV 02:46, 11 February 2012 (UTC)[reply]
You will need the language code for Diitidaht, which is dtd. —Stephen (Talk) 06:30, 11 February 2012 (UTC)[reply]
... which doesn't exist? —CodeCat 12:17, 11 February 2012 (UTC)[reply]
It now does. -- Liliana 13:15, 11 February 2012 (UTC)[reply]
It would be a good start to have some agreement about the English name for this language. Neither Wikipedia at w:Ditidaht language nor SIL International use the double "i" in the name. Eclecticology 07:00, 12 February 2012 (UTC)[reply]

User modified this to change from Lower Silesian to Silesian German and added two interwikis. Are we happy about this? Mglovesfun (talk) 13:25, 12 February 2012 (UTC)[reply]

Not happy. Revert. -- Liliana 13:36, 12 February 2012 (UTC)[reply]
Ethnologue [3] calls it Upper Silesian, and mentions it's "Different from Lower Silesian, a dialect of Polish". WP redirects w:Lower Silesian language to w:Silesian German. I don't see any reason to be unhappy about it. Ungoliant MMDCCLXIV 13:38, 12 February 2012 (UTC)[reply]
WP also redirects w:Upper Silesian language to the Slavic w:Silesian language, so Ethnologue and WP don't seem to agree on which language is Upper Silesian and which language is Lower Silesian. —Angr 14:04, 13 February 2012 (UTC)[reply]
So calling it "Silesian German" is justified, as it avoids confusion. Ungoliant MMDCCLXIV 14:32, 13 February 2012 (UTC)[reply]
I liked the pair Upper Silesian vs. Lower Silesian better. -- Liliana 00:33, 14 February 2012 (UTC)[reply]
If only it were that simple. But both languages were spoken in Upper Silesia at some point, and since the annexation of Silesia to Poland after WWII there's now a Polish dialect in Lower Silesia as well. It's probably best if we use less ambiguous terms for both {{sli}} and {{szl}}. —Angr 11:49, 14 February 2012 (UTC)[reply]
Which ones specifically? I'm open to suggestions. -- Liliana 23:01, 18 February 2012 (UTC)[reply]

MediaWiki 1.19

(Apologies if this message isn't in your language.) The Wikimedia Foundation is planning to upgrade MediaWiki (the software powering this wiki) to its latest version this month. You can help to test it before it is enabled, to avoid disruption and breakage. More information is available in the full announcement. Thank you for your understanding.

Guillaume Paumier, via the Global message delivery system (wrong page? You can fix it.). 14:57, 12 February 2012 (UTC)[reply]

-ty and -ity in European languages

In an annoyingly nonstandard manner, this suffix is represented with or without the "i". Which is etymologically more correct?

Here's the part that needs cleanup once we decide which is to be the form-of and which the real entry:

- Metaknowledge 05:54, 13 February 2012 (UTC)[reply]

The OED [2ⁿᵈ ed., 1989] has entries for both -ty, suffix¹” and “-ity, which I shall quote in full:
  1. “-ty, suffix¹”: “denoting quality or condition, representing ME. -tie, -tee, -te (early ME. -teð), from OF. -te (mod.F. -té), earlier -tet (-ted): — L. -itātem, nom. -itās. Such Latin types as bonitātem, feritātem, were in OF. normally reduced to two syllables (bontet, fertet) by elision of the -i- between the two stresses, so that -tet, later -te, became the regular form of the suffix. The final dental still appears in some early adoptions in ME., as plenteð, plenteth plenty (c 1250, in use till c 1600), and is characteristic of the Scottish forms bountith, daintith, and poortith (q.v.). The reduced form -te, however, is found in words recorded from shortly before or after 1200, such as bonte bounty, cruelte cruelty, debonerte debonairness, deinte dainty (n.), plente plenty, poverte poverty, purte purity, and vilte vileness. Among others which appear somewhat later are certeynte certainty, Cristente Christenty, freelte frailty, novelte novelty, and sotelte subtlety. Varying forms of the stem are found in the words now or formerly represented by beauty, fealty, lealty, †lewty, loyalty, †realty, †rialty, and royalty. From the types lealte, realte, the ending -alte (mod.F. -auté) was in OF. extended to formations from different stems, and many words of this form (ultimately written with -alty) established themselves in English, as admiralty, casualty, commonalty, †generalty, mayoralty, †principalty, †regalty, severalty, specialty, spiritualty, temporalty. Most of these date from the 14th or early 15th century; penalty appears to be of later introduction (1512). An obsolete type of formation is exhibited by curiouste, hid(e)ouste, and joyouste. In OF. certain analogies led to the frequent substitution of -ete for -te, but this form of the suffix is only occasionally adopted in English, as in the obsolete noblete, purete, and simplete; the early sauvete is now represented by safety. Under Latin influence many words in OF. also appear with -ite (mod.F. -ité) in place of -(e)te; hence English forms in -ity, which in many cases (as in F.) have supplanted those in -ty. [¶] Although occurring in a large number of words the suffix has shown little productive power in English; evelte, everlastingte, and overte occur in the 14–15th cent., and shrievalty, sheriffalty, have had currency from the beginning of the 16th cent., but such formations are very rare. [¶] Such words as faculty, difficulty, honesty, modesty, puberty, represent Latin formations in which the suffix -tās is directly added to a consonantal stem. The number of these in English, as in French, is very small. [¶] The early form of the suffix (-te, or -tee) remained in use down to the 16th cent., but from the 15th was gradually supplanted by -tie, -tye, and the surviving -ty.”
  2. “-ity”: “[ME. -ite, a. F. -ité, L. -itāt-em] [¶] the usual form in which the suffix (L. -tās, -tātem, expressing state or condition) appears, the i- being orig. either the stem vowel of the radical (e.g. L. suāvi-tās suavity), or its weakened repr. (e.g. L. puro-, pūri-tās purity), rarely a mere connective (e.g. L. auctōr-i-tās authority; so ME. emperorite, in Vernon MS., St. Ambrose 886). The last became more frequent in med. and mod.L., and the mod. langs., in abstracts from comparatives, as majority, minority, superiority, inferiority, interiority. Hence such formations as egoity, with playful or pedantic nonce-words of Eng. formation, as between-ity, coxcomb-ity, cuppe-ity, table-ity, threadbar-ity, woman-ity (after humani-ty), youthfull-ity. [¶] After i, -ity becomes -ety, as in pie-ty, varie-ty (L. pietātem, varie-tātem). The termination was in L. often added to another adj. suffix, e.g. -āci-, -āli-, -āno-, -āri-, -ārio-, -bili-, -eo-, -idi-, -ido-, -ili-, -īli-, -ino-, -īno-, -io-, -īvo-, -ōci-, -ōso-, -ui-, -uo-, etc., whence the Eng. endings -acity, -ality, -anity, -arity, -ariety, -bility, -eity, -idity, -ility, -inity, -iety, -ivity, -ocity, -osity, -uity, some of which, as -bility (-ability, -ibility) attain almost to the rank of independent suffixes. The earlier popular Fr. form was -eté, in Eng. -ety and -ty, as in safety, bounty, plenty: see -ty.
They seem to treat (deprecated template usage) -ity as merely a concatenation of -i- + (deprecated template usage) -ty, albeit a concatenation far more common than (deprecated template usage) -ty without (deprecated template usage) -i- before it. Might it be worth doing as they seem to do, lemmatising the (deprecated template usage) -ty forms and including redirects defined as “-i- + -ty” (or similar) thereto, with usage notes explaining the relation at the lemma? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:35, 13 February 2012 (UTC)[reply]
Just to note we allow the acute accent in Old French to represent /e/ at the end of a word, so our Old French entry is bonté not bonte. I seem to think the reasons for this are at Wiktionary talk:About Old French. I don't want to say anymore because I don't want to unwillingly hijack this thread. Mglovesfun (talk) 11:58, 13 February 2012 (UTC)[reply]

Mandarin pinyin with numbers

On User talk:Atitarev#Mandarin with numbers I brought up the issue of keeping or not Mandarin pinyin with numbers as opposed to diacritics. Wiktionary:Votes/2011-07/Pinyin entries says "That a pinyin entry, using the tone-marking diacritics, be allowed whenever we have an entry for a traditional-characters or simplified-characters spelling." No mention of numbers, so they're not protect by the vote. But {{cmn-alt-pinyin}} requires both forms and some of these numbered entries go back years, at least as far back as 2006, so I don't think we should start deleting them outright with no prior discussion. While Wiktionary:Votes/2011-07/Pinyin entries doesn't protect these entries, it doesn't mention them in any way so it's the case that the vote is pronouncing these invalid. Mglovesfun (talk) 12:01, 13 February 2012 (UTC)[reply]

Just in case it's not clear, no objection from me to delete all these. I only oppose deleting these with no prior discussion. This is that discussion. Mglovesfun (talk) 12:08, 13 February 2012 (UTC)[reply]
I have checked a few pages using {{cmn-alt-pinyin}} and saw only one syllable pinyin with numbers - mai4, kan4. After a second thought, perhaps it's OK to keep one syllable entries with tone numbers (if there are serious objections) but not entries like "dong4wu4". Books which do use tone numbers (increasingly rare) have spaces between syllables, e.g. "dong4 wu4", anyway. --Anatoli (обсудить) 12:16, 13 February 2012 (UTC)[reply]
For reference, this discussion concerns entries in Category:Mandarin pinyin with tone numbers, which has 1,473 entries. It seems that great many or all of the entries were created by BD2412 (talkcontribs) in 2006. --Dan Polansky 13:07, 13 February 2012 (UTC)[reply]
There's no need for discussion, because Wiktionary:About Sinitic languages#Mandarin addresses this explicitly:
For individual syllables, we have entries in each of these systems, as well as in pinyin with no tones marked at all. For words with multiple syllables, we only have entries for the pinyin romanizations, with tones marked using diacritics.
(citations omitted). If you're aware of any multi-syllable pinyin-with-numerals entries, please list them at RFD so they can be dealt with properly (e.g., moved to the pinyin-with-diacritics title). But single-syllable pinyin-with-numerals entries are absolutely 100% vote-approved, and must be kept.
RuakhTALK 13:24, 13 February 2012 (UTC)[reply]
Mglovesfun has restored the monosyllabic entries I deleted (thanks). The polysyllabic ones usually duplicate the existing toned pinyin entries, which we are reformatting according to the vote, so there's no need to rename or fix them, sorry, they just go straight to the bin. If it's not the case, they are renamed and reformatted. We don't support Wade-Giles, Tongyong Pinyin, Yale, Zhuyin Fuhao (Bopomofo) and any other romanisation/transliteration of Mandarin apart from Hanyu Pinyin with tone marks. The language-specific policy ( Wiktionary:About Sinitic languages#Mandarin) is created and maintained by Mandarin speaking editors and there is no need to keep entries, which are not in the proper script and unattestable. Perhaps, the policy on monosyllabic entries should be reviewed but other Sinitic editors should be involved in the discussion. In my opinion, those entries could be converted to soft or hard redirects to toned pinyin entries with all the information. --Anatoli (обсудить) 22:48, 13 February 2012 (UTC)[reply]
It's at least worth discussing. It does seem to me even if the versions of the polysyllabic words with numbers shouldn't be speedily deleted, the vote offers no protection for them, so they would have to meet CFI by being attested and idiomatic. So anything that doesn't get any Google Books, Groups or Scholar hits should go. Mglovesfun (talk) 11:01, 14 February 2012 (UTC)[reply]
The polysyllabic ones definitely have to go. As for the monosyllabic ones, I am inclined towards deleting them. There is another solution. Either redirect the entire page or if we are not comfortable with this, then make it an alternative form of its diacritic counterpart. I really don't see the point of duplicating the effort. Unlike the tug-of-war between whether to prefer simplified script over traditional (or vice versa), this one is quite clearcut as to which one we prefer, so alt form makes sense in this case. JamesjiaoTC 01:41, 17 February 2012 (UTC)[reply]

Using modifier letters for superscript

A bunch of modifier letters that look like superscript letters were encoded into Unicode for use in various languages and particularly phonetic systems. They were not meant for "generic styling mechanisms for superscripting of text, as for footnotes, mathematical and chemical expressions, and the like." (See http://www.unicode.org/versions/Unicode6.0.0/ch07.pdf ) User:Doremítzwr insists on using them for ordinals, like [4] and for general superscripting like majᵗʸ. Note there's no way automatically uppercase that text, there's no way to automatically search for it unless you know the idiosyncratic means of encoding it, and there's a limited set of characters; I'm not sure if Basic Latin is now covered, but I know most Latin characters outside the basic 26 of English aren't, and only a handful of Cyrillic or Greek. We should be using superscripts for the ordinals (if it's really thought necessary) and we can treat spell majty maj'ty or put it on a page of superscripted abbreviations.--Prosfilaes 12:10, 13 February 2012 (UTC)[reply]

Those partially superscript contractions are extremely numerous in older texts, and whilst some of them will occur in both forms (e.g., both (deprecated template usage) majᵗʸ and (deprecated template usage) majty occur), other contractions only occur partially superscript (such as (deprecated template usage) principˡ). (deprecated template usage) Majᵗʸ, (deprecated template usage) majty, and (deprecated template usage) maj’ty all occur; how would you present the first? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:11, 13 February 2012 (UTC)[reply]
No, we shouldn't be using any kind of superscripts for ordinals, whether "pre-composed" or created by means of html tags. It looks ridiculously old-fashioned. And for dates we shouldn't be using any kind of ordinals. We should be writing "February 10" and "August 14". —Angr 14:21, 13 February 2012 (UTC)[reply]
I must take exception to your edit comment "we don't live in the 19th century". In what world do you live? If superscript ordinals are a typographical feature restricted to the nineteenth century, why the hell would Microsoft Word — probably the most popular word processor in the world — autocorrect "1st", "2nd", "3rd", "4th", etc. to "1st", "2nd", "3rd", "4th", etc. by default? And why shouldn't we be using ordinals for dates? With years, "February 10 2012" and "2011 August 14" look wrong. Indeed, most people use ordinals when writing dates. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:11, 13 February 2012 (UTC)[reply]
I can't see what that last link is to (Google doesn't let me), but in my limited experience most people do not use ordinals (written as such) when writing dates. They write "February 13, 2012" (as the case may be). Is this perhaps a pondian difference?​—msh210 (talk) 19:47, 13 February 2012 (UTC)[reply]
Here's the relevant bit, in our citation format:
  • 2012 February, Andrea Jones, All about Level 3 ITQ QCF: Using Microsoft Word 2010 (All About Resources, →ISBN, page 23
    Ordinals (1st) with superscript [¶] Most people probably do find this feature useful as they may use ordinals when typing dates (like 1ˢᵗ January 2012).
The author's from Lydbury North and the book was printed in the UK, so that much, at least, is consistent with your hypothesis that the use of ordinal suffixes is a Cisatlantic thing. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:45, 13 February 2012 (UTC)[reply]
I live in the UK and read a great deal and the superscripts in dates look comically antiquated to me. Equinox 22:51, 13 February 2012 (UTC)[reply]
Then we disagree. Clearly, we need the input of style guides on this issue. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:15, 14 February 2012 (UTC)[reply]
Per two of Prosfilaes's points — one, that Unicode explicitly notes that these characters are not meant as superscripted standard letters for style purposes and, two, that they are hard to search for — I'll have to agree we should not use them for dates in citations or in page titles. (For page titles, we can use the unsuperscripted versions. The headword line can include the superscripted version (or both, as appropriate); or, if the superscripted version is vanishingly rare as compared to the other, then its existence can be relegated to a usage note.)​—msh210 (talk) 19:52, 13 February 2012 (UTC)[reply]
Isn't that a problem if we have entries for both (deprecated template usage) majᵗʸ and (deprecated template usage) majty? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:45, 13 February 2012 (UTC)[reply]
Should we? We don't include (deprecated template usage) the, (deprecated template usage) THE, (deprecated template usage) The, (deprecated template usage) Tʜᴇ, and (deprecated template usage) ᴛʜᴇ: the differences are in style not the word proper.​—msh210 (talk) 23:56, 13 February 2012 (UTC)[reply]
I don't think the (deprecated template usage) ᵗʸ in (deprecated template usage) majᵗʸ is merely stylistic — it remains superscript often irrespective of context (such as if everything around it is in all caps). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:15, 14 February 2012 (UTC)[reply]
I agree wholeheartedly with Prosfilaes and msh210 that these modifier letters should not be used to write superscripts, because they are not intended or suited for that purpose (they are apparently not found by searches for the non-superscript letters); only <sup> and such things should be used on regular characters when it is necessary to write something superscript. - -sche (discuss) 00:56, 14 February 2012 (UTC)[reply]
They are found in searches; for example, (deprecated template usage) 1ˢᵗ is the second search result that appears when one searches for (deprecated template usage) 1st. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 01:24, 14 February 2012 (UTC)[reply]
It is neat to learn that final letters were often superscripted, though — even superscripted in cases like Principl where almost no space is saved! I saw honour (superscript) in a recaptcha image (i.e. taken from some old book) just yesterday and was confused until now. I would never have searched Wiktionary for honouʳ (modifier), mind you... - -sche (discuss) 01:02, 14 February 2012 (UTC)[reply]
Oh, and display as superscript (in headwords, in citations, in {{term}}, even in pagetitles by means of DISPLAYTITLE) can be by means of the HTML sup element.​—msh210 (talk) 19:57, 13 February 2012 (UTC)[reply]

How many, and which entries use superscript characters? -- Liliana 00:29, 14 February 2012 (UTC)[reply]

There are potentially thousands of entries for obsolete spellings of this kind. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:51, 14 February 2012 (UTC)[reply]
I'm asking because in chemistry subscript letters are commonly used, like in H₂SO₄. -- Liliana 00:54, 14 February 2012 (UTC)[reply]
Well, those are subscript numerals, but they are another example of the legitimate (and irreplaceable) use of these characters. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:57, 14 February 2012 (UTC)[reply]
We have a not-yet-standardised mix of hard and soft redirects pointing to/from H2O, H2SO4 etc from/to the subscript versions so they can be found. Also, the subscript numbers were probably intended to be used in place of <sub>, unlike modifiers like ʳ, which were explicitly not intended to be used in place of <sup>. - -sche (discuss) 01:06, 14 February 2012 (UTC)[reply]

Question: how were things like "majty" and "4h" originally put onto paper? Did book presses and typewriters use dedicated distinct characters, or did they move regular characters around? Obviously, even if they used dedicated separate characters, those characters do not correspond to Unicode's modifier letters, and so we should not misrepresent them by Unicode's modifier letters, but if they just moved regular characters around, there really would seem to be no argument for using dedicated characters here. - -sche (discuss) 05:11, 14 February 2012 (UTC)[reply]

God knows. The superscripts are consistently smaller than the regular characters in whose context they appear. Maybe they just used type pieces for smaller font sizes, but I can't tell you with any authority. Whatever the case, physical type pieces and digital characters are disanalogous. With physical type pieces, one must use different bits of metal every time he wishes to change font sizes; the same digital characters are used irrespective of what font size is selected, and each is kept in the same relation of scale to every other. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:21, 14 February 2012 (UTC)[reply]
Well put, and this is exactly what I thought when I read -sche's comment. Sizes of traditional type don't have a bearing on digital characters. The things we are more interested in are stylised forms like & for et. Equinox 23:27, 14 February 2012 (UTC)[reply]
I should probably clarify: I am opposed to using modifier letters for things like majty; I consider the question of whether or not to use ordinals like 14th a separate question; I would prefer not to use ordinals, but I am not as opposed to ordinals as to modifiers. - -sche (discuss) 23:33, 15 February 2012 (UTC)[reply]

NB: we currently have some entries which are exclusively modifier-characters, like . - -sche (discuss) 05:11, 14 February 2012 (UTC)[reply]

It's clear that special characters should be used only for what they are designed for. Otherwise, it would be like using the Roman letter A in Bulgarian or Russian words because the appearance is exactly the same. Lmaltier 22:29, 15 February 2012 (UTC)[reply]
Good point! - -sche (discuss) 23:33, 15 February 2012 (UTC)[reply]
Not really. Obviously, it's better to use something tailor made if it's available (in the case of the Cyrillic А vs. the Roman A, it's better to use the former in words otherwise written in Cyrillic, because it causes the word in question to be sorted properly (i.e., alphabetically)), but in the case of these superscript forms, there is nothing tailor made that's available, so we have to make do with something that was designed for another purpose, but which nevertheless does the job just fine. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 04:14, 16 February 2012 (UTC)[reply]
Except that it's much more problematic with browsers, systems, and users then st, which is a real issue for the ordinals since there's no functional loss with using st. We should try for consistency, and none of our non-Doremítzwr users have any intention of using these characters in our dates.--Prosfilaes 10:43, 16 February 2012 (UTC)[reply]
There is something made to allow the representation of superscripts: the <sup> tags and other things msh210 describes. - -sche (discuss) 20:25, 16 February 2012 (UTC)[reply]
<sup> tags cannot be used in page titles. In the main text, <sup> tags cause line-spacing problems. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 03:37, 17 February 2012 (UTC)[reply]
They can't be used in the title as displayed in the browser's tab or what-have-you, but they can be used in the top-level header (even though we don't edit that one in the wiki source of the page). (I'm not sure which you meant.)​—msh210 (talk) 01:05, 21 February 2012 (UTC)[reply]
By "page title", I mean the text that appears atop a given page (before section zero and the table of contents), e.g., the “homoglyph” in large text atop our page for (deprecated template usage) homoglyph. What do you mean? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)[reply]
That thing _can_ have superscripts and subscripts.​—msh210 (talk) 18:48, 21 February 2012 (UTC)[reply]
Yes, fr:Mme suggests that. Could you show me how, using a page of your choosing as an example? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:00, 21 February 2012 (UTC)[reply]
See [[User:Msh210 on a public computer]].​—msh210 (talk) 19:19, 21 February 2012 (UTC)[reply]
Hmm. Why hasn't this worked? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:18, 21 February 2012 (UTC)[reply]
Because the thing in the {{DISPLAYTITLE}} and the actual title must be equivalent in the sense that the former (once internal HTML tags are removed) can be used in a URL (or [[link]]) to yield the latter. In the linked-to case, majty as a pagetitle is inequivalent (in that sense) to majᵗʸ.​—msh210 (talk) 23:59, 21 February 2012 (UTC)[reply]
OK, thanks; noted. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:54, 22 February 2012 (UTC)[reply]

The superscripted abbreviations are left-over typographical conventions from the days before Gutenberg. Fortunately, they mostly died out by the end of the 17th century. Paper was expensive in those days, and these abbreviations allowed more text to be put on a page. Entire books have been devoted to the peculiarities of Latin and Greek pæleography. For dates in the ISO format I would use "2011-08-14" and not "2011 August 14" since these were intended to be computer sortable. Putting an ordinal into these looks bizarre. Eclecticology 10:14, 16 February 2012 (UTC)[reply]

The ISO format you advocate has problems of potential ambiguity; just as some people write "7ᵗʰ of August 2011" (7-8-2011) and others "August 7ᵗʰ 2011" (8-7-2011), so some people write "2011, August 7ᵗʰ" (2011-8-7) whilst others write "2011, 7ᵗʰ of August" (2011-7-8). We don't need our citations to be computer-sortable, because they're already listed from oldest to most recent, as standard. BTW, it's (deprecated template usage) palæography. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:24, 16 February 2012 (UTC)[reply]
Typo gratefully acknowledged. -- Ec
Citation needed. As far as I know, every single person who uses 2011-8-7 format uses it in year-month-day format. That's part of why it was chosen as ISO standard format, because it didn't have a conflicting body of usage. Your format has problems, too, as some people will see it as 7?? of August 2011 or 7▉▉ of August 2011.--Prosfilaes 10:43, 16 February 2012 (UTC)[reply]
There are many available examples of YYYY DD MM date formatting: [5], [6], [7], [8], [9], [10], [11], [12]. Take especial note of this one which explains the rationale behind the YYYY DD MM order as:
  • 1999, Twin Plant News: TP. (Nibbe, Hernandez and Associates), volume 14, issues 7–12, page unknown
    YYYY-DD-MM or the year followed by day followed by month separated either by a dash or a slash. The logic for this standard is very simple…start with the largest number and then write the next largest number and so on. The year is the largest number after which a day which can be up to 31, after which the month which can be up to 12.
Encoding problems are never long-term problems, and in the meantime, boxes and such will not introduce ambiguity. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 03:37, 17 February 2012 (UTC)[reply]
You can develop a rationale for anything, including the one quoted, but that doesn't change the international standard. Eclecticology 08:51, 17 February 2012 (UTC)[reply]
No, certainly, but that wasn't my point. Prosfilaes didn't believe me that some people use YYYY DD MM date formatting, so I provided evidence that people do; the quoted rationale was just to show why some people would consider such a format to be intuitive. I agree that either YYYY MM DD or DD MM YYYY makes most sense, but that doesn't stop people misinterpreting the month number for the day number and vice versa when the date is anywhen between the 1ˢᵗ and the 12ᵗʰ of a given month (which is the case for approximately ⅖ of all dates). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:37, 17 February 2012 (UTC)[reply]

Note that, for page titles, this is the same kind of issue than italics (e.g. in animal scientific names). There is a solution used by fr.wikt (e.g. see fr:Mme: the title is Mme without using special letters). However, this solution cannot work if we want to create both Mme and Mme, or Canis and Canis, in different pages. The solution is to consider that, for technical reasons, page titles don't take superscripts, italics, etc. into account, and that all such variations are addressed in the same page. This is a perfectly reasonable and sound solution, and it's easy to understand it. Lmaltier 07:01, 17 February 2012 (UTC)[reply]

But why, when there's no need for us to be limited like that with our page titles? And by the same logic, why don't we strip all our page titles of diacritics and non-ASCII characters? That would make them a whole lot easier to search for using an ordinary keyboard. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 07:26, 17 February 2012 (UTC)[reply]
Hold on, since when did we use italics in page titles? It's possible (cf. canis and 𝑐𝑎𝑛𝑖𝑠), but why would you do this? -- Liliana 07:40, 17 February 2012 (UTC)[reply]
I think he's thinking of Wikipedia, where they italicise the page titles for species names and such. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 07:46, 17 February 2012 (UTC)[reply]
Not Wikipedia, but the international convention that a genus (or a species, or any taxon below the genus) must be written in italics.
About special letters: they must be used in titles if (and only if) they are used in the language, it's very simple. And these letters are not used in English. In majty, the t is a normal t, the y is a normal y, they just happen to be smaller and written higher. If we don't use the Roman letter A in Bulgarian words, it's not because of the alphabetical order, it's because it would be wrong: the Roman letter, the Cyrillic letter and the Greek letter are three different letters despite their common appearance. It's exactly the same here. Lmaltier 07:18, 18 February 2012 (UTC)[reply]
I just created an entry for the French contraction (monsieur), which is unambiguously attested in Usenet sources in employing the MODIFIER LETTER SMALL R. Should French print sources published before June 1993 (the date of the introduction of U+02B3) count towards its antedating? Or further, should French print sources published prior to the invention of digital computers count towards its antedating? Examples of a contraction taking the form of a majuscule em followed by a superscript minuscule ar certainly exist in such print sources. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:03, 18 February 2012 (UTC)[reply]
The normal abbreviation is M. but, you are right, Mr is attested. However, the entry you created, , is not attested, as the ʳ letter does not exist in French, it is 'never used in French. ~~
And Unicode is very clear (see document mentioned above): these letters are modifier letters, and they cannot be used for normal subscripted letters. Lmaltier 11:34, 18 February 2012 (UTC)[reply]
Did you even look at the entry? It has five citations (two are by the same guy, but that's still four independent ones), which disproves your assertion that "ʳ…is 'never used in French." — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:52, 18 February 2012 (UTC)[reply]
Lmaltier, despite your point being right, we aren't much better sometimes. Many of the minority languages of Russia use capital I instead of the palochka Ӏ, technically Unicode considers this practice illegal, and by your logic we should move all these entries to the spellings with palochka. -- Liliana 13:47, 18 February 2012 (UTC)[reply]
The [[Ӏ] page states that the Roman I is in standard use (despite Unicode) in some language for technical reasons (keyboards). In such a case, both pages are probably acceptable (I created myself pages for town names with a bad typography for the capital (E instead of capital é) because the bad typography is very common, probably more common that the right one). But, of course, it's not the case for modifier letters such as ʳ (using the right r is much easier). Lmaltier 17:57, 18 February 2012 (UTC)[reply]
If the town names you're talking about are French, you should note that French orthography traditionally omits diacritics from atop letters when they are capitalised (though such omission is non-standard in Québecois French).
I don't think ease of entry is a valid criterion here. The examples of (deprecated template usage) I cited are in a medium that does not permit superscribing by any other method than by the use of characters like 〈ʳ〉. Given a more flexible medium, such as Microsoft Word, most people will use such a program's superscript function (equivalent to using <sup> tags here); but we don't have that flexibility in our page titles and the use of <sup> generally is problematic, which makes our medium more similar to Usenet than to Word. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:11, 19 February 2012 (UTC)[reply]
I think there is no difference between countries. If you look at town halls in France, you'll read LIBERTÉ, ÉGALITÉ, FRATERNITÉ, and this has always been the normal typography. But this character É is absent from typewriter and computer keyboards. Lmaltier 21:22, 19 February 2012 (UTC)[reply]
I'd read that diacritics are omitted from atop majuscules because otherwise maximal letter height would be exceeded. Perhaps my source and I are wrong, however. Still, your explanation of such commonplace omission as being caused by the "character É [being] absent from typewriter and computer keyboards" is implausible, because 〈É〉's absence would also lead to the omission of the acute accent from the minuscule 〈é〉, which I assume does not occur with anywhere near the same frequency; furthermore, whereas 〈é〉 can be generated by a simple shortcut like Alt Gr + E, 〈É〉 can be generated by a comparably simple shortcut, namely Alt Gr + Shift + E. Unequal ease of entry using typewriters and/or computer keyboards seems not to explain this phenomenon. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:56, 20 February 2012 (UTC)[reply]
The letters é, è, à, ù, ç are present on all AZERTY keyboards (including mine), of course... You could not do without them. But not the capitalized versions. Lmaltier 22:04, 20 February 2012 (UTC)[reply]
Aah, how interesting! I was not aware of AZERTY keyboards. Yes, that would probably explain the frequency of omission. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)[reply]
Of course it's true that "Examples of a contraction taking the form of a majuscule em followed by a superscript minuscule ar certainly exist in such print sources." That is, a superscript r, not a modifier letter r.--Prosfilaes 14:32, 18 February 2012 (UTC)[reply]
Of course, this is what I mean. I repeat that the modifier letter ʳ does not exist in French, it's never used in French. The character representing it might have been used in a few cases, and you found a few examples, but certain not the modifier letter (most probably the authors don't know what "modifier letter" means, they used the character because it looked more or less right, although not quite). A few years ago, I created many Bulgarian first names by bot (on fr.wikt), and I used a Roman a instead of a Cyrillic a in a number of cases. The mistake has been fixed, but would you have used such mistakes as a rationale for creating here these first names with a Roman letter a? Lmaltier 17:49, 18 February 2012 (UTC)[reply]
Right, so some people have used these superscripts for what they look like, namely superscripts. Consider the perspective of a typesetter working before digitisation. Perhaps he needs to print some Russian words in an otherwise-English context. Do you think he'd bother to have two different bits of metal — one for the Roman A and another for the Cyrillic А? It would surely be cheaper just to use the Roman A in all cases. Or what if he mixed up the Roman A with the Cyrillic А — Would that mean that every word in Roman type that seemed to use a Roman A actually misused a Cyrillic А? Even if you answer "yes" to the second question, how can you possibly know, if the two look identical? It would surely be a fetishisation of the intended use of whatever bit of metal was used to print the letter. In the case of superscripts, the fact that the bits of metal that were used to print them could also have been used to print ordinary letters in smaller font sizes is as inconsequential as whether a Roman A and a Cyrillic А were in fact printed using the same bit of metal. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:11, 19 February 2012 (UTC)[reply]
Of course, on paper, there is no difference, and which character has been used is irrelevant. But not here, we are not paper. Furthermore, in the present case, they don't look exactly the same. The page titles you propose are wrong. Lmaltier 21:22, 19 February 2012 (UTC)[reply]
Conversely, “majᵗʸ” has a more correct appearance than “majty”. In “majty”, the superscripts are too big, too high, and cause line spacing problems, whereas in “majᵗʸ” they are the right size, are at the right level, and have no effect on line spacing. Furthermore, “majᵗʸ” italicised as majᵗʸ has a correct appearance, whereas italicising “majty” as majty causes the 〈t〉 to appear on top of the 〈j〉. In terms of functional fit (i.e., using characters for their appearance), these hard-coded superscripts do a better job of representing superscribed characters than using <sup> tags does. I maintain that such functional fit matters more than Unicode-intended purpose. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:56, 20 February 2012 (UTC)[reply]
The modifier letters don't appear at all in some fonts/browsers, except as boxes. The relentless march of progress is resulting in both display problems being fixed for more and more people, but I don't think we can tell which problem will be fixed first. So, those two arguments ("modifiers are bad because they're boxes for some people" and "sup is bad because it breaks in italics") may cancel out, IMO. - -sche (discuss) 01:17, 21 February 2012 (UTC)[reply]
Whereas boxes are unequivocally seen as a display problem to be fixed, I don't think that the problems with <sup> tags are even recognised. Howbeit, I have just discovered that combining <small> tags with <sup> tags generates superscripts of the correct size and height; for example, "1<small><sup>st</sup></small>", "2<small><sup>nd</sup></small>", "3<small><sup>rd</sup></small>", "4<small><sup>th</sup></small>" generates: "1st", "2nd", "3rd", "4th". They still cause line-spacing problems and are positioned too far to the left when italicised, but this new-found functionality is enough to make me drop my instance that we use the hard-coded superscripts. I now advocate only that we use those hard-coded superscripts to allow us to distinguish page titles à la [[majty]] vs. [[majᵗʸ]]. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)[reply]
──────────────────────────────────────────────────────────────────────────────────────────────────── With regard to pagenames — pagenames = the things that exist in place of xz in http://en.wiktionary.org/wiki/xz and [[xz]] — I'm not convinced we should distinguish "majty" and "majty". I agree with msh210's point, above, that this is like "THE", "THE" etc. I'm generally in favor of including as much information as possible on a page, so if "a" is usually italicized in mathematical equations (which it may not be, I'm just making up an example) or "ty" is usually superscript in "majty", I strongly agree that we should convey this on the page. I just now added a usage note to "LORD" to explain that it is commonly written "LORD". I'm not as insistent as you (Raifʻhār) that we convey such typographical features in the headword line, but I definitely want them mentioned in usage notes or sense-line qualifiers. I think the pagenames should be "LORD", "majty" etc, however. (I accept pagenames like "H₂O" because we redirect to them, but my favoured solution for that, too, would be "H20" as the pagename/URL and "H₂O" as the thing displayed everywhere on the page. But I'm not going to press for that.) In part this is to combat Wiktionary's proliferation of content onto multiple pages; surely "majty" is the same word when typed "majty" on Usenet and when written with superscript letters in an old book, so I don't think we need separate entries for the typographical variation. Having the same pagename may, in the event one language has a word "majty" that's written with superscript letters and another has a word "majty" that is never written with superscript letters, also mean we can't have superscript pagetitles (pagetitle = the part of [[User:Msh210 on a public computer]] that currently displays "user: msh210 public"), but because I expect most "majty"-words are also written "majty" sometimes, I don't see it as a problem to use "majty" as the pagetitle/header (and pagename/URL) and only have a usage note mention "majty". - -sche (discuss) 22:03, 21 February 2012 (UTC)[reply]
right|thumb By the way, even when the Unicode characters display, they sometimes display in an unschön way (no better or worse than italicized <sup>-letters). Note the "i" raised above the "t" and "es" in the image to the right. - -sche (discuss) 22:53, 21 February 2012 (UTC)[reply]
Hmm. What do you think of principl? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:54, 22 February 2012 (UTC)[reply]
Looks good! Sorry I missed your reply. (Specifically, unlike in the image, all the letters of the &;ltsup>ped "ties" are at the same height.) - -sche (discuss) 04:21, 4 March 2012 (UTC)[reply]
I created Wiktionary:Votes/pl-2012-02/Using modifier letters for superscript as a possible vote on the subject. Let's all discuss and boldly modify it. As it's set up now, if we cannot get consensus for one option or the other, the unregulated status quo continues. (I feel strongly that whether or not to use ordinals — whether "14th" or some kind of superscript — needs to be a separate vote, although if this vote determines that one or the other method of effecting superscript should be used, that will be binding also on any superscript ordinals.) - -sche (discuss) 22:29, 19 February 2012 (UTC)[reply]
Do we even need to have a formal vote now, or have we sorted out how to handle this? - -sche (discuss) 04:21, 4 March 2012 (UTC)[reply]
Well, Ruakh and I are working on {{SUP}} and {{SUB}} so that they render correct super- and subscripts in normal text, but the one issue that remains is whether to use these modifier letters for page titles. BTW, please continue this discussion on the vote's talk page; the Beer Parlour is not longer on my watchlist. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:46, 7 March 2012 (UTC)[reply]

CFI and company names

I have created Wiktionary:Votes/pl-2012-02/CFI_and_company_names, which proposes removing the section dedicated to company names from WT:CFI.

If any discussion that results lasts longer than to the beginning of the vote (which is 20 February 2012), feel free to postpone the vote.

A poll relevant to the vote: Wiktionary:Beer_parlour_archive/2011/April#Poll:_Including_company_names.

I emphasize that removing the section does not lead to inclusion of any and all company names. Rather, after removing, the inclusion of company names would be governed by the section on the names of specific entities, just like names of literary works such as Much Ado About Nothing. --Dan Polansky 15:20, 13 February 2012 (UTC)[reply]

Great idea, thanks for having the initiative to start this. :) -- Cirt (talk) 23:55, 13 February 2012 (UTC)[reply]
As a practical matter, how has the specific-entities rule been applied so far? I assume that some editors have been adding them like crazy, while other editors slowly (or not-so-slowly) list them at RFD? With a specific consensus being required for deletion, but not for creation? —RuakhTALK 18:06, 14 February 2012 (UTC)[reply]
After the removal of attributive-use rule ("A name should be included if it is used attributively, with a widely understood meaning"), which took place in Wiktionary:Votes/pl-2010-05/Names of specific entities, I have seen no editors add names of specific entities like crazy, but I'll stand corrected. Daniel Carrero was adding some names of dubious lexicographical value (IMHO anyway) some time ago, but these were no company names, and he has already stopped. I have recently added a fairly small batch of Czech geographic names, ones that topped a frequency list. Specifically, I have seen no flood of geographic names that was feared by some of the opposers of broad inclusion of geographic names.
In RFD, consensus is required for deletion; that's right. I admit that this creates a pro-keeping bias, as consensus is required for deletion rather than for creation. Wikipedia's w:WP:AfD has the same pro-keeping bias, it seems. The same pro-keeping bias pertains to discussions of idiomacity in RFD; the bias is specific to RFD rather than to company names. --Dan Polansky 07:58, 15 February 2012 (UTC)[reply]

DICTIONARY FOR BRAZILIAN INDIGENOUS LANGUAGES

Hello, My name is Rodrigo Cotrim. I'm a linguistic professor in Brazil and I've been working with indigenous languages spoken nowadays in Brazil (13 Brazilian languages from 180 existing ones). I would like to make a request to create a dictionary for at list one of those languages I'm working with. It would help me and my indigenous students to make a word list/glossary/vocabulary/dictionary/thesaurus of their mother tongue (L1) (and of their second language (L2), Brazilian Portuguese). This dictionary would help to expand the scientific knowledge upon an endangered language spoken in Brazil. It would also help my indigenous students (many of whom are also indigenous teachers at their villages) in their schools, since the Brazilian government has been implanted computers and INTERNET at public schools located in indigenous villages. Could someone help us? My students and I will be really thankful and we are really looking for an answer. Sincerely, Rodrigo Cotrim (Professor at Federal University of Goiás, Goiânia, Brazil)— This unsigned comment was added by Rodrigo Smisuite (talkcontribs).

Such words are certainly welcome here as entries (though the "definitions" are English translations); see template:welcome for basic information about how things work around here, and feel free to ask here (or, better, at WT:ID) any further questions you have.​—msh210 (talk) 02:39, 15 February 2012 (UTC)[reply]
We have some that you can look at. This should act as a guide for you: Category:Guaraní language. —Stephen (Talk) 03:49, 15 February 2012 (UTC)[reply]
Also note that there is a Portuguese-language Wiktionary, where the glosses are written in Portuguese and the administration of the Wiktionary is discussed in Portuguese. You and your students may prefer to create your dictionary there, so that glosses and communication can be conducted in that language rather than English. Of course your entries are welcome at English Wiktionary too! But if you prefer using Portuguese, you should be aware that there is that option. —Angr 10:27, 15 February 2012 (UTC)[reply]

I'm not 100% happy with this proposal, but I think it's an improvement over the status quo.

Things I'm not so happy with:

  • What about multiple quotations from a small group, such as a single Usenet group? Should they be counted as independent?
  • I don't like the broadness of "anything like the following", but I also didn't want to try to microscopically define all corner cases.

Input or improvements on these points, or on any other, would be welcome.

RuakhTALK 20:22, 15 February 2012 (UTC)[reply]

Hm, I don't know if this is a good idea. I like requiring independence of citations as a general principle for what makes something a real word, but it doesn't translate well into an actual usable firm rule that doesn't break certain things. I'm not sure if the proposed replacement is an improvement. --Yair rand 20:33, 15 February 2012 (UTC)[reply]
So, what would you suggest instead? —RuakhTALK 20:54, 15 February 2012 (UTC)[reply]
Well, taking in to account that a proposal needs community consensus, I would just leave the section the way it is. In a situation where I find myself appointed Supreme Dictator of Wiktionary, I would probably change it to something horribly ambiguous, and leave relevant decisions to whoever happens across the relevant RFV or RFD and can get enough people to agree that "that's is/isn't really independent...", and win the inevitable new argument about what independence means, which can be repeated every time the situation pops up (thus producing all sorts of interesting examples and arguments which might be useful in drafting a potential new policy), sort of like what we do with noun/proper noun designations. :P --Yair rand 21:07, 15 February 2012 (UTC)[reply]
Ah. The current section is so bad that I guess I just don't see leaving-it-the-way-it-is as an option. :-P   The key problem, by the way, isn't that it's vague (which I assume is what you mean by "ambiguous"), but that it's contradictory: it proposes a specific rule, giving non-durably-archived examples, and then explains that the rationale is something completely unrelated. Obviously I'd prefer a guideline that's actually usable, but failing that, we need to fix the current text somehow. (You complain about the difficulty of getting a rule "that doesn't break certain things", but the current text already is broken . . .) —RuakhTALK 21:30, 15 February 2012 (UTC)[reply]
The current text certainly has significant problems, but it has the advantage of being very open to community interpretation. The only real statement in that section (excluding the last sentence, which we generally just don't listen to) is that we want to exclude multiple references/uses that draw on each other. The proposed version actually gives specific points about what that means. A famous quote that becomes an idiom (ex. et tu, Brute) could have an issue with this, as every use of it technically is a verbatim quotation. --Yair rand 22:36, 15 February 2012 (UTC)[reply]
For the most part, I like your (Ruakh's) proposal. Where I see a problem is in the italicized part (by me) of "This serves to exclude uses that draw from each other, or that draw from a common source": "draw" is so broad that specialist uses of a shared term that can be traced to a common source might be considered dependent; an example would be speciesism, I think, which can be traced to Richard D. Ryder from 1973 if one believes Wikipedia. -Dan Polansky 21:34, 15 February 2012 (UTC)[reply]
Most uses of a word (at least of an invented word) ultimately originate from a common source. The text should make clear that uses of a word in different sentences written by different people always are independent citations, whatever this word is. Lmaltier 21:48, 15 February 2012 (UTC)[reply]
A proposed edit: "In particular, two uses are non-independent if (but not only if) anything like the following is true:". This maybe what was intended. As a consequence, the rule would be more explicitly open-ended. --Dan Polansky 21:38, 15 February 2012 (UTC)[reply]
What are the other possibilities? How about instead of "(but not only if)" we add another bullet point with "if consensus of the Wiktionary community finds it to be non-independent" or some such? Pengo 22:38, 15 February 2012 (UTC)[reply]
The word "if" is often read as "if and only if". This was the reading many editors applied to "if" in "A name should be included if it is used attributively, with a widely understood meaning". The same reading is usually applied to "This in turn leads to the somewhat more formal guideline of including a term if it is attested and idiomatic": a term should be included <=> the term is attested and idiomatic. --Dan Polansky 22:49, 15 February 2012 (UTC)[reply]
When someone misreads "if" as "if and only if" their error should not be treated as correct. No syllogism is bidirectional unless that is clearly specified. Eclecticology 09:40, 16 February 2012 (UTC)[reply]
  • I've updated the proposed text to steal Lmaltier's explanation of "independent", almost verbatim; to eliminate the vagueness that Dan Polanksy points out in "draw"; and to plop in a "roughly speaking" and "generally" in acknowledgement of Yair rand's point (though I'm sure he won't consider it nearly enough). The "roughly speaking" and "generally" hopefully also address the point that Dan Polansky was making about how "if" is often taken to mean "if and only if". (Another possibility is to insert a "say" or "for example". I'm not a fan of the "if (but not only if)" wording, though; for some reason, even though it gets several thousand b.g.c. hits, it sounds very strange to me.) —RuakhTALK 00:54, 16 February 2012 (UTC)[reply]
    I can't find any problem with this revision, the latest one. (I have made a small fix to the vote.) Because the second sentence and the bullet points are introduced by "Roughly speaking", this gives some flexibility. Great job! --Dan Polansky 06:51, 16 February 2012 (UTC)[reply]

Votes to change CFI

In addition to the vote Ruakh has set up (on Independence, see the section just above this) and the vote Dan has set up (on company names, two sections up), Liliana has set up a vote for Removing "Vandalism" and "Protologisms" sections of CFI pursuant to October's straw poll, and I have set up one vote to make small changes concerning Patronymics and stylistic edits of CFI and another to remove the section on Attestation vs the slippery slope, both also inspired by the results of the straw poll and other past discussions. Woo, voting. (Other bits of CFI the community expressed an interest in re-examining, but concerning which no vote has yet been set up, including Idiomaticity, Natural Languages, Constructed Languages, Brand Names, Names of Specific Entities.) - -sche (discuss) 23:26, 15 February 2012 (UTC)[reply]

Dan Polansky suggests on the talk page that we could link the key words in our general rule to the sections of CFI that define them (like <tt>[[#Attestation|attested]]</tt>) rather than putting them in bold and linking to the main namespace. Please comment here or on the talk page if you have a preference for one idea or the other. Also, WT:CFI currently uses a mix of curly (“”’) and straight (""') quotation marks and apostrophes; please also comment if you have a preference for one of those or the other. :) - -sche (discuss) 19:27, 16 February 2012 (UTC)[reply]

Being paid to write Wiktionary entries

I have been told by email that one of our contributors (User:Boundlesslearning) is being paid (by an e-learning company) to write articles for us. Notwithstanding that his contributions have been of poor quality (where they wern't just sum-of-parts), is this acceptable? SemperBlotto 09:07, 16 February 2012 (UTC)[reply]

As long as it is understood by both the company doing the paying and the user doing the editing that they both waive any property claims over the latter's contributions hereto, I don't suppose it really makes any difference to us. That being said, Boundlesslearning's wage gives him an ulterior motive for editing here; consequently, we are thereby justified in assuming bad faith on his part if the quality of his contributions does not improve rapidly. To put it bluntly, if he's getting paid to edit here, he'd better make sure his contributions are worth having, and that he isn't just adding mess that has to be cleaned up by the unpaid volunteers who make up the vast majority of the editing community here. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:49, 16 February 2012 (UTC)[reply]
Agreed. I would like to know why they were having him edit. The essence of the problem on Wikipedia is that those paid to edit are dismotivated to follow NPOV. If he actually has reasons to improve the dictionary, then it's a good thing; if he's here to spam, it's not.--Prosfilaes 10:55, 16 February 2012 (UTC)[reply]
Very worrying. They will most certainly have an ulterior motive of spamming their techniques, technologies, etc. even if they aren't so direct as to do it with hyperlinks. Wikipedia has policies about this, as it has been far more of a problem there; does anyone know what they are? Equinox 09:58, 16 February 2012 (UTC)[reply]
If party A wants to pay party B, it's beyond our remit to interfere. What we can 'interfere' with is the contributions of individual editors. If an editor is vandalistic or consistently makes bad but non-vandalistic edits, we should block them. Having witnessed Boundlesslearning's edits a block seems very much appropriate. Mglovesfun (talk) 11:17, 16 February 2012 (UTC)[reply]
Are you sure? He doesn't seem to edit frequently enough to be paid. We should watch carefully the external links he adds, but I haven't noticed any POV yet. Ungoliant MMDCCLXIV 14:16, 16 February 2012 (UTC)[reply]
I think it would be somewhat harder to insert POV into dictionary entries than in encyclopedia entries. My concern with a paid editor here would only be with the quality of their work and their conformance to the CFI, not with their bias for or against a particular viewpoint. bd2412 T 14:24, 16 February 2012 (UTC)[reply]
I think the burning question here should be "how can we get paid to write Wiktionary entries?" --Itkilledthecat 14:19, 16 February 2012 (UTC)[reply]
WF - You'll get your reward in Heaven (or possibly the other place). SemperBlotto 15:18, 16 February 2012 (UTC)[reply]
I wouldn't mind getting paid to do legitimate Wiktionary work. My biggest question about someone getting paid would be: "Why not me?"
I would wonder about the credibility of charges that someone was being paid, as such charges could be leveled at anyone. We are not really in a position to investigate such charges.
Such paid work could be both legitimate for Wiktionary and in a payer's interest under various circumstances.
  1. If an industry association or trade union wanted to make available the technical terms of its industry in hopes of getting someone to translate them into other languages, we might object to flooding by {{trreq}}, but we should welcome the addition of perhaps obscure entries, subject to our usual standards for inclusion, such as they are.
  2. If some national government payed for the entry of words in a recognized language, would we object? Should we?
  3. If some tourist board employee entered all the locations within its remit, could we object? DCDuring TALK 19:10, 16 February 2012 (UTC)[reply]
That's what I should have said. What matters here is the entries, not who created them or why. Mglovesfun (talk) 19:21, 16 February 2012 (UTC)[reply]
Exactly. Motivations are not relevant (after all, everyone here must have one's own motivations), provided that what is done improves the Wiktionary. Lmaltier 20:10, 16 February 2012 (UTC)[reply]
A less legitimate rationale: it might happen that people get paid for introducing a very large number of (not too obvious) copyright violations with the end of the project as the ultimate objective. Lmaltier 20:23, 16 February 2012 (UTC)[reply]
I agree with DCDuring and Lmaltier, if someone is paid to edit Wiktionary that can be OK, as long as their edits are good. (Re Lmaltier's second comment: even a volunteer could introduce a large number of copyvios, as User:Primetime did.) - -sche (discuss) 08:17, 17 February 2012 (UTC)[reply]
The edits seem fine to me. Whatever happened to "assume good faith"? I studied biotechnology and many of the contributions are common terms I recognise, and have perfectly reasonable definitions. Pengo 01:54, 17 February 2012 (UTC)[reply]
That is because every single one of them has been cleaned up by another user. SemperBlotto 07:59, 17 February 2012 (UTC)[reply]
If someone wants to pay me for editing Wiktionary, please let me know :). On the relevant note, I see no problem with being paid per se; the contributions should be judged on their own. --Dan Polansky 08:27, 17 February 2012 (UTC)[reply]
Note. This user is now operating under the name of User:Scienceexplorer (confirmation via email). SemperBlotto 11:17, 22 February 2012 (UTC)[reply]
THe issue I see here is, the decision (of paying these contributors) was made unilaterally by an external business who has no direct influence over any Wikimedia site without consulting with anyone from Wikimedia. So there is next to no understanding or communication of their (ulterior) motive in making this initiative. They also made no effort in understanding the existing standards and conventions used in any given Wikimedia site, before devoting their money in unexperienced editors. Besides, I have seen these people's (yep, I suspect there is more than one person involved) edits and their quality is no way near as good as the quality of the contributions of the amateur (or professional in some cases) lexicographers on this dictionary website. JamesjiaoTC 11:40, 22 February 2012 (UTC)[reply]
  • I looked at 10 or so of today's contributions from User:Scienceexplorer. They seemed reasonable well formatted and well worded. I have challenged three that seemed SoP to me, but not everyone agrees with my nominations to RfD. The contributor may not be sensitive to matters like whether an NP headed by a word (protein) that is both countable and uncountable isn't also both uncountable and countable. It would be nice to see at least one citation. Even for the SoP terms, I see no reason for them not be in an glossary-type appendix and/or redirects either to another headword or to the appendix. IOW, this seems like better than average specialized content. If the person is getting paid, s/he has plenty of incentive to learn our approach and apparently has. DCDuring TALK 17:28, 22 February 2012 (UTC)[reply]

Created category, Freedom of speech and en:Freedom of speech

Created new category, for Freedom of speech. This is in conjunction with crosswiki sister project coordination at Commons:Category:Freedom of speech. Please feel free to help populate it, that'd be most appreciated. ;) Cheers, -- Cirt (talk) 06:13, 17 February 2012 (UTC)[reply]

I find it an excessively narrow topical category. DCDuring TALK 12:04, 17 February 2012 (UTC)[reply]
As do I.​—msh210 (talk) 19:27, 20 February 2012 (UTC)[reply]

Not sure if this is a purely technical issue and belongs to Wiktionary:Grease pit but I have created three entries for Arabic diacritics but the next/previous buttons show something else and the red links suggest unsupported titles. Does Wiktionary fully support Arabic diacritics? As you see the headers for the entries are better used in combination with ـ (taṭwīl/kashida - the elongation symbol). What's the best way to create these entries? Do they belong to unsupported titles? Trying to show links to the new entries here: َ (a)‎, ِ (i)‎ and ُ (u) --Anatoli (обсудить) 04:11, 20 February 2012 (UTC)[reply]

Hmm, I can't get to the link to these three entries on my contributions list (currently using Windows XP, Firefox browser). Can see the symbols but no link. --Anatoli (обсудить) 04:26, 20 February 2012 (UTC)[reply]
I don’t have any difficulty opening َ (a) or ِ (i) or ُ (u). —Stephen (Talk) 10:06, 20 February 2012 (UTC)[reply]
Thanks, Stephen. Now using Windows 7 - my home computer, which also has Arabic support installed. I don't see the links to the entries at all. If I open Category:Arabic diacritical marks, I only see three bullet points. I can only see the symbols (over or under |) in the edit mode while typing this reply. I don't understand what's going on. --Anatoli (обсудить) 11:09, 20 February 2012 (UTC)[reply]
I don’t know, either. I am using WinXP Pro and Firefox 10, and for me it’s no problem. I can open the entries and I can see them in the Category page. —Stephen (Talk) 11:21, 20 February 2012 (UTC)[reply]
I can see them just fine on Windows XP and Opera 11. They display the dotted circle similar to other scripts. -- Liliana 13:57, 20 February 2012 (UTC)[reply]

For reference: one‎, two, and three.​—msh210 (talk) 19:22, 20 February 2012 (UTC)[reply]

I don't see the links on Firefox 10 in Linux Mint 11 but I do see msh210's links. —CodeCat 19:37, 20 February 2012 (UTC)[reply]

I don't see Anatoli's links (Firefox 10, Windows 7), except when editing the page to write this, where they appear over lines, as he describes. I don't see links in the category, either, only bullet points. I do see the characters once I reach the page via msh210's links. Perhaps this is another good example of the need for combining characters to be combined with something. (The combining-character-only pagetitles could certainly redirect to composed forms, or the composed forms could redirect to combining forms.) - -sche (discuss) 20:45, 20 February 2012 (UTC)[reply]
For now, I'll make redirects - ـَ (-a)‎, ـُ (-u)‎ and ـِ (-i)‎ and others later. I can see the links on my work computer (Windows XP, Firefox 5 but not on my home laptop - Windows 7, Firefox 5). The results with other browsers, systems may be unexpected. Perhaps need to check some other similar examples where a dacritic can only work in combination with something, like -sche suggested. --Anatoli (обсудить) 22:14, 20 February 2012 (UTC)[reply]

Flags

Is there a page where I can see all flags? And where is the correct place to discuss about them (inclusion, change, etc.)? Ungoliant MMDCCLXIV 20:15, 20 February 2012 (UTC)[reply]

*see*? The code for the flags is stored in MediaWiki:Gadget-WiktCountryFlags.css. If you want to discuss anything, do so here I guess. -- Liliana 20:25, 20 February 2012 (UTC)[reply]
Thank you. Ungoliant MMDCCLXIV 20:49, 20 February 2012 (UTC)[reply]
By the way, the flag for !Xóõ isn't working. Bloody encoding. Ungoliant MMDCCLXIV 02:15, 27 February 2012 (UTC)[reply]
Bleh. No idea how to get that to work. -- Liliana 02:38, 27 February 2012 (UTC)[reply]
I think this MediaWiki behavior is a bug. Neither HTML nor XML allows attributes of type ID to start with ., so the encoding of !Xóõ as .21X.C3.B3.C3.B5 is invalid. —RuakhTALK 03:34, 27 February 2012 (UTC)[reply]
In that case, shouldn't it be reported to mediazilla:? -- Liliana 03:42, 27 February 2012 (UTC)[reply]
Yes, I think so. —RuakhTALK 21:20, 27 February 2012 (UTC)[reply]
Adding a \ before each . should work, I think. --Yair rand (talk) 03:45, 27 February 2012 (UTC)[reply]
Indeed. Done DoneRuakhTALK 21:20, 27 February 2012 (UTC)[reply]

Rhymes by dialect in Catalan (but possibly other languages too)

The current way of categorising rhymes in Catalan is by using the standard Central Catalan dialect of Catalonia, which is the best-known standard for Catalan. However, there are other dialects, some with their own standard, notably Valencian and Balearic. The problem is that these dialects distinguish certain phonemes that Central Catalan doesn't, especially in unstressed syllables. In Central Catalan, unstressed a and e are pronounced the same (as schwa), as are unstressed o and u (as u), so words ending with those vowels (optionally followed by more sounds) rhyme in Central Catalan whereas they don't rhyme in Valencian. But in Central Catalan words containing stressed ɔ, this is often merged with o in Valencian, so that for example dónes and dones sound alike in Valencian but not in Central Catalan. The same situation occurs with ɛ and e, but Balearic has a third e-like phoneme, stressed ə. I'm wondering how this situation can be solved, seen as currently certain rhymes are thrown together for the sake of Central Catalan while such mergers are inappropriate for Valencian speakers. Should the categories be split so that both dialects are represented, with a footnote that for example words ending in -os rhyme with those in -us in Central Catalan? And what about Balearic, a dialect that has fairly few speakers and even less contributors... —CodeCat 00:49, 23 February 2012 (UTC)[reply]

See Rhymes:English:-ɛri for how we handled one case where some dialects of English exhibit rhymes that others do not. I don't know if that's the only approach we're using for English (we're not famously consistent about these sorts of things), and maybe it's not the best approach for Catalan; but it's probably a decent starting-point. —RuakhTALK 03:51, 27 February 2012 (UTC)[reply]
That approach is used for some Catalan rhyme pages as well, but the issue is that currently our Catalan rhyme pages use the schwa phoneme (in the title), which exists in Central Catalan but corresponds to two different phonemes in Valencian. This means that the words on for example Rhymes:Catalan:-onə might rhyme in Central Catalan but not in Valencian, where they would be differentiated into -ona and -one. So the question is whether there should be Rhymes:Catalan:-ona and Rhymes:Catalan:-one with a notice like the one you mentioned, even though Central Catalan doesn't have unstressed -a or -e. —CodeCat 12:51, 27 February 2012 (UTC)[reply]

Proposal - complete unified login for all eligible accounts

I have created a proposal at Meta, to complete unified login for all eligible accounts. Unified login is a relatively new feature to the WMF wikis, allowing each user to have a single combined account in every project. Users that only have an account on one wiki would extend that to all wikis, and users that already have accounts on multiple wikis would have them combined. It was initially an opt-in for existing users, but it is now done by default for all new users. This leaves us with three groups of users: those with UL, those that cannot complete UL because of a naming conflict on another wiki, and those with no conflict that have simply not completed the process. I am proposing that account unification be completed for all eligible accounts without requiring the user to take any additional steps. This would make UL the rule rather than the exception that it currently is, and bring us closer to the goals of universal watchlists, recent changes, interwiki page moves, etc. This would be especially helpful on Commons, which has so many images that were originally uploaded at another WMF wiki, enabling better attribution without interwiki links. I propose that it be carried out as a one-time process rather than a continuous automatic software process, allowing users to still adjust ULs as they see fit.

If you have any opinion one way or the other, please reply at the proposal at Meta. JohnnyMrNinja 01:13, 23 February 2012 (UTC)[reply]

Misuse of rollback by SemperBlotto

Here, SemperBlotto used rollback to revert a perfectly good-faith edit without any explanation given for the revert. This is not the first time he has done this to me; nor am I the only person who he has misused rollback on. I hereby request that his rollback privileges be suspended owing to continual abuse. Purplebackpack89 (Notes Taken) (Locker) 01:20, 26 February 2012 (UTC)[reply]

A cursory examination of his talk page reveals numerous complaints about hasty deletions or reverts. This has got to stop Purplebackpack89 (Notes Taken) (Locker) 01:28, 26 February 2012 (UTC)[reply]
Good faith isn't good enough. In the edit under discussion you seem to have confused an etymology with a definition. (deprecated template usage) Metro clearly functions as a word in its own right, having meaning that is not identical to either (deprecated template usage) metropolitan or (deprecated template usage) metropolitan area. DCDuring TALK 02:30, 26 February 2012 (UTC)[reply]
"Good faith isn't good enough". If an edit was made in good faith, it can't be rolled back, even if it's wrong. It can be fixed or undone, but not rolled back. Rollback is for bad-faith edits only. The issue here is that Semper makes reverts and deletions too quickly to be anywhere near 100% accurate about being vandalism or not (This is hardly the first time he's been inaccurate with rollback). Because of that, he should forfeit his tools. And FYI, it is a definition; in many cases "metro" is used as a synonym for the adjective use of "metropolitan", not just as a noun regarding transit. Purplebackpack89 (Notes Taken) (Locker) 04:10, 26 February 2012 (UTC)[reply]
Wrong good-faith edits can be rolled back. "Rolling back" is just a one-click version of "undoing". Admins are busy people, so if an edit is wrong enough to merit undoing, it will often be rolled back. - -sche (discuss) 05:10, 26 February 2012 (UTC)[reply]
That's a misuse of rollback to do that, -sche. SemperBlotto serially misuses it, and the deletion tool as well, bites newcomers, and doesn't assume good faith. Frankly, I cannot understand how he is still an admin Purplebackpack89 (Notes Taken) (Locker) 05:39, 26 February 2012 (UTC)[reply]
DCDuring and -sche both clearly feel that it can be O.K. to roll back good-faith edits, and I'll add my voice to their chorus. Do you have any evidence for your contrary claim? For example, can you link to a Wiktionary policy or guideline on the subject? —RuakhTALK 05:51, 26 February 2012 (UTC)[reply]
Lemme turn the tables on you...on any other WikiMedia project, rollback can't be used for good-faith edits. Where's the policy or guideline that says we can or should here? Purplebackpack89 (Notes Taken) (Locker) 06:01, 26 February 2012 (UTC)[reply]
We conveniently don't have rollback policy, so I go to Meta.

Rolling back a good-faith edit, without explanation, may be misinterpreted as "I think your edit was no better than vandalism and reverting it doesn't need an explanation". Some editors are sensitive to such perceived slights; if you use the rollback feature other than for vandalism (for example, because undo is impractical due to the large page size), it is courteous to leave an explanation on the article's talk page or on the talk page of the user, whose edit(s) you have reverted.

So, at the very least, SemperBlotto is being discourteous and BITEy. I think it's time we got rollback policy of our own, and I propose that we follow the lead of EN and most other WikiMedia projects and state that rollback is for vandalism only Purplebackpack89 (Notes Taken) (Locker) 06:34, 26 February 2012 (UTC)[reply]
I repeat what Mglovesfun said in WT:FEED: "Something doesn't have to be vandalism to be removed, it just has to be bad. If the version rolled back to is better than the previous version I support it. Wikipedia seems to have a habit of prioritizing contributors over its articles, I'd be delighted if we didn't do the same here." - -sche (discuss) 07:43, 26 February 2012 (UTC)[reply]
@Purplebackpack89, rubbish, anything can be rolled back. I've rolled back my own good faith edits before, therefore should I lose my admin privileges?! You're making the classic mistake of assuming that we're Wikipedia, and we're not. I hate the idea that someone who makes a good faith bad edit is immune to having that edit removed; we might as well say we welcome bad edits. Mglovesfun (talk) 12:13, 26 February 2012 (UTC)[reply]
Um, there's still the undo button, and regular editing to get rid of good faith bad edits. The point is it ain't right for Semper to remove something like that without bothering to explain why Purplebackpack89 (Notes Taken) (Locker) 17:02, 26 February 2012 (UTC)[reply]
@Purplebackpack89
  1. meta:Rollback says "Rollback works much quicker than undo" and explains further. That's a great reason to use it.
  2. About "without bothering to explain why", see my message below, signed and dated "13:03, 26 February 2012 (UTC)"
--Daniel 10:19, 28 February 2012 (UTC)[reply]

That Meta page seems to be just a help page. So it would not be a policy page on Meta; and, either way, it's definitely not a policy on Wiktionary. Even if it were a policy, it does not say "Rollbacking one good-faith edit is grounds for revoking rollback rights." The section you (Purplebackpack89) copy-pasted here is worded as an essay, rather than a rule. And the whole page focuses on Wikipedia, with jargon like "article" (we say "entry"), "encyclopedic" and "the processes in dispute resolution".

In particular, the idea of always explaining about reverts on users' talk pages looks somewhat good on paper, but:

  • It would be very cumbersome to implement: sometimes we do that, but there are so many edits to be reverted and so few people to do the work (mostly Semper alone).
  • Here it would be useless most of the time. Wikipedia has long articles, with their wordings, coverage, "notability", extensive sections and so on. When an edit (particularly a big edit) of Wikipedia is reverted, it can be difficult to determine why, unless someone explains. When an edit in Wiktionary that fits the standardized system (is formatted with the right sections, lines and is not gibberish like "glrbglblggbrlb" or "LOL FAG") is reverted, the obvious justification commonly is "This entry would be better without these new five or twelve words that you added." If you defined metro as # Abbreviation of [[metropolitan]]., then the obvious "explanation" implied in SemperBlotto's action is "I believe 'metro' is not an abbreviation of metropolitan." (or this variation: "I believe you should not say that metro is an abbreviation of metropolitan.") Do you really need more than that?

You already came here and got your explanation. Your edit is gone, as it should be. It doesn't matter whether the rollback function did it, or it was the "undo" button, or that a meteor crashed on the servers and flipped a few bytes. If it was really in good faith, I suppose you can accept its short life and move on.

P.S.: meta:Rollback says "If your material is reverted, don't take it personally." --Daniel 13:03, 26 February 2012 (UTC)[reply]

WT is unlike WP in that it's fairly liberal about references. Being more lax with references means being quicker to revert edits. The only support for contributions without references comes from the approval of other editors, and the edit in question failed that test, so if it's really a good edit, someone must provide some form of verification. --Haplology (talk) 13:48, 26 February 2012 (UTC)[reply]

We do have Help:Reverting, which mentions that "Reverting vandalism is obviously acceptable, as is reverting copyright violation and edits that do not conform to our Criteria for inclusion." (italics mine). Maybe SemperBlotto removed it because you placed that definition in the Noun section. Ungoliant MMDCCLXIV 14:19, 26 February 2012 (UTC)[reply]

I wholly support what SB did. If such an edit of yours is contested in future, add cites or lump it. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:10, 26 February 2012 (UTC)[reply]

Quoting Purplebackpack89 "That's a misuse of rollback to do that". No it isn't. In your opinion, I'm sure it's a misuse, but on this wiki there's no rule about it and as this discussion has shown, there's no consensus to consider it a misuse, the opposite in fact, so may I suggest now is a good time to drop the matter entirely. As I like to say, if you don't want your bad edits reverted, don't make any bad edits. Mglovesfun (talk) 19:53, 26 February 2012 (UTC)[reply]
@Purplebackpack89 Troublemakers should not be welcome. When a person spends more time arguing and proving his/her point than learning how to make good edits, then it takes the precious time off editors who know how to edit well. Rolling back is not blocking, anyway. Please learn to deal with it. --Anatoli (обсудить) 01:09, 7 March 2012 (UTC)[reply]

Brand names and physical product

I have created Wiktionary:Votes/pl-2012-02/Brand names and physical product, as several people act in RFV as if the wording of "physical product" were not part of WT:BRAND. Thus, there is some support for getting the wording of "physical product" removed, and let us see how big that support really is. I am going to oppose.

I have left the rationale empty. Those who support the proposal have to come up with a rationale, or leave it empty if they oppose rationales in votes.

My rationale for opposing the removal is that the wording makes the already needlessly exclusionist WT:BRAND even more exclusionist, disregarding lexicographical merit of entries. If I could decide, I would drop WT:BRAND rather than making it stronger. Unfortunately, WT:BRAND has been voted on.

Feel free to postpone the vote should the discussion last until the planned start of the vote, which is 4 March 2012. --Dan Polansky (talk) 17:05, 26 February 2012 (UTC)[reply]

Interesting idea, and I agree that a product does have to be physical in some way. Though physical doesn't need to mean tangible, electricity could be physical for example. But something which is an idea can not on its own be physical. Bugs Bunny isn't physical, though manifestations of it can be physical, such as a toy. Mglovesfun (talk) 19:40, 26 February 2012 (UTC)[reply]
Would a book title, exclusively distributed by electronic means, be a branded product by that reasoning? What about the physical representation of an idea in the brain? What about the physical representation of a brand as a sequence of letters on a piece of paper or a storefront? DCDuring TALK 20:08, 26 February 2012 (UTC)[reply]
That's what I mean, anything can be represented physically, but the representation is not the thing itself. If I write the word chair on a piece of paper, it's not a chair. But if I hold a can of Lynx deodorant in my hand it is a can of Lynx deodorant. Geddit? Mglovesfun (talk) 23:14, 26 February 2012 (UTC)[reply]
I think I geddit. I also think brands are different from physical entities, but they are embodied in physical objects. "Tony the Tiger" is a trademark associated with Kellogg's Frosted Flakes. What about a "Bugs Bunny" doll? What about "Warner Brothers" or "WB"? What about a patch with "John Deere" or "Citibank" on it? What about an envelope or letterhead stationery with a brand name and logo? DCDuring TALK 23:58, 26 February 2012 (UTC)[reply]