Jump to content

Wiktionary:Beer parlour/2004/April-June

From Wiktionary, the free dictionary
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


Concordances

Could someone in authority rename the Wiktionary:Concordances page (and the Sherlock Holmes list it points to) to "Concordances" and "Concordance of Sherlock Holmes"? Also, how about adding the Concordances page to the list of appendices on the main page? It appears that words added through this list are all "orhpans" and not indexed by Google etc. —Długosz

Taken care of. I hope no pages were linking to these pages. Keep up the good work! Polyglot 22:06, 6 Apr 2004 (UTC)

Statistics

I visited 100 random pages using the random page link to the left. This gives me an estimate as to how many definitions are really present. That is, the main page currently states "already have 35811 entries in the English version."

  • 59% of them, well over half, are actually Chinese. The person who thought this was not an English edition has good reason!
  • 15% of them are other non-English entries.
  • 3% are proper names and scientific terms.
  • 1% are not definitions at all.
  • 22%, which comes to about 7880 of them, are actually definitions for English words.

So what? Perhaps effort is best served on translation and other things not found easily in existing dictionaries. But, it would make sence to have filters for "random page" for someone just browsing who likes to read dictionaries.

Update, with 40104 entries i the English version.

  • 50% Chinese
  • 13% other non-English
  • 5% are proper names
  • 5% are non-definitions
  • 27%, or about 10828 of them, are definitions for English words.

Or, of 4293 new entries, 2948 of them, or 69% are English words; 1076 of them, or 25%, are more Chinese entries. This implies that the Chinese bot is still at work.


Długosz

The first thing that this shows is that the Chinese did a very good job of entering their material. It's reminiscent of the points that were raised in Wikipedia when the various articles were added about small towns in the United States. My own feeling is that more people should be seting up the English dictionary entries for English words. Before we can really offer something that is not found in other dictionaries we need to have a base of material to work from; that won't be there until the English corpus has reached a critical mass. Eclecticology 01:55, 13 Apr 2004 (UTC)


So how to hold a membership drive? —Długosz
Slashdot is a great medium, but something newsworthy needs to happen then. (like servers that crash all the time...) I'm convinced that having appeared on Slashdot around Christmas has given the project more recognition. Maybe when we reach 50000 entries, we can throw a party... You can also tell about it on newsgroups dedicated to languages or linguistics. (Probably a better idea).
As far as this distribution goes. Those Chinese entries were entered with a bot. If you really want to, I'm sure you can find some source material as well. Create a bot for entering it and create a lot of stub articles. I don't know if this would do much good, but at least then there will be more English content.
Also take into consideration that the English Wiktionary was the first Wiktionary. The other language Wiktionaries will have far less foreign language content and also the English Wiktionary will start attracting less foreign language content as more other language Wiktionaries get started.
If I feel like describing my own language, I won't hesitate to do so. I will probably also add the English counterpart article to add translations to other languages, but I might also not do so. I don't have to.
I don't worry about the pace this project evolves. It will probably keep accelerating as more and more people join in. Those English words will get there. Rome wasn't built in a day either and it's better to have sure but steady pace than to try to do all at once and then collapse (or have people give up) because it's not manageable anymore.
Keep up the good work!Polyglot 08:28, 15 Apr 2004 (UTC)
PS: I use Mozilla and whenever I use the random page button, I click it 20 or 30 times with the middle mouse button. Then I simply discard the Chinese character entries, since they are of no use to me either. I'm glad we have them though.
Spoken like a true veteran!
I generally agree. I prefer the smaller community where we can still treat each other with respect, and without being in a big hurry to have everything done by yesterday. I don't think we need a recruiting drive; it's much better when people come one at a time and can be integrated in the culture. I really don't like a lot of what's been happening at Wikipedia -- too many control freaks. Although NPOV applies just as much here as there, I'm very glad that it's never been a big problem. Eclecticology 21:34, 16 Apr 2004 (UTC)

How to show non-English inflections, conjugations, distinctions, etc?

I am quite new to this, so if this comment is placed in an inappropriate spot, I sincerely apologize in advance.

I love the idea of Wiktionary, and since I'm quite new here I am quite certain that I do not as of yet have a complete grasp of what it is all about.

However, I am wondering why free-form entries similar to those of Wikipedia are the default method of data entry. With a dictionary, it is very easy to have a core subset of fields (e.g., part of speech, noun gender and animacy, verb aspect, et cetera) that are required for a basic entry and can easily be prompted for. If a language doesn't support certain fields (as is the case with animacy in English) or a user doesn't know what to enter, the fields can always be left blank. There can also still be the ability for users to enter free-form content (as is done now).

Would not having a standardized "stub-creation" interface help out a great deal in the creation and upkeep of Wiktionary entries? It would also go hand-in-hand with the idea of storing entries with XML and then formatting them as XHTML for browsers, as has been suggested here already. Where can I go for discussion about this sort of thing to try to help out with it?

Also, as I initially meant to ask, how can one link together disparate words as all being conjugations of one verb or inflected forms of one noun? I am thinking primarily of Russian, as someone trying to look up исчез won't necessarily know to search for исчезать/исчезнуть (both verbs mean "to disappear", but they are of different aspect, and исчез is a past tense form of only one of them). The same goes for его/ему/им versus он - all are inflected forms of "he", and I do not believe that they should get entries independently of он. -Nollidj

Something should be said about the conjugation on the main page for a verb. In some cases full details should be given, but in a regular verb it should be enough to make a link to the verb that serves as a template. You have chosen a good example for the argument that separate pages are needed for the third person pronouns. A person with very limited understanding of Russian may be unable to recognize that его is the genetive of он. They don't look the same at all. The его page should explain that. It should probably be a fairly short page that links back to он where the pronoun could be discussed at length. Eclecticology 07:22, 22 Apr 2004 (UTC)
Well, in the headword, if it is declined or conjugated regularly, ideally you link to a page explaining the declension. (Not too many such pages are up: but Wiktionary Appendix:Declensions is the index of declensions, while Wiktionary Appendix:Conjugations doesn't exist yet.) If a word is irregular, you put the full set of forms on the page itself (if they are not many, on one line like Ich, and if they are, in a table like the ones in Wiktionary:Inflection Templates). ... As for putting inflected forms on their own pages, usually I only see this being done if the word is irregular (e.g. I, me, my) or already means something else (e.g. the French entry in lente)—usually such entries do just link back to the dictionary form of the word with a note saying what part they are. —Muke Tever 07:43, 22 Apr 2004 (UTC)
Would it be possible to get some sort of example entry that includes the various conjugations of a verb (probably an irregular one) like the Spanish ser? I am looking for a sample page like that of ich.
See essere for an example of a page with an irregular verb's conjugation table. —Muke Tever 23:44, 23 Apr 2004 (UTC)
Should one be noting that verbs are irregular and, if so, in which tenses they are? How should this be done in a uniform, standardized fashion? I have similar questions regarding how to show verb aspect and animacy, which I mentioned before. Is there some way to see an uber-template that will act as a guideline for Wiktionary entries?
Incidentally, for Spanish verbs compjugador [1] would be an excellent resource, provided that it is permissible to include the material it contains and can generate in Wiktionary. -Nollidj
These templates need to be set up separately for each language. Work is being done on the French Wiktionary for that language. The idea of an "über-template" can only exist in dreams. The translations of verbs between languages are not a matter of one-to-one correspondences. Go ahead. Begin by setting up a template yourself in a language that you know. Start with a regular verb that can be a pattern for others. The irregular verbs, notably to be and its translations, are too different to be a useful starting point. Eclecticology 23:37, 23 Apr 2004 (UTC)

Concern over "potential words"

I'm coining the term "potential words" here to refer to words that can be formed by the rules of word formation but that do not exist or are not used in English. An example would be a plural of an uncountable word (eg, "happinesses"). This word follows the rules of pluralisation of nouns, but not the rules about uncountable nouns (namely, that they do not have plural forms).

I raise this because a user has created all the pages (other than those already existing) of the form <SI_prefix> + "second". If you refer to the appendix on SI units ([2]) you see that I make the point that not all combinations of <SI_prefix> + <SI_unit> are used, for various reasons. This is particularly true of "second", which is a non-decimal unit having the pre-existing submultiples "minute", "hour", etc, are used instead. I dispute the existence of "hectosecond", "megasecond", etc and claim that any scientist would use "100 seconds" and "11 days, 13 hours, 46 minutes and 40 seconds" instead. I suggest that there is little or no evidence for the existence of these terms, and have contacted the user asking him if he has any quotations. A brief Google search suggests that these words exist only in word lists rather than having any real currency.

What do other users think would be the appropriate approach to deal with the proliferation of "potential words" (whether or not those that I mention here are in fact non-words) in Wiktionary? -- Paul G 07:00, 25 Apr 2004 (UTC)

These "potential words" are all theoretically correct, and about as useless. They are also harmless. The contributors are engaging in the passion of stating the obvious, perhaps tinged with the delusion that the infinite is an attainable place. You are fighting a losing battle. Human nature is such that continued complaining about such matters will only encourage them to do more of it. C. Northcote Parkinson is famous for saying that work expands to fill the amount of time available for doing it; thus, if one expands that to matters of infinitessimal importance the work thereby created will be infinite.

Eclecticology 18:39, 25 Apr 2004 (UTC)

Thanks, Eclecticology. This is what I have come to think since posting on this subject. The page on SI units makes it clear what is used and what is not (although it might not be up to date or complete). I suppose the other words could be marked as "rare". -- Paul G 08:11, 26 Apr 2004 (UTC)

Single-letter entries

I've just noticed again how much I hate the alphabet thingies at the top of articles whose titles are a single letter, outside the language sections. Each single letter should have its own entry under each language which uses it. It should have its own part-of-speech, "Letter", and should be shown in both upper and lower case. Greek and Russian homographs could be shown when they exist, and etymologies showing where each letter evolved from and when it was included in each language. (Apologies for sounding like a rant - I guess I am ranting...) — Hippietrail 00:15, 26 Apr 2004 (UTC)


I know how to link to wikipedia from wiktionary, but how about the other way around? For example, w:Great_Red_Spot has a undefined link to rotates, and that doesn't seem like something that needs to be defined in an encyclopedia, so I thought it might link to wiktionary instead. Of course, it might be better not to link to a common word, but that is the example that got me thinking about it.

You can use something like [[Wiktionary:rotates]] to link to the Wiktionary page on rotates from Wikipedia. (On Wiktionary itself, this creates a link into the Wiktionary: namespace.) But note that the link in "Great Red Spot" is actually to "rotation". Eric119 23:21, 26 Apr 2004 (UTC)

Inserting New Definitions Without Renumbering?

When someone inserts a new definition (not appends, inserts in front an existing one), shouldn't he also renumber the respective translations? It has just happened that someone without a user name inserted two new definitions to football and left the translations out of sync. Suddenly the Slovak word lopta appeared to mean Australian football while in reality it means the ball used to play the game of football. Luckily, I noticed it and fixed it. But other languages apparently have the same problem, alas, not speaking them, I have no way of knowing whether someone else has fixed hem already, so I did not touch them.

If we insert new definitions and leave the translations out of sync with them, we render the entire dictionary unreliable. Having seen this, I can no longer trust that any translation of any word in this dictionary is correct.

Red Prince 13:56, 27 Apr 2004 (UTC)

I think you're fighting a losing battle; the problem has been there from the beginning and can only get worse. The definitions are soft numbered to allow new definitions to be added in an orderly fashion. Adding a new definition at the end of the list, when in reality it is much closer to an earlier one on the list would make the whole definition section difficult to follow. It is important to remember that the English definitions in this dictiobary are the core of any article.
A person who wants to expand the definition of a word probably knows nothing about these translations, and is not about to sort these translations out. The problem is with having hard numbered references in the translations. One possible solution is to have a separate list of translations apply to each definition. The other might be to use soft indirect numbering in the translations; these would be changed automatically if the definitions are renumbered. Unfortunately, I am in no position to know whether the second option is technically feasible. Eclecticology 17:21, 27 Apr 2004 (UTC)
I understand that the person inserting does not know all the translations. But the numbers already refer to the pre-existing definitions. As long as those definitions do not change, then their new numbers will be the correct new numbers for the existing translations, so it is a strictly mechanical matter to change them.
As for the technical feasibility, it is certainly possible to program it that way. But that does not necessarily mean that someone will make it his or her programming priority. It would require adding short codes, such as a.b.c. which would not appear in the final HTML and which would not necessarily require an alphabetical order, but they would be replaced with the correct numbers. However, I was not really trying to make a software change request, only that people who insert new definitions do the math and recdo the numbers. Or else, they should make a very clear note in the summary that they inserted a definition and got the system out of sync, so the rest of us can resync it when we are around.—Red Prince 17:34, 27 Apr 2004 (UTC)
Making those changes is not that easy. Soft numbering only shows up as a "#" on the edit page. Adding a new second definition where there are already 8, means that all the numbering gets shifted down. Putting the numbering back in sync across translations in 20 or 50 languages is no easy task for a person who was only coming to add one definition. A newbie probably doesn't even realize that he has put the system out of sync. It all comes down to designing the system so that putting it out of sync would be impossible. Eclecticology 21:02, 27 Apr 2004 (UTC)


Somewhat unrelated; probably already suggested, and a source of loads of programing work I'm afraid, but nonetheless I suggest:

  1. That every translation shows up imediately beneath the corresponding english definition. (no need of referring to different numbers).
  2. That a row of language links is added to the top of every page. When a user clicks on any of those links, then the page shows the translations into that particular language. If multiple links is clicked, multiple languages is shown. (The pages wont be as large to scroll)
  3. Logged in users might have a preference set so that one (or more?) specified language(s) shows up by default.

Then all those who only want to use wiktionary as a purely english dictionary doesn't get confused about long lists of words in everything from Swahili to Norwegian to French, it will be easier to find the wanted definition, and we get rid of the confusion of whether to wikify the language links. \Mike 21:05, 27 Apr 2004 (UTC) (Oh, and since I shouldn't put this suggestion here, where should I put it instead? I don't want to put it on meta until I get some comments...:)

Would work for me.—Red Prince 21:27, 27 Apr 2004 (UTC)
Your first point essentially reflects what I said above. It would result in some duplication, but that is definitely a lesser evil.
The "row of language links" has a potential ambiguity. Do we mean a link to the word in this Wiktionary, or in the separate Wiktionary for that language?
I'll leave it for the more technically minded to address your third point. Eclecticology 21:55, 27 Apr 2004 (UTC)
To clarify: the 'language links' as I called them in lack of better words, would work somewhat similar to the TOC, which you can hide/show by clicking on a link. Imagine X TOC:s (one for every language), which -when shown- displays its information split up in Y parts, one under each english definition.
Hope this clarified what I was trying to explain... \Mike 07:54, 28 Apr 2004 (UTC)

Firstly my apologies. That was me on the football article last night. I was planning to do the renumbering but I was interrupted by instant messenger and forgot to do it. Sorry about that. I have seen it quite a few times before I did it myself so it is a problem.

This is one of the problems of the lack of structure on Wiktionary. I think we need to specify numbers, letters, or codes of some kind for each definition and for each translation. These codes could be rendered as numbers or even be renumbered themselves, and the article's source automatically rearranged in the correct order.

But yes any way to fix it would be significant work on the Wiki code. I'll try not to make such messes again but really somebody always is going to come along and do it. In the meantime we can use the diffs to see when the change ocurred and which numbers need to move. Apologies once again. — Hippietrail 00:37, 28 Apr 2004 (UTC)

Yet another page that has been edited so that the translations no longer fit the definitions is "stem", which has five definitions but Swedish translations numbered from 1 to 7. (By the way, I have just modified the defintions on this page, but the Swedish translations didn't fit even before my changes.)
It seems a shame that people (myself included) put a great deal of effort into writing translations only for these to fall out of synch the moment anyone adds, removes or reorders the definitions.
I propose an alternative, which has the advantage of not requiring any programming. It might be used as an interim measure until such time as one of the suggestions above can be implemented.
My alternative suggestion is the combination of definitions and translations in a table, with a column for each sense and a row for each language (there usually being fewer senses for a word than languages into which it might be translated).
The clear advantage is that translations remain synchronised with definitions when new ones are added. It is also clearer which senses are still translations.
One disadvantage to this scheme is that pages might become rather wide, and so the user would have to scroll across to view everything. A solution would be to restrict the width of the columns so that the definitions were displayed over several rows of text with a few words in each row.
Another disadvantage is that users modifying the tables would need to manage the tables somehow, either via HTML or typing in the Wiki equivalent to draw boxes around entries. Either way would require some learning and would make it less easy to enter data. Users could, however, just enter new content as plain text and allow someone else to insert it into the table.
Short of reprogramming the way data is entered into and displayed in Wiktionary, I don't see another workable solution.
What do others think?
Paul G 13:30, 13 Jul 2004 (UTC)

I've implemented my proposal. See the new discussion thread below. — Paul G 18:16, 15 Jul 2004 (UTC)


More Options in Preferences

I've been using Wiktionary for several months now, and although most of it is going great, I really hope that there will be more options in Preferences. One thing is this: When I search for a Chinese character, I get all this:

   *   Radical Number 96: 玉+8 strokes 
   * Stroke number: 12 
   * Stroke order: 
   * Four-Corner System: 14181 
   * Cangjie input: 一土廿一金 (MGTMC) 
   * Graphical Significance and Origin: 
   * Common Meaning: type of jade 

The thing is that I don't need all that, as I am just looking for the Chinese Hanzi section. Maybe you could input a way where I can either take out the entire first part of the search result (kind of like how in Preferences I can uncheck the box that says "Show table of contents") or maybe just have an option where I can move the sections around (example: If I want the Japanese Kanji first, then the Chinese Hanzi, I can just move it around to fit my needs. Thanks.

As much as I can sympathize with your problem, I can't see the solution coming very soon. I'm not a software programmer, so I wouldn't know how to do this. You could raise the issue on the Wikitech-l mailing list. Eclecticology 16:54, 27 Apr 2004 (UTC)

Simple English

There is now a Wiktionary for every language that has a Wikipedia, including a Simple English Wiktionary. It's not entirely clear how this will fit in with the Simple English Wikipedia. Up until now, that Wikipedia has taken a different approach to the main English one, and the policy on dictionary definitions has been far more relaxed. Some feedback from Wiktionarians on whether that Wikipedia's definitions should be moved to the new Wiktionary or not would be welcomed. Another question is whether a simple English Wiktionary would be significantly different from the main English one. Is there a need to separate them?

I've copied this to Simple Wiktionary's "simple talk" page and invited those on the Simple English Wikipedia to comment there as well. Please add replies there so this can be kept in one place. Thanks. Angela (from the Simple English Wikipedia) 14:18, 1 May 2004 (UTC)[reply]


Non-English Interwikis

Anyone knows if there is a way of using the wiki syntax to link directly from the Swedish Wikipedia to the newly created Swedish Wiktionary? The [[Wiktionary:Article]] links to this wiki. / Mats Halldin 16:05, 3 May 2004 (UTC)[reply]

right now manual link via [http://en.wiktionary.org English] seems to be it... -- EmperorBMA|話す 05:00, 5 May 2004 (UTC)[reply]

I managed to link to the Swedish WT:RC from my WP user page by using [[wiktionary:sv:Special:Recentchanges]]! But now there is a problem with the Wiktionary-namespace on sv:WT and those pages can't even be opened within our WT. Do you, or anyone else, know how to deal with this? / Mats Halldin 19:27, 5 May 2004 (UTC)[reply]

Normal interwiki works now... -- EmperorBMA|話す 23:29, 5 May 2004 (UTC)[reply]

Thank you... The wiktionary namespace works now, I just created [[wiktionary:sv:Wiktionary:Bybrunnen]] on my WP user page - but all the earlier content in those wiktionary articles on sv:WT is gone. In our case it's probably not a big issue, though! (I hope I manage to make myself understood in English.) / Mats Halldin 10:51, 6 May 2004 (UTC)[reply]

I see the same behaviour after our wiktionary namespace on NL was enabled. All I put in there is gone now. Not the end of the world, but not very funny either. Polyglot 07:04, 7 May 2004 (UTC)[reply]
[[:en:Word]] (note the trailing ':' links to Word in http://en.wiktionary.org. Similar for other languages. [[wikt:]] goes from wikipedia to the wiktionary of the same language, w: goes from wiktionary to the wikipedia of the same language. I've queried TimStarling to fix the pipe trick problem that [[:en:Word|]] does not work as it obiviously should map to [[:en:Word|Word]] for workable transwiktionary action --Juxho 06:59, 10 May 2004 (UTC)[reply]

Interwiki (how will we use it)

What should we do when using interwiki? Should we link the foreign word to the English article for the foreign word or to the English word itself? -- EmperorBMA|話す 23:27, 5 May 2004 (UTC)[reply]

I think we should just link exact wordforms. If there is an article called "foop" on the English wiktionary with entries in however many languages, the interwikis can link to which ever other wiktionaries also have an article called "foop" with entries in whichever languages. Linking translations should not be done with interwiki. — Hippietrail 01:39, 6 May 2004 (UTC)[reply]


The problems will be when we have articles like this:

letter

  1. a glyph representing a sound
  2. a written paper sent via the mail
How do we link to both foreign meanings which could have entirely different ways of being written? -- EmperorBMA|話す 21:47, 6 May 2004 (UTC)[reply]
The translations would go into the =Translations= section. You won't use interwiki links for translations, you'll use the interwiki link to link to the definition of the word spelled "letter"—e.g. es:letter would have:
letter (inglés)
  1. letra
  2. carta
To put it another way: an interwiki link is to find a definition in another language, of the same word, not of the word that means the same in the other language. The reason this is important is because otherwise you come to an article like in the English wiktionary... how are you going to link that to Spanish, for example? es:cinco is clearly not the way... the only way that makes sense is es:五. —Muke Tever 22:17, 6 May 2004 (UTC)[reply]
Makes sense to me... (I think we should codify the "interwiki rules" so newcomers will know how to do them)... -- EmperorBMA|話す 22:32, 6 May 2004 (UTC)[reply]


This is why I had proposed to make those interwiki links de facto. Built into the software and providing links automatically to all existing Wiktionaries (which are just a bit more than I expected, but OK). If they link between words of the same spelling, it doesn't make a lot of sense to have to code them by hand into each and every page. If it doesn't get built in to the software I will create a python script to build up those lists automatically. So we can generate them and copy paste them more easily.Polyglot 07:04, 7 May 2004 (UTC)[reply]
I absolutely agree. I must have missed this discussion because I certainly would have voted the same way. When I saw that it was already done I though "oh well they've already decided so let's just do it that way" - but now if people are going to be confused and start linking translations instead maybe it should be put into the software. — Hippietrail 02:17, 8 May 2004 (UTC)[reply]

There seems to be a consensus that the interwiki from the english word en:foo should link to de:foo, sv:foo and so on. To simplify this, I created {{interwiki}}, which (by now) provides interwiki links to the 10 largest wiktionaries (after en: :). I chose to make this limit for now, as I don't want to clutter up with loads of unused interwiki links, but OTOH is it easy to add other lang's... \Mike 09:00, 24 Jun 2004 (UTC)

But then we'll have lots of links shown for pages which aren't really there. It seems kind of silly to make a need for editors to add these when we're just going to link every wiki on every page whether an article exists or not. — Hippietrail 13:14, 24 Jun 2004 (UTC)
I second Hippietrail. The use of the interwiki links is to signalize that there are other places where to seek information. Andres 13:35, 24 Jun 2004 (UTC)
I just thought that it would reduce the amount of work needed when writing the articles. Somehow, I thought (and still thinks) that might be a good idea, but if you think these links are more annoying than helpful, then please delete the template and I won't remake it. I only put it in the article pear to see what it would look like. \Mike 13:57, 24 Jun 2004 (UTC)
<RANT MODE>And on a side note, if I may express an incoherent train of thoughts, I think we could benefit from interlang' links in red if the target doesn't exist, just as the ordinary links. Then we could use such a template as I created, and in my case I would use it while browsing en: to see whether any given Swedish or English word exists in sv; and if not: quickly get there to add it. But, as said, it is just my rumbling thoughts not to be taken seriously since I suppose that coloring the interwikis includes to much code to be written, so it won't happen whithin the foreseeable future...(please delete this ranting if you think it is necessary.) </RANT MODE>
I think the idea is not bad. I guess this is not at all difficult to code. The need for this feature simply didn't exist before. Andres 14:10, 24 Jun 2004 (UTC)
Maybe or maybe not so difficult to code, but it would be quite a load on the already strained servers. This has been enough for us to reduce the number of links on the English Wiktionary alone. — 203.108.239.12

There has GOT to be an easier way...

Behold modem and its repeated definition 5 times. Is there not a simple way we could symbolize a that a definition is the same as the English one instead of endlessly repeating the same definition? Considering that modem is the same in about 30 more languages, we have a long way to go... But I digress... Couldn't we just say "ditto" or "Same as English" if the definition is repeated? -- EmperorBMA|話す 22:11, 6 May 2004 (UTC)[reply]

The easier way is to put just modem in the definition, just like any other English-Anything dictionary. (Some pages do this, e.g. vijf has a short definition, not the full one five has.) But some people have apparently found it more useful to copy out the definition in full... admittedly this is more useful when the English definition is not on the same page, or when the correspondence is not exact. —Muke Tever 22:19, 6 May 2004 (UTC)[reply]
Words with the same spelling tend to be pronounced differently in different languages. In German, if it is a noun, it is alo capitalized. Most probably the plural and diminutives are formed in a different way. Sometimes there will be a different extra meaning in some languages. Sometimes there will be synonyms. Believe me, this is the most sound/logical way of doing this. At least if you want it be exportable/interpretable by a script of a reasonable size (i.e. without having to code too many exceptions). Of course, this is my agenda, I plan to do this one day in the far future. Maybe you think this is not important, but it is what makes it worthwhile for me to contribute to this project. One day I will be able to take the content and package it differently (in a relational database, I have the database schema, but I still need to code an interface for it). Of course it will be free in that form as well.

About the words like modem. If you think of it, there aren't 500 words like that. So it's really more of an exception. The names of the more obscure elements come to mind and the names of the planets. I think we should keep encoding it this way. Polyglot 06:23, 7 May 2004 (UTC)[reply]


Romanian Wiktionary problems

Hi. The definitions counter in the Romanian Wiktionary does not work - we already have around 10-15 entries and the counter shows up as '-1 articles'. Why is this so? Also, would it be possible to change the Wiktionary: namespace on RO: to "Wikţionar:", and to change the caption "Wiktionary, the free dictionary" to "Wikţionar, dicţionarul liber". Thanks, Ronline 08:32, 7 May 2004 (UTC)[reply]

I don't know about the articles counter, it's doing the same thing in the Latin Wiktionary. The same for the Wiktionary namespace and the caption... that at least depends on the wiktionary sysops... if there aren't any on ro: yet, they have to be requested to be made on meta.wikipedia.org somewhere. —Muke Tever 15:58, 7 May 2004 (UTC)[reply]

Japanese

Excuse me, but where is the Japanese Kanji index? --Samuel 04:04, 12 May 2004 (UTC)[reply]

Well we have a "Chinese" radical index which works for Japanese Kanji and Korean Hanja too. Maybe it should have a less biased name. We do have Japanese indexed by pronunciation too. Start browsing here: Wiktionary:Chinese radical index and Wiktionary:Japanese indexHippietrail 04:29, 12 May 2004 (UTC)[reply]
For some time now I've suggested migrating all language based indexes to the pseudo-namespace Index: Would something like [Index:CJK radical 玉]] be an acceptable format for migrating this batch of indexes. Eclecticology 04:58, 12 May 2004 (UTC)[reply]
I agree that all such things should be under Index:. For CJK radicals I have no idea if Chinese and Japanese dictionaries use identical sets of radicals or not. I know some stroke-counts differ. If the sets of radicals are the same then one such index should do. If they differ then one per language is better. Before we move them we should await comments from our CJK experts and also make sure that other indexes fit into the namespace in analagous ways. — Hippietrail 05:17, 12 May 2004 (UTC)[reply]
I see nothing in my Japanese dictionary to suggest that Kanji radicals would be any different from the Chinese ones. I'm not in a position to comment about Korean. I do agree that we should wait for comments from our CJK experts before acting hurriedly. I've already looked at the general situation for other languages, and can see no significant problems in making the transfer. For some it will even provide an opportunity to clean up a few irregularities. Please note that Index: is intended to be used only for language based indexes. Lists developed on other bases should become listed under Appendix:. Eclecticology 09:04, 12 May 2004 (UTC)[reply]

Translingual homophones

A user (I won't give their name) has been including English homophones in foreign-language entries. I think this is misleading and unlikely to be accurate: the phonemes of individual languages, even when represented by the same phonetical transcriptions, are rarely identical. In any case, if English homophones are to be included, then why not homophones from every other language? I have asked the user to refrain from doing this. Any thoughts? -- Paul G 08:46, 14 May 2004 (UTC)[reply]

I'm often tempted to do the same but as you say the pronunciation is never identical so I don't do it and I believe this should be our policy. We could come up with a separate heading for such things but I think that would truly be pointless. Homophones within same language only please. — Hippietrail 10:02, 14 May 2004 (UTC)[reply]
I think I agree.:-) It's really hard to see where it would lead us. Perhaps a listing of what it means when you pronounce "fuck" in a long list of languages. There's a certain crowd that would just love that. :-) Eclecticology 20:42, 14 May 2004 (UTC)[reply]
I like the idea of translingual homophones. We got the words that are spelled the same across different languages covered, but we don't have words that sound the same in different languages. I'm thinking of put (nl), pute (fr) /pyt/. roet (nl) and root (en) (OK, the r is different). It is true that English phonetics are a bit different of other languages, but between a lot of other languages, words that sound the same, but have totally different meanings are rather common. We could also add them with the false friends, though. I don't mind if we don't add them though. 134.58.253.130 06:58, 15 May 2004 (UTC)[reply]
Well, it's an interesting idea, but as Hippietrail and I have pointed out, even the Dutch "put" and the French "pute", while both being transcribed in IPA as /pyt/, are inevitably pronounced differently because they are spoken by two separate groups of people. One example I am familiar with is the pair French "cane"/"canne" ("female duck", "cane/walking stick") and Italian "can" (literary form of "cane", "dog"). Now, my dictionaries show identical IPA transcriptions, but I pronounce them distinctly - the French /a/ is slightly more open than the Italian one, at least in my pronunciation. If I asked one of my French friends and one of my Italian friends to read these words from their respective languages, I am sure they would sound marginally different. On the other hand, the English words "can" and "Cannes" (the place in the south of France) are pronounced identically by the same speaker of English. I imagine that analysing these nuances of pronunciation would be a nightmare and highly subjective. I say we should definitely avoid them. — Paul G 09:26, 17 May 2004 (UTC)[reply]
Sorry to follow a tangent but I can't resist that in my variety of English I do pronounce "can" and "Cannes" different— "can" = /kæn/, "Cannes" = /kɑːn/. Ain't language endlessly interesting? (-:
By the way I'm not opposed to the idea of the "sounds like" dictionary in general - I think it would be loads of fun. But I don't think the English Wiktionary is the place for it. Maybe the Wikipedia folks can be convinced to create a special Wiktionary just for this project or maybe some website somewhere. But I feel it would be clutter to have it right here. — Hippietrail 12:27, 17 May 2004 (UTC)[reply]
The variants are endless. Some pronounce "Cannes" like "cans" rather than "can". Even just in English not everybody hears a difference between the personal names "Don" and "Dawn". Dealing reasonably with phonetics within a language requires a considerable amount of sophistication; between languages it's nearly impossible. When we listen to a foreign language speaker we can understand what they're saying perfectly well, but they still sound different. Most of us are so accustomed to the way we speak that we would not know how to explain to another how the same sound is articulated differently. What is important in the pronunciation of a language is the phonemic differences rather than the phonetic ones. If the phonetic differences of a sound in two different environments are not meaningful they will not be distinguished; thus that difference is not ohonemic. Eclecticology 17:35, 17 May 2004 (UTC)[reply]
I just had to wikify Electicology's "ohonemic" - nice word! Could you define it for us? -- Paul G 10:49, 19 May 2004 (UTC)[reply]
I think it's a typo; there should be an initial "h", (British spelling: "hohonaemic") in which case it would mean tending to generate bloody laughter. Cf. Eric Partridge. Origins, A Short Etymological Dictionary of Modern English, New York, Macmillan, 1959, at p. 900, sub "haema-". :-) Eclecticology 19:21, 19 May 2004 (UTC)[reply]

Astronomy resource

I've just found the link [3], which includes a glossary of astronomy terms. Would someone be good enough to contact the author to ask whether we might take definitions from there? Thanks. -- Paul G 10:45, 19 May 2004 (UTC)[reply]

I couldn't get the link to work. Still, there's nothing to prevent you from contacting them. :-) Eclecticology 19:05, 19 May 2004 (UTC)[reply]
Of course :) I have mailed the following via the comments page:
"I am interested in your glossary of astronomical terms. I am a major contributor to Wiktionary, the free online dictionary (see en.wiktionary.org) which aims to define all words in all languages. With your permission, we would like to be able to use the definitions from your glossary (either as they stand, or in modified form) in Wiktionary. Would you be happy for us to do this?
Thank you - Paul Giaccone, Wiktionary sysop"

Changing the subtitle

On every page of Wiktionary there is the subtitle "From Wiktionary, the free dictionary." Now, until not that long ago, this said, "From Wikipedia, the free encyclopedia." until someone pointed it out and it was changed. In the Italian "Wikizionario" the banner reads "Da Wiktionary, l'enciclopedia libera.", which is an (incorrect) translation of the old English banner.

Could someone please tell me how this subtitle can be changed so that I can correct it? Is it something that only a sysop can do? Thanks. -- Paul G 10:36, 20 May 2004 (UTC)[reply]

The template is under MediaWiki:Fromwikipedia. However, it appears to be protected so only a sysop can edit it. Ortonmc 15:56, 20 May 2004 (UTC)[reply]

Morphemes: parts of words

When doing an etymology, is there any standard way of handling the morphemes that make up the current word?

Is there a way or place to list the morphemes themselves, so that people can look them up? For example, "-ject" is used in many English words like eject. Thanks. Newbie RSvK 21:17, 20 May 2004 (UTC)[reply]

Well... For eject I would write something like 'Middle English ejecten, borrowed from Latin eicere "to throw out" (via past participle eiectus), from e "out of" + iacere "throw".' You might put cognate terms in the =Related terms= section, thus inject, reject, project, subject, trajectory etc. "-ject" may be a bad example in this case because it is not a particularly productive morpheme (and its meaning is not always directly related to its etymology: the modern senses of "inject" and "subject" do not immediately suggest throwing).
If a word is composed of more or less transparent morphemes you might put in a =Derivation= section (as in, e.g. Malian), consisting of 'e- + -ject', but technically, this is not how the word is actually put together—it is merely how it is taken apart. —Muke Tever 01:28, 21 May 2004 (UTC)[reply]
I'm not against the idea of a morpheme index similar to the rhyme index but I think an entry for each morpheme is probably too much. — Hippietrail 02:52, 21 May 2004 (UTC)[reply]

Order and Subordination, spec. abbreviations

I am currently dealing with [aa], which cannot be ordered in the way that is recommended in the template because it is an abbreviation in addition to being a word. So, it cannot be organized under separate etymologies because abbreviations don't all have the same etymology and for almost all abbreviations the derivation is clear. In other dictionaries, abbreviations, trademarks, letters, cardinals, etc. are considered a different type than regular words, warranting a separate major entry. This does not seem to be how it is here. A disambiguation page might be appropriate, instead of a section in the same page, for such wholly different items. In sum, it looks like the order of entries may need to be changed throughout. Headings are useless if they're not clear. And the current template/policy/practice would seem to indicate that multiple sections entitled "Etymology" should be made for each major class of meaning of a word. Without splitting definitions of words of the same spelling into different articles, what would be appropriate is a numerical list (#) being the topmost classification, and all else remain under it, with a numerical list for each meaning of the word type, as it is now. - Centrx 22:51, 29 May 2004 (UTC)[reply]

Hi Centrx. Currently if we have two words spelled identically (or with just capitalization of the first letter different) we have a seperate part-of-speech heading for each one. But as you mention, when we have seperate etymologies for each entry, the current practice is far from ideal. I've commented on this elsewhere but haven't come up with a solution myself yet. The problem seems to be that when several senses or parts-of-speech share an etymology, the etym acts as the major heading, but on the same level as the p.o.s. When senses have different etyms, some people have made a seperate etym with a hard-coded etym number for each sense - usually still on the same level. I think we need to look at some large print dictionaries and see how they handle the various cases. We don't want to duplicate information when it's shared between senses but we also want useful and flexible headings and heading levels. Small dictionaries have the information compressed into a small area so they solve the problem in a way which might not work so well for Wiktionary. — Hippietrail 23:25, 29 May 2004 (UTC)[reply]
A way analogous to the way a print dictionary would do it is to have a disambiguation page for spellings with different major meanings. Abbreviations, trademarks, etc. and different words with the same spelling are different words that should be in different pages. It is only by chance that they have the same spelling. A change in the software might also be appropriate to have navigation more like that in a dictionary or in the online OED. - Centrx 00:27, 31 May 2004 (UTC)[reply]
Actually I think I agree. There has been pressure not to use disambiguation pages in Wiktionary so far. But I think you're right. Another thing I've been thinking about a lot is how to link to specific senses from other pages. Have senses named (not numbered - would get messy when renumbering is needed) would solve both problems.
Of course, sometimes we don't know enough about a word's or phrase's etymology to know if two different senses have different origins or not - but they may still be different enough to warrant seperate pages. Then also we're going to have some wars where some people take one side, and one another. And some people will overextend the idea and seperate out stuff which needn't be... — Hippietrail 03:02, 31 May 2004 (UTC)[reply]

I tend toward a very conservative approach when it comes to splitting up articles. In some cases this may be necessary at some time, as could be disambiguation pages. But I don't think that we're there yet. The question of capitalization in titles is an important one here, and I'm glad that our German colleagues have raised it. We need case sensitive titles, even though this need was never there for Wikipedia. Please participate in the discussion for the topic immediately above this one. We would like aa, Aa, AA and Aa. to be treated differently. Note that the period in an abbreviation or an apostrophe in a contraction makes it a different word for our purposes.

It's important to maintain flexibility with heading and indent levels. Only the H2 level for language names remains fairly inflexible. If a word has two distinct etymologies they can both be given an H3 heading and the parts of speech that come under them can have an H4 heading. Headings which derive from the meaning (including translations) can have the same heading level as the part of speech if there is no ambiguity, but can as easily go to an H5 level. The fact that certain heading levels appear on the template should not imply that rigid adherence is necessary.

Abbreviations have their own character, and often their etymology is identical with their meaning. In such cases the term "Abbreviation" can have the same heading level as "Etymology".

I am just as concerned as Hippietrail about the potential for argument. Some of these arguments can be headed off before they become real. Eclecticology 05:00, 31 May 2004 (UTC)[reply]

So, we want that (different pages for different capitalization) and we don't mind the temporary mess it's going to cause? Oops, then I sent the wrong signal on the mailing list. Sorry about that.I had already gotten used to seeing these words on one page. Of course the definitions for words that are spelled the same but capitalized differently can also be indicated by linking them to one another. Polyglot 07:13, 31 May 2004 (UTC)[reply]
Some people seem to be talking like we've already decided one way or the other or that we definitely need this or that. I was under the impression that we were talking about the pros and cons still.
One thing we shouldn't forget is that there are many languages with several writing systems, and a latin-centric solution may not work across the board. Polyglot's solution is nice in that the same system works very nice for Arabic and Hebrew which don't have case but have other features which allow variation.
Also, the software seems to have improved in the past few weeks to properly map lower to upper case in all scripts — it won't handle Turkish though which is a special case we can handle manually.
Originally I wanted case sensitivity but now I'm unsure. I've been thinking for a while that we should be able to specify the title that shows the most prominently in the page, rather than just showing the page's name or the first heading which is often something like "English" or "Noun" — not very helpful.
I do think it would be good for each wiki to be able to default to upper or lower case. Uppercase makes good sense for an encyclopedia but lowercase make better sense for dictionaries. — Hippietrail 08:37, 31 May 2004 (UTC)[reply]
Sorry for not always expressing myself in the most fortunate way... I didn't mean to imply a decision was already made. In fact I wanted to incite some discussion about the subject. I sent a mail to the mailing list, stating that we didn't really want case sensitivity (anymore), but when reading Eclecticology's remark, I realized I might have been wrong. Of course all this is because when we started, we were told: OK, you can start a dictionary project on the condition you can get by with the software that was really made to support an encyclopedia. We accepted that and we made do with what we were given, working out solutions as we went on. Not daring to ask to change too much to the software or at least considering that whatever we were going to ask, was going to take a long time to be implemented. Now a change could be considered, apparently, but it would cause us some inconvenience and the solution we adopted seems to have some other advantages. (I wasn't aware of this, but I seem to be less polyglot than I thought...:-) Hippietrail, are you sure those variations in Arabic and Hebrew shouldn't be on their own pages?
A directive to change the title shown at the top of the page sounds like a good alternative. Maybe that possibility should also be explored.
I don't think we are afraid of the iconvenience the change would cause. In a way we always hoped it would happen. The question is: would it really help the project(s)? I guess it would, since case sensitivity matters for a dictionary. So I think, I would like to change my opinion and ask for case sensitivity after all, if there is support for it from the other contributors. Polyglot 20:31, 31 May 2004 (UTC)[reply]


As for disambiguation pages, we are there now. If they are appropriate, then only one word is necessary for them to start being in use. Although the spelling may be the same, the words themselves are entirely different and might should be on different pages. It will be a grave waste of effort to wait on resolving this issue, as correcting it in the future will require a greater and greater effort as the size of the Wiktionary increases. As it stands now, I can't really bring myself to contribute well because there are so many words that simply can't be added properly. If something as basic as form isn't properly set yet, it's almost a waste of time to add or modify entries. Formatting, and the correction of formatting that will be required once this is resolved, take far more time than it does to add the text of a definition. Unlike the Wikipedia, it will not work to simply evolve as we reach critical numbers of articles and users. Having a rigid template and policy is necessary for a dictionary, where nearly every entry is going to follow the exact same format, and every single entry is going to follow the same template. - Centrx 20:09, 31 May 2004 (UTC)[reply]

Here is a way that would be similar to the way the online OED does it and analogous to the way print dictionaries do it. Every page has a unique type and number combination. The type indicates whether it is a noun, verb, abbreviation, etc. and the number (subordinate to the type) indicates which of the many major meanings the word has. If there are multiple pages with the same spelling when one does a search, the list of all the pages comes up, printed with the type and number. This would require software changes, but a dictionary is different from the other types of Wiki's and would benefit from strict typecasting and specific functionality.

To keep with the same software, we could simply make disambiguation pages with the relevant data. This would work very well. One thing is clear though, words with the same meaning should not be on the same page, as a matter of good form and to ensure that we can include allinformation appropriate to a dictionary.

- Centrx 20:27, 31 May 2004 (UTC)[reply]
I don't accept that form is more important than content. I strongly oppose a rigid template, because I think it's important to vary the template with circumstances. If Centrx feels that he cannot contribute without that, I can't help him much. However, I will point out that most of the words that have not yet been written up can fit into the currently suggested template without much difficulty. I don't dismiss the possibility that we may eventually need disambiguation pages, but the pages that need this will be relatively few.

The "type and number" idea seems obscure. It suggests that "type" follows a rigid interpretation, and makes no allowance for words with ambiguous type. Similarly numbering by major meanings fails to respect that meanings are often a part of a continuum. Eclecticology 23:41, 31 May 2004 (UTC)[reply]

I'm also against the "type and number" suggestion. I specifically talked against number earlier but I know my prose style tends toward scruffy ranting.
"Type" is no good because of ambiguous types, types which may be contested (adjective vs noun modifier, possessive adjective vs determiner), and languages which don't have rigid type systems like Chinese and Polynesian languages.
"Number" is no good because who decides the number? Popular usage? History or usage? First contributor? Correctionist contributors in the future?
I suggest using a word to disambiguate - just like Wikipedia does. Preferably from a smallish, mostly standardized set, but a set which can grow as we evolve. Using a word does away with fragile, hard-to-remember, potentially-fought-over numbers. And when we do away with fragility we can (more) easily link to specific senses of words without worrying the links will break.
However obscure you may think it, it's the way the online OED is set up, the way Webster 1913 has it, and analogous to the way many dictionaries have it. Anyway, if abbreviations and trademarks are where this sort of thing comes up, then I think disambiguation pages are appropriate for that, especially for words of such short length. Disambiguation pages seem especially appropriate for abbreviations, as one can follow the link to the real definition. This is what I will do for abbreviations, if there are no objections. - Centrx 03:29, 1 Jun 2004 (UTC)
Not obscure at all. Absolutely the right way to go for a big dictionary with thorough research and editing, formatting set in stone, the possibility of making all the sense numbers match before going to print, etc.
I'm just warning us now that it's going to very very brittle on an open content dictionary with many contributors and no final editing phase before "going to print". Numbers will change and they'll therefore be worthless for use on other pages. And the real dictionaries you cite do cross reference sense numbers extensively.
I also have been doing abbreviations and acronyms as simple links to their expanded forms. That counts as a definition. But I think we're talking about something much closer to Wikipedia's way of doing disambiguation pages aren't we? — Hippietrail 15:43, 1 Jun 2004 (UTC)
Yes, but as I think about it, these disambiguation pages would only be for special definitions, like abbreviations and trademarks, where the "definition" is merely a reference to some other definition. In other words, the only "definition" that can be made of these is a link, possibly with the rare short etymology.
How do you think they should be done? I think simply putting an appropriate link and brief explanation at the top or bottom of some other definition muddles definitions, as the section for that cannot be equal to any of the other sections in the definition. Disambiguation pages in the vein of Wikipedia seem most appropriate to me, as disambiguation is indeed what one needs when they look for a particular word that has two wholly different meanings, that is a regular word-meaning and an abbreviation. The searcher will discover what the abbreviation represents and possible proceed to the appropriate definition for the expanded word. If this is done, it also obviates the need for case-sensitive titles, aside from proper nouns, in English at least. - Centrx 21:44, 1 Jun 2004 (UTC)
You're still ignoring the fact that most abbreviations have periods in them. How do we handle an abbreviation like "cm." for centimetre when the software forces us to have a capital "C". And we do still need to consider those proper nouns that are spelled the same as ordinary words. Eclecticology 03:10, 2 Jun 2004 (UTC)
Make the default be lowercase--most dictionaries have all lowercase too. Allow the functionality to edit page titles so that the capitalization of letters in it can be changed, somewhat like how AIM allows one to change the capitalization of screennames. - Centrx 03:53, 2 Jun 2004 (UTC)
I agree. But the past few hours I've been wondering. Currently only the first letter is case insensitive - all the rest are case sensitive. This was probably a good idea for Wikipedia too but for Wiktionary it seems like the worst of both worlds. If two words/phrases differing only in case are always mapped to the same page title, we can either disambiguate from there, or add the various correctly-cased headwords at each sense.
Also I thing we have to make everybody aware that on Wiktionary, no matter how this decision goes, "Page Title" and "Headword" are different concepts. Currently one page can have several headwords, in the future this may continue, or headwords will only equal page titles on disambiguated pages. — Hippietrail 04:11, 2 Jun 2004 (UTC)
On format vs content, I'm an idealist. In an ideal world we would have a perfect format in XML which could be parsed by computers and displayed on the screen, would work perfectly for a wide wange of wildly different languages, permit ambiguities and subtlties, etc.
I'm also a realist. Such a format will take years to evolve or forever to plan in advance. Nobody has ever created a dictionary of every word in every language before so there will be problems we will have to solve along the way. Nobody likes editing format when they could be editing content, but most of us do it anyway because we're proud of Wiktionary and want to make it better - we can keep doing this.
So I'm in favour of a "firm" format. Not a soft one which is different on every page, not a rigid one that won't bend to the will of unexpected words or languages, but something in the middle. It might mean a bit of scruffiness but it means the content comes first - and that's what a dictionary should mostly be about. — Hippietrail 00:26, 1 Jun 2004 (UTC)

I think that Hippietrail's approach is a pragmatic one. In particular it allows for the possibility that over the next few years we will encounter many problems that cannot be forseen at this time. Yeah pragmatism!!!

As for the disambiguation pages, I still don't see any real need for these yet. What I would suggest at this stage is to go ahead with pages for abbreviations containing all the appropriate periods and apostrophes. Any links to these pages should show the correct capitalization, even if the software forces capital first letters. These are already distinct from "real" words. Going beyond this should wait until the capitalization issue is resolved. Eclecticology 17:32, 1 Jun 2004 (UTC)

This is unclear. If disambiguation pages are not used, where will abbreviations go if they have the same spelling as a word?

I don't think creating different entries for different capitalization and punctuation will work for abbreviations. There are very many abbreviations which are acceptable in many different formats. For instance, for the contraction of ante meridien are we going to have duplicate pages AM, am, A.M., a.m.? And if there are duplicate pages why not just have them all point to the same page, which invalidates the need for case sensitivity in the first place, although I suppose nothing could be done about added punctuation. What about the form "am", which is the same as a word. Should the information on abbreviation be in the page for the word? And if it is decided that "am" should simply be redirected to "be", that would certainly require a disambiguation page, with a line pointing to the abbreviation and one to the verb. There would be no other way to do it. The same problem arises with the abbreviation for antiaircraft (AA, A.A., A-A) and countless others. Capitalization is the most egregious and I think it's a waste of time to create two or three different pages for a single abbreviation. Abbreviations and trademarks are uniquely suited to disambiguation pages, as their very existence is as reference to other terms. - Centrx 20:38, 2 Jun 2004 (UTC)

I think you're making this much more complicated than it actually is am should be about the verb, and will probably need etymological notes to show how is became different from be/ The page a. m. will still be useful for "amplitude modulation", and A. A. for "Alcoholics Anonymous" In some of these a "See Also" page could be more productive. Eclecticology 08:31, 3 Jun 2004 (UTC).

What if a person is searching for the abbreviation am? They will get a page with no information about what they're searching for, despite the fact that am is a normal abbreviation that should be included in a dictionary. This will become even more problematic if there is case sensitivity. All of AM, am, a.m., and A.M. are commonly used and require redirection or disambiguation. What happens if an abbreviation has only one common spelling, which is the same spelling as a word? Your conclusion would exclude many words appropriate for inclusion in a comprehensive dictionary.
What is wrong with a disambiguation page? An abbreviation is concision at the expense of introducing ambiguity. A disambiguation page resolves this ambiguity, reflecting the status of abbreviations as pointers to words.
Anyway, I do believe I shall start putting Abbreviation for xxx, xxx, xxx. at the top of appropriate pages in small print and italicized. - Centrx 22:14, 3 Jun 2004 (UTC)

List of SAT words

A list of SAT words was uploaded to Wikipedia. After being listed at w:MediaWiki:VfD-List of SAT words and w:Wikipedia:Copyright problems, the consensus seems to be to move them here. However, it would be a waste to move them here and then find Wiktionary did not feel they were appropriate. Therefore, it has been left for Wiktionarians to decide whether they want them. There are around 26 pages with these lists. They are linked to from this version. If you think this is suitable material for Wiktionary, please go to Wikipedia and fetch it. If no one does, it will be assumed that Wiktionary does not want and it will be deleted one week from now. Thank you. Angela 10:38, 2 Jun 2004 (UTC)

Hi Angela. I for one am in favour of including all sorts of word lists - on the condition that they are in the public domain and of some kind of genuine interest or relevance. I think this one more than qualifies. — Hippietrail 11:05, 2 Jun 2004 (UTC)
I'm sure we can use this list as a base for automatically adding entries, that need to be fleshed out manually afterwards. I wouldn't copy the definitions though. It's probably safer to make those up ourselves. The list in itself can hardly be copyrighted, but together with the definitions it could be a copyvio.
I don't know if it's necessary to copy these lists over here though. We can just keep a pointer to where the lists can be found on line, edit them off line and then have a script insert the words one by one, at a rate of maybe 5 or 10 per day, to give people the opportunity to edit them. Polyglot 12:03, 2 Jun 2004 (UTC)
I agree about not importing the defs. But using the list of words as an index would be very useful. Especially to people using Wiktionary to learn about English. The benefit of having on site is that they'd all be clickable in one place.
Other word indeces I think would be cool would be frequency lists of the top ~100 words in various languages; in CJK languages, both characters and words; the official Touyou and Jouyou kanji lists; lists of all the words used in the Bible and the Qur'an in their original languages.
Not only do such lists assist learners and students to know which words are most useful to know, but they also assist dicionary compilers to know which defs would be most useful to add. (-: — Hippietrail 12:41, 2 Jun 2004 (UTC)

I agree eith Angela that these are better suited to Wiktionary than Wikipedia, and giving us a week to work on this is fair enough. If people show that they are working on it I don't think there will be any problem extending that for a few more days if needed. I think the copyright issue may be a red herring, especially if we integrate the content with other material. I don't attach much importance to the purpose of these words, but others may strongly disagree. I would be inclined to append this material to the existing English index pages where the words could be wikified. I absolutely don't support the idea of using a bot to trensfer these into new individual articles. That would just create a lot of stubs with bare definitions, and destroy the red links that let us know that the words still need work. Eclecticology 08:49, 3 Jun 2004 (UTC)


By topic - too long

The "By topic" page is getting too long and unwieldy (and messy). I've made a start at addressing this by moving the mathematics section (one of the longest, and subdivided further by branch of mathematics) to Wiktionary:By_topic:Mathematics. I think the page could benefit from the other sections being given similar treatment. — Paul G 09:12, 5 Jun 2004 (UTC)

I essentially support your initiative, though I confess that I seldom use this feature. Still there are two points that I would raise before this goes irretrievably far.
  1. Some time ago I suggested the pseudo-namespaces "Index:" and "Appendix:" as ways of organizing knowledge. One of the advantages would be to break up the already long list of entries in the "Wiktionary:" namespace, which User:Eric119 has been so admirably maintaining. These two pseudo-namespaces would be distinguished by having "Index:" limited to indexes by language, while other finding lists would be characterized with "Appendix:".
  2. Mediawiki 1.3 introduced the "Category" concept to all the sister projects, including this one. Has anyone had any thought about how that system could be used here?
Just a couple points to ponder. Eclecticology 17:48, 5 Jun 2004 (UTC)

Tim's observations

Tim Starling has made the following comments on the mailing list:

I'm amazed at the poor quality of the English Wiktionary, it seems to

miss so many important English words. Most new pages seem to be slang, jargon, and people adding a few dozen words from their native tongue. Plans to import a public domain dictionary were abandoned, and now there seems to be little organisation or direction. Perhaps Wiktionary can be revitalised with extra features, but I doubt stylesheet changes will be enough. It needs a different look and a whole raft of features. It needs methods for easily adding new words, and for categorisation and listing. But I'm neither excited by the project nor optimistic about its future. So most of all, it needs people who want to work on it.

I have already replied there; perhaps others would like to comment.Eclecticology 07:52, 11 Jun 2004 (UTC)


Section ordering

I have been noticing more and more articles appearing with the Etymology below the defs (maybe below other sections too). Has this been discussed anywhere? I've moved a few to agree with the way we've traditionally done things. But then I've never been entirely happy with the placing of this section. Some of the newest entries have it at the bottom and indented within the section for each part-of-speech/sense. Traditionally we've had it at the top, 2nd only to pronunciation - and not indented. My opinion is that the more essential/basic sections should be at the top and the more scholarly/specialized/pedantic at the bottom. Maybe:

Language,Pronunciation,POS,definition,Alt. spellings,Synonyms,Antonyms,Derived/Related,Etymology,Translations

Maybe etym before der/rel or even before syn/ant. Opinions? — Hippietrail 04:02, 15 Jun 2004 (UTC)

Myself would prefer to approach it a little more laxly; frex when I first made centum I put the etymology below the definition because the etymology only really makes sense after the definition — this is a rare case, but an example of how sometimes order can need to be flexible. Personally I prefer to see the etymologies first, but then that may be just because they're my hobby... What I think might make sense would be to lay it out chronologically: the prehistory of the word first (etymology, with related terms as a subsection or integrated), then its historical and current uses in the language (senses, ideally ordered by age), and then how it is integrated into the language (derived terms and collocations)... the bits that aren't really time-bound would want special placement: pronunciation first before anything, altspellings next; synonyms and antonyms probably after the def (if not integrated), and translations last, thus overall:
Language, Pronunciation (w/hyphenation, and rhymes link), Alternative spellings, Etymology (w/related terms), Part of Speech heading (with principal parts, definitions, synonyms and antonyms, and quotations), Derived Terms (somehow divided into morphologically derived words, collocations, and inheritance/borrowing into other languages), Translations.
I would integrate a lot of these because personally I think a lot of these don't need to be headings on their own (other dictionaries make do with inline references like SYN: huge, great, grand or opp. small) — a 'heading' that comprises less than a line seems a little gratuitous. —Muke Tever 05:48, 15 Jun 2004 (UTC)
My position is closer to Muke's on this one, especially on the need to be flexible. I've consistently supported a temporal flow within each article. Thus etymology helps us to understand why the word came to mean what it does, but derived terms exist because of the word. Eclecticology 08:39, 15 Jun 2004 (UTC)
It just doesn't work to put the etymology above the definition when the page has words with different etymologies. It is not superordinate, and it is totally pointless and cryptic if it's a section just called "==Etymology==" or having the title being the Latin word for it, for example with section headings like "==Alleviare==" for the word alleviate. Etymologies aren't good to be a section anyway as it's usually only a line. I think as things progress you're going to find that it just doesn't work to have sections for anything but the languages and the type of speech, i.e. "==Noun==", "==Adverb==". If we're going to put completely different words on the same page because they have the same spelling, Pronunciation, Etymology, etc., and ultimately maybe the same of everything, we cannot use sections. This is also true for when there are variations on words, which are currently delineated in a numbered list (so putting subordinate sections within those lists wouldn't work). Having all this information separated for each different word/meaning is appropriate for the purposes of our dictionary, and its the way lexicographers have done it for ages. - Centrx 02:21, 17 Jun 2004 (UTC)
I'm having difficulty following your comments. We don't have an article on Alleviare. Can you give a link to an article with the problem that you describe. Eclecticology 04:31, 17 Jun 2004 (UTC)
I'm sorry, it was a poor example anyway. Alleviare is the Latin root of the word alleviate, so what I meant was that, if there were multiple distinct meanings associated with the word alleviate, the current template prescribes, or what I've been told, that the topmost section below the language for each meaning would be the etymology. It wouldn't be good to have multiple meanings all using the same section name "==Etymology==", that would be pointless, and to have section names according to etymology would mean the titles would be like "==Alleviare==" for the word alleviate. It worked out because, as you have found, it's not a good section name that means anything to people. It was a bad example, though, because the word/spelling alleviate doesn't have meanings with different etymologies. Because of how the wikimedia software is, we have to be careful with our formatting and so forth, without the luxury of automated sections that make sense for a dictionary. I do think it would be appropriate to have every new page have default sections and formatting set up, at least in the edit page with embedded comments that can be uncommented when new information is added.
Also, if you haven't ever seen a page with this problem, it's most likely due to people following the restricted template closely even when it conflicts with the true information about a word. If we're putting different meanings under the same spelling, this is going to come up a lot as we become more complete and include more than just plain definitions. - Centrx 04:44, 17 Jun 2004 (UTC)
We can do what "real" dictionaries do and put something like =word 1= or =word²= as subheadings between =English= and the standard subheadings ... and if that's not visually distinctive enough, we can separate the homographs with a horizontal rule. I've implemented an example under I. —Muke Tever 14:58, 17 Jun 2004 (UTC)
I is a much better example than alleviate. It doesn't do to discuss multiple etymologies for a word where it doesn't apply. What I've done at "I" is uprade the heading level for the Etymology headings; this seems to give a better visual impression. I do have other questions about that article, but I wouldn't want them to detract from the broader issue in this discussion.
Instead of just saying "Etymology X" it would be better to have a very brief categorization, so in this case instead of "Etymology 1" there would be "Letter" and instead of the second there would be "First person pronoun". There should never be a heading "Etymology" because it doesn't provide any information. All words have etymologies and it will ultimately be in every page. - Centrx 20:16, 25 Jun 2004 (UTC)
Etymology is high up in the because of the significant information that it gives in helping to understand the word. Parts of speech are a completely different issue. A single etymology can give rise to several different parts of speech. At the other end different etymologies produce true homographs. "Calf" as young cattle and "calf" as part of the leg have different etymologies. It is useful to note that the word "homosexual" has nothing to do with a Latin origin that suggests liking men, but with a Greek origin that refers to the same sex. Eclecticology 00:31, 26 Jun 2004 (UTC)

Monobook Tabs

When we got Monobook we also got a nice row of tabs along the top. Unlike the other Wikis, however, ours has a "Speedy Deletions" tab which has a hideous bug for me. It contains an enormous string of text which causes the whole row of tabs to spontaneously rearrange themselves and obscure the name of the article. Does this happen for anybody else? The full text which I see is: ''This page is a [[Wiktionary:Candidates for speedy deletion|candidate for speedy deletion]][[MediaWiki:Delete|.]]'' <br><br> ''If you disagree that the page should be speedily deleted, please explain this on the talk page, or at [[Wiktionary:Speedy deletions]].'' — Hippietrail 01:34, 18 Jun 2004 (UTC)

Interesting. Is everybody getting this or just sysops? Eclecticology 04:45, 18 Jun 2004 (UTC)
I don't see any tab to "speedy deletions", and neither do I get such a tab on sv: (where I am a sysop...). But on the othr hand we don't have a page for "speedy deletions"... \Mike 11:59, 18 Jun 2004 (UTC)
Edit the text of MediaWiki:Delete. Presumably it's supposed to be something short like "delete", which is what it says in w:MediaWiki:Delete. —Muke Tever 14:16, 18 Jun 2004 (UTC)
Done and it's fixed. My guess is it was serving two purposes, one of which may be imaginary. Here's the place to put any negative repurcussions. Thanks very much! — Hippietrail 14:31, 18 Jun 2004 (UTC)

Interwikis in =Translations=

I notice some folk have been coming in and replacing lines like *Lingua: [[verbum]] with *Lingua: [[:lg:verbum|verbum]]. Clearly this isn't quite what we want (the word should link to its own entry here, not in another language on another wiki) — but the link to the foreign wiki isnt exactly inappropriate either. I've been rewriting these lines to e.g. *Lingua: [[verbum]] ([[:lg:verbum|lg]]) but this looks kludgey (especially when it hasta go next to a gender marker). I know this has been discussed elsewhere (someone in Wiktionary talk:Template proposed *[[:lg:verbum|Lingua]]: [[verbum]], which is fine, but non-intuitive, and doesn't work when there's more than one word on the line) but have we/can we actually settle on a format for this? —Muke Tever 15:34, 18 Jun 2004 (UTC)

I don't like this stuff at all. I really don't think those links are going to be useful to people browsing the dictionary except to say "neat!" or "wtf is this!". With some languages already needing and one or two alternative writing systems, and sometimes some other kind of brief note, adding this extra thing which almost nobody needs, is very confusing clutter. And all to save just one click!
And what tex goes in the link? Some have the word spelled out again, some have a language code. But the word is already there once, and most dictionary browsers aren't going to know or care about language codes.
I warned that people would do this if we made the interwiki links manual instead of automatic. Wah! )-: — Hippietrail 16:02, 18 Jun 2004 (UTC)
I'm not really a big fan of the idea either — but apparently some people thought they should be added, and I didnt quite feel it'd be my place to unilaterally decide it should be otherwise (as apparently some wiktionaries like hu: do it). I can move to begin removing them, if that's what we want to prefer, though. —Muke Tever 18:36, 18 Jun 2004 (UTC)
This is also related to the discussion at Wiktionary talk:MediaWiki custom messages. I too find these messages strange and counter-intuitive. That's mostly based on what's going on at nl:. I too don't think that the interwiki links should normally belong on the translation section of a page. The translation should link to the word on this Wiktionary, and the interwiki link should follow from there, probably without piping. I would support removing them from the places where they don't belong. If the people on hu: or nl: want to take a different approach they can do it over there. Eclecticology 20:03, 18 Jun 2004 (UTC)
With the messages as used in nl: the text shown in the article is "exactly" as required locally. The thing with translations and the use of these messages is that it allows for creating translations quickly _once_ and being able to export it to other wiktionaries. I recently added _loads_ of translations to Thai. It is of benefit to the user; more information. The only persons who will encounter it are editors.
The source of many of these translations is the (private) website of Timwi, who asked for all the information to be moved to wiktionary. As it is part of what I am doing and as the messages can (relatively) easy be changed to something more traditional with a bot. And as it is not enough to only have Timwi's data on nl:, this is the best way of working it.
You may have noticed that I have added some new words recently; I do only use the messages for translations; this is the bulk of the work for me. GerardM 08:32, 19 Jun 2004 (UTC)
As to lnterwikis, the interwikis to nl: go to the word exactly written as in the en:wiktionary. For example, the word English refers to nl:English. One purpose is that when someone changes a word, (for instance adding the pronunciation) it can be copied across to the other wikitonary. GerardM 08:37, 19 Jun 2004 (UTC)
Clearly the links to foreign words must go to this dictionary and not any other. If someone wants to know more about the foreign word but does not speak the language to which the word belongs, a link to another Wiktionary is not going to be very helpful. — Paul G 11:25, 21 Jun 2004 (UTC)
Links, yes, they are the ones in the body of the text. Interwikis (found under the toolbox), do link to to other wiktionaries and they are valid. One service they may provide is technical; changes in one might result in changes in the other.. GerardM 16:04, 22 Jun 2004 (UTC)

Hmm, you seem to just plain assume that Interwiki links aren't of any usefulness. Have you ever thought of questioning that assumption? Maybe you should...

Let me give a few examples:

  • A user interested in the target language might want to get fast access to more information in the given language (i.e. hypero-/hyponyms, derived and related words/terms;)
  • That user is using Wiktionary not just for fun but for some actual work they're doing; where having a couple of synonyms - in the target language - at hand might be quite helpful;
  • The same user might be interested in the foreign language's definition for the respective term;
  • A user might just not be aware of a Wiktionary in the target language to exist. Possibly they aren't aware of the fact that there are more than the English Wiktionary at all;
  • This user might even come from a foreign country and would really appreciate this site to direct them to the term's definition in their own language;
  • A user might look up an english term to be translated into their native language, doing it on en.wiktionary just because it is the one and only having a respectable content base until now, since all the others just started - and the respective Wiktionaries of course focus on native -> foreign translations instead of foreign -> native in the first place;

The proposal to place an Interwiki link on the translation's definition inside en.wikipedia seems a good one to me, but since the english Wikipedia of course mainly provides english -> foreign translations and just a small percentage of actually created definitions to foreign terms the given translation links are just plain dead. red. not written yet. You've followed a link to a page that doesn't exist yet.

I don't have the solution to come up with yet, but you might want to rethink your position on relevance and usefulness of fast (or at least at all) accessible links to other Wiktionaries. It's not just native English speaking people out there (I guess you're aware of this, otherwise you most probably wouldn't work on a dictionary), but also many en.wiktionary users come from outside the US, Canada or UK.

Just assuming (and not questioning) the fact that links to foreign definitions of foreign terms aren't useful and stating that you don't care about how other Wiktionaries handle the task sounds much like, well, ignorance to me... and I'm afraid with this type of ignorance, a type that lots of so-called "unamerican" people associate with some inhabitants and political leaders especially of the US, doesn't help this project any further...

So my proposition would be to facilitate Interwiki links near the actual translation, possibly like this

... until someone comes up with a better idea. And not to "outsource" them to articles which in more than 99% of the cases don't exist at all ...

Just my 2% --Xenosophy 17:06, 15 Jul 2004 (UTC)


Stimulating Wiktionary's growth

I think it would be beneficial if we find some free dictionary for us to work out of. I'm going to be automatically generating some stub pages (taking care not to overwrite existing pages) from the entries in the CMU Pronouncing Dictionary. Hopefully, this will increase the number of available English words to edit, and thus stimulate growth. What do you think? Poccil 17:36, 22 Jun 2004 (UTC)

I really don't like this, and I'd rather you'd waited before pouring stub entries in. My main objection is that without stub entries, I can easily tell whether a given related word has been defined. With stub entries, I have to chase the link. I can visually examine about as many entries as I like in under a second to see if any of them are red. Chasing the links, while it may seem trivial, takes a lot longer.
That said, I realize stubs are used extensively in Wikipedia, and I've seen Hippietrail use them legitimately as a place to hang a non-English translation. It's not that stubs are bad per se. I'd just really rather not see massive amounts of them until we have some easy way to tell a link is a link to a stub and not a real entry.
I believe stubs are already excluded from the total word count and from the "random entry" link, but if not, that's another reason not to put them in. -dmh 20:26, 22 Jun 2004 (UTC)

Oh yes now I understand why I shouldn't put "stub" entries in. Now any new entries I will generate will not have the stub placed on them. Does anyone object?

As an alternative to stubs, you might you might consider adding to Wiktionary:Requested articles or its counterparts in other languages.
If you are aware of interesting senses, etymology, translations, usages, etc that you think may be overlooked if you don't enter a stub with just the interesting bits, then enter it. If a stub is interesting it usually gets added to pretty quickly. — Hippietrail 01:48, 23 Jun 2004 (UTC)

Languages in Indonesian

Am I correct in thinking that all languages in Indonesian start with Bahasa followed with a word indicating the specific language ??

Thanks, GerardM 22:09, 23 Jun 2004 (UTC)

I think so. Malaysian also. "bahasa" means "language". Also "person from countryx" is "orang countryx". Unfortunately, the people entering most of our Indonesian and Malay words have seen fit to wikify each word of set phrases and idoms separately, which is almost definitely wrong. — Hippietrail 02:54, 24 Jun 2004 (UTC)

Korean

The translations for the English word Korean (the noun) are wrong; the translation to the correct meaning is indicated by numbers. The translations for the nl translation is reversed. I have serious doubts to the other translations. I will have loads of translations for Korean (the language) at nl:Koreaans by tomorrow. I do want to post these translations on en: but in doing so I will remove all current translations. Please have a look ar Korean GerardM 18:49, 28 Jun 2004 (UTC)


Spelling variants and orthographic variants (moved from Talk:Её)

I've read that "ё" in Russian is only used in Dictionaries etc, much like the acute accent to show where the stress is, and that it should not be used in regular writing. Is this true? If so we should rename this article so that searching will work properly, the "ё" can be used in the display forms though just as a print dictionary.

Advice hereby sought... — Hippietrail 13:30, 29 Jun 2004 (UTC)

I have heard that ё is "more correct" but not always in use. w:Reforms of Russian orthography says "used regularly for a brief period following WWII, today the ё is still seen in books for children, but is usually absent in regular print." (I wonder about the scope of that "usually". It also says this is de facto, not an official "reform".) w:Russian language says "The letter Ё/ё is "optional": it is formally correct to consistently use E/e to represent both /je/ and /jo/," so I suppose that must relegate ё to the display form [and perhaps another use for the proposed display-title directive] though perhaps ё-pages should exist and link to е-pages, as it was once standard spelling (unlike the acute accent, apparently). —Muke Tever 15:30, 29 Jun 2004 (UTC)
I remember several times entering stuff in Cyrillic which I'd found on the net on either w:Everything2 or Wikipedia, and having people "correct" me. What I wrote above was my best attempt to understand them - though I never quite grokked what they were trying to say.
My feeling if it is optional is to leave it in for total accuracy in the display form and people copying and pasting it somewhere should know what is optional. Though that means we should tell them somewhere in a "how to use this dictionary" section.
I am in favour of including all historically correct spellings as an ideal. This means the same word in different stage of spelling reform in which it has been used. — 138.130.33.197 21:52, 29 Jun 2004 (UTC)
It seems to me that in Wiktionary the principle that only regular written forms are qualified to be titles already has been abandoned. For example, the Chinese pinyin forms with diacritical signs are not actually used as far as I know (the forms without diacritics are used) but serve just as transcriptions. Nevertheless they occur as titles.
Pinyin with diacritics is used almost exclusively in the domain of bilingual dictionaries (not just English ones), and teaching aids for foreigners learning Chinese including Chinese readers. Only a small percentage of foreign students ever learn a useful number of characters.
The other reasons I've decided to make Pinyin page titles are because it is 100% standardized (unlike Arabic and Thai), and multiple polysyllabic words can have the same pronunciation and tones. — Hippietrail 08:25, 30 Jun 2004 (UTC)
Yes, I think your decision is good. I am going to proceed the same way on the Estonian Wiktionary. And I also think it is good to have pinyin titles both with diacritics and without diacritics (as yi and for longer words). Andres 09:29, 30 Jun 2004 (UTC)
In regular writing, "ё" is used when it allows to distinguish homographs.
Also, in regular writing the accent sign is used when it allows to diatinguish homographs. It is used in the rarer form.
I think that the page "Её" should not simply redirected to the regular title. I think it should state that we are dealing with the "dictionary spelling" (to be elaborated) of the Russian word ее meaning this and this. I think we should largely avoid redirecting 1) because it is good to treat each form as a separate unit and 2) because in general, we cannot know that no other language uses the same graphical form for some word.
Your arguments are convincing. Do you think we should redirect the regular title to the "Её" one in the case of Russian?
Actually I am not sure which direction of redirecting is the right one. Though the lexicographic tradition has it the other way around I am inclined to think that the main entry should be Ее. The spelling variants should be cited there, and a short treatment should be on the page Её. ё should be in the main entry only if it is used regularly to avoid homographs. Analogously, in Arabic, the main entries should be without diacritics except when they are regularly used to distinguish homographs. Of course, it might be really difficult to judge when the diacritic is optional and when it is not, so that I am not sure. Andres 09:29, 30 Jun 2004 (UTC)
The problem with making each orthographic variant its own page is that other languages can have quite a lot of variants. Especially languages which use the Arabic writing system. They have at least 4 or 5 layers of increasingly optional diacritics! Orthographic variants are a larger set than spelling variants.
My opinion: it's a lot of work but it's worth doing.
I don't know what is the difference between spelling variants and orthographic variants. Andres 09:31, 30 Jun 2004 (UTC)
It is good to avoid redirects. The cases where I think they're OK currently are orthographic rather than just spelling variants, and phrases containing words which have spelling variants.
OK. The redirects are reversible, so there is no big danger. When you are justified to think both forms occur in just one language and there are no homographs/homonyms problem, it's OK. But I still think it's better to avoid redirecting. Andres 09:29, 30 Jun 2004 (UTC)
A redirect can be turned into an article with a "See also" link when a word is found with the same title as the redirect page. — Hippietrail 08:25, 30 Jun 2004 (UTC)
I think that the minimum we should have in such a case is the language statement ("Russian", "Arabic"). And then, I think, it's not a big deal to state that it's an orthographic variant of this and this meaning this or this (short description of meaning to distinguish between homographs or, possibly, homonyms). Andres
I think the Russian accentuated forms have the same status as the Latin forms with diacritics for long and short vowels, and partly the same status as the pinyin forms with diacritics. I think that all of them should be page titles since they occur in texts. The user is bound to have the possibility to find the word in the form she found it in some text. This makes Wiktionary really useful. Andres 07:53, 30 Jun 2004 (UTC)
I think allowing redirects from spellings with optional characters/diacritics to the usual spelling solves the search problem.
The with allowing seperate pages for Russian's "ё" and its accents, is that the accents are more optional than the "ё" from what I have read. So an entry whose dictionary form contained both would need three whole articles: "Xex xox", "Xёx xox", and "Xёx xóx"! More points of view appreciated. — Hippietrail 08:25, 30 Jun 2004 (UTC)
First, if there are real variants, it's worth doing. Second, even if the variants are not redirects they need not be "whole" articles but minimized short variants. In some cases they may be even redirects. Andres 09:29, 30 Jun 2004 (UTC)

It seems clear to me that if the diacritics are optional the principal entry for the word should be the one with diacritics. It is much easier for the reader to know what to omit than what to add and where to add it. How is the её situation being handled on the ru:wiktionary? The acute would be used in Russian to show streaa on a syllable where it might not be expected. It is somewhat more optional, but should still be used in the page title. Someone said above that these forms are only used in dictionnaries; I was under the impression that Wiktionary is a dictionary. Saying that the diacritics are used to distinguish homographs presumes that you know the language well enough to be familiar with its homographs. Without that familiarity, it's safer to include the diacritics when you find them.

Imagine that you find a word in a text in a language you don't know well enough. Then you don't know that you should look up another variant, with diacritics. Anyway, if link in both directions, then there is no trouble. To be safe, we should include all variants as titles.
Though Wiktionary is a dictionary, there is something we can't do here: the reader cannot compare the entry title with the neighbouring ones to check whether there are close enough entries there. And, though Wiktionary is a dictionary, it can afford much more room than the paper dictionaries. Andres 17:31, 12 Jul 2004 (UTC)

Comments have also been made about Arabic and about Chinese pinyin. IIRC the "ü" is phonemic and not optional in pinyin. Were you talking about tone marks? Pinyin is a romanization that (along with other romanizations) was invented for the benefit of foreigners. Pinyin title entries should include tonality, but it is still an open question as to whether that should be done with diacritics or numerals.

Yes, we were talking about tone marks.
If I am right, pinyin without tone marks is used at Chinese schools. so, it is not only for foreigners, and we should include pinyin without tone marks. Andres 17:31, 12 Jul 2004 (UTC)
Schools in China could very well proceed without the tone marks because native Chinese speakers know them already, and could supply them at will. Outsiders need to be reminded that a change in tone means a completely different word. Eclecticology 19:04, 12 Jul 2004 (UTC)
Imagine you are reading a Chinese schoolbook with pinyin without tones, and you come across a word you don't know. Then it's good if you have a Wiktionary entry which provides you with possible readings of that word (because it might be a homograph). This does not excludes entries with tone marks. Andres 01:47, 13 Jul 2004 (UTC)

For Arabic, I again wonder what is intended. The "diacritics" here are really vocalizations, because Arabic is commonly written without vowels, except that the Qur'an must always be printed with full vocalization. The absence of vowels creats great difficulty for those who are not Arabic speakers, and the inclusion of Arabic words on the English Wiktionary is for the benefit of English rather than Arabic speakers. Arabic words start from a triliteral root, and plurals may be indicated by changing the vowel pattern without it being indicated in ordinary writing. Eclecticology 00:02, 2 Jul 2004 (UTC)

For Arabic, both fully vocalized forms and forms used in common texts should be included. In Arabic, there are many homographs that are often distinguished by means of some vocalization mark or consonant lengthening mark.
On the other side, long vowels are indirectly written in Arabic, though again homographically with diphthongs. Arabic words need not start from a triliteral root; letters indicating long vowels are inserted into roots; if singular and plural should graphically coincide, then some vocalization might be useful; in most cases singular and plural don't graphically coincide. Andres 17:31, 12 Jul 2004 (UTC)
We also need to allow for the fact that Arabic is not the only language that uses Arabic script. I confess that I have no idea how Pashto or Urdu handles this sort of thing. Eclecticology 19:04, 12 Jul 2004 (UTC)
We handle this for each language apart and independently, under different subheadings. I don't know these languages either. Andres 01:47, 13 Jul 2004 (UTC)