Wiktionary talk:Capitalization
Archive of Discussion movedfrom Beer Parlour (22-May-2005)
[edit]April 1 2005 d day
[edit]As there is no discussion going on about the technical way in which we will convert after the change of en:wiktionary of the first character capitalisation, I intent to ask the developers to turn first character capitalisation on the first of April. There is no point in prolonging the waiting period when there is no practical technical discussion going on. GerardM 21:22, 30 Mar 2005 (UTC)
- You speak for yourself, I think. I have serious reservations about the purported magnitude of the switch. Knowing now that all German nouns are capitalized is a very salient reason to reconsider this switch. Particularly when it is being discussed, and apparently will therefore break the current functionality of the "Go" button. --Connel MacKenzie 04:11, 31 Mar 2005 (UTC)
- Furthermore, the lack of discussion about how to convert after the change is due to the ambiguity of what the change will actually do.
- The supposed lack of discussion is a reason to delay the switch, not a reason to rush it. --Connel MacKenzie 04:15, 31 Mar 2005 (UTC)
- There is ample experience with not capitalised wiktionaries. There is also ample experience with the work generated by it. I have converted the nl:wiktionary have helped with many others convert. It is just a lot of WORK. With an overwhelming majority to change the move from capitalised articles, there is no reason no to. GerardM 07:21, 31 Mar 2005 (UTC)
- Gerard, can you please let things go their merry way. Change takes time. The fact that you, as an outsider to this Wiktionary, are going to talk to developers to make the switch for us, is NOT helping. We are perfectly able to talk to the developers ourselves, if and when the time comes to do so. It's probably going to be before the 1st of May, but it's certainly not going to be before the 1st of April.
- If this switch is going to be made, it will be because the English Wiktionary community is convinced that it is a good idea. Making the switch in a rushed way and having contributors disgruntled is not an option. It is a pity that the entire Wiktionary community is not behind it. I deplore for instance that Hippietrail opposes it so much. I should probably try to find out why this is the case, but I don't seem to find the time to do so. Just like I don't find the time to do so many other things that would be useful. Polyglot 07:45, 31 Mar 2005 (UTC)
- I do not mind to wait. BUT there is no discussion on how this change is to be done. If there WAS discussion you would have a point. The community has voted several weeks ago decisevely. So your argument that the whole community needs to be convinced is irrelevant because the opposing people have been outvoted. That is the sorry way of voting. The other side of this coin is that it makes change possible. As to being an outsider, I have a sufficient number of edits on the en:wiktionary to make that a silly notion.
- The fact that some people are opposed can be found in their voting against. We passed that stage. There is no technical discussion. So there is no point in waiting. GerardM 09:14, 31 Mar 2005 (UTC)
- Just out of interest - I don't remember seeing a fallback plan that we can take if things go pearshaped. For instance, if we split an entry into a capitalized and non-capitalized forms and then have to revert to the current system, will those two pages both be available? SemperBlotto 09:27, 31 Mar 2005 (UTC)
- There is no fall back plan. The mess just needs to be cleaned up. That is all there is to it. GerardM 19:04, 31 Mar 2005 (UTC)
- I don't understand why one would want to stick to first letter capitalization. Capitalization is an integral part of the German language for instance, and hence should be respected. The earlier we switch, the better and the less work there is to do. GerardM, please ask the developers to turn off first letter capitalization. Ncik 31 Mar 2005
- Ncik, that is a psychotic suggestion. As Polyglot pointed out earlier, this notion does not currently have full backing from the community. I'd like a reason to support this switch, but as of right now, I have enough information at my disposal to object strongly to this.
- Please do not do any such thing, until you can accurately describe what happens when thousands of German nouns become the preferred lookup match for what would otherwise be lowercase English words...and 99.99% of the time the "user" is trying to find the English word. There are thousands of German nouns. Having a notion that Wiktionary lookups will work fine for all but a couple thousand words is NOT a good idea.
- Do not just turn this on. It can't be turned off. It is likely to break tousands of normal lookups. And the "disambiguation page" work-around has not been accurately described. There are several variations I can think of off the top of my head - which one is going to be the recommended one?
- Or is the intent to have Wikipedia style disambiguation pages, or just cross references at the top of each page? If the later, then lookups will certainly be broken.
- When I voted Yes to this, it seemed clear there would be a hundred or so pages that would need immediate attention, and at most several hundred other pages that could be affected. It is clear now that the magnitude of the "trivial" change is many thousands of disambiguation pages (therefore at least twice as many other page modifications) with no one volunteering to help automate the task.
- At least
twofour sysops would be REALLY pissed off by such an action.
- At least
- --Connel MacKenzie 00:19, 2 Apr 2005 (UTC)
- I am a sysop in many projects. It does not make me special. It does not give me extra rights. It gives me a few extra options that are handy. I can change articles to undercase like in the Kanji stuff. Am happy to do those. I can get all articles changed in many ways if we decide to do it. What pisses me off is that this decision is being stonewalled. GerardM 21:32, 11 Apr 2005 (UTC)
- This change will mean that almost all pages will have to be moved and quite a few will have to be split. The only pages that will stay as they were will be the pages describing proper nouns. I'm sorry this wasn't clear when I relaunched the vote. I had taken it for granted and thought it was obvious. German nouns are always capitalized. The problem for German is with all the other parts of speech. Adjectives, adverbs etc are never capitalized. In French adjectives aren't capitalized either. So Français is a person and français is the adjective and (maybe oddly) the name for the language. Each language has its own rules, which is why I believe capitalization is important and should be represented in the titles. Polyglot 08:12, 2 Apr 2005 (UTC)
- Ployglot, I had that basic understanding yes. I think that we should be headed toward a more technically acurate Wiktionary. I had thought a one time MySQL conversion would convert all to start with lower case, and "we" would manually have to move pages to their capitalized meaning. Barring that, it could be done much slower by bots. BUT, I still do not see how having a separate entry for the German noun Kind would not break a "normal" lookup for the English word kind. If the "Go" button really does always search the upper-case varient first, it is a massive long-term problem. Would we then have Wikipedia style disambiguation pages at kind that link to Kind (German noun), kind (English noun) and kind (English adjective)? That quickly leads to madness. (Some of us don't need any help, there!) --Connel MacKenzie 17:30, 2 Apr 2005 (UTC)
- Actually I think German is a spurious example anyway--the German rules say nouns are _always_ capitalized, but titles are already capitalized here. There's no German rule that says all non-nouns are decapitalized in all contexts; in fact they do capitalize in titles, which is why the German Wikipedia lives with capitalized titles, and the only wikipedias that differ are artificial languages that do have such rules about decapitalization, e.g. tlh and tokipona. (Nevertheless, the communis opinio seems to be that decapitalization is somehow essential.) —Muke Tever 00:27, 2 Apr 2005 (UTC)
- What? All words will have their initial letter Capitalised? I assumed that they would all be made lowercase, and we would only have to change Proper (and German) nouns and the like. I now formally change my mind and vote No. SemperBlotto 09:31, 2 Apr 2005 (UTC)
- In relply to Muke: Entries in dictionaries are not titles. Dictionary entries always distinguish between upper and lower case. In reply to Connel MacKenzie: That's why it wont be necessary to have disambiguation pages. In reply to SemperBlotto: Whatever people assume or suggest here, it will be technically possible to change all entries and links to lower case so that one only has to capitalize proper nouns, German substantives, and a very few more. Ncik 08 Apr 2005
- Yes, entries in dictionaries are not titles. But the part of a wiktionary article that corresponds to the "entry" or headword is that line that reads: "word (plural: words)" (or whatever). The title of a Wiktionary article is something that corresponds to no part of a conventional dictionary entry. (For an example, compare the title מאדים with its headword מַאַדִים—the headword is displayed as it would be in a print Hebrew-English dictionary, and the title is displayed as an ordinary title, which you likely wouldn't see in a print dictionary's format.) —Muke Tever 00:28, 12 Apr 2005 (UTC)
- Ncik, my point is that if Wiktionary turns the switch on and does not have (detestable) disambiguation pages, the Go button will cease to function, because it will select the German Noun instead of the desired main English article page. This "glitch" was not apparent (or perhaps downplayed) when the issue was brought up for re-vote. If the English Wiktionary was like print English Dictionaries, we could follow their rules, but a very important difference (right or wrong) is that it includes all words in all languages. --Connel MacKenzie 14:12, 9 Apr 2005 (UTC)
- At this moment we are still not honoring the overwhelming decision to end first character capitalisation. I think it should be relatively easy to change all words to undercase. The idea of what turning off does could be easily discovered in other projects like the nl:wiktionary among many others. The idea of how the "go" button works is of marginal importance in this whole question because people do not enter "Word" when they search for something but they type in "word". So this is a non issue. GerardM 09:20, 11 Apr 2005 (UTC)
- A delay is not a case of "not honoring" the (questionable) decision. Even more relevant is that in the initial vote, no timeline was given: that issue was specifically postponed to be discussed after implementation details were worked out.
- I'm sorry Gerard, but your assertion about the "Go" button simply is not true. An automatic spell check that uses Wiktionary as its reference will submit capitalized first letters whenever they start a sentence, or appear in a title, or start a list (etc.)
- Another example: For doing concordances, sometimes capitalization matters, other times it does not. Right now, making the assumption that all terms are lower case "works" - in that entering a term will get you to the page that has the information one desires somwhere on it (on the first hit, no less.) With the switch, that is no longer the case. Kind at the start of the sentence will not get you to the English word kind even if that was what was intended, and even if you do follow the link to the German word.
- Thank you for the suggestion to (again) check a foriegn language Wiktionary. Searching the "All Pages" special page, I found nl:búlgaro and nl:Búlgaro. This looks like a good example of what is intended (by myself as well, by the way.) That is NOT the case I am saying is broken. There are thousands of German nouns. And for all of them, some of the time, no only will en.Wiktionary display the "wrong" (undesired) page, but there would be no cross link between the two titles! And worse still, there would be no outward indication that the lower case page even exists. --Connel MacKenzie 14:08, 11 Apr 2005 (UTC)
- First of all, calling a decision questionable shows you to be a bad loser. I take it that you do not mean it like that. Deciding to not implement because there should be first endless discussions about how to do it gives an appearance that you want to wiggle out from under the decision to implement what has been decided upon. There is ample experience with implementation. There are several suggestions how it can be done with less effort. It has even been suggested that doing it all by hand would be best because it provides for an opportunity to do some much needed quality control.
- That is a needless personal attack, particularly when (if you check the history) you'd see that I have been more of an advocate of the case change than an opponent! The vote was questionable; it was questioned and proven to be skewed, misleading and overall wildly misrepresentative. (Not by me!)
- Why you are trying to rush the implementation is not apparent. The issue is still being debated and you wish to make it a fait accompli. That speaks very poorly of you, and your tactics, sir.
- --Connel MacKenzie 13:56, 13 Apr 2005 (UTC)
- We are a dictionary we are not usefull for automatic spell check. Given the lack of structure of the English wiktionary it will never be usefull for this purpose. With proper software you can allow for a word to be capitalised under certain conditions. German nouns are ALWAYS capitalised while English nouns only under specific conditions. By having words in undercase you allow for a rule to capitalise. By having them all in uppercase it is not evident that the word may need to be NOT capitalised.
- Given the definition of concordance, I do not understand what you mean.
- I was referring to Wiktionary:Concordances. To give perhaps a better, more tangible example; ro vs. RO. One link is red. Entering "ro" then pressing "Go" you are directed to the "wrong" page. Given that the current proposal is to add thousands of "wrong" capitalization entries, this problem is likely to expand as a result of the proposed capitalization change. --Connel MacKenzie 13:56, 13 Apr 2005 (UTC)
- NOTE: at 05:04, 15 Apr 2005 User:Mike added the lower case entry. This example now better exposes the concept of articles being related only by capitalization, not having any cross reference. --Connel MacKenzie 01:37, 17 Apr 2005 (UTC)
- I was referring to Wiktionary:Concordances. To give perhaps a better, more tangible example; ro vs. RO. One link is red. Entering "ro" then pressing "Go" you are directed to the "wrong" page. Given that the current proposal is to add thousands of "wrong" capitalization entries, this problem is likely to expand as a result of the proposed capitalization change. --Connel MacKenzie 13:56, 13 Apr 2005 (UTC)
- Given the definition of concordance, I do not understand what you mean.
- The words búlgaro and Búlgaro are two different occurences they are distinct words. They are not related. There is no cross link as there should not be. You either look for the one or for the other. When I look for the undercase word and press OK i get the word, with uppercase dito... (I do not understand what I am missing here) GerardM 21:27, 11 Apr 2005 (UTC)
- What you missed is that I do not know the correct spelling of the word on nl:. If I am looking up a term, the two varients should be reachable. Right now, only one is (from the "Go" button.) And the one is not cross referenced from the other, so I have no indication that I found the completely wrong definition! --Connel MacKenzie 13:56, 13 Apr 2005 (UTC)
- None of the points the Connel is trying to make is a point: Automated spellchecks (whatever he's in mind) wont work if the dictionary used contains misspelled words. As said before, the spellchecking program will have to make sure that words follwing full stops are looked up in upper and lower case, and that it is looked up in the right language (so German Kind and English kind doesn't constitute a problem). They are problematic of course if mathematical or other scientific texts are checked, or text in multiple languages. Abbrevations are another problem since they are normally followed by a dot, which a program generally can't distinguish from a full stop, etc. Implementing a case insensitive search function might be a good idea anyway: Not only the first letter is concerned but others as well: Suppose I'm looking up the word UNO in an Italian text which, for some reason (maybe because it's a warning), is capitalized throughout. One will assume it's "uno", maybe try "Uno" just in case it's a name, but completely forget that it might just be the UN Organisation (slightly bad example since everyone knows the UNO, but there will be better ones). "búlgaro" would actually have a crosslink to "Búlgaro" if anyone could be bothered to add the translations. But there are even better examples: Eg take the German pronoun "mein" and the noun "Mein". These words have nothing in common except their pronunciation (just realize that there will be a link in the "Homophones" section. Even better. If there really are words that have nothing in common except that one is written capitalized and the other isn't, why should there be a crosslink??). It is common sense that a dictionary distinguishes between upper and lower case. Hence it is the user's responsibility to enter words correctly. If the user doesn't know the correct spelling he should be able to use a powerful search function. This is not only about first letter capitalization, but also about diacritics, which are omitted in may texts, the German letter ß and the umlauts, which are often rendered (especially in the world of the computer) as "ss", "ae", "oe", and "ue" respectively, and probably many, many more examples. First letter decapitalisation must not be delayed any longer though. We better trouble users that don't know the exact spelling of a word till a better search is available than having to change many tenthousand words more later on. Ncik 11 Apr 2005
- I am sorry you have the opinion that you do. The simple fact that a person looking up a word may or may not know the correct capitalization or spelling should be enough for you to realize that forcing the search to default to the wrong capitalization for thousands of terms is the wrong approach. --Connel MacKenzie 13:56, 13 Apr 2005 (UTC)
- None of the points the Connel is trying to make is a point: Automated spellchecks (whatever he's in mind) wont work if the dictionary used contains misspelled words. As said before, the spellchecking program will have to make sure that words follwing full stops are looked up in upper and lower case, and that it is looked up in the right language (so German Kind and English kind doesn't constitute a problem). They are problematic of course if mathematical or other scientific texts are checked, or text in multiple languages. Abbrevations are another problem since they are normally followed by a dot, which a program generally can't distinguish from a full stop, etc. Implementing a case insensitive search function might be a good idea anyway: Not only the first letter is concerned but others as well: Suppose I'm looking up the word UNO in an Italian text which, for some reason (maybe because it's a warning), is capitalized throughout. One will assume it's "uno", maybe try "Uno" just in case it's a name, but completely forget that it might just be the UN Organisation (slightly bad example since everyone knows the UNO, but there will be better ones). "búlgaro" would actually have a crosslink to "Búlgaro" if anyone could be bothered to add the translations. But there are even better examples: Eg take the German pronoun "mein" and the noun "Mein". These words have nothing in common except their pronunciation (just realize that there will be a link in the "Homophones" section. Even better. If there really are words that have nothing in common except that one is written capitalized and the other isn't, why should there be a crosslink??). It is common sense that a dictionary distinguishes between upper and lower case. Hence it is the user's responsibility to enter words correctly. If the user doesn't know the correct spelling he should be able to use a powerful search function. This is not only about first letter capitalization, but also about diacritics, which are omitted in may texts, the German letter ß and the umlauts, which are often rendered (especially in the world of the computer) as "ss", "ae", "oe", and "ue" respectively, and probably many, many more examples. First letter decapitalisation must not be delayed any longer though. We better trouble users that don't know the exact spelling of a word till a better search is available than having to change many tenthousand words more later on. Ncik 11 Apr 2005
- I get the feeling you don't have a problem with first letter decapitalization but with the current search functionality. If that's the case, I agree with you. Ncik 13 Apr 2005
- The current search functionality only works with all-capitalized first letters. In concept, I'd like to see separate entries for Catholic vs. catholic. You are proposing to a database design change that the current search capability clearly does not handle well. --Connel MacKenzie 01:37, 17 Apr 2005 (UTC)
It seems to me that more people are interested in making debating points about this than in trying to find a solution to the problems. The vote was about the principle of having the first letter case sensitive. I certainly made that point when the vote began. Keeping the operation of the search function, and transition issues out of the vote was intentional. The results of the vote were overwhelming clear. Any suggestion that the results were somehow "skewed", or "misrepresentative", or "misleading", or whatever else of the ilk is an insult to the intelligence of everybody that voted "yes". We voted for the principle of having case sensitive first letters, nothing more, nothing less.
- It is misrepresentative to suggest, as you do now, that I am insulting myself! --Connel MacKenzie 01:37, 17 Apr 2005 (UTC)
- When you make unfounded claims about improprieties in the voting process you can expect such reactions. Eclecticology 08:30, 17 Apr 2005 (UTC)
- Having read the historically relevant pages, I can't imagine how you say my claims were unfounded. --Connel MacKenzie 02:36, 29 Apr 2005 (UTC)
- When you make unfounded claims about improprieties in the voting process you can expect such reactions. Eclecticology 08:30, 17 Apr 2005 (UTC)
Most of the difficulties that have been raised are about problems to be solved, not about insurmountable roadblocks that would defeat the principal objective. I have been critical of Gerard's impatience, and he has at least recognized that it takes time to get things done right. I am far more critical of the whiners who keep finding excuses for why the change can't be made. This has nothing to do with other people's spell checkers; if they work or don't work it's their problem, not ours. This has nothing to do with concordances, or trying to define what we mean by "title". It has nothing to do with the "Go" button or trying to guess whether users search with an upper or lower case first letter. It probably will need some fixing, but that can come later when a developer has a chance to look at it. I don't hear any complaints from the other language Wiktionaries that the "Go" button is a big problem.
- I revoked my "Yes" vote because this does seem to be an insurmountable problem. Comparing this Wiktionary to others is not fair; each language has its own capitalization requirements. Since we include all words in all languages, and we enter all our words decapitalized, the normal search mode after the change will be to find the wrong word. That has nothing to do with imaginary spell checkers. That has nothing to do with my example from the concordances. It is confounded by ambiguous terminology, where "title" can refer to a database entry key, an automated heading or a free text portion of an entry.
- What truly irks me, is that no one has described how they think a "split" article should be split. Is your proposal to have no links between articles (that are only related by their coincidental different capitalization)? Or are you proposing that "split" articles be given a disambiguation page? --Connel MacKenzie 01:37, 17 Apr 2005 (UTC)
Cross references will be needed when articles need to be split, but it's pointless to complain about their failure to magically appear. In the present setup they serve no useful purposes. They will all need to be written manually when each article is split.
- What Gerard pointed to as an example did not have any cross references between articles. Are you saying that the German noun Kind will have a "See also" link to the English word kind and vice-versa? Or by "Cross references" to you mean Kind will be a disambiguation page that points to Kind (German noun), kind (English noun) and kind (English adjective)? --Connel MacKenzie 01:37, 17 Apr 2005 (UTC)
- I favour cross references like "see kind" and "see Kind" with the actual wording and placement to be determined later. For most of these articles the split will be into only two pages and no more. The disambiguation pages, as you describe them, seem to produce unduly awkward page titles. Eclecticology 08:30, 17 Apr 2005 (UTC)
- Thank you. Sorry I missed this particular post for as long as I did. I agree that the disambiguation approach is unwieldy. If I get a chunk of time soon, I'll see what I can parse out of the sql dump to get an idea of the magnitude of all this. As you say, the wording and placement will come later - when we see it in practice. Unfortunately, I would think that "botting" the x-refs at the top and bottom of the page might be reasonable as a first pass (so the excess can be bot-removed later.) I guess we'll have to see how it turns out. (You all are still planning on edging forward with this idea eventually, despite mine and others objections?)
- Also, is there any sort of consensus on whether to link articles (whose only relation is upper-case/lower-case) as a matter of informal policy? --Connel MacKenzie 02:36, 29 Apr 2005 (UTC)
- I favour cross references like "see kind" and "see Kind" with the actual wording and placement to be determined later. For most of these articles the split will be into only two pages and no more. The disambiguation pages, as you describe them, seem to produce unduly awkward page titles. Eclecticology 08:30, 17 Apr 2005 (UTC)
I do not agree with the solution that the first letter of all article titles be changed to lower case. That would improve nothing in any meaningful way. The links that now begin with a lower case letter are probably mostly correct. Because of the general unwillingness to seek solutions I find myself getting closer to Gerard's view that the change should be sooner rather than later. When the switch is made active a bot should be started that will change the title for those articles where all links to it begin with a lower case letter. Where this single criterion is not met something like Category:Please check should be added to the page. After that we can visit each so categorized article, and either move the page manually, split the page, or simply remove the category tag if no change is needed. If someone wants to write the bot, the changeover can be made shortly after it is ready. Let's get past all this unseemly bickering. Eclecticology 07:32, 16 Apr 2005 (UTC)
- I volunteer for writing a bot. Haven't done this before but can't be that difficult, and I'm willing to put some work in. Someone will have to tell me where I can find information on how to write a Wiktionary bot. There should also be some people involved who know what kind of problems may arise. I think I roughly know what Wiktionary currently looks like with respect to European languages, but haven't got a clue when it comes to Arabic, Chinese, Japanese, etc. Ncik 17 Apr 2005
- Thank you. I am clueless about writing bots, so I can't help there. I may be a bit more helpful with the second question. Arabic, Chinese and Japanese do not present any problems because capitalization does not exist in those languages. The idea of capitalization is limited to languages with Roman and Cyrillic scripts, plus Greek and Armenian. Thus far we only have a handful of Armenian words, so it's not a big worry. Keep the bot as simple as possible. If there is any doubt about whether an article should be changed, don't, and let it be tagged for manual checking. The bot can probably be written to ignore articles with a title other than in the target scripts. Eclecticology 08:30, 17 Apr 2005 (UTC)
after the change
[edit]Links from Wikipedia
[edit]The folks over at Wikipedia are trying to figure out how best to link to us now that we have decapitalized. If you'd like to weigh in, head to w:Template_talk:Wiktionary and put in your two cents. --Dvortygirl 03:35, 13 July 2005 (UTC)
- Thanks for the notice, but I don't think that there's much that I could do to help there. Eclecticology 05:44, July 13, 2005 (UTC)
- As of last night, the default search for the "Go" button was improved to handle the 1st character case problem better. That may relieve some of their angst. --Connel MacKenzie 13:54, 13 July 2005 (UTC)
Capitalization - double redirects
[edit]The server-side script has moved some Uppercase redirects to lowercase - creating double redirects a -> b -> c.
Thus Transwiki was moved to transwiki and Down's syndrome was moved to down's syndrome. If you come across this, the Uppercase version needs to be edited to become a copy of the lowercase version, instead of a redirect to the lowercase version. Keep on trucking. SemperBlotto 1 July 2005 13:54 (UTC)
- Why would we make the uppercase version a copy? Why don't we just reaim the redirects? 24 2 July 2005 18:18 (UTC)
- You may have misread it; that is what he is asking you to do (in the case of a double redirect, a -> b -> c becomes a -> c and b -> c.) --Connel MacKenzie 3 July 2005 03:34 (UTC)
- http://en.wiktionary.org/w/index.php?title=Special:DoubleRedirects&limit=500&offset=0 if you are feeling helpful for these 79 or so entries. --Connel MacKenzie 3 July 2005 19:31 (UTC)
- You may have misread it; that is what he is asking you to do (in the case of a double redirect, a -> b -> c becomes a -> c and b -> c.) --Connel MacKenzie 3 July 2005 03:34 (UTC)
Capitalization is here!
[edit]It appears that our developers have stopped waiting for us to come to our own consensus and request that automatic first-letter capitalization be turned off, and made an arbitrary decision. This means that article titles should now begin with a lower-case letter, unless they are proper nouns or otherwise require capitalization (e.g. Tuesday). Therefore, please commence moving articles to the appropriate pages, breaking up articles that require different capitalizations, e.g. polish (to shine), and Polish (from Poland). This is an excellent opportunity, as well, to recheck and improve articles for conformance to current standards of article quality and formatting.
There are bots that can help with the transition. They exploit interwikis and the sections in the articles, where they exist, that have the word in bold, e.g. word. Bots can also help with deleting all of the redirects that so many moves will leave behind.
I recommend that the remaining nominations in Wiktionary:Administrators and Wiktionary:Bureaucrats be considered for approval soon, since this is going to be an enormous task. I would also recommend that we reach some consensus or policy on this transition, and perhaps on some of the related formatting cleanup quickly. Comments, all? --Dvortygirl 29 June 2005 20:58 (UTC)
- Of course, I also see that the thing which should have been updated first—the Go button— hasn't been updated yet. You can click on the link here to get to the lowercase candela, but we cannot get there from the Go button to read it, even if you create the article, or move an existing one. Gene Nygaard 29 June 2005 21:03 (UTC)
- Important note: If you're moving an article that is in a namespace (for example, Template:), do NOT select "Move "talk" page as well, if applicable." as this will move the talk page in the main namespace. For example, moving Template:Wikipedia to Template:wikipedia with that option selected would move Talk:Wikipedia not Template talk:Wikipedia. Thankyou. --Wytukaze 29 June 2005 21:08 (UTC)
This is utter nonsense. What have our "developers" been smoking? 24 29 June 2005 22:34 (UTC)
- Developers' drug habits aside, this has happened and we'll all just have to live with it. Some of us like it, some of us don't, but either way, we better fix our little project here so all the links work again. And it's rather too late to complain, the process has already started, reverting it would be more difficult. --Wytukaze 29 June 2005 22:38 (UTC)
- The thing is, there's no damn reason whatsoever for each individual editor to have to reinvent the wheel to figure out how to cope with this. Wake up, fools.
- The idiots doing this should have had someone explaining various little things to look out for. They should have initiated the discussion here, or on a special page for that purpose, rather than having it called to our attention by some random user paying attention to what was going on, long before those little notices ever started to appear at the top of our screens. For example, it seems to me that links to lowercase spelling contained on an article page will link to the uppercase spelling, up until the time that the page containing the link is edited and saved. Then it will link to the lowercase spelling? Is that right? Has anybody else figured that out independently? Or did I just figure it out wrong, in the absense of any guidance? Will that always be the case, or are there other steps involved in the implementation process which will deal with that? Gene Nygaard 29 June 2005 23:07 (UTC)
This change was never supposed to have been made here until an automated fixup script was ready. Someone made the change without that being ready, and has not yet confessed to the crime. I'm working on a quick fixup script now, which hopefully should be ready in a few hours. Note that this will run much, much faster than any manual or bot attempt to rename pages. --Brion June 29, 2005 23:57 (UTC)
- I've run the script. There's still cleanup to do, but the majority of entry pages are most probably now at their correct titles (lowercase). --Brion June 30, 2005 01:49 (UTC)
- So now what the hell happens when I enter Kind, in that example everybody was talking about every time this discussion came up, as near as I can tell from various archives I've looked at? It takes me to the lowercase kind article, of course. So where are all those people who wanted this to "work right"?
- When I enter "kind" on the Go button, it takes me to kind. And when I enter Kind on the Go button, it also takes me to the same kind entry. Same thing happens when I click on either of these links here. So let's suppose I'm a real newbie, and not just a newbie here on Wiktionary. How the hell do I enter anything in the Kind article, when I can't even get there? How can I take the German noun out of "kind" and put it into "Kind"? Did anybody ever think that it might be a good idea to tell all the editors how they can accomplish this? Was it ever done anywhere, in any of the instructions on Wiktionary? It isn't so hard, once you know how to do it. But it would be nice if somebody made some effort to help people deal with the problems still left lingering, even after the magic pill of this "script" has been administered. Gene Nygaard 30 June 2005 02:17 (UTC)
- Then, of course, the next problem after I figure out how to move part of the kind article to the Kind article is this.
- What kind of link should it be?
- Where should it appear on the page?
- Is there any template to help me accomplish this?
- If not, can I create my own template to help me out?
- If I do, where do I let other people know that I have done so, so that they could use it as well?
- I think you get the idea. Where the hell is all this missing information? Gene Nygaard 30 June 2005 02:24 (UTC)
- Then, of course, the next problem after I figure out how to move part of the kind article to the Kind article is this.
- Help:Redirect Uncle G 30 June 2005 03:58 (UTC)
- Good grief, Uncle G. That's totally useless for the issue I've raised. We will now have an article with substance at kind and another article with different substance at Kind. We don't want to redirect "kind" to "Kind", and we don't want to redirect "Kind" to "kind".
However, your mention of redirects leads us to the next problem. Now that the scripts have been run and all the uppercase initials have been changed to lowercase, how do we get rid of all the useless redirects after we move united States of America to United States of America and the like. We don't really need a redirect from new York and from charlie Brown and all the others like them, do we? Let's throw them out en masse, without having to nominate each and every one of them for deletion. Gene Nygaard 30 June 2005 04:26 (UTC)
- It's far from totally useless. It answers the very questions that you asked above. Uncle G 30 June 2005 09:59 (UTC)
- Good grief, Uncle G. That's totally useless for the issue I've raised. We will now have an article with substance at kind and another article with different substance at Kind. We don't want to redirect "kind" to "Kind", and we don't want to redirect "Kind" to "kind".
- So show me!!
- If you can't see the forest for the trees, and if the answer appears under a topic different from what people would expect to find it in any case, it really isn't a sufficient answer even if you can "show me", is it? Gene Nygaard 30 June 2005 13:16 (UTC)
- Maybe I misunderstood you, and you misunderstood me as well. It does, of course, answer the first question I brought up, the question of how do I get to the redirect page to put the new content in what used to be a redirect, and will now be a new entry under Kind distinct from the entry under kind. See Help:Redirect#Changing a redirect.
- But what that Help:Redirect] page does not answer, and does not address in any way, are the questions I raised as the nest problem. It doesn't tell us how to go about interlinking two existing pages, and where those links should be placed, or even whether or not they should be interlinked, and questions along those lines. Gene Nygaard 30 June 2005 13:47 (UTC)
And it's rather too late to complain, the process has already started, reverting it would be more difficult. --Wytukaze
- Why on earth did the devs secretly tell only this contributor that a revert process would be more difficult? Or is this just misinformation? — Hippietrail 30 June 2005 04:14 (UTC)
Now if we truly are considered too childish to make our own decisions, what other secret changes are the devs planning to bestow on us without consultation nor warning? Perhaps they'll flip a coin on each controversial topic in the Beer parlour archives and implement their decisions one by one.
Now if it is here to stay and we are going to be obedient little serfs or guineapigs then they better damn well start implementing features for Wiktionary that we want, like fixing their braindead search function.
Also it will be up to us to put See also links between words differing only in capitalisation. To enter articles optionally spelled with a capital letter such as Zeppelin, zeppelin, to decide whether those redirect from one to the other, and in which direction, or if we create duplicate articles for them. Capitalisation will have to be researched properly from now on - no more guessing. And don't forget that a couple of Germanic languages formerly capitalised all nouns just like German still does. So all Danish nouns used before the 1950s, all Norwegian nouns used before the mid 1800s, and all Swedish nouns used before the 19th century will have to exist in one way or another as both a capitalised and non-capitalised page title. What a mess the redirections, duplications, or "see non-capitalised page" entries are going to be... — Hippietrail 30 June 2005 04:14 (UTC)
My information was that it was uncontroversial:
- <Oldak> en.wiktionary voted to turn off first-letter capitalisation several months ago. Is anyone free to do this?
If I had known there was disagreement about this, I wouldn't have done it. -- Tim Starling 30 June 2005 06:16 (UTC)
- Great, looks like lots of things are nicely f***ed up now. What has happened to {{rfc}}, for example? What has happened to the Recent changes earlier than today's? (This is not an attack on Brion's excellent work on moving pages over, by the way - well done you. Imagine if we'd had to do all that by hand.) Who knows what else has been screwed up in the process? Thanks (not!) to whoever decided to jump in and do this on our behalf and leaving us to sort out the mess it has caused. — Paul G June 30, 2005 08:59 (UTC)
- Recent Changes has been swamped by approximately 65,000 page moves. Use Special:Newpages instead. Templates such as {{rfc}} simply need moving back to lower case (as I have just done). I strongly suggest that this is a good time for editors to rid themselves of the bad Wiktionary habit of deleting redirects by reflex. The redirects for the templates are useful, for example, as editors have used both spellings of several templates. Uncle G 30 June 2005 09:59 (UTC)
- Although I've consistently supported turning off capitalization, this is not the way I would have had it happen. The person who asked Tim to do this does not appear to be well informed. That vote did take place, but I expected that some preliminary work would be needed before the change came into effect. Some very important problems have already been raised, but at least we now have the attention of the developers. I'll address these details in the next couple days after I've appointed 3 more admins when I come back on in the morning. Good night. Eclecticology June 30, 2005 09:50 (UTC)
- OK, I've calmed down now :) Thanks for the updated, Uncle G and Eclecticology. I think the pseudo-namespaces definitely need to be reverted (for example, there are now two "Rhymes" namespaces - one capitalised and one not) as there is no reason for these to begin with lower-case letters. Could this also be done automatically? — Paul G June 30, 2005 11:02 (UTC)
- Wikitionary:* and Transwiki:* also need to be recapitalized. SemperBlotto 30 June 2005 11:05 (UTC)
- "Transwiki:" isn't, in fact, a namespace, and has in fact always been case sensitive because of this. Note that McBot won't be populating the incoming transwiki queue for a while. The software update broke a lot of 'bots, including McBot (and several major daily maintenance 'bots at Wikipedia). Daniel McDevit is waiting for Kevin Rector. Uncle G 30 June 2005 11:36 (UTC)
- It doesn't really matter for transwiki since these are only temporary pages that are eventually deleted when the work is done. Eclecticology July 5, 2005 02:02 (UTC)
- Wikitionary:* and Transwiki:* also need to be recapitalized. SemperBlotto 30 June 2005 11:05 (UTC)
For those with Mop and Bucket in hand ...
[edit]- Most templates need moving to lowercase (not copying and not deleting the redirects left behind because for many templates editors have employed both capitalizations)
- The templates that add categories to articles need to be checked to ensure that they have spelled the category name correcty.
- Many Category:Abbreviations, Acronyms and Initialisms need moving to uppercase.
- Thanks, somebody, for finally getting around to providing some of the practical information which should have been presented here on the Beer parlour before the change ever took place. Gene Nygaard 30 June 2005 13:24 (UTC)
- I've moved all the language templates to lowercase. 24 30 June 2005 14:19 (UTC)
- Many 2-letter Category:Symbols need moving to uppercase. Some, such as ba, need a casewise split.
- The case of Category:German nouns, Category:German months, Category:German cardinal number (which should be Category:German cardinal numbers), and Category:German proper nouns needs to be correctly determined, and any splits performed
- Many articles in Category:English proper nouns either need moving to uppercase (Darlington) or a casewise split (Ben/ben).
- There are very many proper nouns - mostly not in this category. You can find them by going to Google, selecting "advanced search", and searching for "proper noun" (no quote) within the "domain" en.wiktionary.org - the first few pages have all been fixed already. SemperBlotto 2 July 2005 07:46 (UTC)
- http://en.wiktionary.org/w/index.php?title=Special:DoubleRedirects&limit=500&offset=0 --Connel MacKenzie 3 July 2005 19:09 (UTC)
- Regarding abbreviations, etc., see also
. Many of these need to be moved to the proper capitalization (e.g., BIOS redirects to bIOS), and probably need to be checked for categorization, as well (i.e., Ab/Ac/In). - dcljr 4 July 2005 08:17 (UTC)
- I've done most of the proper nouns in Category:English proper nouns. I very much recommend putting the "See also" note at the TOP of pages that have both capitalised and uncapitalised forms, rather than in a "See also" section at the bottom. I'm using the form: ''See also'' '''[[blah]]''' — Paul G 4 July 2005 10:04 (UTC)
- According to Google, there are 3,990 articles containing the phrase "Proper noun" in en.Wiktionary. But I can't get it to show me more than 1,000. SemperBlotto 4 July 2005 10:12 (UTC)
Category tagging via templates
[edit]- Here's what I think is the right long-term direction for cattag etc. I've tried this out with Category:Basketball, and it seems to work OK, except that the automatic listing in the category is stale. This is actually a bug in Wikimedia. There seems to be a work-around of making a trivial edit to the misfiled article.
- Keep the actual entries the same. If it says {{basketball}}, that's fine.
- Move the templates to their lowercased versions. Template:Basketball becomes Template:basketball. This fixes all the dangling references in pages that use {{basketball}} in one go.
- Keep the template content the same. If it says {{cattag|Basketball}}, that's fine.
- As it happens, the capitalized categories now contain only articles that included the category by hand. It would be a good idea to edit them to say, e.g. {{cat|basketball}} instead of [[Category:Basketball]]
- After (optionally) fixing all manually categorized entries, move the frontmatter of the category to the uncapitalized version. There seems to be no way to move a category, so we have to create the new one and empty out the old.
- At this point everything's hakuna, except that the lowercased category will most likely be empty except for terms you've edited manually. Either wait for the problem to fix itself (a patch to Wikimedia, or something forcing the category to refresh), or use "what links here" with the template to find the misfiled articles and make a trivial edit (e.g., insert a comment) to get them refiled.
- Here's what I think is the right long-term direction for cattag etc. I've tried this out with Category:Basketball, and it seems to work OK, except that the automatic listing in the category is stale. This is actually a bug in Wikimedia. There seems to be a work-around of making a trivial edit to the misfiled article.
- Again, that last step should be unneccesary. If the page for foo include template {{bar}} and this template contains [[Category:spam]], then the term foo should be in category spam. I.e., it shouldn't matter whether the category link is included directly or indirectly. This is in fact the case for pages that have been edited more recently than the template. It should be true uniformly. At the very least, there should be a "refresh category" function that one could invoke manually when updating a template that uses a category. This will make life easier any time we want to re-categorize a bunch of terms, this being a particular case. In fact, the ability to re-categorize in one go is one of the major advantages of the template approach (consistency and ease of editing being two of the others). -dmh June 30, 2005 16:49 (UTC)
- Update: It looks like the bug may be with templates, not categories. The "what links here" link for a template doesn't seem to get updated when the template moves. The list only contains articles edited more recently than the template. Note that there are ways of handling this situation efficiently. Tracking such changes does not inherently require exhaustively searching all articles when a template changes. On the other hand, I have no way of knowing how MediaWiki does things. -dmh June 30, 2005 17:15 (UTC)
There has been no discussion or agreement about whether category names should be capitalized. I can live with doing it either way (except, of course, for proper nouns) but we can't assume that there has been agreement. There is certainly no agreement to embed all categories into templates. Using manual categorization must continue to be acceptable. Eclecticology June 30, 2005 18:41 (UTC)
- Is anyone assuming there has been agreement? I'm cretainly not. That's why I decided to do an experiment with the basketball category which, as far as I can tell, no one really does much with except for me anyway.
- One of the side-effects of the capitalization change is a change in categorization. Previously, adding a link to "Category:foo" (whether directly or through a tempate) would place the term in category "Foo" (upper case). Now it places the term in category "foo" (lower case). It would have been nice to discuss the consequences of this before cutting over, but as you point out, we didn't.
- This side effect happens to break the "cattag" mechanism, a convenience which has never been and never will be required, but which I and evidently others find highly useful. There are at several distinct ways to deal with this:
- Throw up our hands and say "Oh well, cattag is broken because it no longer fits with categories." I would not be in favor of this.
- Fix cat (and cattag, since cat has been inlined into it) to mimic the old behavior by capitalizing the category name. If there is a template/macro/whatever handy to do this, that might be the best solution.
- Rename the categories, where necessary, to use the same case as their tags. This would be consistent, at least. Terms relating to basketball are tagged basketball and categorized under Category:basketball, while terms relating to Siegfried and Roy are tagged Siegfried and Roy and categorized under Category:Siegfried and Roy.
- Manually fix all the existing cattag instances to use the uppercase category name, and deprecate the direct use of cattag, cattag2 and cattag3.
- Leave everything as is, and have two competing sets of categories, differing only by case.
- Only options 2 and 3 are stable in the long term and retain the ability to use cattag and related macros. -dmh 5 July 2005 15:32 (UTC)
- Not having heard any further discussion, I plan to continue moving to categories which match the tags. More precisely, I plan to move the categories pointed at by the various cattag templates. In practice, this will only affect categories whose tags start with a lower case letter (e.g., US should be fine). There are a couple of well-established categories, notably from the idiom template, that don't follow this and may never. -dmh 04:37, 10 July 2005 (UTC)
- OK, I notice that, instead of responding in discussion, people are just "fixing" the tag templates to write out the tag and the capitalized category. I can't see how this won't lead to slow but steady drumbeat of people adding new tags and forgetting to capitalize the category, but evidently the priority here is to keep the category names capitalized.
- That being the case, I'd be much more comfortable if we booted cattag2 and cattag3 and changed the format of our articles slightly. The internal change would be from saying
# {{cattag2|foo|bar}} Some definition.
- to saying
# {{foo}}{{bar}} Some definition.
- This should be much easier to grasp for newcomers, and is also more robust since cattag2 tends to break in the presence of things like {{idiom}}. However, it also has a visible effect, namely that the article will no longer read
- (foo, bar) Some definition.
- but
- (foo)(bar) Some definition.
- If we're willing to live with that formatting, then things get a lot easier. Cattag2 and cattag3 go away (I never liked them much anyway), tagging always means just putting the tag in double curlies, and the only thing left to argue about is what the tag macros should expand to. Since there doesn't seem to be a handy
{{capitalize|...}}
template — please let me know if there is! — we're stuck with hand-writing each of them if we want capitalized categories. But if we can stand the proposed change in appearance, at least that's sufficient. You'll never get tripped up saying{{cattag2|uncountable|basketball}}
and having your entry end up in Category:uncountable and Category:basketball when it should be in Category:Uncountable and Category:Basketball. -dmh 18:10, 20 July 2005 (UTC)
- The thing that nagged at my subconcious about cattag, cattag2 and cattag3 was actually the parenthesis. If they lost the parenthesis, the rendered formatting would not be adversely affected. Paul (and I imagine, all the other East pond folk) would be happy that I no longer try to italicize the parenthesis. But perhaps there should be no parenthesis at all, only a single colon following however many of these there are? So we would end up with: # {{cattag2|foo|bar}} Some definition. that would come out as
- foo, bar: Some definition.
- Or perhaps it could be specified as # ({{cattag|foo}}, {{cattag|bar}}): Some definition. which would come out as
- (foo, bar): Some definition.
- Either way, it seems prudent that the templates not add the parenthesis. If we can get community agreement that the parenthesis are too problematic to retain at all, then I will volunteer to hunt down these couple hundred entries. (I would've guessed these numbered in the thousands.) --Connel MacKenzie 01:52, 22 July 2005 (UTC)
- The thing that nagged at my subconcious about cattag, cattag2 and cattag3 was actually the parenthesis. If they lost the parenthesis, the rendered formatting would not be adversely affected. Paul (and I imagine, all the other East pond folk) would be happy that I no longer try to italicize the parenthesis. But perhaps there should be no parenthesis at all, only a single colon following however many of these there are? So we would end up with: # {{cattag2|foo|bar}} Some definition. that would come out as
Another problem
[edit]Another problem for new users is that one can't set up a new username with a lower case first letter. http://bugzilla.wikimedia.org/show_bug.cgi?id=2629 and there are related ones probably. 131.251.0.8 30 June 2005 13:21 (UTC)
Capitalization of the letters of the alphabet
[edit]I suppose that we need two versions for each, or for most, of them. I am going to create H as the chemical symbol for hydrogen, but I'm not sure what else needs to move. SemperBlotto 30 June 2005 10:55 (UTC)
- Well, chemically speaking, there's O, F, C, N (and a few others, I think) as well. — Paul G June 30, 2005 10:59 (UTC)
- I've got my periodic table out, and the other elements are: B, P, S, K, V, Y, I, W, U. Jonathan Webley 30 June 2005 20:44 (UTC)
- I've done all the chemical element symbols. But there is content in the lowercase versions thet needs moving some time. ALL letters need to be looked at and most probably need splitting. SemperBlotto 2 July 2005 07:49 (UTC)
Unifying entries
[edit]I think that words should be capitalized correctly in their entries. Thus for example Pythagorean theorem is handled well. The lowercase version of the page redirects to the capital version, and the page uses the correct capitalization in the text entry.
However, I also think it's a mistake to separate entries from one another based on capitalization. Take for example bible and Bible. These two entried would be better on the same page. A single page can contain more than one entry. The page for bible should have a proper noun section that has a captilized entry with the definition as seen on the current Bible page, and it should also have a noun section in with a lower case entry containing the defitions on the bible page.
The current layout emphasizes precision at the cost of recall. Specific word lookup is inherently a very high precision / low recall domain (there's only one right answer and you always get it), so it is more helful to users to allow the item to be found to be more comprehensive. The cost of a user finding a page with definitions he doesn't care about is that he has to ignore part of the page. The cost of a user finding the wrong information because he used the wrong capitalization is greater - he will not get what he came for at all. This is especially true given that one of the primary uses of a dictionary is for a person to learn about a word with which he is not familiar, in which case not understanding the proper capitalization is likely. 24.18.198.220 04:52, 13 February 2010 (UTC)