Jump to content

Wiktionary:Beer parlour/2019/January

From Wiktionary, the free dictionary

WARNING! The title you are using may be wrong.

[edit]

Very annoying and shouty. Do we still need it? If we do, it would be nice to have some option to stop showing it, or maybe disable it for non-anons. – Jberkel 13:47, 1 January 2019 (UTC)[reply]

Where is that message configured? DTLHS (talk) 16:11, 1 January 2019 (UTC)[reply]
Yes, is there any list of where all these messages are located? — SGconlaw (talk) 17:56, 1 January 2019 (UTC)[reply]
I did a search for mediawiki: insource:"the title you are using may be wrong" and found MediaWiki:Newarticletext. — Eru·tuon 19:13, 1 January 2019 (UTC)[reply]
To remove the "new article text" on any page, you can use CSS: .mw-newarticletext { display: none; }. But that will remove other messages besides the shouty case-insensitivity message. I could also add another class around the shouty message so people can selectively remove it. — Eru·tuon 05:50, 2 January 2019 (UTC)[reply]
It's still needed. The alternative is even more entries that need privileged access to remove. --RichardW57 (talk) 05:20, 2 January 2019 (UTC)[reply]
To make it less shouty, we could replace the text “WARNING! The title you are using may be wrong.” by the friendlier “Are you sure this is the right title?” Also, in the second next sentence, instead of “You probably want to edit the lowercase version of your word”, it is probably better not to presume the editor is “probably” mistaken; we may replace that by “You may want to edit the lowercase version of your term”.  --Lambiam 09:31, 2 January 2019 (UTC)[reply]
+1 to replacements proposed by Lambiam. No icons please. I can see the discussed text in Amadabacra, e.g., when I try to create it. --Dan Polansky (talk) 11:17, 5 January 2019 (UTC)[reply]
I can't seem to see this warning: I probably ad-blocked it if it was annoying. "Warning" seems good for Richard's reason above. Perhaps we could use a warning icon (yellow triangle etc.) instead of the word WARNING in caps? Equinox 20:07, 3 January 2019 (UTC)[reply]
There's already a warning triangle in the notification UI (above the notice), but it's blue instead of yellow and not very prominent. Changing the wording to be less aggressive/patronising as suggested above would be a first step to a friendlier interface. – Jberkel 21:07, 3 January 2019 (UTC)[reply]
This is not a popular opinion but I think that being aggressive is a good thing. We have certain rules and standards and we are not going to help ourselves by easing newbies into creating unusable entries. ("Patronising" is another matter...) Equinox 15:50, 6 January 2019 (UTC)[reply]
This is what I get in Amadabacra:
WARNING! The title you are using may be wrong.
Remember that Wiktionary is case sensitive. You probably want to edit the lowercase version of your word: amadabacra.
And this is what would be an improvement without reducing the warning effect much:
WARNING: Did you intend to edit amadabacra, starting in lowercase? Wiktionary is case sensitive.
--Dan Polansky (talk) 16:08, 6 January 2019 (UTC)[reply]

When do we include capitalized versions of words

[edit]

I came across Zumeendar and zumeendar, and it made me wonder what the policy is around including both letter-case versions of nouns. With older nouns it is often trivial to find alternative letter-case examples since it used to be quite fashionable to capitalize lots of words (perhaps to Impress your Friends), but that may be the equivalent of YELLING in modern internet communications. What is the right thing to do? - TheDaveRoss 16:47, 3 January 2019 (UTC)[reply]

Yeah, I'd disagree that the only criterion should be whether it is verifiable by our rules. As mentioned by Dave, in the past it used to be the norm to capitalize nouns, so I wouldn't be surprised if many common-or-garden nouns would be found to be verifiable in a capitalized form. Something more is needed – perhaps it is an eponym and so occurs in both capitalized and uncapitalized form. — SGconlaw (talk) 16:58, 3 January 2019 (UTC)[reply]
Isn't this the same word as zamindar? Basically any function that bestows some authority will also be found in capitalized form: ”the Protector”, “the Taxman”, “the Constable”. I think we need more than the occasional occurrence of a capitalized form before we decide it is an alternative spelling and not some instances where the custom of spelling honorifics and titles showing rank or prestige, usually only used when referring to a specific person, spilled over (possibly by ignorance of the customary rules) to a generic use.  --Lambiam 21:46, 3 January 2019 (UTC)[reply]
This is a bit of a grey area. My sense is that most of us agree that not just any capitalization should be included, since any common word is sometimes capitalized (hence Semper's The got deleted). I think that when a capitalized form does have a distinct sense,then it should be included: e.g. Native vs native (even though older works probably have instances of "Belonging to one by birth" capitalized as Native, and there are some uses of "Indian" written in lowercase as native), and likewise aboriginal/Aboriginal, black/Black, white/White, etc, where even if we host the definitions all in the lowercase entry, we have an {{altcaps}} soft redirect at the uppercase form. I think I'd rather exclude capitalizations of titles (unless they're almost always capitalized), but the current situation is haphazard; we have King, but not Secretary (it's blue only because it's a town), President but not Viceroy. I suppose you should RFD it. (I would also have deleted e.g. He, She, Me, Who, You, etc, but other people wanted to keep them.) - -sche (discuss) 22:30, 3 January 2019 (UTC)[reply]
Rather than create another topic, I'll add my bit here. I just came from K-hole and was left baffled by which of the following rule is in effect: only the proper case is acceptable; both the proper case and lower case are acceptable; or no position is taken on which case is normative. This is a horrific problem on Wikipedia, as well, whenever the bold topic item begins the lead sentence with a noun in sentence case (ambiguous with noun case). Furthermore, regardless of whether y'all think you know the answer to this, I assure that we'all fly-by-night users don't, and neither can we puzzle this out from any particular word entry. While deciding how to treat both cases at the same time, you might spare a thought to illuminating what prevails in the existing policy already, for those of use who didn't wade through the ten essential pages of italics that introduce any printed dictionary worth having. Once upon a time, I used to read those pages. By now I arrive mainly through Google, and generally speaking, without exerting some serious effort, I don't even know where those pages reside. Each entry must therefore speak for itself, and on this point, few do. — MaxEnt (talk) 05:53, 26 February 2019 (UTC)[reply]

Users Luxipa and BigDom

[edit]

Hello. I want to bring your attention to User:Luxipa, whose username clearly indicates that the primary function of that account is adding phonemic transcriptions of Luxembourgish words to Wiktionary. This is confirmed on his user page.

To me, this seems to be User:BigDom that could be using that account in a (somewhat) deceptive manner. This user has put hundreds if not thousands of incorrect Luxembourgish transcriptions on Wiktionary. Incorrect, because he put phonetic transcriptions such as */bəˈdʀekən/ ([ə] and [e] belong to one phoneme) to bedrécken (see [1]).

I corrected a few hundred of them (see the last ~600 edits of Mr KEBAB, my previous account), which then already got him a bit mad - notice that this edit itself contains an error, because the correct phonemic transcription of bedrécken contains either /e/ or /ə/ for all three vowels (depending on which symbol you choose to represent the phonological mid front vowel in phonemic transcription). But notice that on my user talk page, he was nothing but nice to me.

A year later, an anon appears and writes on my old user talk page (notice that was over 5 months after I switched accounts). The tone of that message is similar to the tone of the edit summary I've linked above. Similar style of interacting can be seen in this reply to my message (I also thought it unfortunate that you had already made so many edits when you did. Particularly in a language that you don't know the first thing about. I can offer your to simply revert all of our edits and go back to system we had.) Notice the last sentence: he never brings up BigDom's mistakes and blames the whole thing on me, offering to go back to the system we had - the system that had so many mistakes. Not only that, in the next sentence he writes Your system, as I said, is "halfbaked" and therefore not only unnecessary but confusing - but that's the thing - it's not my system. 99% of it is based on the JIPA article on Luxembourgish. In that reply, he also just dismisses the whole article we have about Luxembourgish phonology, most of which is based on said JIPA article.

Also, compare the latest 100 edits of Luxipa with those of Bigdom. There's also this anon who (presumably) is Luxipa as well.

I also want to bring your attention to the fact that many of BigDom's phonetic transcriptions of Icelandic words are also wrong - see e.g. brúsi - vowel length isn't phonemic in Icelandic (see [2]), and so the transcription should read either [ˈpruːsɪ] or /ˈprusɪ/, [ˈpruːsɪ] but not */ˈpruːsɪ/. Kbb2 (talk) 06:16, 4 January 2019 (UTC)[reply]

@Kbb2 Sorry to disappoint you, but I am not the new user you are saying I am, and I hope a CheckUser confirms that for you. I simply haven't been editing over the holidays as I have been travelling - I was only alerted to this as I got an email saying you had commented on my talk page (which I will reply to separately).
I admit that in the past I didn't react well in the edit summary, but as you've pointed out on talk pages I always endeavour to be nothing but polite and courteous. Having seen the way Luxipa was responding to you on his/her talk page, I am genuinely dismayed that you even considered that I would converse with you or any other user in that way.
Having looked at the edits of this new user, I too have some ideas about who it may be but without proof I don't wish to speculate on this public forum. In any case, I hope that these allegations won't stop us working together going forward - I realise we might not always agree in our methods but I do think we both want what's best for the dictionary. BigDom 07:19, 4 January 2019 (UTC)[reply]
For what it is worth, there isn't much evidence to suggest that BigDom and Luxipa are related from a checkuser perspective. Even so, it is not against policy to have multiple accounts, only to abuse them. None of that takes away from your concerns about whether or not there are patterns of incorrect edits being made, but at least that aspect of the discussion can be put to rest. - TheDaveRoss 14:55, 9 January 2019 (UTC)[reply]
@BigDom, TheDaveRoss I apologize for not replying sooner.
BigDom, this is fair enough. I also remember that I compared the hours you and Luxipa edited on Wiktionary and some of them overlap, meaning that it's unlikely that you two are the same person. I apologize if I made you feel unfairly accused of something you didn't do.
Still, I find it interesting that Luxipa stopped editing altogether pretty much immediately after I posted here.
TheDaveRoss, thanks for the clarification. Kbb2 (talk) 16:25, 6 February 2019 (UTC)[reply]

Phrases comprised of words with multiple meanings should be kept

[edit]

Since that's an argument regularly invoked in RFD, we might as well include it in the CFI, right? That will allow us to keep hollow victory of course, but also hollow vessel, hollow quest, hollow city; anything that can be hollow, in fact. Per utramque cavernam 18:15, 5 January 2019 (UTC)[reply]

And dear old brown leaf, which may indeed be a burned piece of paper as well as autumn foliage. Equinox 18:17, 5 January 2019 (UTC)[reply]
Ambiguity is part of the language, and there's no way we can make a dictionary ambiguity-proof without making it too wordy to use. Most words have multiple senses- wouldn't that make any random phrase containing those words includable, if it happens to be used often enough? Should we have an entry for "go to the bank" because it could refer to either a financial institution or the shore of a river? How about "go to the dry cleaner's to pick up a suit"? After all, "dry cleaner" could refer to a cleaner that's dry (not allowing alcohol?), "pick up" could refer to physically lifting something or to getting someone to go on a date, and "suit" could refer to a person in management, to cards or to a legal action. Think about the old joke where the Buddhist says to the hot dog vendor "make me one with everything", or just about any play on words- dictionary material? Chuck Entz (talk) 22:04, 5 January 2019 (UTC)[reply]
Sometimes people can't realise what a bad idea something is until you actually let them try it. I took Per's remark in that satirical vein. I might be wrong (in which case gawd help us all). Equinox 22:12, 5 January 2019 (UTC)[reply]
Yes, I'm being sarcastic. SemperBlotto isn't, though, so here we are. Per utramque cavernam 22:22, 5 January 2019 (UTC)[reply]
My interpretation was that PUC was stating the assertion positively for rhetorical purposes so we could discuss it and come to an explicit consensus. I knew he doesn't actually agree with it. Either way, I felt that the fact that people have been seriously using that argument meant that we should take the opportunity to point out its flaws. Chuck Entz (talk) 23:46, 5 January 2019 (UTC)[reply]
It would be my chance to sneak the German translator in.  --Lambiam 22:19, 5 January 2019 (UTC)[reply]
All entries of the form "[X] [Y]", where [X] and [Y] are entries (not the possible recursion), should be kept, but with the sole definition {{&lit|[X]|[Y]}} => Used other than with a figurative or idiomatic meaning: see X, Y. to allow for all possible combinations of the polysemic Xs and Ys. It wouldn't be fair to compel the advocates of such entries to insert all the possible combinations, even just the attestable ones. To make implementation more gradual we should restrict this initially to combinations of single words. Further I would recommend automating the process and imposing some grammatical restrictions to eliminate automated entries like the of. Automating attestation would be a help. DCDuring (talk) 23:26, 5 January 2019 (UTC)[reply]
Wouldn't it be better if the software just made reasonable suggestions and we didn't create trillions of pages which provided no actual information? - TheDaveRoss 13:23, 7 January 2019 (UTC)[reply]
I think the real utility of a dictionary comes from collecting terms with meanings that are unexpected for its users. If we could define somehow which senses are rare, we should imo add amply attested compounds containing them, such as hollow victory, which might not be idiomatic, but is useful in a way the headlessness isn't. Crom daba (talk) 15:01, 6 January 2019 (UTC)[reply]
None of the lexicographers who control the references at OneLook seem to share your opinion. Perhaps the OED? DCDuring (talk) 22:02, 6 January 2019 (UTC)[reply]
Yeah, maybe no one does, thinking in expectations is not very common but I believe it is very useful. My suggestion might not be practically implementable or attractive for anything, but it describes well what I intuitively feel is a useful entry as opposed to clutter. Crom daba (talk) 03:17, 7 January 2019 (UTC)[reply]
These unexpected combinations belong to {{uxi}}, in this case in hollow, where it is already since yore. The SOP rule has to be understood as implying that an entry should not be created in cases when even though the meanings of the parts used need additional mental strain for recognition one rather expects a comparatively infrequent meaning than an idiomatic use of the whole, even if this expectation is slanted by the experience of dictionaries restricting themselves in their coverage of composed expressions, since why assume the fulfillment of the inclusion criteria by idiomaticity if even an averagely astute learner is expected to look up the parts. Fay Freak (talk) 03:48, 7 January 2019 (UTC)[reply]
Conversational understanding is mostly derived from context and metaphor. You don't have to "know" that hollow means "empty" (also metaphorical) to understand hollow victory. The dictionary mostly helps some learners move something from the category of understood to that of able to be used by confirming the meaning, which can be done by looking up all the terms, though usually it is clear which term has the less certain meaning. DCDuring (talk) 14:21, 7 January 2019 (UTC)[reply]

@SemperBlotto: I see you making that argument again, so could you please spell out the logic? Are you okay with having an entry for empty glass (9 senses x 10 senses)? Per utramque cavernam 18:32, 6 February 2019 (UTC)[reply]

Imagine the fun of citing 90 possible definitions of the NP entry. DCDuring (talk) 23:39, 6 February 2019 (UTC)[reply]
I suppose RfVing each and every sense could be fun. DCDuring (talk) 23:40, 6 February 2019 (UTC)[reply]
@DCDuring, don't test Kiwima s/he will absolutely do it. - TheDaveRoss 16:07, 7 February 2019 (UTC)[reply]
I was counting on a desire not to squander her? efforts. There are much better things for us to cite. The more terms and definitions we have that other dictionaries don't, the more effort we need to put into citations for those terms and definitions. DCDuring (talk) 16:15, 7 February 2019 (UTC)[reply]
Thanks, @DCDuring: - I think the effort of citing every sense would drive me away! It's like when someone in 2018 added every cliché they could think of to the requests for definition, I stopped supplying requested definitions. Kiwima (talk) 19:23, 7 February 2019 (UTC)[reply]
On a more serious note, I think the point behind this suggestion is that phrases which involve rare or surprising secondary meanings of words seem worth keeping. If the meaning is obvious from context, then there isn't really a point to it. Kiwima (talk) 19:27, 7 February 2019 (UTC)[reply]

PageNotice extension

[edit]

I've finally posted on the Phabricator ticket titled "Review the PageNotice extension for deployment" that we would find it useful because it would allow {{reconstruction}} to be automatically transcluded at the top of pages in the Reconstruction namespace.

See also the previous discussions at Wiktionary:Grease pit/2018/September § {{reconstruction}}, Wiktionary:Beer parlour/2017/September § Proposal: install mw:Extension:PageNotice, and at Wiktionary:Grease pit/2017/June § Citations at citations. — Eru·tuon 00:07, 6 January 2019 (UTC)[reply]

@Erutuon I'm thinking it might be better to open a new ticket. --{{victar|talk}} 01:51, 31 January 2019 (UTC)[reply]
@Victar: Why? If the ticket is on the same topic, they'll just merge or close it. — Eru·tuon 03:34, 31 January 2019 (UTC)[reply]
@Erutuon: It might be for the same extension, but a completely different purpose. More to the point though, that ticket has too much baggage and is just going to be ignored. --{{victar|talk}} 05:55, 31 January 2019 (UTC)[reply]

The entry for sexually frustrated was deleted as SOP per consensus at RfD. That leaves behind Thesaurus:sexually frustrated, which could get the same treatment, or could be moved to one of its provided synonyms. bd2412 T 21:24, 6 January 2019 (UTC)[reply]

The names of Thesaurus entries don't have to be valid dictionary terms. They're supposed to be descriptive and unambiguous, which often means SOP. Chuck Entz (talk) 21:56, 6 January 2019 (UTC)[reply]
I have adjusted the thesaurus header to eliminate the red link. Further tweaking may be needed. Cheers! bd2412 T 05:13, 7 January 2019 (UTC)[reply]

Competition finished

[edit]

So, the Christmas competition has finished. It was a resounding success with one and a half entries. Apparently, now the winner is to be decided democratically. I'm expecting a massive turnout for voters too. --Wonderfool Dec 2018 (talk) 10:58, 8 January 2019 (UTC)[reply]

Straw polls on criteria for including chemical formulas

[edit]
Previous discussions: Talk:AsH₃, Talk:CO₂, Talk:LiBr, WT:RFDN#SiGe (will become Talk:SiGe)

To gauge what criteria for including or excluding chemical formulas/formulae might have consensus, probably as a precursor to a vote, let's straw poll some possibilities. This also allows for problems with proposals to be pointed out.
For example, some people previously suggested including only formulas which would be read by letter, like "aitch two oh", but AFAICT all formulas can be read as letters and unfamiliar ones are necessarily read as letters. Other people proposed including only formulas that have unformulaic common names, but e.g. AlF₆Na₃ would meet that criterion as cryolite while CO₂ would fail as carbon dioxide, which seems opposite to what most people would expect. (As a result, I didn't list those ideas below.)
- -sche (discuss) 02:27, 9 January 2019 (UTC)[reply]

Include all attested chemical formulas

[edit]

Please indicate if you support or oppose including all chemical formulas, such as BaCO₃, H₂O, Al(NO₃)₃, HArF, and CH₃(CH₂)₂₄-COOH, if they are attested.

Exclude all chemical formulas

[edit]

Alternatively, indicate if you support or oppose excluding all chemical formulas. Indicate if you would prefer to exclude them all without exception, or just exclude them by default but with the possibility for individual formulas (such as perhaps H₂O, which passes LEMMING) to be included on a case-by-case basis via consensus (presumably at WT:RFD, which is where requests for un-deletion are normally handled, and where consensus has occasionally been reached to keep other unidiomatic, non-translation-hub entries).

  • Oppose Some formulas intrude in material intended for broader than technical audiences, such as consumer protection, worker safety, and environmental literature. I don't see why we would limit STEM content. DCDuring (talk) 16:10, 9 January 2019 (UTC)[reply]
  • Oppose. I think there's obvious value in including formulas like H₂O and CO₂, which have a lot of currency, and I see no reason not to include attestable formulas that are used outside of chemistry-related subjects. In scientific contexts, we can probably expect readers to understand them and not need to look them up, but in non-scientific contexts, many people probably wouldn't know what they mean. Andrew Sheedy (talk) 22:47, 10 January 2019 (UTC)[reply]
  • Oppose. Some are so basic that excluding them would put us on the wrong side of being a dictionary. bd2412 T 14:13, 11 January 2019 (UTC)[reply]

Without exception

[edit]

By default, with exceptions

[edit]
Arguing here whether there shall be exceptions allowed “on a case-by-case basis via consensus” is nonsensical because the option always remains and posterior consent switch cannot be excluded by consent (a contradiction like “consensual non-consent”, “voluntary slavery” etc.), or it would not be allowed by rules we cannot decide. So, as long as I see an outline of an exception I desire exclusion without exception since I do not know any exception. Indeed “freely” does not mean anything so far and won’t probably. You can only argue for certain exceptions, not if there shall be exceptions or for exceptions of indeterminable meaning. Fay Freak (talk) 21:22, 10 January 2019 (UTC)[reply]
Indeed, I must say I don't see much difference with the "Include only formulas that are attested in non-scientific contexts" option below. Per utramque cavernam 21:33, 10 January 2019 (UTC)[reply]
I assume that if we adopt the rule exclude but, the exceptions will be stated as part of the rule, like we have done for WT:BRAND and WT:FICTION. My current preferred rule for exceptions is stated below at #Include only formulas that are attested in non-scientific contexts – which should not be a surprise, considering that this is essentially the rule I have proposed myself. I can imagine, though, that we can live with other versions. By “used freely”, I meant, “used without further explanation”. If a news report mentions that vats were labelled with C3H8NO5P, but then goes on to explain that this is the chemical formula of glyphosate, it would not count as free use. This is similar to what we have at WT:BRAND. The commonality is whether the author assumes the reader is familiar with the term.  --Lambiam 22:57, 10 January 2019 (UTC)[reply]

Exclude formulas with more than a certain number of symbols (how many?)

[edit]

Please indicate if there is a cutoff beyond which you think formulas should be excluded; for example, if you would exclude any formulas with more than eight element-symbols (like CH₃CH₂OCH₂CH₃). Perhaps we will be able to agree (and then vote on) an upper bound.

Exclude formulas with parentheses

[edit]

Please indicate if you would support or oppose excluding chemical formulas which have parentheses in them, like Al(NO₃)₃. A rationale is that these are more clearly formulas of which the component parts should be looked up separately.

Include only formulas that are attested in non-scientific contexts

[edit]

For example, a scientific paper or popular-science magazine article on the synthesis of carbon compounds would not attest CO₂, but a murder mystery saying "the air in the scuba tank had been replaced with CO2" could.

Comment: deciding whether some works are "scientific" or not will be a bit fuzzy, but we have other fuzzy policies, most notably deciding whether or not something is WT:SOP (and to some extent WT:BRAND, in deciding exactly how much can be said about a product, e.g. that someone drank it, before the product counts as having been "identified" within the text). - -sche (discuss) 02:39, 9 January 2019 (UTC)[reply]
I think that we should exclude usages in textbooks (where such formulae will be used more than normal). SemperBlotto (talk) 07:37, 9 January 2019 (UTC)[reply]
  • Support, we can niggle over the details of what counts as we go, but I like the spirit of this option. If the formula is so common that it is being used without explanation in fiction or general news stories then it makes sense to define it. - TheDaveRoss 14:35, 9 January 2019 (UTC)[reply]
  • Support. This is similar to other exceptions to general exclusion rules, like for brand names or entities from fictional universes.  --Lambiam 16:11, 9 January 2019 (UTC)[reply]
  • Oppose We should certainly have formulas that are attested in popular science books (eg, Napoleon's Buttons) and journals (eg, Scientific American, Popular Science). DCDuring (talk) 16:34, 9 January 2019 (UTC)[reply]
  • Support. Andrew Sheedy (talk) 06:55, 10 January 2019 (UTC)[reply]
    To elaborate, I'll repeat what I said above: "In scientific contexts, we can probably expect readers to understand them and not need to look them up, but in non-scientific contexts, many people probably wouldn't know what they mean." Andrew Sheedy (talk) 22:47, 10 January 2019 (UTC)[reply]
  • Support. A formula worth including would likely be one that would occur outside of a technical context. bd2412 T 21:18, 10 January 2019 (UTC)[reply]
  • Oppose We shouldn’t introduce inclusion criteria by literary genres. Even more so then we shouldn’t include by a scientificity criterion which is an epistemic and not a formal criterion and of dubious identification, this being aggravated by teleological interpretation giving the concept of science another twist and thus adding even more confusion. Fay Freak (talk) 21:36, 10 January 2019 (UTC)[reply]
    We allow names from fictional universes, but do not accept the fiction in which they occur for attesting citations. Instead, we require citations that are independent of reference to that universe. This is not meant to discriminate against fiction as a literary genre (although it does). It does ensure that the author of the citation assumes that the term in question has entered the lexicon. The exception proposed here serves the same purpose.  --Lambiam 11:40, 12 January 2019 (UTC)[reply]
  • Oppose. Any time a chemical formula is used there's a scientific context. DTLHS (talk) 16:21, 11 January 2019 (UTC)[reply]
    If someone says "Drink lots of H20", that's not a scientific context, but it is a chemical formula. Andrew Sheedy (talk) 17:12, 11 January 2019 (UTC)[reply]
  • Weak Support: Looks not too bad. I have posted other candidate criteria to "General discussion" section below. I think this discussion would have better started with exploration of candidate criteria. --Dan Polansky (talk) 15:30, 13 January 2019 (UTC)[reply]

Soft-redirect (Template:no entry) any excluded formulas to Wikipedia

[edit]

Rationale: this way, for any formula which we exclude, people can still type the formula into the search bar and find content.

While we're on this topic: should we lemmatize regular or subscript numbers?

[edit]

For any chemical formula with numbers that we do include, please indicate if you'd rather lemmatize the form with regular numbers (H2O) or the form with special Unicode subscript numbers (H₂O). (We can create hard or soft redirects from the other form.)

I think lemmatizing the forms with subscript numbers and creating hard redirects from the other form (unless it's citable, in which case it should be a soft redirect). Andrew Sheedy (talk) 04:36, 9 January 2019 (UTC)[reply]
As has been enough noted on other occasions, citations or usage are a bad guide in finer Unicode matters. In this case it is easily an editorial decision to have entries only in one form and always hard-redirect to the other. Even if chemical formulae are included – hopefully not –, then I doubt anyone wants to pursue attesting such typographic details. Then we would also want to display structural formulae in quotation templates and many other nasty things just to quote materials/books as they display content. Fay Freak (talk) 07:41, 9 January 2019 (UTC)[reply]
That reminds me: Wikipedia seems to mostly use regular numbers with <sub> tags, and it might often be impossible to tell whether a book was typeset using a mechanism like that, or using Unicode's special subscript numbers. That said, using Unicode subscript numbers to represent subscript numbers in books would seem(?) to be technically valid/sound, unlike using ʳ in Mʳ, so it's just a question of whether we want to do it or not. - -sche (discuss) 11:14, 9 January 2019 (UTC)[reply]
I would prefer regular numbers (I believe we can change the actual displayed headword with some kind of template). Wiktionary supports all kinds of formatting (bold, superscript, etc.) so we can rely on those capabilities and not on the rather hacky variant and legacy forms that Unicode is full of. Equinox 16:14, 9 January 2019 (UTC)[reply]
It raises the question, though, are there any chemical formulae that differ only in whether a number is subscripted? Equinox 16:14, 9 January 2019 (UTC)[reply]
Numbers occurring in chemical formulas are always subscripted or superscripted, but superscripts are used for specific purposes. For example, the formula for the phosphate ion containing radioactive phosphorus-32 is [32PO4]3−. Formulas with superscripts are unlikely to pass muster for lexical purposes. Disregarding superscripts, moving subscripts to the baseline is a lossless transformation. If superscripts have to be taken into consideration, regularizing all to the baseline might create an ambiguity, although it will be difficult to construct an example, and almost certainly impossible to find a realistic example.  --Lambiam 16:51, 9 January 2019 (UTC)[reply]

Exclude all attested chemical formulas except for H2O and CO2

[edit]

Rationale: this is what people keep giving as examples of things that we "should" include. DTLHS (talk) 17:15, 11 January 2019 (UTC)[reply]

Ha. But a good policy incorporates reasons for what it is doing, and isn't just a sort of black box. Equinox 17:20, 11 January 2019 (UTC)[reply]
What about excluding chemical formulae that aren’t also attestable from poetry (including raps)? There could be also the exclusion ground “this poem has mainly been invented to promote chemistry”, for cases like rapping professors, but else it sets a natural limit to used formulae by meter and consonance limits. Fay Freak (talk) 20:10, 11 January 2019 (UTC)[reply]
Blackalicious would like a word with you. They use Ca(OH)2 and NO2 at the very least. - TheDaveRoss 20:21, 11 January 2019 (UTC)[reply]
Not bad. Though myself I, more radically inclined, am against these as SOP my suggestion seems practicable, like I wouldn’t care to add them but it also cuts the sharp edges. Fay Freak (talk) 20:35, 11 January 2019 (UTC)[reply]

Exclude chemical formulae except those (attested in running text) that people may reasonably mistake for acronyms or for other non-chemical-formula words

[edit]

This would allow KCN if attested in running text but not H₂O. Rationale: It's reasonable to look up KCN in a dictionary if it's found in running text. It's unreasonable to look up in a dictionary something that's obviously a chemical formula.​—msh210 (talk) 21:26, 14 January 2019 (UTC)[reply]

General discussion

[edit]

Make comments here, or add additional proposals above this section. :) - -sche (discuss) 02:27, 9 January 2019 (UTC)[reply]

  • I note that most relatively popular works that have chemical formulas have them in or with structure diagrams. I believe we should favor entries for attestable formulas for which we can provide a graphical ostensive definition and for which there is a name found in running text, however technical the source. I suppose this could be considered a "value-added" criterion. If we can add sufficient value to a potential entry, it should become an actual entry. DCDuring (talk) 16:45, 9 January 2019 (UTC)[reply]
    Isn’t this more of an encyclopedic than a lexicographic task? A name found in running text, would that include “DOTA-E{E[c(RGDfK)]2}2”, as in the sentence “The structural formula of DOTA-E{E[c(RGDfK)]2}2 is shown in Fig. 1c.”?  --Lambiam 22:32, 9 January 2019 (UTC)[reply]
    It's a matter of providing useful definitions. Definitions need to break out of the cycle of words to establish contact with the physical world from time to time. That's why we have pictures and diagrams in entries and should have more. If we are going to have some chemical formulas or even tedious chemical names, we might want to make sure that we are adding value by having them. Image availability is a consideration. It is also a form of attestation. DCDuring (talk) 23:35, 9 January 2019 (UTC)[reply]
  • I hate to throw in a new option while there are so many votes already but this is a perfect use of the Appendix: space. We shouldn't delete valid information--we should store it appropriately. —Justin (koavf)TCM 18:50, 9 January 2019 (UTC)[reply]
As you say, we should store it appropriately. Even though trigonometric formulas are valid information, we don't host them here, even in the appendix. Per utramque cavernam 18:54, 9 January 2019 (UTC)[reply]
A formula isn't a word, term, or name. Dictionaries record the latter but not the former. —Justin (koavf)TCM 19:28, 9 January 2019 (UTC)[reply]
H2SO4 (formula) is to O (oxygen) as 6+3=9 (equation) is to 3 (digit) or + (operator). We can cover the components but IMO should not attempt to include the virtually limitless "sentences" spelled out with them. Equinox 19:30, 9 January 2019 (UTC)[reply]
Correct, also imho H₂O, CO₂, NaCl should be deleted because of being SOP. (Do you think I joke? Why?) This would also solve the there being both Translingual and English entries. Remarkably enough for the constituent parts there aren’t English entries.
I wanted to pun about “sum formulae” but I see that they are called so in languages other than English and English calls them molecular formula. But look at German Wikipedia “Summenformel” which has a nice table of projections. Should we include all these projections? Sure we can’t include the structural projections because of technical reasons, but the linear projections aren’t of different nature. Wiktionary does not mean “include everything that is linear”. If Wiktionary were to grow to include chemical formulae in a remarkable extent I am surprised if there isn’t any policy by which some Wikimedia bureaucrat is obliged to bust up this project because of this luxury. In Germany it would be 1) for Wikimedia trustees embezzlement by omission to tolerate Wiktionary adding chemical formulae 2) lead to liability for additional expenses caused by these measures perhaps for anyone who voted for a policy allowing it. Fay Freak (talk) 20:45, 10 January 2019 (UTC)[reply]
"H20" means or refers to or is equivalent to "water" or "dihydrogen monoxide". "3 + (5/8)" doesn't stand for anything. Also, there can be an infinite number of mathematical statements but not an infinite amount of chemical compounds, so it would be inherently foolish to start making pages like Appendix:15+8+87+9. —Justin (koavf)TCM 19:39, 9 January 2019 (UTC)[reply]
There is in fact literally an infinite amount of potential chemical compounds. DTLHS (talk) 22:28, 9 January 2019 (UTC)[reply]
I may have to defer to your expertise here (I don't see how that's possible with ~118 elements, many of which are ephemeral) but even so, there is no H48580, H48590,H48600... but there is 1+1, 1+2, 1+3, ... —Justin (koavf)TCM 22:33, 9 January 2019 (UTC)[reply]
There is CH4, C2H6, C3H8, C4H10, C5H12, C6H14, C7H16, C8H18, C9H20, C10H22, and so on, the simplest representants of which are the linear alkanes. If the universe is infinite, there may even be an actual infinity of extant chemical compounds.  --Lambiam 22:45, 9 January 2019 (UTC)[reply]
Good to know. But the other good thing is that the citable chemicals are finite and by definition documented. So we don't need to guess if somewhere there is sililcone-based life that makes 5H785L somehow, whereas I can "document" all kinds of new and perfectly valid mathematical statements all the time that have never existed before (e.g. "−194850329328230932238239238*(1893349834710138103823/10935038430583503498)". I think the difference is frankly obvious and germane. —Justin (koavf)TCM 23:09, 9 January 2019 (UTC)[reply]
I think this discussion should have better started with exploration of putative criteria. In another discussion, I mentioned the following ones:
1) Keep a chemical formula only if it involves no more than 3 chemical elements and no more than 10 atoms.
2) Keep a chemical formula only if the chemical it denotes has a CFI-meeting name: e.g. H₂SO₄ has sulfuric acid or AsH₃ has arsine. This criterion ensures that the inclusion of chemical formulas no more than doubles the number of items in the dictionary.
I think especially 2) is worth considering. --Dan Polansky (talk) 15:28, 13 January 2019 (UTC)[reply]

Rename non-lemma categories to match the format approved in vote

[edit]

This vote changed the naming scheme of categories for comparatives and superlatives, but in a way that does not match any existing non-lemma categories. The standard name for such categories was "POS xxx forms" (such as Category:English noun plural forms, Category:Northern Sami noun possessive forms, Category:Armenian verb passive forms, Category:English verb simple past forms, Category:Arabic adjective plural forms, Category:Bulgarian adjective feminine forms and so on). Meanwhile, the naming scheme for subcategorised lemmas was "xxx POSs" (such as Category:Dutch diminutive nouns, Category:English uncountable nouns, Category:German reflexive verbs, Category:Armenian diminutive adjectives etc.). In some cases, an entirely different POS term is used for non-lemmas, such as "participles" and "infinitives"; these are implicitly non-lemmas by virtue of the part of speech, i.e. a participle and an infinitive are always a non-lemma and cannot be a lemma.

Now that the vote has passed, however, there are two categories which do not fit this naming scheme anymore. Category:English comparative adjectives has the name of a lemma subcategorisation, suggesting that a comparative adjective is a kind of adjective lemma like "diminutive noun" is a kind of noun lemma, but it is now categorised as a non-lemma. The same for Category:English superlative adjectives. Since this suggests that there are no longer separate naming schemes for lemma and non-lemma categories, I propose to realign all other existing non-lemma categories with these two new names. Thus:

This will make the naming consistent with the vote. The only categories that will retain the word "forms" are the base-level categories without a qualifier, e.g. Category:English noun forms and Category:English adjective forms. —Rua (mew) 13:15, 11 January 2019 (UTC)[reply]

I'm going to link a thread on RFM and another one on RFC to see the context. For the record, I do not support this change, as I consider comparatives and superlatives to be similar to participles yet different enough from other forms for them to be exactly considered comparable. — surjection?13:21, 11 January 2019 (UTC)[reply]
The vote aligned "comparative adjectives" and "superlative adjectives" with "participles", by not including the word "form" anymore. This proposal aligns all the other categories with "participles" as well, by not including the word "forms" anymore. —Rua (mew) 13:30, 11 January 2019 (UTC)[reply]

Proposal: Japanese Classical (文語体) conjugation/inflection table for Japanese entries

[edit]

This is how it is supposed to look like:

{{#invoke:User:Huhu9001/000|japanese_classical_conjugation|kanji=過|stem=す|ctype=2u-g}} {{#invoke:User:Huhu9001/000|japanese_classical_conjugation|kanji=得|ctype=2d-a|suffix_in_kanji=}} {{#invoke:User:Huhu9001/000|japanese_classical_conjugation|lemma=ぬ|kana_adv=ず<br>ん|kana_ter=ぬ<br>ん|kana_adn=ぬ<br>ん|kana_rea=ね}}

-- Huhu9001 (talk) 15:02, 11 January 2019 (UTC)[reply]

I like this, thank you. One concern: our readership consists of English-language readers, so listing conjugational info only in Japanese, such as ガ行上二段活用, seems inappropriate. Even more so when that potentially-illegible string isn't even included on the Appendix:Japanese_verbs page, leaving users unable to search easily. What about something like, g- stem, upper bigrade? Appendix:Japanese_verbs would also need updating to describe the situation for classical verbs. ‑‑ Eiríkr Útlendi │Tala við mig 21:25, 11 January 2019 (UTC)[reply]
To Eiríkr Útlendi: Appendix:Japanese_verbs#Classical_Japanese -- Huhu9001 (talk) 04:57, 12 January 2019 (UTC)[reply]
To Eiríkr Útlendi: How about 上二段活用 or ガ(ga)行上二段活用? -- Huhu9001 (talk) 04:59, 12 January 2019 (UTC)[reply]

It's done. {{ja-conj-bungo}} -- Huhu9001 (talk) 13:51, 13 January 2019 (UTC)[reply]

  • There are still usability issues here, presenting avoidable barriers to our English-reading users. I feel somewhat strongly that we cannot provide the conjugational type only in Japanese.
Although the Appendix:Japanese_verbs#Classical_Japanese section does describe classical conjugations, as previously noted, the strings ガ行上二段活用 and even 上二段活用 are nowhere to be found. Linking to is unfortunately of no apparent utility for explaining ガ行 in this context. While linking through to the JA entry for 上二段活用 is slightly better than plain text, it still hides the English rendering from the user, forcing them to click through. As currently (2019-01-14) implemented at {{ja-conj-bungo}}, ガ行上二段活用 links through to 上二段活用, leaving the ガ行 portion unexplained.
Could we not present this information in English instead? ‑‑ Eiríkr Útlendi │Tala við mig 20:16, 14 January 2019 (UTC)[reply]
To Eiríkr Útlendi: I suggest you give a list of the inflection names you want to apply to this template. -- Huhu9001 (talk) 13:31, 15 January 2019 (UTC)[reply]
To POKéTalker: Fixed. -- Huhu9001 (talk) 08:52, 20 January 2019 (UTC)[reply]

Unnoticed request for unblock

[edit]

Hello! I write here because I haven't found a local equivalent of Wikipedia:Arbitration Committee. If it is my mistake, please direct me to the right place.

Today my semi-static IP has been unblocked by timer after 1 month of block. Soon after I have found that I am blocked, on 23 December, I have written an unblock request, but until now nobody commented it. Like a confirmation that the block was unjust and should be removed, or, contrary, that it was a proper punishment for my deeds and should remain as is. No any comment, no any action. Is the indifference to such request normal here? --109.252.109.37 17:41, 13 January 2019 (UTC)[reply]

Can we get a Russian speaker to look into this please? Equinox 18:06, 13 January 2019 (UTC)[reply]
@Equinox What was written on Atitarev's talk page? It's hard to judge without that. But the block seems harsh. Per utramque cavernam 18:20, 13 January 2019 (UTC)[reply]
@Atitarev added a Russian translation to kick-ass a few years ago. 109.252.109.37 lately disagreed with the translation and he/she had the usual options of changing the translation, adding another translation, and/or leaving a note on Anatoli's talk page explaining his/her POV about the translation (using civil, nonconfrontational language). 109.252.109.37 chose not add a different translation, but instead left this offensive message on Anatoli's talk page:
Your translation. Were you able to add the non-obscene term instead, which exists at least in the Russian Wiktionary? For example: наглый, задиристый, крутой etc. Or the Russian obscene lexicon is your primary dialect?
Anatoli initially ignored the attack and deleted it. 109.252.109.37, refusing to be ignored, came back with this comment:
That rollback is not an error, it's your moral position. Feel free to revert this post as well. Good luck in translations. --109.252.109.37 11:49, 13 December 2018 (UTC)[reply]
I would have blocked 109.252.109.37 as well. There is no excuse for this unprovoked attack. In my opinion, the block was appropriate. —Stephen (Talk) 20:32, 13 January 2019 (UTC)[reply]
I don't think that's particularly confrontational; it might just represent annoyance. Equinox 20:51, 13 January 2019 (UTC)[reply]
Anatoli's Russian translation was correct. 109.252.109.37 disagreed with the register, but there is no Russian term that means the same thing and also matches the register the of the English word. Accusing Anatoli of only being able to speak in obscenities was a deliberate and disingenuous insult. There was no reason to be annoyed. If the Anon disagreed with the translation or register, he or she could have suggested another translation. Because if his or her aggressive and combative comment, he deserved to be blocked. —Stephen (Talk) 21:03, 13 January 2019 (UTC)[reply]
Banning instead of stating the main point that Atitarev wasn’t obliged to add any translations, and indeed an obscene word is a possible translation and particulary if the translated English term contains a mildly vulgar word, and that morality claimed by the IP does not make sense since adding a vulgar translation is better than adding none? Better teach the IPs what they miss instead of blocking them. Having strange morality is not a ban reason. Worsening the dictionary is, but it does not seem to happen if an IP asks for better translations, be it with strange arguments and an insolent rhetorical question or be it without. Assume good faith. Some people are just bad at being flattering. That “insult” is a conjecture. Why do you think he wanted to insult? What is “deliberate”? Anything written here is deliberate since we try to think before posting, but it does not get good completely anyway even if we try. I don’t see the “accusation”, it is a rhetorical question the answer of which was implied as “no”, plus even if it wasn’t an insult an insult is not a ban reason: We have defined insult as being insensitive, but people just are insensitive. Maybe he hasn’t learned well to be sensitive but he still is concerned about a good dictionary and actually moved towards this goal, so what? No bannable “annoyance” (which could harm the lexicon by siphoning off attention) is there if simultaneously honest questions are asked (apparently he wasn’t so smart to see that his argument was faulty and thus asked honestly), since no plan suitable for harming is there. People let themselves be insulted too much by concluding insults: Really, if the IP was there to insult he could have written the insult and it would have been an insult, otherwise it was just maladroit. Else everything that is written anywhere is annoyance, there isn’t anyone around here that doesn’t annoy me, myself included. One could wish people wouldn’t post on talk pages and make best edits without them but somehow people need to talk on talk pages with rare avail. If we banned everyone who said objectively wrong or immoral or useless things … it’s about prognosis guys. Mojshahmiri (talkcontribs) has been banned for promising to add crank theories (his prognosis has been that it is not worth to groom him), this IP would have done what without a ban? Would it have learned to be sensitive? Man, Wiktionary is a minefield if sensitivities count. Feelings on tight reins please. Reason must prevail! Fay Freak (talk) 21:44, 13 January 2019 (UTC)[reply]
Blocking somebody and then deleting the history of what they did is damnatio memoriae at least, and Orwellianism at worst. Is one rude comment on a talk page worth this? I think that's disturbing and wrong. Equinox 21:48, 13 January 2019 (UTC)[reply]
Any blocks for actions other than blatant vandalism deserve explanation, especially if requested by the blocking party. Based on Stephen's translation I don't think the block was warranted, and certainly not without some form of communication. - TheDaveRoss 02:09, 14 January 2019 (UTC)[reply]
Anatoli DID communicate the reason for the block: Intimidating behavior/harassment. And many of us block anons all the time for the same or similar reasons, and with the same or less communication. Hardly a day goes by that I don't see some IP complaining about some admin abuse without any communication. As far as deleting the history, every one of us admins could still see it and could have looked at it just as easily as I did. This was a common action. If you want to take Anatoli to task, then let's pillory all the other admins who have done the same or worse. It's a tempest in a teapot. —Stephen (Talk) 05:17, 14 January 2019 (UTC)[reply]
I meant communication with the user prior to the block, i.e. about the translation issue and perhaps about their manner of raising their concerns, but I can see I wasn't clear. For the rest of it, the root problem is a lack of assumption of good faith in borderline cases. Even if the tone of the communication was poor, the content is perfectly reasonable and on topic, no reason to delete it and block the person for asking. I don't think Anatoli deserves to be punished for this or anything, but when issues such as this come up I think it is worth sharing how we each would handle it so that we can be more evenhanded going forward. - TheDaveRoss 13:47, 14 January 2019 (UTC)[reply]

FileExporter beta feature

[edit]

Johanna Strodt (WMDE) 09:41, 14 January 2019 (UTC)[reply]

Banning Altaic reconstructions

[edit]

At the present there are no non-controversial reconstructions of Proto-Altaic, in fact the Altaic theory itself is a controversial hypothesis. In practice, allowing reconstructed Proto-Altaic entries means copying from the Etymological Dictionary of the Altaic Language (EDAL) by Starostin, Dybo and Mudrak.

EDAL reconstructions are based on ad-hoc soundlaws justified by semantically dubious comparisons, lack of strictness in lower-level languages, faulty philology and generally too many researcher degrees of freedom; it is not merely a controversial representation of an Altaist tradion, it is not homotopic to an earlier body of knowledge regarding sound correspondences within the proposed language family (such a thing exists only in fragments), rather it creates a completely new reconstruction using very tenuous soundlaws, with no prior precedent, to fit cognate sets which are also not traditionally accepted and which by themselves can only be called 'doubtful' at best.

I would like us to ban Altaic as a language family completely, since its only function seems to be smuggling lousy comparisons along with promising ones (usually Turkic-Mongolic, Mongolic-Tungusic or Korean-Japanese) and as a shorthand for "it appears in Mongolic and Turkic language and I can't be bothered to investigate the etymology further", but I would be fine with reducing it to an etymology-only language to stop further proliferation of garbage copy-pasted entries. Crom daba (talk) 22:59, 15 January 2019 (UTC)[reply]

Support --{{victar|talk}} 23:25, 15 January 2019 (UTC)[reply]
SupportTom 144 (𒄩𒇻𒅗𒀸) 11:58, 16 January 2019 (UTC)[reply]
The earlier vote for this was Wiktionary:Votes/2013-11/Proto-Altaic. Personally, I feel that the reconstruction template itself needs to display a notice about how controversial the Altaic hypothesis actually is serves as good evidence that it isn't exactly our most useful content. — surjection?08:28, 16 January 2019 (UTC)[reply]
Of course it was Ivan who pushed for that. @Crom daba, if you want that vote reversed, I think you'll need to create a new vote. --{{victar|talk}} 17:53, 16 January 2019 (UTC)[reply]
I really know nothing about this so I will not vote on it, but it seems to me that even if Altaic is so controversial, the content should still be archived somewhere in an appendix - it seems like a waste to just erase it altogether. If we can have an Appendix:A Clockwork Orange, surely we can have an appendix for controversial reconstructions. (Perhaps that appendix should not be linked to from mainspace, however.) — Mnemosientje (t · c) 18:20, 16 January 2019 (UTC)[reply]
It's already archived by people who promulgate these reconstructions (StarLing). We are not and do not need to be a repository of all knowledge that is tangentially related to linguistics. DTLHS (talk) 18:23, 16 January 2019 (UTC)[reply]
Support, but I too think that we should rightly have a vote in order to overturn a previous vote. —Μετάknowledgediscuss/deeds 20:15, 16 January 2019 (UTC)[reply]

Okay, here's the vote. Not sure if I set it up correctly though. Crom daba (talk) 23:03, 16 January 2019 (UTC)[reply]

@Crom daba: I think it would be better to keep option 1 only for now. If it doesn't pass, we can put option 2 to the vote later. Per utramque cavernam 23:34, 16 January 2019 (UTC)[reply]
Okay, sounds good. Crom daba (talk) 23:45, 16 January 2019 (UTC)[reply]

Project proposal: Enrichment of multilingual STM terms

[edit]

Hallo all,
I would like to propose a research project aimed to enrich the Wikitionary in the STM (scientifical, techncal, medical) domain.
As a starting point lays the observation that many terms (typically named-entities) are present in scientific literature sources, but they do not still have an entry even on the English Wikitionary, which has the best coverage. This situation is even worse for some "new" terms, which are certainly of interest, and for non-English Wikitionaries.
On the other side, it has to be observed that some of the information which is not available on the Wikitionary can be extracted from Wikipedia. Hence the project objective are:
a) the Wiktionary will be extended for STM relevant terms in English and Italian as well, for thousands of terms.
b) The whole process will be validated for two languages (English and Italian) having different coverage and characteristics between Wikitionary and in Wikipedia.
The result would be very useful for who works in the research field.

Tasks:
1) I will identify from the the STM English literature from the sampled areas, including hot topics (e.g. Artificial Intelligence) and some new terms which are not present in the English Wikitionary;
2) Then, I will create such new English Wikitionay entries with a semi-automatic supervised process which will include as much as possible what can be inferred from Wikipedia (e.g. term disambiguation, different translations, etc.).
3) Then, I will validate this entry process for the italian language also, which is my native language: in this case, I will directly enrich manually the entries in the cases when the algorithm identifies names which can not be inferred from Wikipedia.
4) Then, I would document this (multi-language) process in a detailed pseudo-code, resulting in a open-access paper as a further project. I think that this result is preferrable than delivering a language-specific implemented piece of code, since creating/mantaining software should be further tasks.

To support the project proposal please leave a comment at the bottom of the project page.
Thank you,
Best
--Marco Ciaramella (talk) 16:30, 17 January 2019 (UTC)[reply]

How will you ensure that these terms meet WT:CFI? Wikipedia tends to invent words in order to translate concepts between languages. DTLHS (talk) 16:41, 17 January 2019 (UTC)[reply]
@DTLHS Good point. Briefly, since any translations into Italian of an existing word from the English Wikitionary could be problematic, this is the reason why such entries must be validated manually (as stated at the point #3). --Marco Ciaramella (talk) 19:46, 17 January 2019 (UTC)[reply]
I think this sounds great! My only concern is about the nature of the semi-automatic process. What part of entry creation do you see as being automatic? Andrew Sheedy (talk) 04:24, 18 January 2019 (UTC)[reply]
@Andrew Sheedy Thank you for the feedback ! :-) The semi-automatic fashion would involve the English terms enrichment process (however, it is intended that the analysis of the input sources and the generated names are one part of the project), referenced at the point 2. --Marco Ciaramella (talk) 08:31, 18 January 2019 (UTC)[reply]
I'm not too keen on using Wikipedia as a prime source of technical terms. Wouldn't it be better to harvest them from online scientific journals? I'm currently working my way through PLOS ONE, finding hundreds of words we would never otherwise have. And I can't see how you are going to arrive at definitions in a semi-automatic manner. Perhaps you could start in a very slow way so we can see some examples. SemperBlotto (talk) 07:13, 18 January 2019 (UTC)[reply]
@SemperBlotto This is another interesting really to-the-point feedback, thank you. One intended task of my proposal is the discussion about the use of Wikipedia as a bootstrap source for the seeds of new (related) terms, and some related topics (e.g. how the generated terms are related each other, etc.). The human supervision at this stage is aimed mainly to assess the results of such process. However, this can obviously not exclude from the discussion what can be generated from (open) literature, which is considered a primarily source for Wikitionary too - and often cited as reference or in the Wikitionary meaning examples. I have some knowledge about your project and I would like to include some of eventual related-results or at least paper/project reference about PLOS in my final (also, open) publication. --Marco Ciaramella (talk) 08:31, 18 January 2019 (UTC)[reply]
@Marco Ciaramella: It's spelled Wiktionary, not Wikitionary :) You mention that typically named entities are missing in Wiktionary. But this is probably the least interesting type of entry, as the English and Italian translations will likely be identical? I general I don't see a problem with using Wikipedia as a source, especially for multilingual work it will be more useful. – Jberkel 23:05, 23 January 2019 (UTC)[reply]

Hyphens and dashes in entry titles

[edit]

Hi, An editor just pointed to me that dashes are not currently used for entry titles on English Wiktionary. Unfortunately, the only justification they managed to quote was this page: Wiktionary:Entry titles. This page currently doesn't say anything about not using dashes, and neither does the List of unsupported characters. So is there an actual policy on whether dashes could/couldn't be used in the entry titles? There are quite a few legitimate cases for them (as well as for hyphens obviously – as each of them has their own specific usage rules). But blindly advocating for using hyphens for everything resembling them (incl. en-dashes and em-dashes) seems to be an unnecessary simplification of typographic conventions. Cherkash (talk) 02:39, 23 January 2019 (UTC)[reply]

@Cherkash, see Wiktionary:Entry_titles#Punctuation: this section has clearly stated, since December 30, 2010, that:

In most languages, the HYPHEN-MINUS is used for the hyphen, not any of the dashes.

I suggest we continue with this common practice. ‑‑ Eiríkr Útlendi │Tala við mig 00:11, 26 January 2019 (UTC)[reply]
So just to be clear, @Eirikr: the way I read this (arguably, a badly phrased) passage is: "Don't use HYPHEN-MINUS for anything but the hyphen; and certainly don't use it for any of the dashes." Is this what you meant as well, just to reaffirm that hyphen-minus sign shouldn't be used for anything else but the hyphen? Cherkash (talk) 00:55, 26 January 2019 (UTC)[reply]
What are the "legitimate cases for them"? It would seem that we would just need to exercise or add search rules that fold all the hyphens and dashes into one. DCDuring (talk) 02:52, 23 January 2019 (UTC)[reply]
Or we could create redirects. — SGconlaw (talk) 03:27, 23 January 2019 (UTC)[reply]
Looks like the search engine does not automatically redirect "Tay–Sachs disease" with en-dash to Tay-Sachs disease with hyphen-minus, but it does return the hyphen-minus version at the top of the list of results because it considers punctuation characters as word separators. — Eru·tuon 04:25, 23 January 2019 (UTC)[reply]
@Cherkash: Can you give me an example of an endash- or emdash-appropriate title? —Justin (koavf)TCM 03:16, 23 January 2019 (UTC)[reply]
@Koavf: En-dash–appropriate example: Tay-Sachs disease. Cherkash (talk) 03:26, 23 January 2019 (UTC)[reply]
Good point. Perfectly valid name--please do create it in the correct form. —Justin (koavf)TCM 03:51, 23 January 2019 (UTC)[reply]
Do not create it, and don't tell people to create it. If you absolutely insist on using a special dash character it can go in the headword line. DTLHS (talk) 03:52, 23 January 2019 (UTC)[reply]
Agreed, unnecessary. --{{victar|talk}} 04:08, 23 January 2019 (UTC)[reply]
@DTLHS: Why? Why would we be opposed to proper typography, especially when we can trivially source it? —Justin (koavf)TCM 04:10, 23 January 2019 (UTC)[reply]
So put it in the fucking headword line. Why the fuck should we waste our time creating entries with trivial punctuation differences when instead we could just say every entry uses a hyphen and be done with it. DTLHS (talk) 04:15, 23 January 2019 (UTC)[reply]
Calm down, @DTLHS! Why so much anger? Please keep it civil. Dashes are no more special than hyphens. And they are standard punctuation in their own right, which has its own usage patterns. What's your reason to insist to avoid them? Cherkash (talk) 04:20, 23 January 2019 (UTC)[reply]
@DTLHS: No one is asking you to do anything and you don't have to be rude to me. —Justin (koavf)TCM 04:22, 23 January 2019 (UTC)[reply]
@Cherkash: You're wrong, em-dashes are more "special", as you say, because they aren't typically intended to be used in URLs and may even cause some encoding problems. They would definitely be f**k-all annoying for people typing in the URL. It's a bad idea all around. --{{victar|talk}} 04:31, 23 January 2019 (UTC)[reply]
Evidence for your claims, @victar? As far as I know, URL encoding schemes, as well as the Wiki engine, handle dashes and any other non-ASCII characters gracefully. Other Wikis (e.g., Wikipedias in many different languages) also have no problem with them. Cherkash (talk) 04:57, 23 January 2019 (UTC)[reply]
URLs have to encode dashes and em-dashes as %E2%80%93 and %E2%80%94, respectively, so you're actually suggesting Tay%E2%80%93Sachs_disease as opposed to Tay-Sachs_disease. One concern I have with it, other then it being a total hassle to type for zero benefit, is all the places where Lua mw.ustring.find functions, etc., might not include these special characters. --{{victar|talk}} 05:22, 23 January 2019 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── On Wikipedia, for titles with a dash, the title with a hyphen instead usually redirects to the proper title; for example, Tay-Sachs disease redirects to Tay–Sachs disease. We can do the same here. A bot could create such redirects where they do not already exist.  --Lambiam 12:13, 23 January 2019 (UTC)[reply]

It looks like other dictionaries tend to normalize em- and en-dashes to hyphens. I think we should include entries for both the hyphenated and dashed versions when both exist, but for sanity's sake I think we ought to keep all content at the hyphenated version (using the "proper" punctuation in the header). No reason to exclude the dashed versions, the URL concerns are a red herring, we already have lots of page titles which don't play well with URLs (see: the majority of the worlds languages) and even so the em- and en-dashes are valid in URLs anyway (https://en.wiktionary.org/wiki/Tay–Sachs_disease). - TheDaveRoss 13:55, 23 January 2019 (UTC)[reply]
Most of the entries get there in the first place because someone types in a search term and clicks on the redlink, and most people have no clue about how to produce any kind of dash on their keyboard. That means that most new entries will be "wrong" in a very subtle and non-obvious way, and end up being moved. It's bad enough that we have case sensitivity to mess people up, but at least there we have a very practical reason that people can understand. I think the reason for the depth of emotion on this issue is that it represents a type of prescriptivism, which the community here instinctively distrusts. I don't really want people going around changing double quotes to Smart Quotes and apostrophes from straight to curly ones, and this feels like the same sort of thing. Chuck Entz (talk) 14:55, 23 January 2019 (UTC)[reply]
What supports the assertion an en- or em- dash is the "correct" typography? Is it just printers' esthetics?
If I cut-and-paste a some text that contains a dash from the results page of a search engine, how often will that contain a dash that is not a hyphen?
I don't really see the point of having long dashes even in the inflection line, where their presence might cause on-page (browser-based) searches to miss occurrences of characters that normal (ie, benighted — like me) contributors might expect to be included. DCDuring (talk) 15:31, 23 January 2019 (UTC)[reply]
“Is it just printers esthetics?” – Yes, and also there are rules or guidelines when to use which to specify these aesthetical requirements, though I don’t know where normal people learn them now. They are also language-specific, even of different distribution in various English-language countries though I am not sure whether in what lexically matters (like em dashes more often used in the UK for parentheses).
“how often will that contain a dash that is not a hyphen” – Suddenly now search engines count which list everything of any quality on the web. It depends on the search engine how Unicode confusables are handled. Well I automatically write en dashes when appropriate and everybody should use a layout where correct punctuation are comfortably at hand. But there seems to be no guide anymore. Now only practicability counts. Because people wrongly assume their keyboard layout has all the signs it actually should have.
Not sure why case-sensitivity would mess anything up. Case-sensitivity is everywhere in the Unix world, and users who presume Windows qualities should check themselves. Case-sensitivity is normal and smart quotes and en dashes and em dashes are also normal on keyboard layouts. People who insist on their default layouts being correct also leave IoT decives with the default password “admin”.
The difference between a hyphen-dash and an en dash is also easily seen by the accustomed eye. We can also have ‐ U+2010 HYPHEN and − U+2212 MINUS SIGN if we abolish - U+002D HYPHEN-MINUS which is a character that doesn’t exist in any language (only in programming languages). But we won’t, because Unicode has gone too far. As I have demonstrated Wiktionary:Grease pit/2018/November § U+2019 in notWordPunc there is neither a correct way to implement apostrophes in Unicode correctly since the standard is contradictory. URL standards and Lua standards are also bad because character encoding and input is in the desolate state it is in.
This is to say what can be considered. I do not lean to any of the solutions, though I must point out that if we have a whole sentence like e. g. a proverb entered into the dictionary the hyphen difference is really visible and thus I would expect the larger hyphen used in the dictionary that is used in the respective language. I mean one should not be bullied for assuming that a dictionary uses reasonable typography! Not sure what is expected for these disease names. Anglos are responsible for Unicode and bad keyboard layouts being widespread and for ASCII-centric computer languages and schemes and for typography being lost in 2019 so they shall make the mess themselves clear – they should know. When are en dashes or em dashes totally necessary? Maybe make a list of occasions that could appear as dictionary headers? One could even make a long vote where each use case gets voted upon (but it is per language: writing Tay–Sachs disease in English does not mean German shouldn’t use Tay-Sachs-Syndrom). It is an editorial decision and one cannot make it right for everyone; anything technically works. Fay Freak (talk) 16:39, 23 January 2019 (UTC)[reply]
As a privileged, old Anglo with poor vision, I accept full responsibility for all defects of life as we now it. Applications for reparations will be duly considered.
My concern is solely with English and Translingual entries, the English etymologies, usage notes, and definitions of FL words, and other English-language text in other namespaces. I don't see the point of adding redirects or in any way impeding the use of English Wiktionary by any passive user capable of using a computer or smartphone or any contributor who doesn't want to bother with knowing the first thing about variant dashes/hyphens, etc. DCDuring (talk) 19:42, 23 January 2019 (UTC)[reply]
@Victar: I don't think percent encoding is a problem; it happens with lots of entry names already. I thought of some places where en dashes might have effects on our Lua infrastructure. For some languages, Module:headword would automatically add entry names with en dash to "spelled with" categories; I think that could be changed easily by adding en dash to the PUNCTUATION variable in several language data modules or to individual languages' standardChars fields. And Module:headword has a pattern that matches a punctuation character that cannot appear inside a word (notWordPunc) that is used in the automatic linking in headwords, but I think that wouldn't need to be changed. Currently, "Tay–Sachs disease" would be automatically linked as [[Tay]]–[[Sachs]] [[disease]], which seems right. (If en dash were added to the list, it would be linked as [[Tay–Sachs]] [[disease]].) — Eru·tuon 21:42, 23 January 2019 (UTC)[reply]
Will it create any problems if (for example) the page Bose-Einstein condensate is moved to Bose–Einstein condensate, leaving a hard redirect? (Assuming that we can agree not to consider these to be “different hyphenation forms”, but instead “entries using alternative punctuation marks”.)  --Lambiam 18:55, 25 January 2019 (UTC)[reply]
This approach is problematic, as we must engage in different handling for any such hyphenated EN term that shares its spelling with terms in any other languages.
I'm baffled by this interest in changing the typographics of our headwords. This brings zero value as the new form is lexically equivalent to the existing form. Rather, this change arguably imposes *negative* value as we're having to spend time hashing this out, moving things, redesigning things, and all for a headword form that is demonstrably more difficult for our users to accurately input. Why, for the love of Wiktionary itself, are we wasting time with this? ‑‑ Eiríkr Útlendi │Tala við mig 00:04, 26 January 2019 (UTC)[reply]
Your response suggests that you place more priority on retaining incorrect presentation for the sake of avoiding change, rather than making a change for the sake of correcting the presentation.
—DIV (1.145.44.125 05:48, 29 June 2022 (UTC))[reply]
It can potentially serve to disambiguate, e.g. if Mr Fotheringay-Smythe and Ms Jones find a disease then it might be Fotheringay-Smythe–Jones syndrome. Many sources (including Wikipedia) seem to use en dash, not hyphen, to separate surnames in such terms. I've never really bothered mainly because my keyboard lacks en dash. Equinox 00:13, 26 January 2019 (UTC)[reply]
Another example is win-win (which should be win–win), where there is a bit more discussion on the Talk page.
My keyboard doesn't have all manner of accents or non-English characters, but apparently Wiktionary is able to handle the following, by way of example *pun intended* :
Did I mention that those're in the English-language version of Wiktionary? But then, you all knew that already, didn't you? *rhetorical*
So far as I can see, Wiktionary currently seems to have a policy of that allows almost any required character except for an en-dash. Why on earth pick out this one character to forbid? *not rhetorical*
The only vaguely interesting point to the contrary is that other dictionaries have often (surely not "all" & "always") used hyphens when en-dashes would have been correct. Sure, and newspapers do this all the time (to save space), and social media influencers often do it too (for reasons that should be obvious). So what? *rhetorical* Well-written books, periodicals, and (as mentioned above) Wikipedia all do employ the en-dash when it's called for.
Make the main entry correct, and then include alternative forms and/or usage notes if need be.
—DIV (1.145.44.125 06:13, 29 June 2022 (UTC))[reply]
P.S. Doesn't Wiktionary also have a policy of (loosely speaking) "If you can demonstrate usage of the word, it should generally merit an entry." So why on earth does it seem that (by current practice) the supreme need to never use an en-dash in the headword trumps the general policy? *not rhetorical*
[Signed above, but apparently Wiktionary doesn't recognise that.] 1.145.44.125 06:13, 29 June 2022 (UTC)[reply]
Regarding URL's
Actually, the percent-sign rendering is not obligatory, as the following do work (subject to browser functionality)!
Yet another reason not to forbid the en-dash!
—DIV (1.145.44.125 06:19, 29 June 2022 (UTC))[reply]

Someone needs to go through all contributions of Erminwin

[edit]

Obviously gibberish content on ngựa and many other pages - now blocked. Wyang (talk) 00:07, 26 January 2019 (UTC)[reply]

"Many other pages"? @Erminwin's edits on the relationship between (ine) and (yone) in OJP appears to be likely; and he corrected my edits on (kami). Are there any problems? If it regards Chinese/Vietnamese I'm out of this. ~ POKéTalker02:36, 26 January 2019 (UTC)[reply]
@Wyang: I've been monitoring their edits, and they don't seem to be as bad as you make them out to be. Sure, they could be making some mistakes along the way, but I think they're quite conscientious about their edits. — justin(r)leung (t...) | c=› } 02:57, 26 January 2019 (UTC)[reply]

Wyang edit-warring to unilaterally remove an attested entry

[edit]

User:Wyang is edit-warring to unilaterally remove an attested entry, while refusing to participate in the ongoing RFV discussion. Normally I would block the user in this situation, but Wyang is an administrator, so a block would be useless. I'm at a loss for how to deal with the situation and would appreciate input from others. —Granger (talk · contribs) 01:09, 26 January 2019 (UTC)[reply]

Again a non-native speaker thinking that he understands the language better than native speakers - the situation of Wiktionary:Beer parlour/2017/September#Modern Greek terms spelt with Latin characters played again, where non-native Greek speakers tell native Greek speakers that "marketing" is Greek. Chinese people interpret any three- or four-letter English word as acronym, and write it, pronounce it in such manner, probably because the average level of English there remains on the level of the alphabet in school. "app" is 'ei-pi-pi', "ugg" is "u-gi-gi", "doc" is "di-o-si", "jpeg" is "jei-pi-e-gi", "ppt" is "pi-pi-ti", ... you name it. Wyang (talk) 01:14, 26 January 2019 (UTC)[reply]
Thank you for finally engaging with the discussion. What you're saying is not always true—according to our entry, one counterexample is man#Chinese. I know I've encountered other examples as well and I'll mention them if I think of them. —Granger (talk · contribs) 01:18, 26 January 2019 (UTC)[reply]
You are missing the point. How does pronouncing it as 'ei-pi-pi' and writing it in ignorant capitals mean it is Chinese? Are all Category:English three-letter words Chinese words? Do Chinese people think it is Chinese and do Chinese dictionaries include it as a Chinese word? Wyang (talk) 01:21, 26 January 2019 (UTC)[reply]
立flag is another example.
No, three-letter English words are not Chinese words. I'm not sure how you got that from what I'm saying. I've given several reasons for thinking this is a Chinese word at the RFV discussion; I think the most convincing are that it is used as a word in running Chinese text and that it's the most common Chinese word for app. —Granger (talk · contribs) 01:27, 26 January 2019 (UTC)[reply]
Why would native speakers know better than non-native speakers about whether something is a word in the language for Wiktionary's purposes? Having a distinctive pronunciation in a language and have a distinctive spelling are good signs that something is a word in a language, and it's not unheard of a foreign word being adopted with new meaning and pronunciation, even when native speakers might counterfactually claim that the word is not part of their language.--Prosfilaes (talk) 01:58, 28 January 2019 (UTC)[reply]

In any case, discussion about the entry belongs at RFV. Could someone give me advice about how to deal with the edit-warring situation? —Granger (talk · contribs) 01:28, 26 January 2019 (UTC)[reply]

The reasons that you gave for proving it is Chinese, not English used in Chinese text were that it is written in capitals and it is pronounced as if it is an acronym, but neither of these is a telling argument because all English three- or four-letter words are read and used by the English-incompetent population of China as if they are acronyms. Why are you insisting on verifying something in the wrong language and pretending that you know the language better than people who speak it natively? You are wasting people's time, mate. Wyang (talk) 01:34, 26 January 2019 (UTC)[reply]
I stepped away from the computer to take a walk, and on the walk I decided this entry isn't worth the stress. I'm taking it and the associated discussions off of my watchlist, so please ping me if my input is needed. I hope that you will read my argument more carefully, including my comments above, and that someone will restore the entry. Otherwise its removal will be a loss for our readers. —Granger (talk · contribs) 02:17, 26 January 2019 (UTC)[reply]
Good. Wyang (talk) 02:21, 26 January 2019 (UTC)[reply]
  • This and above are behaviors just so unbecoming for an admin. (@Chuck Entz) --{{victar|talk}} 19:48, 31 January 2019 (UTC)[reply]
    Absolutely. That said, until WMF invests in genetic research to come up with the perfect admin, we're stuck with the imperfect humans that we have. Some of our most problematic admins have made tremendous contributions: Ivan Stambuk, Rua, Liliana60,and Vahagn Petrosyan, to name the first ones that come to mind.
    Most of Wyang's interactions are out of your sphere, but regular contributors in east Asian languages will tell you that he's been unfailingly helpful, patient, and generous with his time. That's in addition to his unmatched expertise in those languages and great technical skill with modules and templates. The problem arises when someone from outside his territory does something that affects it- he morphs from kindly Grandfather Wyang into a ruthless warrior against what he sees as outside meddling. That combination of well-intentioned virtue and disruptiveness makes it really hard to come up with an appropriate response. Chuck Entz (talk) 00:58, 2 February 2019 (UTC)[reply]
    @Chuck Entz: Yet then on the other hand, we're very quick to hand out 3-day blocks to users for the same unbecoming behavior. It seems to me like a double-standard, in a world where admins are immune from recourse. To echo a point I brought up in a current vote, I would like to see more admin tools broken up into roles -- roles that can be more easily given and taken away. In that way, we can also hold admins to a higher standard of interpersonal skills while not taking away needed tools from the less socially inclined. --{{victar|talk}} 01:30, 2 February 2019 (UTC)[reply]

Moving Translation Hubs (and Perhaps All Translations) Out of Mainspace

[edit]

Buried deep in an interminable deletion discussion (Wiktionary:Requests for deletion/English#address using the formal pronoun) was a proposal by User:Per utramque cavernam that I think deserves more attention and full consideration here. I've taken the liberty of extracting the parts of his message and my reply that aren't specific to the discussion:

[]

I still don't think cluttering the mainspace with entries such as "address using the formal pronoun" or "address with the formal pronoun" is a good idea. What happened of the first THUB provision: "The attested English term has to be common; rare terms don't qualify"?

I propose we create an appendix on the V-form / T-form, and put the translations there. Per utramque cavernam 10:03, 26 January 2019 (UTC)

[]

Whether you put it in mainspace or somewhere else, a translation hub is more like an appendix or a footnote rather than an entry- it's not really English, though it claims to be, it violates the spelling-first organization of the dictionary as a whole, and being based on a concept rather than a specific term in a specific language makes it rather encyclopedic. Since no one arrives at it directly, there's no practical reason for it to be in any specific namespace that can't be fixed with a tweak or two to the code. Chuck Entz (talk) 13:26, 26 January 2019 (UTC)

I think his proposal should be generalized to all translation hubs, and, given that translation sections have had to be moved to subpages in several entries because of their use of system resources, we also might consider moving all translations out of mainspace, perhaps something along the lines of the Thesaurus namespace. This would be, in effect, like replacing every translation table in the entries with {{trans-see}}.

Having a separate namespace would make it easier to avoid dueling translation sections in synonyms or regional variants, and allow foreign language entries to have translations as well (everything would be a translation hub). It would take some work to figure out the best way to nUser:Erutuon/entries with slashesame them and organize them and to deal with duplication, but I think it would be worth it.

It would also take a major draw on system resources out of the entries. Since each translation table would be its own page, there would be more capacity available in most cases.

The main drawback is that they wouldn't be tied in as tightly with the entries, especially where there are multiple, subtly differentiated senses. I suppose they could be transcluded into the entries, but that's technically impossible for some of the larger ones.

What does everyone think? Chuck Entz (talk) 15:07, 26 January 2019 (UTC)[reply]

I don't like it. "They wouldn't be tied in as tightly with the entries" is an understatement. You are now basically creating two entirely separate dictionaries complete with definitions and parts of speech, but now as an "appendix" namespace to be forgotten about and diverge. DTLHS (talk) 17:31, 26 January 2019 (UTC)[reply]
I like the idea of reconsidering how translation hubs are handled, I am less excited about moving translations out of entries without a really strong demonstration of how it could be done in a way that improves usability. - TheDaveRoss 17:51, 26 January 2019 (UTC)[reply]
Couldn't such a namespace also be used to speed loading of entries with a very large number of translations, like [[water]]?[point made by Chuck]
Do those users who use Wiktionary as a translating dictionary like to see redlinks in translation tables in search results for every search or most searches that they do? Or would they rather only see full entries?
Can't users, at least registered users, somehow (JS?) specify multiple namespaces for their default searches? AFAICT we don't have anything in preferences that facilitates this. (This would also have value for incorporating all sorts of things into search results that probably shouldn't clutter most users' search results, such as snowclones, reconstructions, collocations (sometimes proposed).) DCDuring (talk) 18:44, 26 January 2019 (UTC)[reply]
I agree with DTLHS, and for the specified reasons. I would rather see translation hubs kept in the mainspace wherever possible, moved to subpages only as necessary to avoid out-of-resources errors. Benwing2 (talk) 19:26, 26 January 2019 (UTC)[reply]
@DCDuring: Both the regular and the advanced search interfaces on the search page allow selecting the default namespaces for searches. The CirrusSearch documentation claims the place to select namespaces is in the Search tab of preferences, but that seems to be out-of-date information. — Eru·tuon 21:30, 31 January 2019 (UTC)[reply]
Thanks. I hadn't noticed how that worked. But I was really thinking about the default for unregistered and new users. For English speakers principal namespace is a good default, with a possible future collocations space being a good addition. For others principal + any translation namespace would seem better. Do we consider against our privacy and anonymity principles to read a user's preferred language to set such a default? DCDuring (talk) 21:55, 31 January 2019 (UTC)[reply]
Isn't that the concept of omegawiki? I believe they are not good when it comes to usability.Matthias Buchmeier (talk) 14:28, 31 January 2019 (UTC)[reply]
I think a Translations namespace would be a better solution for dealing with Lua memory problems than moving translations sections to /translations subpages as I have been doing. There are a fair number of entry titles that legitimately contain slashes and it's neater to reduce the number of slashes that mark subpages in the mainspace. The Unsupported titles subpages would still remain, though. I'm uncertain about the idea of moving all translations to a new namespace.
We could avoid separate pages if someone designed a way to maintain non-Lua versions of the translation templates ({{t}}, {{t+}}), not only for Latin terms with a limited number of parameters (as a script I created sort of does) but for terms with other parameters and non-Latin terms as well, but I think that would be a fairly complex task and neither I nor anyone else has volunteered to work on it yet. — Eru·tuon 01:40, 1 February 2019 (UTC)[reply]
I think we should keep separate translation pages a a workaround. The number of translations is growing very slowly, so that for the next ten years we would likely only have to use translation subpages on a couple of entries. And the Lua memory problem will evetually be fixed in the near future as server memory gets cheaper and larger. Matthias Buchmeier (talk) 15:13, 1 February 2019 (UTC)[reply]
The syncing problem is a very good point, so I've withdrawn the "all translations" part. That's not relevant to the translation hubs, though, since there's nothing to sync with. They're not really terms in English, they're concepts- the only reason we treat them as English is because we don't put translation tables under other language headers. In effect, they're already appendices- we just put them in mainspace and stick an "English" header on them.
I'm skeptical that being out of reach of the default search settings is such a big deal. After all, who actually searches for "consecrate a Buddha image"? It should be enough to link to the translation hub from Buddha and perhaps consecrate.
If we really have to have them in mainspace, an alternative might be to create a "language" header for them like we do for Translingual. Chuck Entz (talk) 23:54, 1 February 2019 (UTC)[reply]
  • I would support having a "Translations" tab similar to the current "Citations" tab, and moving all tables of translations to the namespace associated with the tab. We have a number of citations pages for terms that do not have entries (for example, terms deleted as SOP), and it is my understanding that these are permissible. I see no reason why we could not have uncoupled translations pages for terms that do not have entries for lacking an English meaning, so long as they are searchable, and are linked from some relevant pages in entry space. bd2412 T 01:03, 3 February 2019 (UTC)[reply]
    I would support having a Translations tab as you suggest, but absolutely oppose moving all translation tables to it. Translation tables grow slowly enough as it is...this would be a surefire way to get it to grind to a halt. I think the purpose of a Translations tab/namespace would be to house current translation hubs and the translations for the few entries (like water and cat) that are too numerous for them to be in the main entry. The goal would be to have all our translation tables look like that, so we could simply work towards making them all big enough to move to the Translations namespace. I just don't think we're ready to do it now. Andrew Sheedy (talk) 02:21, 3 February 2019 (UTC)[reply]

Inline "Imperfective:" and "Perfective:", similar to inline "Synonym:" etc.

[edit]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): Russian (and other Slavic-language) verbs come in perfective/imperfective pairs. Sometimes there is a one-to-one correspondence, but sometimes now. For example, a given imperfective verb may have 0, 1 or many corresponding perfectives, and it might differ from meaning to meaning when a verb has multiple meanings. As an example, the verb коло́ть (kolótʹ) is defined like this:

  1. to split, cleave, break (sugar), crack (nuts), chop (firewood), pf - расколо́ть (raskolótʹ)
  2. to stab, thrust, pf - заколо́ть (zakolótʹ)
  3. to kill, slaughter, pf - заколо́ть (zakolótʹ)
  4. to prick, sting, pf - уколо́ть (ukolótʹ), кольну́ть (kolʹnútʹ)
  5. to have a stitch, to feel a prick or a stab, pf - кольну́ть (kolʹnútʹ)
    У меня́ ко́лет в ле́вом боку́.
    U menjá kólet v lévom bokú.
    I have a stitch in my left side.
  6. to taunt

This indicates that e.g. in the meaning "to split" it has perfective расколо́ть (raskolótʹ), while in the meaning "to stab" or "to kill" it has perfective заколо́ть (zakolótʹ) and in the meaning "to prick" or "to sting" it has perfective either уколо́ть (ukolótʹ) or кольну́ть (kolʹnútʹ). We have taken to indicating the perfective correspondences on the same line, but I find this confusing and would rather put them on a following line just like the new format for synonyms/antonyms/etc. It gets especially annoying when you have both perfectives and synonyms listed, e.g. this example from пере́ть (perétʹ):

  1. (colloquial) to steal, to pinch, pfспере́ть (sperétʹ) or упере́ть (uperétʹ)
    Synonym: тащи́ть (taščítʹ)

I am planning on creating templates {{perfectives}} and {{imperfectives}}, with short forms {{pf}} and {{impf}}, which work just like e.g. {{synonyms}} with short form {{syn}}, so you instead write this:

# {{lb|ru|colloquial}} to [[steal]], to [[pinch]]
#: {{pf|ru|спере́ть|упере́ть}}
#: {{syn|ru|тащи́ть}}

and get this:

  1. (colloquial) to steal, to pinch
    Perfectives: спере́ть (sperétʹ), упере́ть (uperétʹ)
    Synonym: тащи́ть (taščítʹ)

I'll then use my bot to convert existing entries to the new format. Comments? Benwing2 (talk) 19:04, 26 January 2019 (UTC)[reply]

Makes sense. There don’t seem to be many options to make such distinctions clear, and this is a known format already. Better than the format “{{lb|ru|colloquial}} to [[steal]], to [[pinch]], {{g|pf}} — {{m|ru|спере́ть}} or {{m|ru|упере́ть}}” at least. This format is what you mean to convert by bot? Fay Freak (talk) 19:54, 26 January 2019 (UTC)[reply]
@Fay Freak Yes. Benwing2 (talk) 00:23, 27 January 2019 (UTC)[reply]
It's okay with me. —Stephen (Talk) 08:18, 27 January 2019 (UTC)[reply]
Seems okay to me too. Per utramque cavernam 12:47, 27 January 2019 (UTC)[reply]
(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): This is done, please use the new format from now on. Benwing2 (talk) 22:27, 27 January 2019 (UTC)[reply]
Note that User:Ungoliant MMDCCLXIV/synshide.js needs to be updated since it just says "undefined" in this case (can we make this more robust?) DTLHS (talk) 00:44, 28 January 2019 (UTC)[reply]
@DTLHS: Jberkel's version, semhide.js, works properly. He and I have been working on necessary changes to the visibility toggle framework (see MediaWiki talk:Gadget-visibilityToggling.js) and at this point I just need to double-check the scripts and install them, which I've put off for a while. — Eru·tuon 20:06, 28 January 2019 (UTC)[reply]

Novial entries are all lacking citations

[edit]

There are currently 572 entries in the category Category:Novial lemmas. All of them are lacking citations. Maybe it would be a good idea if those entries are moved to the appendix instead. Or is there a big Novial corpus somewhere to attest those words? Robin van der Vliet (talk) (contribs) 00:44, 28 January 2019 (UTC)[reply]

Nothing about Novial is currently in the public domain due to age in the US, and thus they aren't accessible from Google Books or HathiTrust. Otto Jespersen died in 1943, so his works are PD in the EU, and they're available online. I can't find any evidence of other writers in the language. Unless we want to let Wikipedia attest things (an idea which I don't think would get much success, which is why I haven't seriously proposed it), I think attesting anything is going to be hard.--Prosfilaes (talk) 02:24, 28 January 2019 (UTC)[reply]
Are there other published dictionaries other than the "Novial Lexike" as mentioned in the Wikipedia article? DTLHS (talk) 02:31, 28 January 2019 (UTC)[reply]
Neither http://web.archive.org/web/20120719040016/http://www.rickharrison.com/language/bibliography.html nor w:Novial mention anything.--Prosfilaes (talk) 03:17, 28 January 2019 (UTC)[reply]
The solution is to move all Novial entries to the Appendix, as we did with Lojban. Do we need a vote for that, or can we gather consensus to do so in this discussion? Also @Mx. GrangerΜετάknowledgediscuss/deeds 06:13, 28 January 2019 (UTC)[reply]
Moving looks boni to me.  --Lambiam 12:40, 28 January 2019 (UTC)[reply]
Looks good to me too. Robin van der Vliet (talk) (contribs) 20:01, 28 January 2019 (UTC)[reply]
Sorry to say, but the inclusion of Novial in the main space is baked into CFI, so the ensuing modification of CFI to exclude Novail would require a vote. This, that and the other (talk) 01:54, 31 January 2019 (UTC)[reply]
I created a voting page here to strike it from WT:CFI#Constructed languages. Robin van der Vliet (talk) (contribs) 02:03, 31 January 2019 (UTC)[reply]
@Metaknowledge I just saw this ping now; for whatever reason I don't think it worked at the time. Or maybe I saw it and forgot to respond. Anyway, thanks for creating the vote. —Granger (talk · contribs) 08:11, 5 February 2021 (UTC)[reply]

Deletion of own userspaces

[edit]

I've always wondered why we can't delete our own userspaces. That seems kinda silly to me. Is there any technical solution around this, and if so, is that something people agree with? --{{victar|talk}} 01:46, 31 January 2019 (UTC)[reply]

Just put {{delete}} on your user page and an administrator will delete it shortly afterwards. Robin van der Vliet (talk) (contribs) 01:57, 31 January 2019 (UTC)[reply]
Obviously. That's not what I was asking. --{{victar|talk}} 07:12, 31 January 2019 (UTC)[reply]
I agree it should be possible. I'm curious to hear the rationale for it, besides “it increases code complexity / introduces potential security problems”. Maybe it stems from the general anti-deletist culture in MediaWiki. – Jberkel 07:04, 31 January 2019 (UTC)[reply]
From a programming standpoint, it seems easy enough: if (isAdmin or ownsPage) { showDeleteButton}. --{{victar|talk}} 07:12, 31 January 2019 (UTC)[reply]
Not sure, but could there be some objection to editors deleting talk pages to hide evidence of uncordial behaviour? Of course, in such cases an administrator could still view or undelete the pages. — SGconlaw (talk) 08:18, 31 January 2019 (UTC)[reply]
Yeah, I thought about that scenario, or maybe some joint project multiple people are contributing to, but a) edit other user's userspaces at your own risk, and 2) exactly, an admin could just restore it. In truth, I think, it's more dangerous allowing users to create userspaces than it is allowing them to delete them. --{{victar|talk}} 08:23, 31 January 2019 (UTC)[reply]
There are multiple paths to deletion, so the code changes would be more extensive than that, but I doubt that is the reason. Another possible reasons are that not all installations of MW are going to have the same concept of what a user page is and who "owns" it, just because we are generally fine with people controlling their userspace doesn't mean everyone will be. I bet the actual reason is that nobody felt it was that important, and the vast majority of the people who design MW and suggest changes are administrators or developers on their primary wikis. Plus it is the status quo. - TheDaveRoss 13:39, 31 January 2019 (UTC)[reply]
Yeah, I didn't think it was that simple; I was just laying out the logic. You make a good point in that Wikt is different in allowing for userspaces, so this issue is unique to us. --{{victar|talk}} 19:53, 31 January 2019 (UTC)[reply]