Jump to content

Wiktionary:Beer parlour/2017/November

From Wiktionary, the free dictionary

I'd like to create categories for French compounds comprised of:

but I don't know how to name them. It would be useful to distinguish between words such as porte-clef (verb+noun) and porte-fenêtre (noun+noun). Any ideas? --Barytonesis (talk) 19:13, 1 November 2017 (UTC)[reply]

I'd just name them more or less as you said, Category:French verb-noun compounds. I notice that we have Category:French compound words with "words" at the end, but all the subcategories of Category:Subtypes of compounds by language, as well as this category itself, lack "words" in the name. Is this something that needs to be corrected? It might be better to do it now, before we start creating more of these categories. —Rua (mew) 20:14, 1 November 2017 (UTC)[reply]
Category:Subtypes of compound words by language then? Category:Compound words subcategories by language? Category:Types of compound words by language? (to follow up on Erutuon's suggestion at Wiktionary:Beer parlour/2017/July § Compound and fiction etymology categories)
See also Module talk:category tree/poscatboiler/data/terms by etymology.
Fine by me in any case. --Barytonesis (talk) 00:10, 2 November 2017 (UTC)[reply]
@Rua? --Barytonesis (talk) 15:00, 3 November 2017 (UTC)[reply]
I'm waiting for others to offer their views. —Rua (mew) 15:05, 3 November 2017 (UTC)[reply]
@Tropylium: do you have an opinion about this? --Per utramque cavernam (talk) 13:05, 17 December 2017 (UTC)[reply]
Sounds agreeable to me, even kind of overdue actually, and also more generally applicable than the already existing Sanskritic categories like Category:Tatpurusa compounds by language. --Tropylium (talk) 17:19, 17 December 2017 (UTC)[reply]
@Per utramque cavernam: I like the new umbrella category name you've chosen. One preference I have regarding the compound constituent categories is that the titles use an en-dash: Category:French verb–noun compounds. Difficult to type, but perhaps the categories could be added by templates. — Eru·tuon 01:44, 18 December 2017 (UTC)[reply]
@Erutuon: Thanks. I prefer the en-dash as I find it more "soigné" and pleasing to the eye, but it's true that it's harder to type... I even voiced my uncertainty at User talk:Ungoliant MMDCCLXIV § Portuguese compounds (where I've also had an idea I wanted to pitch to you: improving {{affix}} and using the pos parameters to bring about the categorisation).
I'll probably have to go back on the category names for disambiguation purposes anyway (I don't want people to put endocentric verb-noun compounds in there), so we could revert to the hyphen if you prefer. --Per utramque cavernam (talk) 02:11, 18 December 2017 (UTC)[reply]
@DCDuring: you were mentioning here the idea of categorising compounds by their head. Maybe you'd like to weigh in? --Barytonesis (talk) 15:23, 13 November 2017 (UTC)[reply]
I like the idea of adding depth to our existing entries by almost any means, including categorization.
For compound words one could categorize by head PoS and subcategorize by the PoS of the other element of the two-part compound. The subcategorization might be essential because for most English compounds the PoS of the compound is the same as the PoS of its head. English being what it is (terms that are nouns and verbs as well as nouns being used as noun modifiers), not all such categorization would be indisputable. I think such categorization could help non-natives understand English compounding better and might be useful to linguists. I don't think we would have to have complete coverage to be useful.
It would be consistent with a general scheme of characterizing all multi-word terms grammatically. We already have such categories as Category:English coordinates, Category:English non-constituents, and Category:English predicates.
Also, for words that have suffixes or prefixes it might be useful to distinguish via categories those words of denominal, deverbal, and deadjectival derivation. DCDuring (talk) 20:54, 13 November 2017 (UTC)[reply]

Participate in Dispute Resolution Focus Group

[edit]

The Harvard Negotiation & Mediation Clinical Program is working with the Wikimedia Foundation to help communities develop tools to resolve disputes. You are invited to participate in a focus group aimed at identifying needs and developing possible solutions through collaborative design thinking.

If you are interested in participating, please add your name to the signup list on the Meta-Wiki page.

Thank you for giving us the opportunity to learn from the Wikimedia community. We value all of your opinions and look forward to hearing from you. JosephNegotiation (talk) 22:50, 1 November 2017 (UTC)[reply]

"The scope of this project is limited to the resolution of disputes concerning improper or disruptive behaviors." (from m:Research:Developing Wikimedia's anti-harassment and behavioral dispute resolution systems) DCDuring (talk) 13:24, 2 November 2017 (UTC)[reply]
P.S. I am setting up a group for the promotion of improper disruptive behaviours, because they are the only way to stop some bastards. You know whom to call. Equinox 13:25, 2 November 2017 (UTC)[reply]

Project idea kind-of maybe

[edit]

SECRET CLUB LINK: User:Equinox/EWDC. The rest is babble.

Who else is obsessed with adding missing English words, like I am? I know there are some of you, although User:Visviva seems to have found something better to do. I've got a ton of missing English words from all kinds of sources. The ones I have really tried but couldn't work out can be seen in the alphabetical list on my user page. But I've got a zillion more. Does anyone want to be in ENGLISH WORD DEFINING CLUB. I think it would be pretty amusing, and maybe useful, to break down some of these huge lists and split 'em across a group, and everyone would take five words per week (or whatever) and see what they can do. Probably not alphabetically or there would be a rash of suicides as we hit M and everybody got 30-letter words starting with methyl. -- If anyone is into this I will tell more secrets about my word lists. Equinox 05:48, 2 November 2017 (UTC)[reply]

I'm in. Let's get to 1 million lemmas. DTLHS (talk) 05:54, 2 November 2017 (UTC)[reply]
If you wanna do it then sign on this page, and make a little Scout salute. User:Equinox/EWDC. Equinox 06:00, 2 November 2017 (UTC)[reply]
Minerals, geological periods and formations. DCDuring (talk) 13:20, 2 November 2017 (UTC)[reply]
It's only fun when the words are random. We could split them by theme but see above re methyl. Equinox 13:22, 2 November 2017 (UTC)[reply]
If I had done Webster 1913 from A to Z rather than having a random number generator, I would probably have chugged bleach in 2015 rather than 2018. Equinox 13:27, 2 November 2017 (UTC)[reply]

November LexiSession: toilets

[edit]

The monthly suggested collective theme is toilet. It's not a fancy topic but an interesting one and 19th of November is the World Toilet Day (no kinding). So, we may improve Thesaurus:toilet and all the slang words referring to this crucial place! And you are not push to contribute from this place.

Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every project at the same time, to give us more insight into the ways our colleagues works in the other projects.

Have a good time! Noé 09:03, 2 November 2017 (UTC)[reply]

French Wiktionary October news

[edit]

Hello!

Hey! October issue of Wiktionary Actualités just came out in English!

Long Actualités this month with five articles: what is the patrol, a review of a research that use Wiktionary, a dictionary with celtic words in French, a feedback from the Wikiconvention francophone and a short point on the phatic function of the language. Plus, as usual: highlights from the press, statistics, videos, LexiSession and colorful pictures!

This issue was translated by Jberkel in less than a day, and may can be improved by readers (wiki-spirit-love-love). We did not receive any money for this publication and we are not supported by any user group or chapter. It is only written by the community, for the large community of lexilovers! Feel free to send us comments on our writings, or differences between our projects Noé 08:53, 3 November 2017 (UTC)[reply]

Template:bor: Replace notext=1 with withtext=1

[edit]

@Daniel Carrero, TheDaveRoss As part of the effort to remove the text from {{bor}}, we've been adding notext=1 to calls of this template. The idea is that any instances that are called the old way, without that parameter, are still in need of fixing. But this is really annoying and backwards so I propose to reverse the sense of the parameter:

  1. Module:etymology/templates is modified to accept both a notext=1 and withtext=1. withtext=1 is initially implemented as a no-op parameter.
  2. Any instances of {{bor}} that don't currently have notext=1 get withtext=1 added by a bot.
  3. The module is modified so that the text is not displayed by default, thus notext=1 becomes the no-op, and withtext=1 triggers the inclusion of the text.
  4. Any instances of {{bor}} that have notext=1 have it removed by a bot.
  5. The module is modified again to remove the notext= parameter.
  6. The template now works the way it should, and any further efforts at cleanup are now focused on Category:bor with withtext.

Rua (mew) 13:38, 3 November 2017 (UTC)[reply]

I concur. --Barytonesis (talk) 15:00, 3 November 2017 (UTC)[reply]
I see no way in which this helps at all (just like using a bot to change "{{bor|aa|bb}}" to "Borrowed from {{bor|aa|bb|notext=1}}"). It's just more work for the editor. —Aryaman (मुझसे बात करो) 17:51, 3 November 2017 (UTC)[reply]
Why? —Rua (mew) 17:56, 3 November 2017 (UTC)[reply]
The question to ask is whether this does any harm other than temporary confusion to editors while the switch is underway. If not, the benefits of speeding up the phaseout of the "Borrowed" text would be reason enough, IMO.
The removal of this text was agreed to a while back, partly to reduce the arbitrary differences needed to be learned between the behavior of similar templates in the {{bor}}/{{inh}}/{{der}}/{{cog}} group, and partly because the added parameters needed to customize the text like |nocaps= are more trouble for many editors than just typing the text by hand every time.
Although the removal of the text makes more work when the template is at the very beginning of the etymology, it makes things simpler for new users and less work in all other cases. Even for those who don't like what's being done, there's something to be said for getting the transition over with. Chuck Entz (talk) 20:06, 3 November 2017 (UTC)[reply]
I concur too. It is regularly nerve-racking having to add “notext=1”, and the reasons explained by Chuck Entz are true to the facts. Palaestrator verborum (loquier) 22:14, 3 November 2017 (UTC)[reply]
Yes, let {{bor}} default to showing no text. Ideally, in future, let "Borrowed" be no longer shown in etymologies at all. --Dan Polansky (talk) 22:24, 10 November 2017 (UTC)[reply]
I'm supportive of unifying template behavior to no longer add text. However, why remove "borrowed" if an editor has added it? This is useful information for the reader. ‑‑ Eiríkr Útlendi │Tala við mig 22:30, 10 November 2017 (UTC)[reply]
It is not part of Rua's proposal, but anyway: I don't believe "Borrowed" is useful. It looks like cruft to me. Merriam-Webster does not have it. Whether term T1 is borrowed from T2 or is inherited seems to nearly always follow from which languages T1 and T2 belong to, and therefore, it is a rather trivial piece of text that does not need to be on the radar screen of the reader. --Dan Polansky (talk) 22:34, 10 November 2017 (UTC)[reply]
I write "From" for inherited terms, and "Borrowed from" for borrowed terms. I do this also for non-initial parts of the etymology, e.g. Latin borrowed from Greek in an English etymology. —Rua (mew) 22:43, 10 November 2017 (UTC)[reply]
It's a distinction without a difference. Czech cannot inherit from French, so it has to borrow; when I see "From French ...", I know it's a borrowing. --Dan Polansky (talk) 22:48, 10 November 2017 (UTC)[reply]
You know that, but does everyone else using Wiktionary also know? And there are some languages that are both borrowed and inherited from. Latin is a good example. For those languages, we need to specify borrowings. And for consistency it then makes sense if we do it for all of them. I really don't understand what the objection to it is. —Rua (mew) 22:51, 10 November 2017 (UTC)[reply]
There is nothing to know since "borrow" is only a terminology that indicates a trivial distinction, one without a difference. It's the same kind of objection that I have when someone adds "noun" before the word in etymology chain, or instead of "from" writes "which comes from"; I like things to be very compact, and get rid of all that looks like cruft. Merriam-Webster online seems to see things the same way I do. --Dan Polansky (talk) 23:02, 10 November 2017 (UTC)[reply]
The distinction between borrowing and and non-specific derivation might not be significant, particularly for Greek terms borrowed using Latin spelling, but the distinction between borrowing and inheritance is significant for some doublets like Spanish palabra and parábola. Both of them originate from the same Latin term, parabola, but the inherited one has undergone sound changes that the borrowed one has not. I don't know how this exactly relates to having "derived" as well as specifically "borrowed" and "inherited" categories, though. — Eru·tuon 23:45, 10 November 2017 (UTC)[reply]
These doublets are interesting, thank you, but I do not see that rather small group of items as a sufficient reason to flood our etymologies with "borrowed" that for all but a very small portion of all cases (>99%?) adds nothing beyond the obvious. --Dan Polansky (talk) 08:11, 11 November 2017 (UTC)[reply]
If you find it fluff, but others find it useful, can you live with the presence of Borrowed from in entries? ‑‑ Eiríkr Útlendi │Tala við mig 03:44, 12 November 2017 (UTC)[reply]
I want to see "Borrowed" gone even if some people find it useful. I reject the following principle: If several users find a certain more loquacious and space-consuming presentation useful, all users should be exposed to that presentation, even if those several users are a small minority. If I am the minority, I have to live with "Borrowed" anyway. --Dan Polansky (talk) 08:25, 12 November 2017 (UTC)[reply]

I've done step 1 and a bot is now doing step 2. —Rua (mew) 22:41, 10 November 2017 (UTC)[reply]

Step 2 is complete, but I noticed that Category:bor without notext is still being filled with new entries day by day. I've notified Equinox of the upcoming change, but there may be others who still rely on the old format that need to be notified. —Rua (mew) 17:34, 11 November 2017 (UTC)[reply]
Step 3 has been done, now the template works in the intended way. The text is no longer shown by default. —Rua (mew) 12:36, 12 November 2017 (UTC)[reply]
A bot is now working on step 4. There's 40 thousand entries, so it will take a while, possibly multiple days. —Rua (mew) 12:44, 12 November 2017 (UTC)[reply]
Everything is now done! —Rua (mew) 19:36, 12 November 2017 (UTC)[reply]
Nice! --Daniel Carrero (talk) 01:29, 13 November 2017 (UTC)[reply]

Translations with a different part of speech that express the same idea

[edit]

@Adelpine I just noticed diff. French may not even have a noun to express this idea, so I don't think it's helpful to remove this translation. I recall that we had a discussion in the past where we agreed that it was desirable to have translations that express the same idea even if the part of speech is different. A common example is a verb that expresses what is an adjective in English or vice versa. Stative verbs are a thing in many languages. —Rua (mew) 13:46, 3 November 2017 (UTC)[reply]

@Rua OK. I will not remove this kind of translations from now on.--Adelpine (talk) 20:42, 3 November 2017 (UTC)[reply]

Add pronunciation of chinese words in the table titled "Dialectal synonyms of", under the "Synonyms" header.

[edit]

Just like in the box for "Derived terms from", under the "Compounds" header, there's plenty of space for it, and otherwise you have to click every word and then go back in your browser, expand the table again. --Backinstadiums (talk) 21:35, 3 November 2017 (UTC)[reply]

Since when do we have a separate header for Compounds? It should be Derived terms. —Rua (mew) 22:17, 3 November 2017 (UTC)[reply]
Oppose. This was discussed and it was decided to leave them out, for otherwise the table could get quite wide and also we don't hold pronunciations for most regional dialects. The discussion is somewhere. —suzukaze (tc) 22:30, 3 November 2017 (UTC)[reply]
@Rua: For example
That entry is malformed in all kinds of ways. There's no part-of-speech header, and there's a Compounds header at L3. I changed Compounds to L4 Derived terms, but the part of speech needs to be fixed still. —Rua (mew) 22:47, 3 November 2017 (UTC)[reply]
Compounds are not necessarily derived from the definitions; they are merely all multicharacter words containing the character. Characters can also be used phonetically. Wyang (talk) 22:48, 3 November 2017 (UTC)[reply]
Single-character Chinese entries don't get part-of-speech headers and it has been the language policy, de facto and by the About Chinese page, for quite some time. They get Definitions header. --Anatoli T. (обсудить/вклад) 00:37, 4 November 2017 (UTC)[reply]
It seems phonocentristic to me to assert that the contested formations are not derived terms. Of course they are derived terms, in so far as we take the graphical words, independent of any underlying phonetic value, and stick them together. It’s graphemological derivation in the sense as we also use the word “derivation” to subsume compounds. So I presume that Rua is right about that heading. Palaestrator verborum (loquier) 00:49, 4 November 2017 (UTC)[reply]
WT:ELE#Derived terms:

List terms in the same language that are morphological derivatives.

See Morphological derivation. Wyang (talk) 01:04, 4 November 2017 (UTC)[reply]
Seems like the definition there is a slip, as everywhere there are compounds listed under “derived terms”, notably German, while derivations are “contrasted with other types of word formation such as compounding”. Meseems WT:EL has a bug, because compounds and such as the Chinese formations need either to use the Derived terms heading against the WT:EL definition or have invented a new heading. Palaestrator verborum (loquier) 04:13, 4 November 2017 (UTC)[reply]
Morphological derivation and compounding are subsumed under "Derivation" in general Wiktionary terminology (see e.g. search:"insource:/derived from/"), whereas Chinese words using characters phonetically are not said to "be derived from" those individual characters. Consequently, CJKV character entries conventionally use "Compounds" instead of "Derived terms" to include these words, at L3, as they are not subordinate to the definitions. Wyang (talk) 05:21, 4 November 2017 (UTC)[reply]
That’s what I already have observed and is expressed above. I have explicitly stated that the practice of listing lexical compounding deviates from the description in WT:EL, or rather the description deviates from the current best practice, and Atitarev has elucidated that Chinese entries generally use the heading “Compounds”. However I contest the notion of “phonetical derivation”. Your outlinings exhibit some unjustly blurry and simplifying thinking. Most commonly “derivation” is understood as lexical, on the level of lexemes and lexes rather than phonemes and phons. But nothing gainstays speaking of derivation on a mere graphical level. But here I opine that “Derived terms” and “compounds” are alike misleading. “Compounds” are typically understood as lexical, i.e. inheriting the meanings of the uncompounded elements – though this is mitigated by the use of L3 instead of L4 –, and is here used quite untechnically, while “term” is a word that can hardly be used in English if no focus on meaning is intended, so the reader might wrongly assume or maybe needs wrongly assumes a little that it is spoken of “derived terms” because some meaning is inherited. Factually both headings are – on the L3 place – roughly correct as long as understood the intended way. However so far it seems to me that there is no way to solve the dilemma because the available English lexicon is too phonocentristic for a reliable generalized heading. Every handling is wrong and there is no solution as long as no cruel and unusual diction is adopted. Palaestrator verborum (loquier) 06:31, 4 November 2017 (UTC)[reply]
No, you still don't get it. "Derivation/derived/derives from" on Wiktionary generally means both morphological derivation and compounding, but the terminology is not used for phonetically used glyphs. You have to understand there isn't a perfect solution to many issues in language handling on Wiktionary; what is to be respected is conventional practices that the relevant editors have come up with and have deemed to be the most appropriate. The CJKV editors have long felt the use of "derived" in the description of phonetically chosen characters to be very misleading and have been avoiding such wording whenever possible. It provides way more harm than benefit to Wiktionary readers who are used to understanding "derived" as "etymologically derived".
From your initial, almost instinctive denunciation as an unfamiliar outsider, to the insistence that "derivation" on Wiktionary be interpreted in your manner after having been corrected, and the silly summarisation of the practice as "phonocentrism", your posts come across as quite misinformed and pretentious. You have not worked on CJKV languages, and are unfamiliar with them, so please appreciate that you are unversed in these languages' complexities and the rationales for the conventions. Wyang (talk) 07:20, 4 November 2017 (UTC)[reply]
I disagree that the editors and/or speakers of a language should always have the last say, as it can lead to politically/emotionally motivated decisions rather than linguistically based ones (Wiktionary:Votes/pl-2014-03/Unified Norwegian anyone? And the opposing votants barely contribute here, with the exception of Donnanz...). However, I have complete trust in the Chinese contributors, who are numerous, very active and very linguistically literate (as far as I can tell, which doesn't mean much in this case). So I think all the decisions about Chinese can be safely left to them. --Barytonesis (talk) 09:53, 4 November 2017 (UTC)[reply]
But you only repeat what I say, Wyang: “No, you still don't get it. ‘Derivation/derived/derives from’ on Wiktionary generally means both morphological derivation and compounding, but the terminology is not used for phonetically used glyphs.” I have said that derivation means 1. lexical/morphological derivation 2. in a broader sense, compounding 3. by virtue of the abstract meaning of the word, those Chinese “compounds” which are not formed because of meaning but because of matching sounds – I don’t think that it is false to speak of “derivations” in the third case, and I have not claimed that the third meaning is used with the “Derived terms” headings. I rather opine that it is wrong to speak of “derivations”, rather than false, because of the misleading consequences you already know (people thinking that the meaning is inherited).
“but the terminology is not used for phonetically used glyphs.” It’s what I have presumed; I have even given the reasons why it is misleading to speak of derived terms. I have said “Of course they are derived terms, in so far as we take the graphical words, independent of any underlying phonetic value, and stick them together,” but that is only the reason why one might believe that it is correct to to speak about the “compounds” as “derived terms”, while on the contrary I have give a reason against using it and why it is wrong, but not false. “’term’ is a word that can hardly be used in English if no focus on meaning is intended, so the reader might wrongly assume or maybe needs wrongly assumes a little that it is spoken of “derived terms” because some meaning is inherited.”
“You have to understand there isn't a perfect solution to many issues in language handling on Wiktionary” – I said ”there is no way to solve the dilemma because the available English lexicon etc.”.
“the insistence that "derivation" on Wiktionary be interpreted in your manner” I have not insisted on anything, I have quite explicitly stated that I know no solution but rather deny its possibility – ”However so far it seems to me that there is no way to solve the dilemma”. My purpose has been to pronounce in which different meanings “derivation” can be understood by whomever, to elucidate the causes of the frictions about and that the definition in WT:EL needlessly contributes to a confusion by dint of displaying only half of the practice with the “Derived terms” heading. How difficult is it to get that I do not call for any action (short of possibly clarifications in WT:EL to regard compounds as in German and those here called “compounds” in the Chinese entries) but call for awareness about the terminology needs being frictious?
In other words:
  1. The ”compounds” listed at are all compounds.
  2. The ”compounds” listed at are not only compounds.
  3. The ”compounds” listed at are all derived terms.
  4. The ”compounds” listed at are not only derived terms.
I esteem all four propositions correct, I am with no side. No contradiction is intended. And if I have said “I presume that Rua is right about that heading” it’s not because I side but because it follows from her proposition that I have formulated in (3.) being correct and only that far. However I do not recommend that stance, as this choosing of words is more likely to be misunderstood than “compounds”; and I do not recommend any wording, as I have said, as all wordings I know are wrong. Palaestrator verborum (loquier) 17:59, 4 November 2017 (UTC)[reply]
TLDR. You lost me when I saw the number of bytes in your addition. Not everyone has time for this. Wyang (talk) 23:09, 4 November 2017 (UTC)[reply]
@Suzukaze-c: Neither of your reasons are very convincing; as I said, most tables have plenty of space, and otherwise they could be expanded lengthwise. As happens in the table for "Derived terms from", leaving out those pronunciations we are not aware of is not very disturbing, just as we do with unknown etymologies, translations, definitions themselves... and yet those words are added as "Derived terms from", appearing in red. --Backinstadiums (talk) 22:45, 3 November 2017 (UTC)[reply]
I don't think adding pronunciations to the module is sustainable, especially considering that there will be topolects not covered yet or words with multiple readings but I think adding simplified form is probably feasible and should be done. So far, the promise after the centralisation of the contents to always provide simplified characters in all entries was kept in all Chinese modules but not in this one. --Anatoli T. (обсудить/вклад) 03:27, 4 November 2017 (UTC)[reply]
What on earth happened to politeness in this functionality request (actually more like a demand)? Wyang (talk) 07:38, 4 November 2017 (UTC)[reply]
Hi guys, @Wyang: I am sorry if I seem too "demanding", or even cheecky, but I am just a language enthusiast, and so I try to improve this place with objective critical posts. --Backinstadiums (talk) 08:36, 4 November 2017 (UTC)[reply]
I didn't mean to sound demanding either, sorry if it sounded that way. It was a general comment - we have kept the promise but not always, which includes myself, since I have been involved in Chinese edits as well. The feature (of providing simplified variants) would be good to have. --Anatoli T. (обсудить/вклад) 09:59, 4 November 2017 (UTC)[reply]
@Wyang: No offense taken at all, do not worry :-) --Backinstadiums (talk) 10:36, 4 November 2017 (UTC)[reply]

Regarding Appendix:HSK_list_of_Mandarin_words, except for the first level, the rest presents first the simplified word, then traditional in parentheses, which as far as I can tell is the reverse approach taken here. Furthermore, the advanced level shows a formatting error, starting at letter "w", repeating "Template:l" in a column several times. --Backinstadiums (talk) 11:20, 4 November 2017 (UTC)[reply]

WT:ACCEL upgrade

[edit]

Over at WT:GP there's a thread about improving the WT:ACCEL script. I've now completed this and I'd like to replace the original script with it. To spare you all the technical details of the discussion, here's what's changed in the new version:

  • Rather than generating the entire new entry for each accelerated link in a page, it only appends the acceleration data onto the URL.
  • If the OrangeLinks gadget is enabled, any orange links are modified to become edit links, and also have acceleration data added.
  • The data in the URL is picked up by the edit page, which is what actually generates the entry and places the content in the edit window. This relieves some of the workload from the main page.
  • If you are editing an existing page and acceleration data is given in the URL (as with modified orange links), the script will insert the new language section into the existing wikitext at the appropriate location.

This last point is especially a big improvement, as it means that you can now use accelerated links even if the entry already exists. The current creation rules work with the new script without changes, and no changes need to be made to accelerated links.

Is it ok to replace the old script with my new one? The code is at User:Rua/Gadget-AcceleratedFormCreation.js. —Rua (mew) 11:56, 4 November 2017 (UTC)[reply]

An additional feature has been added. Now, it is possible to specify multiple sets of grammar tags in the URL, using accel1=, accel2= and so on. The script will generate multiple entries in this case, but it will try to merge them so that information isn't duplicated. If two adjacent entries differ only in their definition, with everything else being equal, then the two entries are merged and the definitions are placed next to each other. Moreover, if both definitions use {{inflection of}}, then the inflection tags are also merged, so that e.g. definition 1 with {{inflection of|soddjil||com|s|lang=se}} and definition 2 with {{inflection of|soddjil||loc|p|lang=se}} are merged as {{inflection of|soddjil||com|s|;|loc|p|lang=se}}.

The part of the script that generates the links does not yet make use of this feature, so if two different forms have the same page name, each link will still only generate an entry for its own form. It would be desirable for the script to combine accelerated links if they have the same page name, but this is not trivial: the script would have to decide in which order to place the forms. Consider the entry soddjil: the comitative singular is identical to the locative plural. In the HTML, the locative plural appears first, since tables are organised by row and locative is in a higher row than comitative. It would be desirable, however, to generate an entry with the order com|s|;|loc|p as above, with singular taking precedence over plural. Either the script would have to be modified with more complicated logic to decide which goes first, or the ordering of the elements in the HTML would have to be changed so that all singular forms appear in the HTML before all plural forms. —Rua (mew) 14:53, 5 November 2017 (UTC)[reply]

After some more work, the script will now automatically combine identical forms into a single link. It does this on a per-language basis, meaning it gathers all the accelerated links for a particular language for a particular target page, and combines them. It uses the order that the links appear in the HTML to determine which order the entries should be generated in. But perhaps there are ways to change the HTML order? —Rua (mew) 16:21, 5 November 2017 (UTC)[reply]

It is a definite improvement to not generate a bunch of new entries at once, and to allow the gadget to add language sections to existing pages. However, I haven't used the gadget much (it needs support for Ancient Greek), so I'm not a good person to test it.
For Ancient Greek and Latin at least, I doubt it would be worth trying to mess with the HTML in tables to get the forms to appear in the correct order. The HTML order for Ancient Greek and Latin nouns is by case and then by number, as the numbers are in column headers and the cases in row headers. The order could be changed by putting each number in a subtable, but that would be messy. The order for Ancient Greek verbs is probably okay though.
Perhaps the order could be imposed using maps and the array sorting function. Thinking of Latin nouns, maps from label to information: type (case or number) and a ranking number (nominative: 1, genitive: 2, ...; singular: 1, plural: 2). And a map from number and case to their ranking numbers (number: 1, case: 2). — Eru·tuon 06:32, 12 November 2017 (UTC)[reply]
I've implemented another possibility instead, the one I described on WT:GP. Now, if you want forms in one column (e.g. singular) to always have their definition before forms in another column (e.g. plural), then you specify data-accel-col=number on the table cells of each column. Forms within the same column will then always come before forms in a column with a higher number. This can be seen in Module:se-adjectives (or for an example entry, soddjil). Although the comitative singular form appears in the HTML after the locative plural form, the column numbers make sure it comes first in the generated entry, so that you get {{inflection of|soddjil||com|s|;|loc|p|lang=se}} and not {{inflection of|soddjil||loc|p|;|com|s|lang=se}}. This should work for Latin and Greek as well. —Rua (mew) 16:47, 13 November 2017 (UTC)[reply]

If there are no further comments, I will put in the new script on the 19th, two weeks after the initial post. —Rua (mew) 13:11, 14 November 2017 (UTC)[reply]

The script has now been activated. Have fun! —Rua (mew) 11:47, 19 November 2017 (UTC)[reply]

Thanks, seems like an improvement. Equinox 16:14, 21 November 2017 (UTC)[reply]

chinese anagrams

[edit]

鞦韆 indicates it's an anagram, but without being indexed to a category of anagrams. How can I find a list of the ones shown in Wiktionary?

This should be a good approximation of what you wanted. Wyang (talk) 00:07, 5 November 2017 (UTC)[reply]
@Wyang: Could you tell me where to find info. about the advanced search operators? --Backinstadiums (talk) 08:12, 5 November 2017 (UTC)[reply]
You can find more documentation on mw:Help:CirrusSearch and w:Help:Advanced search. Wyang (talk) 08:16, 5 November 2017 (UTC)[reply]
@Wyang: Thanks again! BTW, I'd rather use that space for the HSK levels, creating a category for anagrams and indexing words to it. --Backinstadiums (talk) 08:52, 5 November 2017 (UTC)[reply]
Wiktionary currently does not have an anagrams category in any language. —suzukaze (tc) 08:54, 5 November 2017 (UTC)[reply]
@Suzukaze-c: Regarding the English language, I've read similar proposals and discussions about the topic.--Backinstadiums (talk) 09:00, 5 November 2017 (UTC)[reply]
The problem is most words don't have anagrams. So if we created a category (automatically) for each word like "English words with alphagram abc", we would have hundreds of thousands of categories with only a single member. Adding the alphagram categories explicitly only for words with anagrams wouldn't provide any benefit over just listing the anagrams on the page with a bot (like I do now). DTLHS (talk) 21:54, 5 November 2017 (UTC)[reply]

Desysopping CodeCat aka Rua

[edit]

FYI, I created Wiktionary:Votes/sy-2017-11/Desysopping CodeCat aka Rua. --Dan Polansky (talk) 22:40, 4 November 2017 (UTC)[reply]

variant forms of 𠃊

[edit]

The cross-references for the variant forms of 𠃊 seem to have failed, aren't they generated automatically? --Backinstadiums (talk) 08:57, 5 November 2017 (UTC)[reply]

nope! all manual —suzukaze (tc) 08:58, 5 November 2017 (UTC)[reply]
O.K. Is it as such on purpose or for technical issues, or rather nobody noticed it before? --Backinstadiums (talk) 09:02, 5 November 2017 (UTC)[reply]

Grouping descendants (by borrowing) by language family

[edit]

It has come to my attention that this habit that I feel improves readability may be more controversial than I presumed, because the reader might interpret it as the word being borrowed into a common ancestor of the family.

To give an example, here's an example from *buka:

Here's the alphabetical ordering for comparison:

I think that parsing meaningful information from the list in this form is much harder, am I alone in this? Crom daba (talk) 22:35, 5 November 2017 (UTC)[reply]

I agree. I like the language family organization better, but it is potentially ambiguous as descendants lists in Reconstruction entries use family names as a proxy for proto-languages. — Eru·tuon 00:45, 6 November 2017 (UTC)[reply]
We could change that. —Rua (mew) 00:46, 6 November 2017 (UTC)[reply]
I agree. — Ungoliant (falai) 16:48, 13 November 2017 (UTC)[reply]

chinese templates

[edit]

Templates like {{Chinese-numbers}} could offer the pronunciation right under the characters (BTW, if sb. please would show me where one can add such info., I'll be so grateful). Their category page could also show the pronunciation since there're few items in them (I volunteer to do it manually, yet I'd need some guidelines on how to proceed).

Secondly, other templates such as {{list:days_of_the_week/zh}} are not displayed as an expandable box with translations, which worsens the user experience. --Backinstadiums (talk) 11:43, 6 November 2017 (UTC)[reply]

I believe the latter is by design. None of the list templates offer translations. —suzukaze (tc) 17:17, 6 November 2017 (UTC)[reply]
@Suzukaze-c: Is there any specific reason why the second design has been chosen? Can list templates start to show translations? What about pronunciations? --Backinstadiums (talk) 19:06, 6 November 2017 (UTC)[reply]

The Community Wishlist Survey 2017

[edit]

Hey everyone,

The Community Wishlist Survey is the process when the Wikimedia communities decide what the Wikimedia Foundation Community Tech should work on over the next year.

The Community Tech team is focused on tools for experienced Wikimedia editors. You can post technical proposals from now until November 20. The communities will vote on the proposals between November 28 and December 12. You can read more on the 2017 wishlist survey page. /Johan (WMF) (talk) 20:17, 6 November 2017 (UTC)[reply]

Yeah! Go for thousand of proposals! I am convinced none of them will be picked by the Community Tech, but they may inspire a dev or a session at a future hackathon...or at least a dozen of proposals may become a signal for the Wikimedia Foundation that Wiktionaries do exist and deserve some attention. So please, express your wishes! Noé 09:05, 8 November 2017 (UTC)[reply]
There's a section specifically for Wiktionary proposals at meta:2017 Community Wishlist Survey/Wiktionary. --Yair rand (talk) 18:51, 8 November 2017 (UTC)[reply]

Annotating the first English–Navajo dictionary

[edit]

https://www.newyorker.com/culture/personal-history/annotating-the-first-page-of-the-first-navajo-english-dictionaryJustin (koavf)TCM 21:13, 7 November 2017 (UTC)[reply]

paywall :( --2A02:2788:A4:F44:39D4:48CB:D128:F4DE 21:19, 7 November 2017 (UTC)[reply]
I don't see one. —Justin (koavf)TCM 21:51, 7 November 2017 (UTC)[reply]

These categories are currently empty. Meanwhile, the label {{lb|metaphorically}} automatically redirects to {{lb|figuratively}}, but we have no CAT:Terms used figuratively by language or CAT:Terms with figurative senses by language. Should we create one of these and delete CAT:Metaphors, or start using it? --Barytonesis (talk) 23:39, 7 November 2017 (UTC)[reply]

Appendix:Fiction/Films

[edit]

I created Appendix:Fiction/Films with a few fictional terms used in films. --Daniel Carrero (talk) 21:52, 9 November 2017 (UTC)[reply]

Oh thanks!! Equinox 23:20, 9 November 2017 (UTC)[reply]
You're welcome. Although I seem to remember that you don't like when I create appendices of works of fiction, so maybe there's a chance you're being sarcastic. --Daniel Carrero (talk) 23:23, 9 November 2017 (UTC)[reply]
lol Equinox 23:26, 9 November 2017 (UTC)[reply]
rsrs --Daniel Carrero (talk) 23:27, 9 November 2017 (UTC)[reply]
lol Equinox 23:29, 9 November 2017 (UTC)[reply]

When a city and a province/state/subdivision share a name

[edit]

In the Netherlands, two provinces, Groningen and Utrecht, have cities in them by the same name. The cities of course came first, and are the usual referent of these names; the provinces were named after the cities. But what order should the definitions be in? Since the city is the most common sense, it should come first. However, the city is defined as being in the province of the same name, which hasn't been defined yet. So the city definition depends on the province definition. How should this be handled? —Rua (mew) 20:30, 10 November 2017 (UTC)[reply]

A map would give an ostensive definition of both (or all three) that surpasses the kind of verbal definition in those entries. A good verbal definition of the city wouldn't reference the province. The etymology or {{defdate}} (together with basic common sense on the part of the user) could address which definition came first. DCDuring (talk) 21:14, 10 November 2017 (UTC)[reply]
I guess the province should be listed last: A province in the Netherlands named after the city / capital ? DonnanZ (talk) 00:09, 12 November 2017 (UTC)[reply]
  • Just in passing…"Since the city is the most common sense, it should come first." – that isn't policy AFAIK, and I for one am dead against it (in favour of historical ordering). Ƿidsiþ 08:11, 16 November 2017 (UTC)[reply]
    • The city is also the oldest sense, so the point is moot. But the oldest sense should definitely not come first by default. Do we really want entries to start with a bunch of obsolete senses before getting to the ones that really matter? I would rather have the most important sense first, the one that is most likely the one intended by the user. —Rua (mew) 15:13, 16 November 2017 (UTC)[reply]
      • "Do we really want entries to start with a bunch of obsolete senses before getting to the ones that really matter?" I do, since for me those are the ones that matter. It also make the development of senses much more obvious; otherwise it's hard to understand the connection between wildly different senses of a word that has many definitions. It's also not clear to me how you will determine which sense is most "important" in a word like set. But this is a discussion that should be pursued elsewhere; the only point to make now is that we do not have an agreed policy on the matter. Ƿidsiþ 15:50, 16 November 2017 (UTC)[reply]

French IPA pronunciation - express markup vs. autotemplate

[edit]

Is it preferable to replace express IPA markup with autotemplate, like in diff?

Thus, replace {{IPA|/e.zɔ.te.ʁik/|lang=fr}} with {{fr-IPA}}.

--Dan Polansky (talk) 22:02, 10 November 2017 (UTC)[reply]

I don't have a problem with it, if the template generates the right output. —Rua (mew) 22:51, 10 November 2017 (UTC)[reply]
If the backend module is stable and well tested I am fine with it. DTLHS (talk) 01:17, 11 November 2017 (UTC)[reply]
So am I. I've seen a bunch of these replacements over the last several weeks and they always seem to give the right results. It is possible to give a phonetic respelling in |1= for words whose pronunciation is not predictable from the spelling. —Aɴɢʀ (talk) 08:12, 11 November 2017 (UTC)[reply]
More importantly: with well over 5,000 edits over 10 days at rates up to 8 per minute, this looks like an unauthorized bot. I've blocked them accordingly. Chuck Entz (talk) 23:29, 11 November 2017 (UTC)[reply]
Well, if you're referring to User:86.130.177.172, it's not a bot. Just a human who edits fast. Please don't block them. They are useful. --Spreaderofwords (talk) 15:49, 13 November 2017 (UTC)[reply]

Characters in the same phonetic series (Zhengzhang, 2003)

[edit]

Could sb. please add how the info. "Characters in the same phonetic series (Zhengzhang, 2003)" can be used for language learning purposes? Thanks --Backinstadiums (talk) 20:52, 11 November 2017 (UTC)[reply]

It would help you understand how the characters were coined in the first place, I hope. Wyang (talk) 21:12, 11 November 2017 (UTC)[reply]
@Wyang: could the rest of reconstructions be added as well? --Backinstadiums (talk) 15:30, 12 November 2017 (UTC)[reply]
@Backinstadiums I don't have the data, unfortunately. Someone needs to key in the data manually or extract them from somewhere reliable. Wyang (talk) 07:38, 14 November 2017 (UTC)[reply]

Changes to the global ban policy

[edit]
Hello. Some changes to the community global ban policy have been proposed. Your comments are welcome at m:Requests for comment/Improvement of global ban policy. Please translate this message to your language, if needed. Cordially. Matiia (Matiia) 00:34, 12 November 2017 (UTC)[reply]
On Meta, I posted my oppose since the change in wording makes it possible to globally ban someone based on only two bans on Wikimedia projects even if those bans are on projects ruled by small cliques without transparent processes. --Dan Polansky (talk) 11:28, 12 November 2017 (UTC)[reply]

Vote: Placing Wikidata ID in sense ID of proper nouns

[edit]

FYI, I created Wiktionary:Votes/2017-11/Placing Wikidata ID in sense ID of proper nouns.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:30, 12 November 2017 (UTC)[reply]

Why is this just about proper nouns? Shouldn't we generalize this? - Jberkel (talk) 12:04, 13 November 2017 (UTC)[reply]
@Jberkel: You may already know this, but proper nouns seem to be a special case. Wikidata is for concepts, not words. This is OK for proper nouns. Wikidata is expected to have a specific data item for each place, like d:Q29 = Spain. But things get tricky with some other parts of speech. Like, d:Q3133 is green, but would we use that for the noun or adjective? d:Q31920 is the concept of swimming, so would we add that ID for multiple senses of "swim" and "swimming"? In any event, proper nouns may get the same ID in multiple entries, but the sense should be the same: d:Q30 means United States of America and is available for use in USA, US, United States of America, United States, etc. The same ID can be used for other languages, like in entry Estados Unidos, which means "United States" in Portuguese. Still, the sense is the same.
Let me know if there are any exceptions or problems I did not think yet.
I would be fine with keeping the vote for proper nouns only. Or maybe creating 2 options: proper nouns, and everything else. But the "everything else" part was not properly discussed yet, so maybe I would oppose it. I support placing the Wikidata ID as the sense ID of proper nouns.
It is important to note that some Wikidata ID senses are already being added to proper noun entries. I assume this is why @Dan Polansky created this vote. This Wikidata ID project was discussed at least a couple of times (discussion links are in the vote). This project was never voted yet. --Daniel Carrero (talk) 13:50, 14 November 2017 (UTC)[reply]
OK, but I can think of many cases where Wikidata ids are also applicable for “normal” nouns / senses, if they refer to specific concepts. As an example, class can refer to a class in Object Oriented Programming: this can be mapped to d:Q4479242. If class refers to a mathematical class, there's d:Q217594. etc.
Of course we shouldn't add Wikidata ids randomly to any sense, but really just to those where there is a good match and where it is necessary to distinguish multiple senses of a word. For proper nouns this is less controversial but also less useful (ok, the classic counter example here would probably be “Paris, Texas”). – Jberkel (talk) 17:02, 14 November 2017 (UTC)[reply]

Browser extension to label a selected chinese word

[edit]

Since I am starting to learn Chinese, instead of sitting down and manually edit entries systematically, I thought about doing it on the go, checking whenever they show up in an online text.

After selecting a chinese word, I need an extension to show to which of four groups of words, named by numbers 1 to 4, that word belongs; otherwise, nothing should pop up if the selected word does not belong to any of the four groups. The four groups of chinese words are in this list, and the label to appear is the "level number" of each category.

Regarding the apperance of the number (typeface, font, place etc.) any suggestions are welcome. @Atitarev --[[--Backinstadiums (talk) 11:01, 12 November 2017 (UTC)User:Backinstadiums|Backinstadiums]] (talk) 09:41, 12 November 2017 (UTC)[reply]

If you are interested in a general cursor translation tool for Chinese, you can try Lingoes (for Windows) or the in-built Mac dictionary. Both provide decent results for simplified Chinese at least. Wyang (talk) 09:47, 12 November 2017 (UTC)[reply]
@Wyang: That's not what I meant. Please check this thread. --Backinstadiums (talk) 09:54, 12 November 2017 (UTC)[reply]
That discussion is a bit hard to follow, I still don't quite understand your request or question. Do you mean editing Wiktionary pages themselves to include the levels, or ...? Wyang (talk) 09:58, 12 November 2017 (UTC)[reply]
@Wyang: A browser extension so that any user can select the headword of a Chinese entry which hasn't been indexed to an HSK level category yet, and check which one it belongs to, or whether it doesn't belong to any, thereby collectively editing them. --Backinstadiums (talk) 10:09, 12 November 2017 (UTC)[reply]
(edit conflict) @Wyang: User:Backinstadiums must be talking about a pop-up dictionary similar to Perapera Chinese Firefox or Chrome, which I use (there are also Japanese and Korean versions). Absolutely unreasonable requests. We are not producing browser extensions! --Anatoli T. (обсудить/вклад) 10:12, 12 November 2017 (UTC)[reply]
If Anatoli's right, then my first reply of this thread would apply - there are already mature softwares and extensions that do that. If the aim is to add the appropriate category to all HSK words, manually adding the category to entries missing it would be easier to achieve and much more efficient. Wyang (talk) 10:18, 12 November 2017 (UTC)[reply]
@Wyang: "If the aim is to add the appropriate category to all HSK words" Correct, that's the goal "manually adding the category to entries missing it would be easier to achieve and much more efficient" This is false, for the fact is that it's not done yet. I just want to engage with the thousands of users that monthly use Wiktionary. Creating a page of tasks or projects to carry out would really help, similar to https://commons.wikimedia.org/wiki/Commons_talk:Stroke_Order_Project. -- Backinstadiums (talk) 11:01, 12 November 2017 (UTC)[reply]

Topics of discussion

[edit]

I'm new to Wiktionary, regarding topics, and this is continuing from above BP discussion, what topics are there besides 1, improving Wiktionary, 2, word etymologies, 3, policy topics, 4, discussion about discussion.. -Aision (talk) 03:13, 13 November 2017 (UTC)[reply]

Occasionally we slag off another that we find disagreeable. And every now and then we tell jokes or play a wiki-game. --Spreaderofwords (talk) 15:39, 13 November 2017 (UTC)[reply]

Defining affixes

[edit]

I've seen that Equinox (talkcontribs) has been changing a lot of affix definitions to be less descriptive and more gloss-like; for instance, the definition of viscero- used to say "Forming compound words related to viscera", and it now reads just "viscera". Ditto for many dozens of similar affixes on my watchlist. Is this in accordance with some policy that I missed? It seems strange to me, because "viscero-" doesn't really mean "viscera", otherwise you'd be able to say "I removed the animal's viscero-". I understand that descriptive definitions are something of a grey area, but in the case of affixes they seem justifiable to me. Ƿidsiþ 09:57, 13 November 2017 (UTC)[reply]

Mh, yes, I preferred the old version. I've reverted back to it and added {{non-gloss}}. --Barytonesis (talk) 10:05, 13 November 2017 (UTC)[reply]
Well OK, but given how many pages this involves, I'd rather work out some consensus on the issue rather than changing one or two here and there. Ƿidsiþ 11:58, 13 November 2017 (UTC)[reply]
I concur as well. —Μετάknowledgediscuss/deeds 19:15, 13 November 2017 (UTC)[reply]
Hi! I've been doing this because I think that sense lines should tell us what is meant by the prefix (or suffix etc.). We don't define apple as "MEANING A SORT OF FRUIT", we just say what it means. So, to me, we should equally define a *fix like "-phobia" as meaning "fear" and not as "MEANING FEAR" because that isn't a definition per se. I seem to be in a minority here but let us talk a bit more. Why should pre/suffixes be defined in a weird way that we would revert in a second if we saw them for normal nouns or adjectives? Equinox 23:34, 13 November 2017 (UTC)[reply]
The quick-and-dirty fix is to put the old n-g template around some text that says "meaning XYZ" but I really think that's dodging the issue. Surely it's reasonable to say (for example) that the prefix "mega-" means "million", without spazzing too much. Yet would we write "MEANS A MILLION" at the capital symbol M, rather than merely writing "million"? Look at the SI units. Equinox 23:40, 13 November 2017 (UTC)[reply]
I sort of want to ask, if we need to write that a suffix -xx is used in "forming compound words", well what else would it do? Suffixes by their very nature form compound words. When I delete that stuff I feel like I'm deleting the bracketed part from "dog, n. NOUN [THIS IS A WORD AND A NOUN] an animal that barks". hissss. Equinox 23:45, 13 November 2017 (UTC)[reply]
I understand the concern but I think it makes things less clear. "An animal that barks" is already a (compound) noun, and "move quickly" for "run" is already a (compound) verb, but the same is not true of your affix definitions. It's a bit like interjections – we define words like sorry by saying that they "express regret" or similar, which is metadefinitional in the same way. I don't know exactly how it should be codified, but the point is that I think for most people, saying that viscero- means "viscera" is not sufficient, even given the stated part of speech. Ƿidsiþ 07:39, 14 November 2017 (UTC)[reply]
Perhaps we shouldn't give them a definition at all, but just redirect to the plain word somehow? --Barytonesis (talk) 10:46, 14 November 2017 (UTC)[reply]
The Equinox version has its charm, for the reason given by Equinox: "Forming compound words concerning viscera" does not tell me anything that I do not learn from "viscera". --Dan Polansky (talk) 18:05, 14 November 2017 (UTC)[reply]
https://www.merriam-webster.com/dictionary/cardio- does basically what Equinox does; so does AHD[1]; Macmillan has "relating to the heart"[2]. --Dan Polansky (talk) 18:08, 14 November 2017 (UTC)[reply]

Independent Synonyms section, or {{synonyms}} under each relevant sense?

[edit]

I've noticed some use of {{synonyms}} just after relevant senses, replacing the independent ====Synonyms==== section, as in this edit to the 青空 (aozora) entry. I don't see anything relevant at Wiktionary:Entry_layout#Synonyms. Is this use of {{synonyms}} the expected practice going forward?

Curious, ‑‑ Eiríkr Útlendi │Tala við mig 17:29, 13 November 2017 (UTC)[reply]

I don't think it has widespread consensus yet, but I prefer it because it's much easier to associate synonyms with specific senses that way. When they're in a separate ====Synonyms==== section, there's a risk that a sense will get deleted, or a sense will be split into two, and then the synonyms listed in a separate section get stranded. When they're right there under the sense, people are more likely to remember to move them at the same time. They're also easier for users to find when they're right there. —Aɴɢʀ (talk) 17:52, 13 November 2017 (UTC)[reply]
One (current) problem with {{syn}} / {{synonyms}} is that it's not possible to put any qualifiers or notes along with the synonyms such as "rare". DTLHS (talk) 17:56, 13 November 2017 (UTC)[reply]
That problem could be (and should be) solved — each synonym can have an alternative display with alt1= etc, so they could have qual1= etc just as easily. —Μετάknowledgediscuss/deeds 19:14, 13 November 2017 (UTC)[reply]
I use {{syn|pt|[[word|words]] {{q|qualifier}}}}. — Ungoliant (falai) 19:23, 13 November 2017 (UTC)[reply]
Functional, but ick -- that's starting to look dangerously like Perl.  :-P ‑‑ Eiríkr Útlendi │Tala við mig 19:41, 13 November 2017 (UTC)[reply]
Yeah, it’s not pretty. Still, I think it’s better than using qual1=, qual2=, alt1=, etc., since that would generate incorrect content whenever someone moves, removes or adds synonyms without paying attention to the other parameters.
Another possibility is using regular parenthees and having the module format it like {{q}}, but that would remove the distinction between qualifiers and optional particles (e.g., deixar has a synonym largar (de), where deixar X is synonymous with largar X and largar de X). — Ungoliant (falai) 15:19, 14 November 2017 (UTC)[reply]
  • Ideally, synonyms, translations and other sense-specific information would all come in little drop-down boxes on the definition lines, much as citations do now. @Ruakh raised this issue years ago and suggested some technical ways to make it work, but it never got anywhere (mainly due to general apathy, I think – so one really objected to it). Ƿidsiþ 10:35, 14 November 2017 (UTC)[reply]
Yes, there should be drop-down boxes possible for semantic relations. For example I can’t make the entry graph very pleasing without it. Technically it would be fine if I could just take the hyponym table that is now in a separate section and put it (minus the terms belonging to other senses) with few changes under the sense it belongs to. Maybe what we need are wrappers {{hypowrap}} and so on in which we can insert the tables, but as I see for at least that table in graph that does not have multiple columns it would be fine to have parameters, like table=yes or hidden=yes, in the semantic relations templates because it is quite easy to convert that table for {{hypo}}, only that under a sense I require a hidden listing instead of a full listing without user interaction. Also, why don’t I see a template {{mero}}, {{holo}}, {{coordinate}} while there is {{syn}}, {{hypo}}, {{hyper}}, {{ant}}?
In fact, the same can be said about translations. We would then need a wrapper which the user could click for each sense to see semantic relations and translations. Or that table would have smart mathematics and hide translations always unless clicked (would be good for learners) and show semantic relations in so far as they do not have excessive length:
ausmachen
1. to turn off, switch off
Antonyms: anstellen, anmachen, anschalten, einschalten, (click to see more)
(click to see translations) (if it is an English sense) Palaestrator verborum (loquier) 18:33, 14 November 2017 (UTC)[reply]
One problem I have with putting more things directly under definitions is that it makes it harder to modify or combine English senses. This is especially relevant for translations since an English editor probably will not know all the languages under a particular sense. This problem does not occur with the current gloss system, although translation boxes may become out of sync with definitions. DTLHS (talk) 18:38, 14 November 2017 (UTC)[reply]
It does in fact occur currently, except that people ignore the translations when merging senses, so that they go out of sync. Putting them under senses would just make it harder to ignore this problem. —Rua (mew) 18:44, 14 November 2017 (UTC)[reply]
If an English sense is modified and the corresponding translation box is not, those translations are still valid since they have an independent gloss that they refer to. DTLHS (talk) 18:46, 14 November 2017 (UTC)[reply]
However one might still have the current translation headers. Translation sections sometimes need to be more differentiated than the English glosses. For example I added under interference a specific legalese sense “intrusion into the scope of protection of a guaranteed right” without finding it necessary to add any English gloss, as the sense is included by the most general definition while other languages have already a distinction there. For the inner-language semantic relations, it would help to keep the senses in sync with the synonyms and other semantic relations if these are directly under sense, as else editors are tempted to underspecify to which senses the synonyms belong if they do it in special sections, and, what Rua said, it is a gentle push to keep things in sync if the semantic relations and translations are under the senses. Palaestrator verborum (loquier) 18:52, 14 November 2017 (UTC)[reply]
Also one should be more conscious about when it can actually happen that senses get merged, modified, or added.
  1. If senses are added, there is no problem with the translations and semantic relations. There are none for the new sense and the older specified ones are not wrong.
  2. If you combine senses, it is, as I understand it, because it has been foolish in the first place to differentiate, so there can be no distinction discerned in two translation sections either. There are many such English definition pages, and I would like to ask how a translator shall distinguish all the allegedly different eleven concepts of understanding currently listed at enforce – it is hard to relate a table in this lemma to a gloss, and translators have already forgot the sense which they actually want to translate when they have chosen a table, and one doubts the essentiality of the differences between the glosses, this is why there are almost no translations, unlike on enforcement. That article can hardly be made worse by adding the translations under glosses and also not further become worse by further merges, as long as and because the distinctions in the translations tables are incomprehensible. Actually having bilingual dictionaries at hand for some of the translation targets should tell you if it is really necessary to distinguish so many meanings. If in doubt, delete the translation tables because they are insufficient and unhealable to allow the births of new ones – it is better than letting confusing, poorly hedged glosses stay.
  3. If you modify senses, it is because you want to make the senses clearer: In those cases the translations will continue being correct and in the future it will be easier to translate because the sense to be translated is more clear for the translator. If you modify the sense because the former sense was just plain wrong and not extant for the lemma, which is highly unlikely, the translations can be deleted because they are misplaced. Palaestrator verborum (loquier) 19:25, 14 November 2017 (UTC)[reply]
Very true, but there's a fourth case: splitting senses. —Rua (mew) 19:37, 14 November 2017 (UTC)[reply]
Well, I presumed that the senses are somehow monadic, so splitting would be modifying (specifying) + adding. Because how wretched must one be to have actually two separate meanings as one gloss? At the latest the translator will split up such a gloss. It’s the whole problem of unreadable pages and translations conflicting glosses if the editors do not understand that the fact that they can describe meanings in multiple glosses does not imply that they are dealing with that many separate meanings. By talking more one does not necessarily make more statements, sometimes one clarifies, sometimes one even obsfuscates by hiding the simple. For the lovers of examples, a complete fail of the described kind has happened at نِسْبَة (nisba) until fixed today where there was a list of glosses with one word each gloss as if the English words had no manifold meaning themselves. The skilled editor does not just list every possible rewording but distinguishes the meanings in relation opposite to each other and specializes them as feasible – with means of vocabulary or layout – for the reader to understand given sentences that contain the word; sometimes the translation tables are desired to be even more special than the meaning has been presented without them, but listing the semantic relations and translations for each gloss helps to distinguish independently of this. There is in the case of enforce one abstract meaning that, as I admit, needs at least two main glosses to describe (mainly force for the increase of the effect of something being added and force being just applied on something or someone for a certain effect, as I see before now actually regrouping the enforce lemma content), but not eleven. Palaestrator verborum (loquier) 23:16, 14 November 2017 (UTC)[reply]
I like separate synonym sections as mandated by WT:EL, but there was a recent poll showing many people like the new format. --Dan Polansky (talk) 07:28, 17 November 2017 (UTC)[reply]
The poll is at Wiktionary:Beer_parlour/2017/May#Poll: putting "nyms" directly under definition lines. It shows a 2/3-supermajority support placing -nyms (including synonyms) directly under definitions. --Dan Polansky (talk) 07:44, 17 November 2017 (UTC)[reply]
I have a stylistic objection, currently "Synonyms:" is so big and bold that it overshadows the definition line above it which should be more important. Crom daba (talk) 15:12, 22 November 2017 (UTC)[reply]
I agree with Crom daba. Plus, I can’t see why synonyms should be linked. — Ungoliant (falai) 15:26, 22 November 2017 (UTC)[reply]
I'd like to point out that the poll shows there is no consensus to put them under definitions if they are not hidden like quotations, which is not currently the case. There is only majority support for having them under definitions AND hidden. Andrew Sheedy (talk) 02:26, 4 December 2017 (UTC)[reply]
Nobody has bothered to make them hidden yet. —Rua (mew) 11:58, 4 December 2017 (UTC)[reply]
I express that I also have that stylistic objection – Synonyms should be somewhat greyed out, it is too obtrusive. Else I do not care if it is hidden or not. Maybe the best is a display on hovering, as with the CSS pseudo-class :hover, to introduce a third option. @Rua Palaestrator verborum (loquier) 16:05, 4 December 2017 (UTC)[reply]
Hiding something until the user happens to place their mouse in that location is terrible discoverability, and terrible usability. How would anyone know to mouse-over? ‑‑ Eiríkr Útlendi │Tala við mig 18:20, 6 December 2017 (UTC)[reply]
@Eirikr: Don't you like treasure hunts? --Barytonesis (talk) 18:50, 6 December 2017 (UTC)[reply]
Never mind platforms that don't even have a mouse cursor. —Rua (mew) 18:51, 6 December 2017 (UTC)[reply]

Hyphen at the end of a proto-language word

[edit]

What is the meaning of the hyphen at the end of a proto-language word? For example, if the source uses *pura without a hyphen, can it be changed to *pura-? This change was made at the Hungarian verb fúr and I wonder if it is valid. Thanks. --Panda10 (talk) 18:11, 13 November 2017 (UTC)[reply]

It's how the lemma form has been chosen for Proto-Uralic verbs. Verb lemmas are the stem, noun/adjective lemmas are the nominative singular. —Rua (mew) 18:59, 13 November 2017 (UTC)[reply]

Proposal: delete all Latin script letter senses from all languages except Translingual

[edit]

I suggest deleting all Latin script letter senses from all languages except Translingual. The "letter name" senses like "bee" = "name of letter B" may be kept.

Example: diff. (I just edited the first few language sections)

If we keep the current format without changes and add a language section for each language that uses "a", it will become basically infinite, and useless. It gets increasingly hard to find the non-letter senses.

I know I have proposed different things in the past concerning letters. I've been trying to figure out what to do with them.

Pronunciation can be found in appendices like Appendix:English pronunciation. The appendices may even be expanded with detailed info if needed.

Feel free to propose different things or say if there's any problem with this idea.

This is a major proposal, so if people like this idea, I'm sure it would need to be voted at some point. I'm not in a hurry. --Daniel Carrero (talk) 23:08, 13 November 2017 (UTC)[reply]

Why only Latin? The page А is also huge. DTLHS (talk) 23:12, 13 November 2017 (UTC)[reply]
I support this, but for other scripts as well. --Barytonesis (talk) 23:29, 13 November 2017 (UTC)[reply]
What about unusual or maybe even non-Translingual letters? For example, are there other languages than Gregorian using Gregorian letters like , or languages other than Cherokee using Cherokee letters like ? If there aren't, then properly speaking it's not translingual.
What about inflection? {{en-letter|upper=A|lower=a}} also produces "plural a's" which is an English plural. Well, it might belong into a noun and not a letter section, but there wouldn't be much gain in removing the letter section and adding new noun sections for inflection.
-84.161.12.88 00:05, 14 November 2017 (UTC)[reply]
Support, though some solution for inflection does need to be found. I definitely agree that these pages are so huge, and have the potential to get so much huger, that they are useless. Moreover, letters themselves don't convey meaning, they just stand for themselves just like any mention of a word stands for the word itself, so they don't even meet CFI. Yes, letters can used in an ordinal fashion, but our current entries don't make any mention of this so there is no loss in that regard. —Rua (mew) 00:18, 14 November 2017 (UTC)[reply]
Oppose. I find pronunciation information for individual languages very helpful, especially for the actual name of the letter (how else could you figure out how to spell in another language?). As an alternate, would it be possible to make the Translingual page the default, and move most language-specific information on the letter to an appendix, or break it up into more than one mainspace entry somehow (with links from the main entry), or have two versions of the page? I very badly don't want to lose all the pronunciation information, so I think simply stripping the entries down is a bad idea. Andrew Sheedy (talk) 01:31, 14 November 2017 (UTC)[reply]
"Pronunciation can be found in appendices like Appendix:English pronunciation." ;) —suzukaze (tc) 01:54, 14 November 2017 (UTC)[reply]
Does that appendix explain that the letter A is pronounced /eɪ/? I can't see it. Ƿidsiþ 08:57, 14 November 2017 (UTC)[reply]
@Widsith: No it doesn't, but see my suggestion below. —Aɴɢʀ (talk) 10:08, 14 November 2017 (UTC)[reply]
Support, and for non-Latin scripts too. —suzukaze (tc) 01:54, 14 November 2017 (UTC)[reply]

Vote: Restricting Thesaurus to English

[edit]

FYI, I created Wiktionary:Votes/pl-2017-11/Restricting Thesaurus to English.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 10:28, 15 November 2017 (UTC)[reply]

Comments on Recent Votes

[edit]

I am curious if people feel as I do, that vote pages are often overrun with discussions and that those discussions are often tangential at best. I propose that we eliminate discussion within the vote page itself, and if someone would like to engage in some discussion concerning an individual's vote that they do so on the vote talk page. Perhaps a link to the discussion on the talk page can be placed on the voting page, but protracted conversations within the vote only obfuscate the results. I am only referring to secondary comments; a comment left by the voter to clarify or justify their vote is, of course, perfectly fine.
Secondly, a number of times recently there have been comments to the effect that votes should include a written rationale, or requests that individuals justify their votes in some way. This is inappropriate. Any member of the community is eligible to vote, and they can vote how they please for whatever reasons they please; if they wish to share their rationale that is their prerogative. Requesting clarification about a vote is acceptable (e.g. "if the proposal had been x instead of y would you have voted differently") but demanding justification is completely out of line. - TheDaveRoss 15:33, 15 November 2017 (UTC)[reply]

protracted conversations within the vote only obfuscate the results – well, that is why the three voting possibilities have colored badges so one can count through. And technically the “result” has only to be counted through at the end by one person. Isn’t it then actually good that the result is obfuscated? It diminishes group pressure and enables free voting because one does not see such heaps of votes.
Also I have suggested that some kind of hedge template is created that demarcates digressions from the topic and maybe needs a click for drop-down because often nobody does want to move or one does not know whither exactly to move the discussions. Palaestrator verborum (loquier) 11:04, 16 November 2017 (UTC)[reply]
Oppose. First of all, about: "demanding justification is completely out of line". I'm not sure about that. I would support having a different voting system where unjustified votes don't count. (except maybe votes for bots and admins, etc.) But I'm fine with the current system, anyway.
More importantly, I disagree with what has been proposed. I would prefer still freely allowing discussion in vote pages. Some written rationales are more likely than others to attract further comments from other people, and thus discussions begin. I think that's fine. In fact, I consider the in-vote discussions even more important than the actual simple count of "Support"/"Oppose"/"Abstain". The discussions should encourage us to think and justify stuff. If someone says "I'm voting because of reason X", it would most likely be an avoidable hassle to go to the talk page to discuss about reason X, if the alternative is replying right then and there in the vote page itself.
When someone adds the first comment to someone's rationale, it's not clear yet if this will become an actual discussion. It's not clear if someone else will reply. I don't suppose we're expected to create separate sections in the talk page for each little comment when we don't know if this will become an actual discussion, right?
Interestingly, we have this rule in WT:Voting policy: "Debate is welcome on these pages." But this rule was never voted itself. --Daniel Carrero (talk) 12:30, 16 November 2017 (UTC)[reply]
@Daniel Carrero Part of the reason I dislike discussions within the vote is that it indicates that the vote was premature, and there was not adequate discussion prior to its start. Votes should not be islands, they should be the final step in a process. The commentary often takes forms which I find distasteful, such as pressure to change one's vote, casting aspersions on the subject of the vote or the voter themselves, or other such nonsense. Why is having the discussion within the vote so important? Re whether or not a comment will lead to a discussion, I think any comment beyond that of the voter constitutes discussion, so every time it happens it is the case. - TheDaveRoss 14:08, 16 November 2017 (UTC)[reply]
I don't think discussions in the vote are (necessarily) an indication that the vote is premature. I think we can't ever say for sure that the discussions are over. If some idea is discussed by many people and has close to 100% approval before any vote starts, this would indicate that the issue is so noncontroversial that maybe a vote is not even needed. --Daniel Carrero (talk) 14:34, 16 November 2017 (UTC)[reply]
Approval and sufficient discussion are not the same thing, I am perfectly fine with a vote being run for a controversial topic. But if people feel the need to make arguments within the vote itself it is an obvious indicator that they have not had sufficient opportunity to do so ahead of time. - TheDaveRoss 15:07, 16 November 2017 (UTC)[reply]
I oppose removing discussions from the vote page. I like the idea of show-hide concealment, perhaps beginning a short time after the last comment in the thread or when the thread is 'too' long. DCDuring (talk) 13:08, 16 November 2017 (UTC)[reply]
I support using show-hide concealment in some cases too. --Daniel Carrero (talk) 13:09, 16 November 2017 (UTC)[reply]
@DCDuring If the consensus is that discussions within votes are appropriate, then I would support hiding them as well. - TheDaveRoss 14:08, 16 November 2017 (UTC)[reply]
  • I oppose removing, limiting or hiding discussions from the vote page. These discussions are a great feature. Votes should be understood to be votes/requests for comment. If votes have the potential to be "evil" to an extent, these discussions directly on the vote page make them less so. I think we should invite more discussion directly on the vote page. The discussions do not "obfuscate the results" since we use icons and vote counting. As for "demanding justification is completely out of line", I think the opposite: we should do more to encourage people to provide rationales and make votes more like requests for comments and discussions. The idea that enough people come to discuss a proposal before the vote starts is unrealistic; it is the vote that forces people to pay attention to a proposal lest it passes. --Dan Polansky (talk) 07:24, 17 November 2017 (UTC)[reply]
    That said, it may be a good practice to continue the discussion on the vote talk page once it becomes more protracted. --Dan Polansky (talk) 17:15, 17 November 2017 (UTC)[reply]
    I also oppose, for pretty much identical reasons to Dan. —Μετάknowledgediscuss/deeds 17:16, 17 November 2017 (UTC)[reply]
    @Dan, I would consider the sentiment of your final sentence to be an abuse of the voting system, not a feature. - TheDaveRoss 20:15, 17 November 2017 (UTC)[reply]
    @TheDaveRoss: I read that sentence as a simple statement of fact, not an endorsement of starting votes purely to force people to think about things. — Eru·tuon 22:40, 20 November 2017 (UTC)[reply]
This made me start vaguely wondering if a secret ballot (with names/votes posted at conclusion of vote only) would be an improvement over a public voting page. (I doubt it!) It wouldn't stop people from making comments, anyway; would just prevent voting based on how others have voted, which I bet also happens. Equinox 16:49, 17 November 2017 (UTC)[reply]
I like to see how others have voted. There are some editors I trust a great deal, and I do sometimes vote oppositely to them, but I am sure to think it through more carefully first. —Μετάknowledgediscuss/deeds 21:01, 27 November 2017 (UTC)[reply]

stroke order exceptions within the standard guidelines

[edit]

Where could I find a dabase of exceptions of the stroke order to be expected following the standard guidelines? Input methods' lists of input sequences could be used for this issue, which is of great lexicographical value --Backinstadiums (talk) 09:34, 19 November 2017 (UTC)[reply]

[edit]

After jumping from page to page for an hour, I still do not know how to request a translation. The article I'd like to have an English version of, among others, is https://zh.wikipedia.org/wiki/%E5%90%88%E6%96%87. Would it be possible to request the translation of a whole "category"? Thanks --Backinstadiums (talk) 13:49, 20 November 2017 (UTC)[reply]

We have WT:TRREQ, but requesting translation of whole Wikipedia articles is asking an awful lot. Of course, you could enter the URL of the Wikipedia page into Google Translate, but for the URL you gave the result is pretty useless (Chinese speakers may find it good for a laugh, though). Chuck Entz (talk) 14:25, 20 November 2017 (UTC)[reply]
Indeed, and it starts right from the title... lol! Wyang (talk) 14:28, 20 November 2017 (UTC)[reply]
@Chuck Entz: I meant there used to be a page for this in wikipedia, neamely https://en.wikipedia.org/wiki/Category:Translation_Request, yet it's inactive now, and cannot find the current formal procedure --Backinstadiums (talk) 14:58, 20 November 2017 (UTC)[reply]
@Backinstadiums: Did you see w:Wikipedia:Translation#Requesting a translation from a foreign language to English? —Aɴɢʀ (talk) 16:38, 20 November 2017 (UTC)[reply]
@Angr: Yes, and still do not know where to post... it's easier in Wiktionary. I cannot seem to find the exact final thread to request it. I think those articles are highly relevant for Wiktionary as well --Backinstadiums (talk) 18:31, 20 November 2017 (UTC)[reply]

New print to pdf feature for mobile web readers

[edit]

CKoerner (WMF) (talk) 22:07, 20 November 2017 (UTC)[reply]

Noto font family

[edit]

Google has designed and released Noto Sans and Noto Serif (freely available) that can handle a large number of languages. Currently, Noto covers over 30 scripts, and will cover all of Unicode in the future. Noto download. Go to Noto Sans specimen to see how it looks. —Stephen (Talk) 05:31, 21 November 2017 (UTC)[reply]

Bah, still no Manichaean. Crom daba (talk) 14:23, 21 November 2017 (UTC)[reply]
I use this for everything, it's awesome. Their Devanagari serif font looks so cool and has a ton of ligatures. —AryamanA (मुझसे बात करेंयोगदान) 00:57, 30 November 2017 (UTC)[reply]

Gender-neutral French plurals

[edit]

Should we be concerned with documenting these? —Justin (koavf)TCM 18:03, 21 November 2017 (UTC)[reply]

If they're attestable, of course. —Mahāgaja (fomerly Angr) · talk 19:24, 21 November 2017 (UTC)[reply]
The · entry might need the addition of a "French > Symbol" section — again, if attestable. Equinox 19:31, 21 November 2017 (UTC)[reply]

Similarly spelled words (not alt. spellings) but somewhat related, e.g. skink and skunk

[edit]

Hi, I'm not sure if this should go under "See also" but sometimes it's useful to show similarly spelled words. A majority of English speakers might know skunk but some might not remember or know skink (a type of lizard). Obviously this should be sparingly used. Thanks in advance for any advice! Facts707 (talk) 13:58, 26 November 2017 (UTC)[reply]

I don't think we should put "unrelated" stuff in the actual entry (in most cases). This kind of "did you mean...?" seems like a job for the search engine. (At present, it means you just have to click the Search button instead of the Go button; and I'm not sure how well it deals with soundalikes.) Equinox 21:31, 27 November 2017 (UTC)[reply]
Currently similar things are put under “usage notes”, which is not optimal: etymology is “not to be confused with entomology (“the study of insects”) or etiology (“the study of causes or origins”)” Palaestrator verborum (loquier) 17:04, 30 November 2017 (UTC)[reply]

Uncountable nouns under singularia tantum

[edit]

The category for uncountable nouns is currently under the category of "singularia tantum". I don't think this is right for languages that do not really have grammatical number, like Chinese. Chinese still has uncountable nouns, i.e. nouns that do not take a classifier/counter, but these are not "singularia tantum". — justin(r)leung (t...) | c=› } 20:49, 27 November 2017 (UTC)[reply]

But our category structure is gem-like in the perfection of its universality. DCDuring (talk) 22:15, 27 November 2017 (UTC) and[reply]
Indeed. We categorise everything from Category:Aa: ⠁ to Category:Zz: ⠵ via Category:Furry fandom and Category:Working dogs. --Lirafafrod (talk) 22:31, 27 November 2017 (UTC)[reply]
The underlying issue would seem to be that most of the entries use {{en-noun|-}} which automatically categorizes them as uncountable nouns. I don't think many of them are. We don't seem to have a {{en-singular noun}} that parallels {{en-plural noun}}. Category:English singularia tantum needs cleanup, as it, among other things, contains words that agree with both singular and plural verbs. I've never understood why we emphasize the form of the word rather than the number of the verb that it is normally agrees with. DCDuring (talk) 18:43, 28 November 2017 (UTC)[reply]
I don't particularly like the term, it has a certain "urgh" factor, especially when there there are categories in various languages for uncountable nouns, and where nouns can be recorded as uncountable by the template used. I have eradicated it from some entries. I'm not keen on pluralia tantum either. DonnanZ (talk) 19:53, 1 December 2017 (UTC)[reply]
I don't know of any dictionary that routinely uses singulare tantum and plurale tantum. Among OneLook dictionaries only Wiktionary and WP even define them. I'd be happy to get rid of them for all English entries and categories. DCDuring (talk) 22:21, 1 December 2017 (UTC)[reply]
It's being quite pretentious actually. So as the Daleks said: "Exterminate!". DonnanZ (talk) 17:19, 4 December 2017 (UTC)[reply]
Are there uncountable pluralia tantum? If there are, then in no way do uncountable nouns belong under singularia tantum. 1990s should be a plurale tantum and uncountable (as there is no a 1990s is/was, one 1990s is/was, two 1990s are/were but only the 1990s were (historic also: are)).
It could work there other way though: Both singularia tantum and pluralia tantum could belong under uncountable. However there could also be countable pluralia tantum, e.g. litterae (sense letter) could be countable like unae litterae (one letter), duae litterae (two letters). -80.133.105.203 17:53, 4 December 2017 (UTC)[reply]
Well, there is a template {{en-plural noun}} for English plural nouns, but I'm not sure whether it generates pluralia tantum as a category. I hope not. DonnanZ (talk) 01:04, 5 December 2017 (UTC)[reply]
Oh, it does... — justin(r)leung (t...) | c=› } 01:18, 5 December 2017 (UTC)[reply]
Looking at the first few entries that used {{en-plural noun}}, I found feet and computer graphics reported as plural nouns. Feet is not plural only in any of the definitions in the entry. Similarly for computer graphics, which is a plural of computer graphic in one sense and is a singular noun (in terms of the number of the verb for grammatical agreement) in the other senses. It would not surprise me to find that 10-20% of the entries of the category that use the template used it improperly, in whole or in part. I can't really get excited about issues like subcategorization when plain error is so common. DCDuring (talk) 02:18, 5 December 2017 (UTC)[reply]
I'm beginning to think that we should just make Category:English pluralia tantum and Category:English singularia tantum hidden so that normal users don't see all the evidence of a high rate of error. DCDuring (talk) 02:25, 5 December 2017 (UTC)[reply]
Hmm, that's a bit like sweeping the dust under the carpet. Wouldn't it be better to amend the templates and categories? DonnanZ (talk) 13:12, 5 December 2017 (UTC)[reply]

Removing Proto-North-Caucasian

[edit]

Judging by Wikipedia, North-Caucasian is not a valid language grouping. This is a part of Moscow school "Sino-Caucasian" long range reconstruction which is supposed to connect Sino-Tibetan, Dravidian, North-East Caucasian, North-West Caucasian and Yeniseian into a single super family.

I understand that it's sometimes easier to just copy things from the Starling database, but I don't think we should give space to crackpot theories over here. Crom daba (talk) 15:52, 28 November 2017 (UTC)[reply]

Well, some parts of Starling are decent enough. The North-Caucasian etymologies are particularly notorious though. —AryamanA (मुझसे बात करेंयोगदान) 00:56, 30 November 2017 (UTC)[reply]
The problem is that Starling doesn't offer reconstructions on the proto-NEC level, so the editor (specifically @Vahagn Petrosyan since he seems to be the one making these) would have to know enough about NEC to excise aspects of the reconstruction formed through spurious NWC comparanda. Crom daba (talk) 13:51, 30 November 2017 (UTC)[reply]
I don't know enough about NEC to do that, but I support removing Proto-North-Caucasian. --Vahag (talk) 16:49, 30 November 2017 (UTC)[reply]
I previously thought that it would be possible to somehow maintain these reconstructions, Proto-North-Caucasian, Nostratic and so on, with extra templates that show that even the family is not accepted, perhaps even in a new namespace to hedge this “poison”, but I am in in the deletion, because I see that these macrofamilies attract all kinds of howthery-towthery people that burn the candle at both ends with their smattering learning. It’s hard enough to know the languages and collect the data for the accepted families. It’s beyond the level of man to do more in acceptable quality – those people who want Nostratic shall fork their own Wiktionary thus. That @Jaspet shall work on an own server where he continually clones Wiktionary entries for acceptable terms, and he can create entries for the acceptable languages here. Let’s see if that works: It won’t and he can’t work this way because such people are incompetent and try to hide it by working in areas that nobody can have enough concrete knowledge about. It is like metaphysics: Nothing serious people want to read. Palaestrator verborum (loquier) 21:18, 7 December 2017 (UTC)[reply]
I too support removal. @Crom daba, is it no longer referenced in mainspace? —Μετάknowledgediscuss/deeds 21:43, 7 December 2017 (UTC)[reply]
It's still referenced, I'll clean up derived forms. Crom daba (talk) 21:54, 7 December 2017 (UTC)[reply]
With all due respect I don't know what you're talking about, nor what that is meant to convey.
There is, by the way, a major difference between comparative linguistics and metaphysics: one is a posteriori and theoretical, for which all of the methodology rests on the idea that evidence is what conclusions about reality are drawn from; the other is a priori (thus very often ad hoc), philosophical, and hypothetical, being based on personal beliefs for the causes of experiences rather than on a scientific methodology. A small and irrelevant detail, I know. I'll let you humor yourself guessing which is which.  — J​as​p​e​t 22:31, 7 December 2017 (UTC)[reply]
"Judging by Wikipedia" is your first problem. The last time I checked, Wikipedia isn't itself a source and doesn't have an opinion. Regardless, I agree that we should be using original sources rather than mindlessly pulling from StarLing in a practically copy-and-paste manner. Rather, it is more logical to cite original sources and only include StarLing when it is absolutely necessary (i.e. when it is the only source, as is the case for the majority of this database). At least one such "original source" exists in this case for North Caucasian, but it is hosted on the StarLing site itself and is by the same authors, so in this case it might not make much difference.
That said, although I do not support deleting it outright, I want little part in North Caucasian: it is a case in which there is far too much long-term proximity of the two language families for any amount of evidence to appease people, and invoking the argument that every resemblance is due to contact/borrowing is not entirely ridiculous as it would be in some other cases.  — J​as​p​e​t 22:31, 7 December 2017 (UTC)[reply]
@Jaspet: As an aside--I don't want to get off-track with the main conversation--a neutral point of view is a point of view. It's not that Wikipedia has no perspective but a neutral one. —Justin (koavf)TCM 23:38, 7 December 2017 (UTC)[reply]
There is nothing stopping us from linking these potential cognates on PNEC and PNWC pages, this however does not necessitate making PNC pages.
Seeing as there are no regular phonologic correspondences between these languages that are accepted outside of Starostin's books, we cannot do our own informed reconstructions and the entries will unavoidably be StarLing copy-pastes.
Not to mention how misleading these are, if you saw Lua error in Module:parameters at line 573: Parameter 1 should be a valid language, etymology language or family code; the value "ccn-pro" is not valid. See WT:LOL, WT:LOL/E and WT:LOF. in an etymology, you might imagine it to be solid etymon (well not if you had prior contact with StarLing wildcard etymologies) with many regular descendants. Instead it's a (seemingly solid) PNWC word for a jackal/fox plus two very dubious NEC words.
This is endemic to all of StarLing, methods of reconstruction are fixed so that long-range cognates line up, this is no way to do philology.
Crom daba (talk) 00:04, 8 December 2017 (UTC)[reply]
Fair enough. I would say it has a perspective but not an opinion, such that it tries not to take sides. But I suppose this is an unhelpful semantic issue.
On a more relevant note, there is no objective manner in which to settle differences in perspectives/opinions/beliefs about linguistic theories, and for better or worse it is ultimately up to the consensus of individual Wikimedia contributors. My two cents is that, though the theory itself isn't accepted at large, that doesn't mean Wiktionary shouldn't allow the reconstructions to be shown at all. In the same way that Wikipedia articles can exist for these theoretical language families, though with plenty of disclaimers, Wiktionary entries can exist for the specific comparisons and reconstructed protoforms, with disclaimers. As has been suggested, maybe the creation of a new template (yet another disclaimer) is warranted, or I alternatively suggest going back to using appendices specifically for these less accepted reconstructions. Additionally, it is difficult to discern whether the linguistics community at large even has a perspective on each of such connections between language families, since most study directed at the comparisons are only done by those arguing in favor of their relationship.  — J​as​p​e​t 00:18, 8 December 2017 (UTC)[reply]

@Crom daba In my opinion, you wrote complete nonsense. Why did you write about the Proto-Sino-Caucasian, Proto-Dravidian languages? In fact, the criticism will be negligible if you continue to consider only the proto-word forms you need. The fundamental knowledge itself has not been destroyed by criticism. Not knowing the basics of macrocomparative, you propose to remove a reconstructed language. You naively believe that between languages that diverged "8" thousand years ago, phonetic correspondences will be like between closely related dialects. In my opinion, you propagandize the malicious dilettantism that Vovin started in the 90 years of the 20th century. This means that there is no reason to delete the proto-language. Gnosandes (talk) 20:32, 22 May 2020 (UTC)[reply]

Oh my, I didn't realise you were a full-throated Altaic supporter. And defensive of Nostratic, to boot, unless you completely misunderstood Crom daba. This kind of pseudoscience has no place on Wiktionary. —Μετάknowledgediscuss/deeds 20:49, 22 May 2020 (UTC)[reply]
@Metaknowledge You're more likely to be doing pseudoscience here. What is worth it is that you continue to merge the Indo-European roots into one single root. At the same time ignoring modern data of accentological science. What resistance you show me in Proto-Slavic articles, and even more so in Proto-Balto-Slavic articles. After that, you can’t say anything about Altaic and Nostratic. I treat these reconstructions both not well and not bad. For as Illich-Svitych wrote, this is the experience of reconstruction, not the truth. At the same time, critics only criticize certain reconstructions, but fundamental knowledge does not (I repeat). When dealing with Caucasian languages, which are quite complex, you suggest deleting them without having any idea about them. There are about two dozen linguists engaged in Caucasian languages in the world, and I have not seen any other reconstructions, except for those of Starostin and Nikolaev, who studied these languages on expeditions to Moscow State University. I see one paltry criticism that is widely believed. For half a year, I have sufficiently understood the essence of the English Wiktionary. Gnosandes (talk) 21:11, 22 May 2020 (UTC)[reply]
The problem with wildcarded protoforms isn't that they can't be right, but that they fit a wide variety of possibilities ranging from tidy long-range relationships to random noise. Tidiness is more fun, so long-range comparativists tend to ignore all the rest. It's sort of like going through a truckload of scrabble tiles and coming up with a quote from Shakespeare... Chuck Entz (talk) 07:22, 23 May 2020 (UTC)[reply]

Restoring vmf and eliminating gmw-hfr

[edit]

Five years ago we voted to exclude the code vmf because it was unclear what language it was supposed to refer to. However, in the meantime, SIL/Ethnologue has cleaned up their definition and it is now clear that it refers to East Franconian German: Ethnologue's name for the language is "Eastern Franconian" and the area where it is said to be spoken is "Bayern state: Oberfranken, Mittelfranken, and Unterfranken districts; Thüringen state: south", which corresponds to the usual understanding of East Franconian.

At the same time, we do have gmw-hfr for "High Franconian", which is supposed to cover both East Franconian and South Franconian; however, dialectologists no longer accept that these two lects are more closely related to each other than they are to the other Upper German lects, so "High Franconian" is apparently not actually a valid clade.

I therefore request that we reinstate vmf as a valid code and call it East Franconian, and abolish gmw-hfr. There are currently only four High Franconian entries, all of which are labeled with locations in the East (not South) Franconian area, so we can safely move all of them to vmf. —Mahāgaja (formerly Angr) · talk 16:13, 29 November 2017 (UTC)[reply]

Oh, they finally did something about that mess. Support. — Ungoliant (falai) 16:24, 29 November 2017 (UTC)[reply]
Wouldn't than a new code gmw-sfr (german middle west - south franconian) for South Franconian be needed too?
Isn't according to SIL (www-01.sil.org/iso639-3/documentation.asp?id=vmf) vmf = w:Main-Franconian dialects? www.ethnologue.com/language/vmf calls vmf "Eastern Franconian" (w:East Franconian German) and "Upper Franconian" (w:High Franconian German?) and mentions the alleged autonyms "Mainfränkisch", "Ostfränkisch". Thus isn't it still unclear what's it supposed to be? Wouldn't a clear code gmw-efr (german middle west - east franconian) be a better choice, also as it is similar to gmw-sfr?
-84.161.29.245 16:29, 29 November 2017 (UTC)[reply]
Yes, we can add gmw-sfr for South Franconian too, but at the moment we don't seem to have any South Franconian entries. As for what vmf refers to, I think it's clear enough from the Ethnologue entry that all of East Franconian is meant: Ethnologue says vmf is spoken in Oberfranken, but Oberfränkisch is not usually considered part of Mainfränkisch. So if vmf includes Oberfränkisch, then vmf must be broader than Mainfränkisch. And franlkly, even if only Main Franconian were meant, we have enough flexibility to extend it to the whole of East Franconian. Compare nrf, which SIL uses only for Guernésiais and Jèrriais but which we have extended to cover Sercquiais and Continental Norman. —Mahāgaja (formerly Angr) · talk 16:52, 29 November 2017 (UTC)[reply]
OK, I have added vmf as a valid code and removed gmw-hfr. For some reason {{auto cat}} doesn't work on East Franconian categories, but the other boilerplate templates (e.g. {{poscatboiler}} do. —Mahāgaja (formerly Angr) · talk 11:29, 7 December 2017 (UTC)[reply]

Excluding syr

[edit]

A user has been adding Latin-alphabet entries in Syriac, using the code syr, which currently has no other lemmas but which we define as being written in the Syriac script, not Latin. For this reason, I have reverted them. However, it turns out that syr is a macrolanguage covering Assyrian Neo-Aramaic (aii) and Chaldean Neo-Aramaic (cld), which we treat as two separate languages. I therefore suggest we exclude syr as a valid code and ask the user to add entries only in Syriac script and only using one of those two codes (whichever is appropriate). —Mahāgaja (formerly Angr) · talk 16:40, 29 November 2017 (UTC)[reply]

Yes, if we split the language one way we can not lump it the other. Also, syr is an easy misspelling for syc (I have cleaned those at some point), that’s why it should go.
Also, can somebody add language data to the Old South Arabian dialects (as etymology-only languages, like ML.)? We treat Old South Arabian as one language, but currently if someone uses the codes xsa (Sabaean), xhd (Hadramautic), xqt (Qatabanian), inm (Minaean), xha (Harami) he can do what he wants but reasonable use of the codes is hindered because the templates set the South Arabian script cursive (they should just have the same properties as sem-srb in this regard). Palaestrator verborum (loquier) 00:21, 30 November 2017 (UTC)[reply]
@Palaestrator verborum: By cursive do you mean italicization? Italicization is controlled by MediaWiki:Common.css, not by language data. There is already a style rule preventing italicization for South Arabian script: .Sarb, .Sarb * { font-style: normal; }. It will not apply in the mobile version of the site, however. — Eru·tuon 01:14, 30 November 2017 (UTC)[reply]
Oh, I see. xsa and the others don't have a script listed. Do they all use South Arabian script? — Eru·tuon 01:16, 30 November 2017 (UTC)[reply]
@Erutuon Yes, the script had been pretty widespread throughout the Arabian peninsula. I can’t tell what they also used, as the period under question is to a great part the period of emergences and mixings of various scripts and their prototypes and the epigraphic literature is scattered and hard to reach (all trifles are digitized or collected by university libraries but rarely the crucial reproductions of Arabian inscriptions), but definitely one is not wrong with ascribing the Ancient South Arabian script to all the Old South Arabian languages. Palaestrator verborum (loquier) 01:33, 30 November 2017 (UTC)[reply]
@Palaestrator verborum: Okay, done (diff1, diff2). Now the correct script class will be used and the languages will display without italics. — Eru·tuon 01:54, 30 November 2017 (UTC)[reply]
To get back to the original topic, are there any objections to my removing syr from Module:languages/data3/s and listing it as excluded at WT:LT? —Mahāgaja (formerly Angr) · talk 13:05, 30 November 2017 (UTC)[reply]
It seems like it was planned to exclude it, as WT:LT says: Syriac (syr) See the entry for "Aramaic". But it’s not there. But note that you have to move some babel templates. Currently they are all on the code syr. Lol the deletion of {{User syc}}: “Not used, and the correct code syr already exists” – but that was 2011. Palaestrator verborum (loquier)
Oh Lord. I don't even know what language {{User syr}} and {{User syr-1}} are written in. Presumably not Classical Syriac since Kathovo claims to be a native speaker. I assume he's a native speaker of either aii or cld, but he hasn't edited here since 2013 or at Wikipedia since 2016. @Lingo Bingo Dingo, Profes.I.: what language did you guys mean when you put "syr-1" on your user pages? Classical Syriac, Assyrian Neo-Aramaic, Chaldean Neo-Aramaic, or something else? —Mahāgaja (formerly Angr) · talk 21:32, 30 November 2017 (UTC)[reply]
Classical Syriac, because I couldn't find a Babel tag for that variety. Lingo Bingo Dingo (talk) 09:54, 1 December 2017 (UTC)[reply]
OK, then I'm moving the templates from "syr" to "syc", removing "syr" from the language module, and updating LT. —Mahāgaja (formerly Angr) · talk 19:25, 4 December 2017 (UTC)[reply]
I think I'm done. I've corrected all the errors that appeared at CAT:E as of now. —Mahāgaja (formerly Angr) · talk 21:39, 4 December 2017 (UTC)[reply]

A little last-minute but I'd like a Christmas competition

[edit]

We haven't had one in a few years. I'd be willing to brainstorm for one and post it December 1 if the community thinks there is enough interest in some word games. —Justin (koavf)TCM 18:28, 29 November 2017 (UTC)[reply]

See Category:Wiktionary fun stuff for previous entries. —Justin (koavf)TCM 18:28, 29 November 2017 (UTC)[reply]
Never too late — I'd love a Christmas competition. The best ones are like the one from earlier this year in that they have a structure which encourages entry creation as side effect of the game. —Μετάknowledgediscuss/deeds 19:13, 29 November 2017 (UTC)[reply]
I’d love one too. — Ungoliant (falai) 11:07, 30 November 2017 (UTC)[reply]
I could organise multilingual Scrabble, as the original cretor doesn't seem to be around any longer. --Lirafafrod (talk) 15:41, 1 December 2017 (UTC)[reply]
Or we can rehash old games - I doubt any of us were still around in 2008. Wiktionary:Possible Christmas Competition 2017 1 is a catchy name. --Lirafafrod (talk) 15:49, 1 December 2017 (UTC)[reply]
Wiktionary:Possible Christmas Competition 2017 2 - rewriting a Christmas song with Wiktionary-based lyrics. --Lirafafrod (talk) 15:55, 1 December 2017 (UTC)[reply]
Thanks, Lirafafrod. I don't have more to add, so I'm in favor of whatever everyone else decides. I'll brainstorm for an Easter competition for next year. —Justin (koavf)TCM 02:38, 2 December 2017 (UTC)[reply]
His name is Wonderfool (probably) —AryamanA (मुझसे बात करेंयोगदान) 03:12, 2 December 2017 (UTC)[reply]

Let's please have a final game by Monday the 4th, so we have three weeks to "compete". —Justin (koavf)TCM 02:42, 2 December 2017 (UTC)[reply]

Ingush palochka

[edit]

I've just noticed that, in our entry Ingush, the Ingush translation for the language is гӏалгӏай (ğalğaj), with Unicode u04cf for the palochkas. Wikipedia's entry for w:Ingush language has гӀалгӀай, which uses the palochka codepoint u04c0 (the original Unicode palochka, and the one used by Ingush and Chechen writers). The palochka glyph has never had separate upper- and lowercase forms, but only a single case. Recently somebody created a special Unicode u04cf (as used in our гӏалгӏай) as a new lowercase form, and this new lowercase ӏ is identical to the original Ӏ (at least in my font). Since the palochka has never had two cases, and since the original form and the new form look identical, it causes a lot of confusion. I don't think the new form is being used by anyone (except for some of our editors). Besides that, the palochka is never initial in a word, so there is no place to use an uppercase form (the uppercase being represented by the original form which is virtually always used in lowercase environments). The only situation where an uppercase might theoretically be used is in all caps, a rarity. I think we should stop using the new form (u04cf) since no one else uses it.

One result of this confusion is that Ingush words using the original palochka are being transliterated strangely: гӏалгӏай (ğalğaj). See Wiktionary:Ingush transliteration. Another difficulty is that searching for one spelling misses words written with the other spelling. This is what I think. Using that new lowercase form is like creating a special apostrophe, identical to the normal apostrophe in every way except codepoint, for use in lowercase environments. It's just a big headache for no good reason. —Stephen (Talk) 10:54, 30 November 2017 (UTC)[reply]

I think you're right, we should be using the original palochka U+04C0 in entry titles. We could make hard redirects from the forms using U+04CF just in case anyone searches for them, and to prevent people from starting new entries with that character. —Mahāgaja (formerly Angr) · talk 13:07, 30 November 2017 (UTC)[reply]
I beg to differ. The small palochka is the latest required introduction but it hasn't caught up yet everywhere. Besides, this is a strict dictionary style. While it's important to show how characters are/were used in the real life, we shouldn't promote the incorrect common usage. There are numerous examples in various languages:
  1. Russians don't use stress marks usually don't use letter ё usually replacing it with a е.
  2. Arabs don't use vowel points and, depending on country and style don't use a hamza over alif and under alif - أ and إ, final dotted ي is often replaced with ى and sometimes the final ة is replaced with a ه.
  3. All Hindi letters with nuqta: क़, ख़, ग़, ज़, झ़, फ़, ड़, ढ़ are replaced with their counterparts without it: , , , , , , ,
  4. Chuvash Cyrillic letters ӑ, ӗ, ҫ, etc. are replaced with their Latin lookalikes.
  5. Ossetian ӕ is replaced with the Latin lookalike.
  6. Mongolian (Cyrillic) standard ө and ү were replaced with є and ї.
The transliterations should work (ideally) with the wrong characters and substitutes but we should aim for the standard and dictionary styles in entries and redirects should be for the less standard forms, not the other way around. --Anatoli T. (обсудить/вклад) 03:15, 2 December 2017 (UTC)[reply]
Hindi nuqta letters aren't always unused, it's a matter of preference. I always use nuqtas in text, even though my sociolect doesn't distinguish them from non-nuqta consonants (except ज़ (za)) And ड़ (ṛa) and ढ़ (ṛha) would definitely be misspelled without the nuqta, since they are native sounds. —AryamanA (मुझसे बात करेंयोगदान) 01:13, 6 December 2017 (UTC)[reply]
@AryamanA: Thanks for reminding me, I slightly forgot the rules (it's been a while) but I know that some nuqta letters cause both misreadings and overcorrections and it may also depend on the origin of the word or frequency. फ़िल्म (film, film), AFAIK, would always be pronounced "film", even if it's spelled फिल्म (philm) but words of Persian or Arabic origin may not be so "lucky", like ग़रीब (ġarīb, poor) -> गरीब (garīb). I think these variants also need (eventually) to have their "alt form" entries with or without phonetic respelling, so that users know what pronunciations are acceptable - actual with or without nuqta, or of the alternative. E.g., users need to know for both ज़रूरी (zarūrī, necessary) and जरूरी (jarūrī), if both "zarūrī" and "jarūrī" can be tolerated, also which spelling is now common and included in dictionaries, etc. --Anatoli T. (обсудить/вклад) 06:20, 6 December 2017 (UTC)[reply]
The last but not the least example are the Hebrew geresh and gershayim. We normalise words that out there use a simple apostrophe with a ׳, etc. --Anatoli T. (обсудить/вклад) 04:30, 2 December 2017 (UTC)[reply]
I think you're conflating three different things here. There are cases where the underlying character is clear, but the wrong Unicode code point is used. There are cases where we use things like vowel points that are useful in a dictionary but not necessary in normal writing. And then there are points where the practice simply doesn't match with the theory, in which case we are supposed to go with the practice, not the theory.--Prosfilaes (talk) 06:35, 5 December 2017 (UTC)[reply]
The amount of contradictions and conflations and underspecifications in your post is baffling. Unicode is a theory we abide by as practice is inferior to it. Like the regulative idea of having tomorrow better entries than we have had the week before. Sometimes practice just lacks the right to life. I don’t know about those points where we should go with practice that contradicts theory. One can just handle the practice and else follow the reasonable rules to get the most for all. Dictionary editors needs have abstractions that commoners do not have and it shan’t be a disadvantage to follow the laws. Unicode is most accessible and available on all platforms, thus it is everyone’s obligation to consider it, and even to drop the platform that is unfit for serious working with language (Microsoft Windows). Palaestrator verborum (loquier) 07:00, 5 December 2017 (UTC)[reply]
Unicode is not a theory. It's a character encoding standard, a set of sometimes quite arbitrary rules created to allow computers to store written text.
Modern dictionary theory, and Wikimedia principles, deprives us of the right to follow "practice just lacks the right to life".--Prosfilaes (talk) 11:35, 5 December 2017 (UTC)[reply]
(edit conflict) @Prosfilaes: Dictionary creators/publishers use what they consider normal and standard for dictionaries, the most current and correct from various points of view. Palochka was replaced with a vertical bar "|" and other characters for a long time, these languages didn't have the luxury to have a keyboard provided for them. Then a single (upper case) was introduced. The lower case is the latest addition. The Ossetian, Chuvash situations is no different. The writers often choose what they have available and what they got used to but it's not consistent. Ossetian is the only language that uses [[ӕ]] but they used [[æ]] instead for various reasons.
Reasons for the Russian [[е]] instead of [[ё]], failure to spell out Hindi nuqta, Arabic hamza above and below, dots under the final yāʾ and over the tāʾ marbūṭa and Hebrew writing an apostrophe instead of geresh [[׳]] also vary but we choose to use what we consider the latest dictionary style for these languages, even if some dictionaries not necessarily use the same rules.
We use stress marks and vowel points even if most attested texts don't use them. (Some people consider writing out super- and subscript hamza and dots I mentioned above additional diacritics as well.)
The main point, perhaps, is that what we use at Wiktionary is not necessarily the same what we can find citations and real-life examples for even if reasons and frequencies are different with different symbols and languages. There are too many examples of normalisations and fixes and other things (like Japanese furigana) we use here, even if this is not normal in some running texts and many texts, like North Caucasian languages are full of incorrect characters. Call it fixes, normalisations or enhancements, dictionaries are different from running texts. --Anatoli T. (обсудить/вклад) 07:25, 5 December 2017 (UTC)[reply]
ӕ and æ are different code points. It's clear to me that in context, they're the same character. If you were looking at a printed text, you couldn't tell the difference, and there's no difference in intent. When it gets to things like "Mongolian (Cyrillic) standard ө and ү were replaced with є and ї.", it seems clear that we're changing spelling, and that's problematic.--Prosfilaes (talk) 11:35, 5 December 2017 (UTC)[reply]
I can fix the issue with transliteration. I would think that both palochkas should be transliterated the same way, even if one of them is not supposed to be used. — Eru·tuon 05:42, 2 December 2017 (UTC)[reply]
@Erutuon: Thanks but please be aware that palochka is used a significant number of North Caucasian languages and the mix-up between the upper case and lower case palochka is not the only problem. It's often replaced with [[|]], I, l. --Anatoli T. (обсудить/вклад) 06:15, 2 December 2017 (UTC)[reply]
Note that with the Russian-based polyglot and reactionary keyboard layout you can use both palochkas; the explicit lower-case one U+04CF CYRILLIC SMALL LETTER PALOCHKA with Caps + й (Q), U+04C0 CYRILLIC LETTER PALOCHKA with Caps + Shift + й (Q).
So from the keyboard layout side there is no preference, the free desktops have a layout that fits quite well to wiki editing in any Cyrillic script; the usage is a bad guide if it comes to encoding. Palaestrator verborum (loquier) 13:42, 2 December 2017 (UTC)[reply]
@Atitarev: Done. Those other replacements are not the right script, or a formerly correct usage, so they should be unsupported. — Eru·tuon 20:02, 2 December 2017 (UTC)[reply]
@Erutuon: Thanks. I am OK to not support "|" and other substitutes. Are you able to apply the same fix for other modules, which use palochka, please? Adyghe (ady), Avar (av), Chechen (ce). @Vahagn: Have I missed any modules we have, which use the palochka? --Anatoli T. (обсудить/вклад) 02:13, 3 December 2017 (UTC)[reply]
@Atitarev: I think this search will find all the modules that transliterate palochkas. — Eru·tuon 02:41, 3 December 2017 (UTC)[reply]
@Erutuon: Wow, that's a lot. Thanks. They are the right ones. If a generic method for treating equally upper- and lower-case palochkas is found, then it should apply to all of them. For transliteration purposes, they are equal. --Anatoli T. (обсудить/вклад) 02:45, 3 December 2017 (UTC)[reply]
@Atitarev: The method I've arrived at is to make sure all palochkas in the module are lowercase, then convert any uppercase palochkas in the text to lowercase and proceed with transliteration. That's pretty straightforward and reliable. I had been creating testcases, which takes time, but I think I'll just go ahead and implement this method. — Eru·tuon 04:21, 3 December 2017 (UTC)[reply]
If we are to only use one type of palochka, why bother transliterating the other one? —suzukaze (tc) 04:14, 3 December 2017 (UTC)[reply]
I think the uppercase palochka is the older character and was formerly correct. Not transliterating it correctly would be like penalizing people for not using the newer character. I could imagine a variety of reasons for not using the newer character: ignorance, computer issues, or principled objection. And it's nice to be backwards-compatible. The use of the newer character can be enforced in other ways. (Also, I just discovered that Wiktionary:Abaza transliteration itself uses both characters, and the older one wasn't being transliterated correctly.) — Eru·tuon 04:56, 3 December 2017 (UTC)[reply]

Okay, I think all the modules treat both palochkas alike now. — Eru·tuon 05:08, 3 December 2017 (UTC)[reply]

@Erutuon: Great job and a good method! --Anatoli T. (обсудить/вклад) 11:57, 3 December 2017 (UTC)[reply]

stroke order exceptions

[edit]

It would be great to find a dabase/list of characters whose stroke order is different from the one to be expected according to the standard guidelines 现代汉语通用字笔顺规范. Prof. Yin mentions in the "Routledge's Encyclopedia of Chinese" 女 as a simple example.

Eventually, the characters that do not follow the standard guidelines should be arranged into groups according to the "type of irregularity", which would be very useful both for lexicographic and learning purposes.

Maybe the info. necessary can be gathered form new input methods' lists of sequences as well as from stroke order gifs. Then it would be manually checked. --Backinstadiums (talk) 12:47, 30 November 2017 (UTC)[reply]

The moment I tried to apply this template to a confirmed sock user it was deleted ASAP by Metaknowledge. The deletion itself is not why I'm bringing it here.

In his deletion reason he claimed that the sockpuppeteer template is pointless in general. Does anyone else think it should be deleted? mellohi! (僕の乖離) 20:31, 30 November 2017 (UTC)[reply]

My general reasoning: it's a Wikipedia template that links to Wikipedia policies. It might be useful there, but block summaries are good enough for our modest needs. —Μετάknowledgediscuss/deeds 20:32, 30 November 2017 (UTC)[reply]
Not even Wonderfool got that far? mellohi! (僕の乖離) 20:36, 30 November 2017 (UTC)[reply]
Not sure what you mean, but WF has not taken any pains to hide his identity for many years. —Μετάknowledgediscuss/deeds 21:04, 30 November 2017 (UTC)[reply]
Who's Wonderfool? --Lirafafrod (talk) 15:53, 1 December 2017 (UTC)[reply]
@Lirafafrod: A user who was an admin that went rogue and caused all kinds of damage (like deleting the Main Page). —Justin (koavf)TCM 17:44, 1 December 2017 (UTC)[reply]
Hey Justin, look up at the ceiling, it says "gullible"... —Μετάknowledgediscuss/deeds 19:06, 1 December 2017 (UTC)[reply]
Confirmed. Lirafafrod is an anagram of "i arr da ffol", which is obviously Welsh for "I am the (Wonder)fool". Equinox 19:27, 1 December 2017 (UTC)[reply]
Wonderfoolati confirmed. — Ungoliant (falai) 19:31, 1 December 2017 (UTC)[reply]