Jump to content

Wiktionary:Beer parlour

Add topic
From Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Latest comment: 3 hours ago by Wüstenspringmaus in topic Global ban proposal for Shāntián Tàiláng

Wiktionary > Discussion rooms > Beer parlour

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.

Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.

Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!

Beer parlour archives edit
2025

2024
Earlier years

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


Bad ledes in Thesaurus namespace

[edit]

@qwertygiy Standard practice in the Thesaurus namespace is currently putting a blank line between {{ws header}} and the first L2. See, for example, Thesaurus:person, Thesaurus:berry, etc. This gives a warning to anyone who makes a new Thesaurus entry (e.g. at this trigger of filter 115), because it's in violation of WT:NORM. So either the Thesaurus namespace should be excluded from filter 115 or the practice should be changed to match NORM. -saph (usertalkcontribs) 03:12, 1 January 2025 (UTC)Reply

2024 – Top pageviews statistics

[edit]

The top for en.wiktionary.org (unfiltered list from dump files) is:

	25840383 Special:Search
	21800424 Wiktionary:Main_Page
	 1993839 Appendix:Glossary
	 1645478 rainbow_kiss
	 1506823 xxx
	 1451758 -
	 1244194 黑料
	 1145183 吃瓜
	  938861 Category:English_swear_words
	  681510 bokep
	  672460 I'll
	  633276 视频
	  599259 XXXX
	  598818 aww
	  567081 colmek
	  527691 麻豆
	  523495 bocil
	  493837 Appendix:Protologisms/Long_words/Titin
	  463888 Appendix:Filipino_surnames
	  444811 Wiktionary:International_Phonetic_Alphabet
	  427258 XXX
	  395379 لا_إله_إلا_الله_محمد_رسول_الله
	  395192 
	  386054 «
	  377134 astaghfirullah
	  356207 変態
	  349388 Category:English_surnames_from_Old_English
	  343450 سکس
	  342540 ‘
	  338717 pajeet

Detail and another WMF projects: https://archive.org/details/2024-top_2k_user_pageviews Dušan Kreheľ (talk) 08:49, 1 January 2025 (UTC)Reply

Well, rainbow kiss is the most popular. I wonder which website(s?) are sending people (or bots) to it. Also, we can assume it is the favourite sexual act on en.wiktionary. I wonder what Wiktionnaire's top sex act wordsex act is... Father of minus 2 (talk) 22:22, 12 January 2025 (UTC)Reply

Pronunciation of irregular plurals

[edit]

Currently there is no way of knowing how to pronounce, for example, ibices or sphinges. JMGN (talk) 00:46, 2 January 2025 (UTC)Reply

@JMGN: As īʹbĭsēz, /ˈaɪbɪsiːz/ and sfĭnʹjēz, /ˈsfɪnd͡ʒiːz/, respectively. 0DF (talk) 13:30, 4 January 2025 (UTC)Reply
@Chuck Entz: Should we add them to the headword entries too, since they appear there and are irregular? Namely, in ibex & sphinge. JMGN (talk) 18:03, 4 January 2025 (UTC)Reply
Then add the pronunciations or add a pronunciation request? -saph (usertalkcontribs) 03:42, 2 January 2025 (UTC)Reply
To all of them? I thought this was Beer parlor... JMGN (talk) 12:17, 2 January 2025 (UTC)Reply
There's nothing preventing anyone from adding them now, hence, no need to bring the issue to Beer parlour. See, for instance, vertices or indices. Andrew Sheedy (talk) 18:33, 2 January 2025 (UTC)Reply
Really surprized that this cannot be automated as the rest of the pronunciations though... JMGN (talk) 21:21, 3 January 2025 (UTC)Reply
English is a special case: it represents the collision of two branches of Indo-European, followed by a thousand years of history including serving as the main language in two whole continents and in numerous countries worldwide, and as a second language for over a billion people. It has huge numbers of loanwords coming from languages all over the world throughout known history. On top of all that, it has no authoritative standard. Although it may be possible to automate the pronunciation, it would be a huge project and would probably add considerably to system overhead on the million+ pages where it would be deployed. Chuck Entz (talk) 02:01, 4 January 2025 (UTC)Reply
Hard to believe that there're over a million entries with irregular plurals... JMGN (talk) 12:57, 4 January 2025 (UTC)Reply
There don't have to be. What is there about irregular plurals that requires special treatment? Chuck Entz (talk) 14:01, 4 January 2025 (UTC)Reply
@JMGN: We have 19,761 entries for English nouns with irregular plurals. 0DF (talk) 15:43, 4 January 2025 (UTC)Reply
@0DF: Thnx. Let's do it then! JMGN (talk) 18:00, 4 January 2025 (UTC)Reply

"number" or "numeral"?

[edit]

We currently have a POS "numeral", hence CAT:Numerals by language, CAT:English numerals, etc. but for some reason we have CAT:Cardinal numbers by language and CAT:Ordinal numbers by language not #cardinal numeral or #ordinal numeral. Category:Numerical appendices has a mixture of appendices called "Foo numbers" and "Foo numerals". I'd like to straighten this out by using a consistent naming scheme, probably numeral instead of number. [The root of the issue seems to be that numeral is normally taken to be a symbol (like 2, 3, 4 in the Hindu-Arabic system) that refers to a number, which is an abstract concept, but (a) whether numerical words like two, three, four are considered "numerals" or "numbers" is less clear (technically it appears they are numerals, being symbols of sorts, maybe more correctly signs, that refer to abstract numbers), and (b) in common parlance, the distinction between numeral and number is elided.] Mixing both terms is unhelpful, so if we can settle on consistent terminology, either "numeral" or "number", I can do the renames. Benwing2 (talk) 03:33, 2 January 2025 (UTC)Reply

I think I have a preference for numeral. Vininn126 (talk) 03:45, 2 January 2025 (UTC)Reply
If we are to standardize to one of them, then "numeral" is the better option of the two. But I wonder if we could instead come up with some consistent distinction between the two. — SURJECTION / T / C / L / 07:58, 2 January 2025 (UTC)Reply
@Surjection Can you be more specific? What sort of distinction were you thinking of? Benwing2 (talk) 09:17, 2 January 2025 (UTC)Reply
Numerals are a POS, numbers are a semantic category. Ordinal numbers are often not numerals in many languages, same goes for adverbial numbers (once, twice) and fractional numbers (half, third), for instance.
In many languages, the numeral POS has different grammatical properties than other POS: in Finnish and Russian, it governs very specific cases on the adjacent noun, for instance. In many other languages, there is no numeral POS at all (e.g. Afar nammáy or Tokelauan lua). They are still numbers though.
So, basically, the appendices calling these "numerals" are 'wrong' (in the sense that they don't follow the above distinction), and should probably be standardised to numbers. Thadh (talk) 09:10, 2 January 2025 (UTC)Reply
I'm not sure where you got the idea that a numeral word has to be its own part of speech to be a numeral. See Ordinal numeral on Wikipedia. That seems a Thadh-ism (if I may call it that), which you've extrapolated from a handful of languages. Benwing2 (talk) 09:16, 2 January 2025 (UTC)Reply
OK, that was a bit snarky and I apologize for that, but I'm still confused as to where you've gotten your ideas from. Benwing2 (talk) 09:27, 2 January 2025 (UTC)Reply
Yes, it was... No worries. I don't think I've said that, but rather that for now that's the distinction we do (or should/could) handle. 'Numeral' as a grammatical category is very useful for these languages that do have one, whereas they still have a distinct semantic category including other parts of speech. We can and should make a distinction between the two, and I think calling them by the same name will lead to much more confusion than the status quo. Thadh (talk) 09:16, 3 January 2025 (UTC)Reply
I am not convinced of this. Not all Russian numbers work the same way by any means; they range from один (pure adjective) to миллион (pure noun), with in-between numbers getting progressively more noun-like and less adjective-like. I don't know Finnish but I wouldn't be surprised things are similar. If you want to make a number vs. numeral distinction, you need to spell out when one term is used and when another is used, and what renames need to happen; otherwise I have no idea what you're getting at. Benwing2 (talk) 09:38, 3 January 2025 (UTC)Reply
All right:
- number is used to denote any member of the semantic category/ies that denote a specified amount, position etc. that can be theoretically counted.
- numeral is used to denote any member of a syntactic category of nominals that exhibits syntactic behaviour not found in other nouns and adjectives, and is typically associated with amounts that can or cannot be counted.
Since POSs already are language-specific, I can't give you hard-and-fast rules where to use the latter: just like I can't give you a way to recognise a noun or a verb or an adjective, it differs by language. However, it is clear to me that there are languages where numerals are nominals (in flectional languages, they can usually be inflected, in others, they can be used as a head of a nominal phrase) that do not syntactically behave the same way as nouns or adjectives or (if those exist) determiners, and have very specific rules of governing the head noun.
Maybe indeed the numeral 'one' (yksi) could better be analysed as an adjective, but that doesn't take away that kaksi is neither an adjective nor a noun, as it agrees in the oblique cases but takes a partitive-case noun in the nominative (non-agreement). Same thing for Russian два (dva): две девушки, but двум девушкам - partial agreement, not identical to adjectives or nouns. Thadh (talk) 09:55, 3 January 2025 (UTC)Reply

Extended Mover request: User:Rex Aurorum

[edit]

Hello. I'd like to request extended mover rights, mainly to be able to fix issues: 1. Typos made by earlier editors (non-Indonesian speakers) 2. Typos made by myself (frequently made typos in certain clusters) ―Rex AurōrumDisputātiō 10:29, 2 January 2025 (UTC)Reply

Nominated at WT:WL. Svārtava (tɕ) 16:18, 5 January 2025 (UTC)Reply

Sundanese main entries

[edit]

I noticed that Sundanese entries have the main entries in the Sundanese script but according to @Udaradingin, the most common script used nowadays is the modern script. Shouldn't the main entries (and possibly also links) be moved to Latin script entries just like how Tagalog use Latin script and the Baybayin spelling is just shown as an alternative spelling? Thanks. 𝄽 ysrael214 (talk) 08:51, 3 January 2025 (UTC)Reply

I agree. There are some Sundanese entries in Latin script that lists both the definition and a redirecting link to the Sundanese script version of said entry. I think it would be great if the su-noun template has Sundanese as an alternative spelling on (as also seen in ms-noun or tl-noun). As for now, cleanup for some entries are underway. Udaradingin (talk) 10:08, 10 January 2025 (UTC)Reply

Stray Arabic-script digit entries

[edit]

We have entries for the main series of Arabic-Indic digits in Unicode, and what Unicode refers to as the Extended Arabic-Indic digits

Comparison of digits
Digit Main series Main series
language sections
Extended series Extended series
language sections
1 ١ Translingual ۱ Ottoman Turkish
Persian
Punjabi
Urdu
2 ٢ Translingual ۲ Ottoman Turkish
Persian
Punjabi
Urdu
3 ٣ Translingual ۳ Ottoman Turkish
Persian
Punjabi
Urdu
4 ٤ Translingual
Ottoman Turkish
۴ Persian
Punjabi
Urdu
5 ٥ Translingual
Ottoman Turkish
۵ Persian
Punjabi
Urdu
6 ٦ Translingual
Ottoman Turkish
۶ Pashto
Persian
Punjabi
Urdu
7 ٧ Translingual ۷ Ottoman Turkish
Persian
Punjabi
Urdu
8 ٨ Translingual ۸ Ottoman Turkish
Persian
Punjabi
Urdu
9 ٩ Translingual ۹ Ottoman Turkish
Persian
Punjabi
Urdu
0 ٠ Translingual ۰ Ottoman Turkish
Persian
Punjabi
Urdu

As you can see, the main series are all straightforward Translingual entries like we have for the Latin-script digits, though a few also have Ottoman Turkish entries. The Extended series, however, are all nothing but entries for individual languages. What's more, many of them have no headword templates, and the ones that do treat them as entries for the spelled-out word that the symbol represents in that language. I did my best to fix the entries at ۱ (in the Extended series), but then I realized that there shouldn't be entries for specific languages at all, just a Translingual section at the top.

I'm not sure what the Translingual entries for these characters should look like- some, at least, seem like variants used in some Arabic-script languages, but not others. Others seem like identical glyphs that are separate due to some quirk in the early history of Unicode. There's a task listed in WT:Todo for fixing entries that use the wrong character for a given language, so there's probably a lot more to this.

I do think that all of the pages in both series should have only a Translingual section, and all the other language sections should be merged into the spelled-out versions that the digits represent (if there's anything worth keeping). I didn't see any idiomatic senses like "4" used for "for" in texting.

The main problem is that I'm not really proficient in these languages, so I'm not sure how, exactly, to fix this- but I am sure it needs to be fixed, somehow. Thanks, Chuck Entz (talk) 01:28, 4 January 2025 (UTC)Reply

You are making sense. The Ottoman entries were added by @Moonpulsar in 2023, the Persian and Urdu ones in 2006 and 2007, when rules, standards or consistency on Wiktionary were not developed in a now relevant extent, as formatting was wild. The analogy to non-Arabic-script languages suggest that we keep but translingual entries, perhaps even with hard-redirects of alternative forms.
I have seen both series of numbers in either Arabic, Persian, and Ottoman prints, a second one did not need to have been encoded by Unicode in the first place, rather than being relegated to font systems varying display by language, as say, italic б looks different depending on whether it is Serbian or Russian, and less clearly Bulgarian.
The numbers did not even have distinct names from the ones we use in Europe, once again I note that the terminology of Eastern Arabic numerals vs. Western Arabic numerals is Wikipedia’s citogenesis, with their frequent problem of citing terms added to some list but barely used, in case anyone attempts to conceive what Wiktionary has to portray. Fay Freak (talk) 02:18, 4 January 2025 (UTC)Reply

Affix template standardization

[edit]

The templates prefix, suffix and confix (and their shortcuts pre, suf and con, respectively), can all be handled by affix (and its shortcut af). The template compound (and its shortcut com) can also be handed by af, although compound+ (and its shortcut com+) provides additional text that is not currently replicable with af. Both pre and suf are designated as "less-preferred" on category pages in favor of af, so it appears that af is the de facto standard. However, the other templates can still be found on many pages so converting them to af will need to be done. Once that is done, the templates prefix, suffix and confix (and their shortcuts pre, suf and con, respectively) can be formally depreciated, similar to circumfix. Netizen3102 (talk) 17:11, 4 January 2025 (UTC)Reply

What is the rationale? By analogy, changing all the for loops in a computer program into while loops reduces the number of keywords but why is that better? Makes it harder to understand. 2A00:23C5:FE1C:3701:C9DA:1ED4:BE2C:8235 17:13, 4 January 2025 (UTC)Reply
One rational is that it would be easier for NEW users. Less complexity of templates equals lower barrier to entry. The downside is that these are fairly used templates and editors who have been using them for a while will have to adjust. I still think we should try to make it easier for new people, however, even if it is in a small way. Vininn126 (talk) 17:15, 4 January 2025 (UTC)Reply
Just remember that most people know what prefixes and suffixes are, but quite a few have never heard of affixes. Chuck Entz (talk) 19:51, 4 January 2025 (UTC)Reply
What would be interesting is if we were to merge these templates, would we suddenly see threads popping up about the lack of {{pre}} and {{suf}}, asking how to deal with prefixes and suffixes? A hypothetical, to be sure, but I think an interesting one. Vininn126 (talk) 19:55, 4 January 2025 (UTC)Reply
This can be addressed by having a few well maintained perfectly formatted role model entries for each language, provided as examples for new users. Similar to how parrot is used as an example for Wiktionary:Quotations. --Ssvb (talk) 12:55, 5 January 2025 (UTC)Reply
The af template requires dashes similar to how prefixes and suffixes are traditionally written, whereas pre, suf and con do not. For example, (af) un- +‎ do and (pre) un- +‎ do both produce the same output, but pre does not require the dash after the prefix, which could be confusing to editors. Netizen3102 (talk) 17:22, 4 January 2025 (UTC)Reply
I also personally find it easier to keep track of affixes when the dashes are present - the other prefixes do ALLOW for dashes, but they are not required. Vininn126 (talk) 17:25, 4 January 2025 (UTC)Reply
I'm not convinced at all. Continuing the for/while analogy, that's like pointing out that while only needs a single expression and doesn't require semicolons or a "stepwise rule". Sure! But that's why we don't use it for everything. Because humans are not instruction sets, and benefit from context. I'm sure there are ways to DRY it by having one template call into another. Reducing everything to eventual Turing tape is nasty. 2A00:23C5:FE1C:3701:C54E:F82E:FAA1:E7A5 05:54, 6 January 2025 (UTC)Reply
Fewer varieties of different templates make Wiktionary more machine readable, even though I'm not sure whether this is considered to be a desirable goal. --Ssvb (talk) 19:11, 4 January 2025 (UTC)Reply

Category:Ojibwe stem-building elements

[edit]

There are a few Ojibwe entries that keep showing up in WT:Todo lists because they're not in Category:Ojibwe lemmas or Category:Ojibwe non-lemma forms, and lots more that don't show up in the lists, but have similar problems.

First, some background: as with many American Indian languages, Ojibwe is polysynthetic, meaning that it uses mostly complex systems of morphemes bound together instead of separate words. That makes it hard to analyze Ojibwe grammar using the categories established for the better known European languages. There are prefixes, suffixes, infixes, and circumfixes that attach, not just to a central root or stem, but also to each other in very complicated ways.

Apparently Ojibwe carries this even farther by having stems that are made up of separate sub-elements: initials, medials, and finals, as explained on the page for Category:Ojibwe stem-building elements. These aren't completely arbitrary: they each have specific roles and carry specific types of information.

For 5 months in 2020, @SteveGat spent a great deal of time expanding our coverage of Ojibwe, but in ways that never really got integrated into our POS headers and categories. I would like to do that part now.

The question is: how should we do that. I can see a few approaches:

  1. Make initials, medials and finals into prefixes, suffixes, and/or infixes
  2. Make all of them just plain morphemes
  3. Add them to the modules as lemmas

For the first two options, we would want to have secondary categories to preserve the information. These entries already have secondary categories such as Category:Ojibwe noun finals and Category:Ojibwe verb finals and tertiary categories attached to those. For the third option, we would want to also integrate the new lemma types into the category-tree modules so the categories can use {{auto cat}}. For that matter, we could do the same for the secondary and tertiary categories no matter what we do with the rest.

I should also mention Category:Ottawa initials, which indicates that there are probably more languages with similar issues that I don't know about. That one shows up in Wiktionary:Todo/Lists/Uncategorised pages (all namespaces)#Category, but there may be more with categories added by hand. Chuck Entz (talk) 19:38, 4 January 2025 (UTC)Reply

I forgot to ping @-sche, who knows a lot more about Algonquian languages like this one than I do. Chuck Entz (talk) 19:40, 4 January 2025 (UTC)Reply
That's also a feature of other Algonquian languages like Cree (however that language(s) happens to be treated on Wiktionary), though Ojibwe is the one with the most content. Circeus (talk) 17:21, 5 January 2025 (UTC)Reply
@Circeus: Most of the language is treated under Plains Cree, although several dialects have their own categories. The macrolanguage is a leftover that for some reason still hasn't been deleted, even though it's a hazard. Thadh (talk) 14:56, 20 January 2025 (UTC)Reply
I can't say I know a lot about Ojibwe but in general I would prefer to try and fit things like initials, medials and finals into existing categories like prefixes, suffixes and infixes rather than just use the language-specific terminology directly. This latter approach, in the extreme, leads to a proliferation of lemma types that is singularly unhelpful, e.g. as was done with Lojban, where someone added Lojban-specific lemma types cmavo, cmene, fu'ivla, gismu, lujvo and rafsi to Module:headword/data. I and most people can't tell a gismu from a Ginsu knife, making these terms completely opaque. I went through a year or two ago and tried to rewrite the opaque Lojban grammatical terminology into the most similar comprehensible term, hence the terms in the category Category:Lojban gismu now have a header "Root" instead of "gismu"; similarly "Predicate" in place of "lujvo"; etc. The actual categories haven't yet been renamed but should be. IMO if there's a one-to-one mapping between initial <-> prefix, final <-> suffix, etc. there is no need to have the same term categorized into both CAT:Ojibwe prefixes and CAT:Ojibwe initials (just use the former), but if there is some extra information in the CAT:Ojibwe initials category, I am not averse to having the term categorized both ways. I can add the {{auto cat}} support for language-specific (or family-specific) terminology like "initials", "finals", etc.; this is not hard as the underlying functionality for language-specific categories is already present. The only other thing I'd add is that we have nastily-named categories in Special:WantedCategories like Category:Unami animate intransitive (vai), Category:Unami verb transitive inanimate and Category:Unami inanimate intransitive verb. I remember having a discussion with someone (probably the same SteveGat) about putting these into separate Category:Unami animate verbs and Category:Unami intransitive verbs categories; he eventually convinced me that there is a reason for combining them, as apparently a "transitive inanimate" verb is a different beast from an "inanimate intransitive" verb, not just the transitive equivalent. But these definitely should use the Wiktionary standard naming format Category:Unami inanimate intransitive verbs and such, not Category:Unami verb inanimate intransitive or some other weirdness, even if the latter is the standard format used in the grammar of these languages. Benwing2 (talk) 06:07, 6 January 2025 (UTC)Reply

How about...

[edit]

creating a template called "all", that can do everything? You just need to know what to put in the parameters, as described in the template documentation subpages (we're limited to 2 MB per page, so there would have to be a number of them). In other news, there's a new Swiss Army Knife™ that can do thousands of different things. The only problem: with all the attachments, it's over a meter wide...

There's a certain amount of complexity inherent in any given task. The question is, how do we distribute it?

With lots of templates, we don't have to know as much to do any one task. With fewer templates, we have to know about more things to do one task, but if we do multiple tasks, the information is in fewer places.

We should be thinking about what tasks go together, and have one template that does the things that go together, but multiple templates to do things that don't.

Also, we need to think about the range of things the individual user deals with: someone who edits Mandarin Chinese needs to know about Han characters, tones, and various particles like classifiers, but not affixes or grammatical gender. Someone who works with most European languages, on the other hand, needs to know about the morphology for things like cases, gender, number, mood, voice, tense, aspect, etc. Someone who works with Celtic languages needs to consider the interactions in sounds between syllables, separate words, and even sentences, while a Hawaiian doesn't really encounter synchronic phonological changes in some vowels, and nothing at all in consonants.

Considering this, we should think about whether all of those people need to use the same templates for everything. Yes, we have specialized templates for specific languages that do extra things, but we should also think about whether to have templates for specific languages that do less so users don't have think about as much. We've been deleting lots of language-specific templates that can't do things that the general templates can, but also can't do anything that the general templates can't. The question that doesn't get asked is: are the things that the templates can't do things that editors in those languages will want to do.

Another thing to think about: knowing that a template called "xyz-noun" is all you need for headwords in language-xyz noun entries should make it easier to get started in that language. If it doesn't have features needed for that language, you can always use {{head}}, or learn to customize it. It's also nice to have things that are just for you and your community of editors.

That's not to say that such things should be used as barriers to keep others out or as a way to claim ownership over language entries or anything else. All of the things I mentioned above should be considered as needed, but shouldn't override all the other things we already look at- I'm talking about broadening the discourse, not replacing it. Chuck Entz (talk) 21:45, 4 January 2025 (UTC)Reply

How about we not be so reliant on templates? There's too many templates as is and they change way too frequently. (And while we're at it, you shouldn't be required to code to create or edit a category). Purplebackpack89 16:12, 5 January 2025 (UTC)Reply
Lua may have allowed incredible flexibility in templates, but it has also made them impossible to edit for 99% of people. I do not consider this a good thing. Circeus (talk) 17:23, 5 January 2025 (UTC)Reply
Plus, any given template relies on a stack of dependencies that is completely impenetrable. I've given up trying on many templates and modules and I'm more knowledgeable than your average person (but still pretty ignorant about programming). —Justin (koavf)TCM 17:24, 5 January 2025 (UTC)Reply
Hear, hear DCDuring (talk) 17:39, 5 January 2025 (UTC)Reply
Lua is by far nicer, cleaner and easier to read than wiki templates. Complex templates are way too cryptic. --Ssvb (talk) 21:27, 5 January 2025 (UTC)Reply
The closest thing I've ever come up with is a universal definition template. I don't understand the bellyaching here, where we already have to deal with tons of functions, regardless of language (those that think you can't are sorely mistaken) - on the otherhand, I'm not sure you can create something THAT universal, at least not in one fell swoop. Vininn126 (talk) 17:50, 5 January 2025 (UTC)Reply
I think the point is that templates shouldn't be designed for those who put in more than 30 hours a week on Wiktionary and/or have IQs over 200. {{en-noun}} is wonderfully powerful, but even high-volume contributors have had trouble with the keystroke-saving features using "+", "-", "~", not to mention the complexities of auto-pluralization. Why should users have to consult the documentation every third time they try to use the template? DCDuring (talk) 19:43, 5 January 2025 (UTC)Reply
Sweet Heaven, you are singing my song. At the very least, there's no reason to not have more intelligible fall-back aliases like "pural=[foo]" or something for a normal person who casually edits. If you've ever edited SVGs at c: and tried to use c:Template:Valid SVG and its successor Templates, it's completely infuriating how clipped and counter-intuitive all the inputs are. —Justin (koavf)TCM 19:58, 5 January 2025 (UTC)Reply
@Chuck Entz What is this in reference to? Is there something specific you're annoyed about? Benwing2 (talk) 21:33, 5 January 2025 (UTC)Reply
I'm going to take a wild guess and say that it is not a matter of one or a few templates or even one or a new types of templates, but rather an attitude toward usability and the population of potential contributors. DCDuring (talk) 03:04, 6 January 2025 (UTC)Reply
The downside is that the "xyz-noun" templates for inflected languages are enormously difficult to use and have a steep learning curve. Very few of the new editors can use them correctly on their first try, so their initial edits tend to need corrections. At least that's what I observed when looking at the new Belarusian entries added by new people. And I suspect that many potential new editors probably just give up rather than contributing incorrect edits if they notice problems in the previews of their edits. --Ssvb (talk) 21:42, 5 January 2025 (UTC)Reply
@Ssvb I agree that many of the xyz-noun templates are complex, but I'm not sure there's anything much that can be done about this. The root of the issue is that your typical inflection system is itself quite complex, and if you want to support the system fully, the template itself will necessarily be complex. One alternative is only to support the most regular inflections, but (a) most people are more interested in the harder, less regular words, which also are usually the most common words; and (b) the templates are already designed (at least the ones I've designed) so they have sensible defaults in most cases that make it relatively easy to specify the inflection of words with regular declensions or conjugations. Another alternative is to require people to specify a lot more information manually in the case of irregular inflections (e.g. just type out the entire inflection by hand); on the surface that may make it easier to enter for a native speaker who knows the inflection but doesn't want to or can't figure out the syntax of something like {{be-ndecl}}. But in practice (a) it's extremely tedious, with the result that a lot of words never get inflections; (b) it leads to lots of mistakes. Whenever I design a new template for entering the noun, verb or adjective inflection of language Foo and convert old template uses, I invariably find tons of mistakes due to bad design in the previous template where too much info has to be given manually. So I'm not really sure what a better approach would be. Benwing2 (talk) 05:38, 6 January 2025 (UTC)Reply

th-cls

[edit]

(Notifying Alifshinobi, Octahedron80, YURi, Judexvivorum, หมวดซาโต้, Atitarev, GinGlaep, RichardW57, Noktonissian):

I've made an inline classifier template for Thai similar to {{zh-mw}}. Here's an example of what it looks like:

Are there any objections to me moving it into mainspace / any feedback? - saph ^_^⠀talk⠀ 00:28, 6 January 2025 (UTC)Reply

I already made {{cls}} that can be used by many languages, not only Thai. (Tai languages and Vietnamese also use classifier.) I oppose to make template for only Thai. You should expand into this template instead. (It is used a lot at thwikt) --Octahedron80 (talk) 01:07, 6 January 2025 (UTC)Reply

I'm not sure I would agree that this is a situation where a one-size-fits-all template is ideal, especially not a wikitext template. {{th-cls}} has automatic translit where {{cls}} does not, for one. - saph ^_^⠀talk⠀ 01:13, 6 January 2025 (UTC)Reply
Because I add tr=- to prevent translit since it results to many parentheses. It is no need to show them all. --Octahedron80 (talk) 01:17, 6 January 2025 (UTC)Reply
See th:อาทิตย์ th:ຄຳ th:ᦋᦲᧃᧉ th:ကျား for example. --Octahedron80 (talk) 01:25, 6 January 2025 (UTC)Reply
Didn't notice that, fair enough. I'll wait for other people to comment. - saph ^_^⠀talk⠀ 01:20, 6 January 2025 (UTC)Reply
@Saph I support @Octahedron80's view that we should have a single language-independent {{cls}} template, since there are a lot of languages with classifiers and otherwise we'd end up with a proliferation of incompatible and subtly different templates. This template can have language-specific behaviors for certain languages if it makes sense to do so, e.g. we could make the default transliterating and turn it off for certain languages. (IMO however, transliteration should usually be enabled, since most non-Latin scripts are unfamiliar and hard to read for the average Wiktionary user; it might make sense, for example, to turn off translit in some circumstances for Greek and Cyrillic, which aren't so hard to read and with which many people will be familiar, but for most scripts transliteration is helpful. If the issue with transliteration is display-related, we should be able to come up with a display format that works better.) If Thai needs some special behavior of some sort, that could be supported under the hood in {{cls}}.
@Octahedron80 My main complaint about {{cls}} is not its implementation but the default positioning before the headword. This is nonstandard (we usually put labels and other information after the headword) and IMO looks bad. If you're OK with it, I can do a bot run moving the {{cls}} invocations after the headword. Benwing2 (talk) 05:27, 6 January 2025 (UTC)Reply
No. Don't do that. Originally we put classifier(s) after th-noun (and lots of Tai's noun headword). But there are many cases that it cannot share the same classifer(s) with other senses, or some senses cannot have classifier at all. So the template cls is born to add classifier per sense (just like zh-mw you know; what is mw anyway?). --Octahedron80 (talk) 05:49, 6 January 2025 (UTC)Reply
About transliteration, you can make it turn on or off tr display as you like. By the way, the zh-mw doesn't show pinyin, so I just follow that. --Octahedron80 (talk) 06:10, 6 January 2025 (UTC)Reply
About Tày language, the template is not tended to be used with Tày before headword, but someone is already widely using it. And I cannot make them off. Their classifiers should integrate with its Tày tyz-noun, like Vietnamese vi-noun. See for comparing. If tyz-noun support classifier by itself, so we can remove cls there. --Octahedron80 (talk) 05:54, 6 January 2025 (UTC)Reply
@Octahedron80 You are misunderstanding me. I'm not objecting to putting classifiers per sense, following the sense definition. What I'm objecting to is putting the classifier directly *before* the headword. If it goes on the headword line, it needs to follow. So I'm suggesting moving {{cls}} uses from before the headword to after the headword. BTW this is largely with Vietnamese, not with Tày or Thai. If it's better to not have it on the headword line at all, but instead on a sense line, that's fine, but I can't do that by bot; in the meantime it's better to have the classifiers after the headword than before. And since I assume the issue with per-sense classifiers occurs with all languages using classifiers (since classifiers are essentially semantic-based), so I don't see how it's useful to integrate classifiers into the headword. BTW "mw" means "measure word". See measure word and classifier on Wikipedia. Benwing2 (talk) 06:13, 6 January 2025 (UTC)Reply
Tày and Vietnamese use classifier before noun (same as Chinese), unlike other Tai languages that use classifier after number and noun. Do Wiktionary need to show classifier in headword before noun? If you asked Vietnamese users, they would say yes I guess. --Octahedron80 (talk) 06:28, 6 January 2025 (UTC)Reply
Whether the classifier comes before the noun or after the noun in the grammar of the language has nothing to do with where we should put the classifier in the headword. All headword-related information always goes after the headword itself. There is no other situation that I know of where we put any headword-related information before the headword. Thus, putting the classifier before the headword is highly nonstandard and looks really awful (IMO) and janky. So it's important we move its position. Benwing2 (talk) 06:34, 6 January 2025 (UTC)Reply
Okay. You can move cls to end of Tày headword at first, until we can make tyz (and vi?) templates better. --Octahedron80 (talk) 06:39, 6 January 2025 (UTC)Reply
@Octahedron80: Aside: I think classifier before noun is quite common amongst Tai languages in northern regions - quite possibly alignment with Chinese. --RichardW57 (talk) 08:18, 6 January 2025 (UTC)Reply
If we wanted to do per-language transliteration (/per-language turning off transliteration), would we keep it as wikitext? That seems like it would make the template a lot less readable. - saph ^_^⠀talk⠀ 11:33, 6 January 2025 (UTC)Reply

Category:Artsakh and subcats

[edit]

Are these needed anymore? The Republic of Artsakh dissolved a year ago. 115.188.138.105 11:16, 6 January 2025 (UTC)Reply

Cf. Category:Soviet Union. The words still exist or existed in regular use. Why would we delete this category? —Justin (koavf)TCM 11:19, 6 January 2025 (UTC)Reply
To OP's point, the category descriptions are worded as if Artsakh still exists. - saph ^_^⠀talk⠀ 12:43, 6 January 2025 (UTC)Reply
What? His point was about the existence of the category, not the wording of the description. No one needs to start a conversation about modifying the module's wording (which I will do now). —Justin (koavf)TCM 12:54, 6 January 2025 (UTC)Reply
https://en.wiktionary.org/w/index.php?title=Module%3Aplace%2Fshared-data&diff=83485136&oldid=83484470Justin (koavf)TCM 12:58, 6 January 2025 (UTC)Reply
Well, theres a Category:Rivers in Artsakh but no corresponding Category:Rivers in the Soviet Union or other toponym categories. 115.188.138.105 20:19, 6 January 2025 (UTC)Reply
Sure, but the premise is "this place no longer exists (i.e. the state was dissolved), therefore, should we delete the categories?" and the answer is "no". There may be some subcats that shouldn't have existed or should be deleted, but that's not because the breakaway republic has been reintegrated into Azerbaijan. —Justin (koavf)TCM 11:19, 7 January 2025 (UTC)Reply
By that argument we should have categories for all polities that have ever existed. Category:en:Rivers in the Aztec Empire? I'd rather just have categories for currently-existing polities, as well as those which are of particular historical significance to particular languages (Category:la:Towns in the Roman Empire?). This, that and the other (talk) 12:00, 7 January 2025 (UTC)Reply
By what argument? My argument was "there are enough words about [topic] to have a category", so yes, if there are enough words about that topic, go for it. Why would polities be any different than sports or political movements or breads or any of the other things we have categories about? —Justin (koavf)TCM 12:06, 7 January 2025 (UTC)Reply
As a general rule, geographic features such as rivers, cities, etc. are categorized according to current political boundaries. So there should be no Category:Rivers in Artsakh any longer. Possibly an exception could be made for cities that no longer exist, but I'm skeptical of that, and rivers usually don't come and go, so there's no reason to put anything in Category:Rivers in Artsakh. Benwing2 (talk) 00:10, 9 January 2025 (UTC)Reply
The river categories can go but village and city categories should stay, because the invaders have either destroyed or renamed them. For example, Karin Tak belongs in the Category:en:Villages in Artsakh because it was a village only when Artsakh existed. Now neither the population, nor the village nor the name are there anymore. Vahag (talk) 08:37, 9 January 2025 (UTC)Reply

Para-Nakh languages

[edit]

I would like to discuss here the addition of a reference to the Para-Nakh languages to the etymology for the Nakh languages. In my opinion, it should work the same way Ancient Greek forms refer to a pre-Greek substrate. An explanation of this is given in detail in Johanna Nichols' work {{R:cau-nkh:Nichols:2004}}. Particularly from page 145 onwards. I think this option works well for explaining forms with phonologically close but irregular correspondences. For example, (1) Ingush ӏаж (ˀaž, apple), Chechen ӏа̄ж (ˀaaž, id.); (2) Ingush нихь (niḥʳ, hide, animal skin), Chechen неӏ (neˀ, id.); (3) Ingush зӏамига (zˀamiga, little, small), Chechen жима (žima, id.); (4) Ingush чил (čil, ashes), Chechen чим (čim, id.); (5) Ingush муа (mwa, scar), Chechen мо (mo, id.); (6) Ingush миинг (miı̇ng, alder), Chechen маъ (maʔ, id.), муъ (muʔ), Bats მურყაჼ (murq̇ã, id.) → Georgian მურყანი (murq̇ani, id.) as a suggestion from user:კვარია; (7) Ingush шуа (šwa, abomasum), Chechen шуа (šwa, id.) and their doublet forms with normal development Ingush шоа (šoa, id.), Chechen шо (šo, id.) as my example. Still, I think it would be wrong to reconstruct the Proto-Nakh form on the basis of these irregular daughter forms. So it was very much not wanted to get a situation like, for example, with Proto-Finnic *omëna (apple), where the daughter forms have no regularity. If you have a better idea on how to handle it here, please let me know. @Vahagn Petrosyan, კვარია, Tollef Salemann, Tropylium, Chuck Entz, Thadh, Fay Freak, Surjection ɶLerman (talk) 15:36, 6 January 2025 (UTC)Reply

Just to be clear, do you propose simply making a code for a pre-Nakh substrate? Thadh (talk) 16:08, 6 January 2025 (UTC)Reply
@Thadh Yes, that's right, although Nichols doesn't have that. ɶLerman (talk) 16:16, 6 January 2025 (UTC)Reply
Why can't you simply say "borrowed from a {{bor|ce|qfa-sub}} language"? That would put the term into Category:Chechen terms borrowed from substrate languages. By the way, I consider Category:Ancient Greek terms borrowed from a Pre-Greek substrate redundant to Category:Ancient Greek terms borrowed from substrate languages. I don't believe people who claim they can distinguish between different substrate sources within the same language. Vahag (talk) 17:18, 6 January 2025 (UTC)Reply
Ok, I'll try to use this template, thanks. ɶLerman (talk) 10:53, 7 January 2025 (UTC)Reply

Splitting WT:RFVE?

[edit]

This page is one of the slowest-to-load high-usage pages we have. Despite User:Pious Eterino's best efforts, it has been above 700K almost all the time since 12/7/24. It would help to find ways to split it. I can imagine three basic ways:

  1. by whether or not another dictionary has the challenged definition.
  2. by whether or not the challenged definition is labeled as restricted geographically (or otherwise?).
  3. by whether the challenged definition is hard to cite because it is for a term that is highly polysemic.

I don't know which one is the best to start with. The first might encourage people to at least look at a few other dictionaries (I like to use OneLook.com for convenient access to multiple dictionaries but OED is an obvious resource for those who have access. The second would encourage those with familiar with the restricted domain to focus their efforts on those areas. The third would be useful for isolating terms that should have long dwell time in RfV.

It might also be useful to use categories and subcategories of RfVed items to isolate, say, challenged UK- or Commonwealth-specific definitions and those with other attributes suggested above as bases for splitting the page. Such a category system could be applied in other languages as well. DCDuring (talk) 22:14, 7 January 2025 (UTC)Reply

In my view the problem is that the rate of new RFVs exceeds the capacity of cite-seekers to process the requests.
I would point the finger particularly at an IP-hopping user who has, in recent weeks, been posting large volumes of words from Webster, without (as far as I can tell) making any effort to assist with other requests. I believe it is incumbent on such users to help out by looking for cites for RFVs posted by others on the page. Perhaps we need to be a bit heavy-handed in making this into a proper obligation and enforcing it, a bit like Wikipedia's "quid pro quo" system for "did you know" entries on their Main Page.
I'd prefer to try this before splitting the page. But if a split is absolutely necessary, I think the best way would be to create a (hopefully temporary) subpage WT:Requests for verification/English/Old, which would be for RFVs of {{Webster 1913}} words, words/senses marked (obsolete) and the like. This, that and the other (talk) 23:30, 7 January 2025 (UTC)Reply
@This, that and the other: I like that idea. Alternatively, we could impose a rule of, say, no more than two nominations a day. (In the case of nominations by IP addresses, nominations originating from the same IP range would be deemed to be from the same editor.) Also, I have asked this IP to sign and date nominations. If this request is ignored, I feel the nominations should simply be removed. — Sgconlaw (talk) 23:41, 7 January 2025 (UTC)Reply
"incumbent on such users to help out" — I don't think that forcing Wonderfool to cite and close RFVs will end well. He'll just make stuff up etc. I like the rule idea though. Anyone can lazily RFV something without doing any research. 2A00:23C5:FE1C:3701:CC32:2372:B6E0:DBC7 20:42, 9 January 2025 (UTC)Reply
  • I think that the lowest priority need for verification are words from very old dictionaries and words that can be found used by a "significant" author from history but so far no others. Yes, some truly dictionary-only words no doubt exist, but most words in old dictionaries probably have an original basis in fact. Also, the further back you go, the less and less searchable material we have, so the three-citation rule in practice becomes more stringent. I wouldn't mind a policy relaxation of the three "independent" cites rule for single-author words from hundreds of years ago. It is somewhat of thankless task looking for some of these, and then if we don't find any further examples, what? We delete the entry? Does that really benefit anyone? Better to mark them "only known in X", I would say. Mihia (talk) 20:50, 10 January 2025 (UTC)Reply
    RFVs for old words are certainly lower-priority in some sense (hence why I think shunting to a subpage would work, if a split was really necessary).
    Not sure if I can agree with your comments about keeping old hapaxes. We'd then have trouble deciding what was a legitimate use and what was simply a single author's error for another word, typo in EEBO etc. Even for the correctly written words, I'm not sure that a hapax of, say, Browne is any more valuable than a hapax made up by a modern author. Who, besides us, is reading Browne today anyway? His failed coinages are just clutter.
    The situation is different for famous and enduring works of literature (Shakespeare etc). I was sad to see the "one use in a well-known work" rule abolished, and all the Shakespearianisms subsequently deleted, as these are entires that people actually use. (There's one at RFVE now: solidare.) I would have preferred to see the rule tightened to "one use in a work of enduring literary interest found on a community-agreed list of such works on the language considerations page". This, that and the other (talk) 22:32, 10 January 2025 (UTC)Reply
@User:This, that and the other IMO we have plenty of uncited, low-quality English entries, so limiting the number of RfVs seems likely to guarantee ever-declining quality of our English dictionary.
"incumbent on such users to help out" Do we want to have s "social credit" system to evaluate contributors, with only sufficiently net-positive contributors allowed to use credits to make RfVs? I think not.
Our rules allow items that have been in RfV for 30 days without sufficient citations to be deleted. Only about 20% of the RfV page is from items added after November 30, 2024. Most discussions peter out after a week. Perhaps the best thing would be to give User:Pious Eterino two hands: 1., a round of applause for his efforts to move resolved items off the page and, 2., assistance in those efforts. We should also acknowledge and follow the efforts of User:-sche and others to try to bring older items to a conclusion.
Some of the older unresolved items may be there because of unresolved policy issues, eg, Discordian#Etymology 2 (durably archived media). But many just seem to need decisiveness. DCDuring (talk) 23:05, 10 January 2025 (UTC)Reply
I do not agree that entries that are, on the face of, plausible, i.e. excluding "obvious rubbish", should be automatically deletable after 30 days simply because no citations have been added. It could be that they were listed in error and nobody has looked at them. I think there should be some kind of check or safeguard, even with many listings and limited resources. Mihia (talk) 23:40, 10 January 2025 (UTC)Reply
Further on this point, although my view is that few people actually bother to read instructions, I do think that the RFV pages should, even so, in the blurb at the top, ask editors to record negative findings, and even if this just repeats what someone else has said. I wonder whether everyone does this. If after 30 days we have two or three experienced editors saying "couldn't find anything", then we can be more confident about deleting it. Mihia (talk) 21:03, 11 January 2025 (UTC)Reply
That wouldn't hurt. I can't say I record that I can't find anything, unless I've searched really thoroughly. Andrew Sheedy (talk) 04:38, 13 January 2025 (UTC)Reply
I assumed that I wouldn't be able to edit that text myself, but actually I can, so I have added the following:
Recording negative findings: Editors who make a fair effort to find citations but fail to do so should state their negative result on this page (even if it only repeats another editor's negative result).
Mihia (talk) 19:48, 14 January 2025 (UTC)Reply
I do agree with you (This, that and the other, I mean) about a single author's errors, typos, totally made-up words used only once etc., that these don't have greater value just because they are in old books. I guess what I was referring to are words that in all probability did exist with that meaning at the time, just that because of paucity of material we can't verify this -- a bit like the words in old dictionaries that "probably" existed. For words in these categories, if we can distinguish them, I would support applying the benefit of the doubt, even if this means we risk including a small amount of junk. Mihia (talk) 20:55, 11 January 2025 (UTC)Reply
To me, it's odd that we include reconstructions, but not these types of words. I would be in favour of including such words with a disclaimer in the entry (along the lines of the defunct {{LDL}} template that's being discussed below). And as mentioned above, I think we should have all words in exceptionally famous and widely studied works, like Shakespeare (again, with a disclaimer/notice). Andrew Sheedy (talk) 04:38, 13 January 2025 (UTC)Reply
The size of RFVE has not always been so massive, see this graph that shows when it really started to blow up. Ioaxxere (talk) 04:55, 13 January 2025 (UTC)Reply
Someone certainly had a "spring clean" at the start of 2023! Mihia (talk) 10:29, 13 January 2025 (UTC)Reply
Could a major source of the problem be that, as Wiktionary covers more of the English language, the average quality and degree of editor interest in new entries is declining? That would suggest policies aimed at new entries, new L2s, and new definitions. Examples would be:
  1. requiring all new definitions to have
    1. at least one citation (or two or three) and/or
    2. per-definition references to other dictionaries.
  2. reviewing all new definitions before they "go live".
  3. using filters to identify (and exclude?) new definitions from IPs
  4. directing some classes of users to the entry talk page and then actually following up on proposed new definitions.
As new definitions are necessary to follow the expansion and evolution of English and to fill in missing definitions, any policy should be not be so hard to follow that we excessively reduce desirable new content or cause would-be contributors to circumvent the restrictions by editing existing definitions to convert them into their new definition. DCDuring (talk) 14:19, 13 January 2025 (UTC)Reply
Some of my devices are so old that they are not capable to download this page at all. And these devices are not so old compared to some of devices used by many other people. Tollef Salemann (talk) 21:07, 14 January 2025 (UTC)Reply

Analysis of words in terms of Pali roots

[edit]

Is it legitimate to present an analysis of Pali words inherited from Old Indic in terms of Pali roots? For example, text books on Pali will present many participles sensu lato as being root + -ta or root + -ya, even though they have been inherited from Sanskrit. I believe that these are worthy as inclusion as surface analyses at the very least.

@Pulimaiyi has objected, «It has been brought to my attention that you have been creating Pali roots - some of which, have a questionable form - and then using these to synchronically derive terms which are clear-cut cases of inheritance from Old Indo-Aryan and as such cannot be analysed as intra-Pali derivations. This is very misleading. Case in point: satta and sakka. How can sakka, for instance, be categorised or analyzed as a "-ya" formation if it is indistinguishable from another hypothetical form "sakka", which could hypothetically derive from a hypothetical adjective *śakra (which would be a "-ra" suffix adjective)? These derivations were not done at the Pali-level. sakka is inherited from a "-ya" formation, it is not itself a "-ya" formation. Please desist from such edits. Thanks. -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 17:07, 7 January 2025 (UTC)»Reply

In so far as sakka is indeed a gerundive (aka future passive participle), this analysis is legitimate. Multiple ancestries are possible, and can be noted in the etymology section. Kindly curb your objections to the presentation of Pali internal analyses, but rather enmhance etymology sections if appropriate. So the short answer to your request is 'no'm but I will heed a consensus. --RichardW57 (talk) 15:14, 8 January 2025 (UTC)Reply
There was some debate regarding another word on surface analysis in terms of Pali roots, but sakka and satta are exceptionally clear-cut non-surface-analysable terms. Their function as participles is duely documented on the respective pages; but that is not any reason to be necessarily be able to analyse them using the participle-forming suffixes inherited from Sanskrit - as a result, I have removed the surface analysis from these two pages. Svārtava (tɕ) 18:01, 8 January 2025 (UTC)Reply
sakka cannot be analysed as sak + ya. Where is the -ya component in it that the etymology claims? -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 18:45, 8 January 2025 (UTC)Reply
See Duroiselle[1] for roots to kam and Buddhadatta[2] for more of what happens to -ya. The commonest pattern, as in present (as in the third conjugation where div yields dibbati) and passive stem formation, is that the 'y' totally assimilates to an immediately preceding consonant. With dentals, it merges to form a geminate palatal. Are you just being obstructive? --RichardW57 (talk) 00:36, 10 January 2025 (UTC) RichardW57 (talk) 00:36, 10 January 2025 (UTC)Reply
@RichardW57: The pattern you say is an observation-based mixture of inheritance sound changes and Sanskrit combining rules (e.g. शक् (śak) + -त (-ta) = शक्त (śakta) -> Pali satta). This may be done by those studying grammar withing Pali as an observation, but this shouldn't allow the inclusion of surface analysis. As an example, any Prakrit grammarian can make a rule from observation that while forming compounds, word medial -p- "assimilates" into -v- and surface-analyse 𑀲𑀯𑀢𑁆𑀢𑀻 (savattī) as 𑀲- (sa-) [ < Sanskrit स- (sa-) ] + 𑀧𑀢𑁆𑀢𑀻 (pattī), which I'm sure is uncontroversially undesirable and unuseful. Svārtava (tɕ) 06:38, 10 January 2025 (UTC)Reply
@Svartava: You're wrong. If it was obvious to Prakrit speakers, it's not unhelpful. And I feel modern students of Pali are expected to recognised suffixed -ya in most of its forms rather than recognise the suffixed forms in their own right. (-yir- for -r + y- may be an exception.) And automatically mentally restoring the assimilation of final stops of roots of words in context cannot be too difficult - the Indic scripts of the Philippines used not to write final consonants. --RichardW57 (talk) 16:14, 10 January 2025 (UTC)Reply
@RichardW57: It might have been obvious at some point before it changed even more into forms like 𑀲𑀉𑀢𑁆𑀢𑀻 (saüttī). However, the understanding comes from an understanding of inheritance sound changes only, not surface-analysability. Svārtava (tɕ) 16:20, 10 January 2025 (UTC)Reply
@Svartava: I think you underestimate the human ability to interpret speech. --RichardW57 (talk) 17:35, 10 January 2025 (UTC)Reply
@Svartava: Note that the purpose of Duroiselle's work is not to explain how Pali came about, but to help one understand it. Unlike Buddhadatta's work, it's not even aimed at helping one to write or speak it. For example, he gives guidance on interpreting an aorist, but not how to form it. --RichardW57 (talk) 17:33, 10 January 2025 (UTC)Reply

References

[edit]
  1. ^ Charles Duroiselle (1921) A Practical Grammar of the Pali Language (overall work in English), Rangoon, section 472
  2. ^ A. P. Buddhadatta Thera (1956) The New Pali Course: Part II, 4th edition (overall work in English), Colombo, section 144, page 176

Exceptional behavior for modern Greek?

[edit]

A lot of modern Greek pages do things differently from other languages, usually for no clear reason that I can see, e.g.:

  1. Many uses of {{col}}, {{col2}}, etc. set |sort=0 and |collapse=0.
  2. Many terms in {{col}}, {{col2}}, etc. manually disable transliteration.
  3. Modern Greek and Ancient Greek seem to be essentially the only users of {{see}}, which is used heavily in these two languages, esp. modern Greek.
  4. Several pages do unusual things like {{l|el|αγριοκοιτάω|αγριοκοιτάω/αγριοκοιτώ|t=}}, {{l|el|αγριοκοιτάζω}} (instead of just e.g. {{l|el|αγριοκοιτάω}}/{{l|el|αγριοκοιτώ}}, {{l|el|αγριοκοιτάζω}}).

The page κοιτάζω illustrates the first three, and κοιτάω illustrates (1) and (4). I am in the process of cleaning up {{col}} and variants and I'm going to fix (1) and (2) pending clear reasons why these things should remain. (1) requires manual auditing to see whether any invocations of |sort=0 should stay, but I expect there to be few cases of this. @Sarri.greek @Saltmarsh Benwing2 (talk) 00:24, 9 January 2025 (UTC)Reply

Dear @Benwing2, Happy 2025! Please do as you wish so that we can copypaste your final style to our cheatsheets. I see that you discuss your changes at Discord. Just to explain:
_1 I used to copypaste it from an older style found in 2019. collapse=0 when the reader should view the whole table (without the silly-hide of few lines). Later, I found the Template:topx handling columns ad libitum, now i see top2, 3 etc.- which does not use pipes, but normal links with asteriscs: they can be copypasted easily, here and at other wiktionaries. I was about to use it at all Related sections, but if you do not like it, please use the pattern that is advisable to copypaste everywhere.
_2 tr=- for repetitive similar transliterations; but if a reader wishes, repeat them.
_3 Template:see, used heavily at Related section. When the Related section is “polyplethes” -more than 5 words- we do not repeat it at each of its members. At a member (a derived or related word) we give links for its closest Rels +all its compounds and for the rest we urge the reader to see the full list at the central lemma which has the complete index of the etymological field (as we often see it in Ety.Dictionaries). Example: modern πείθω (peítho) with Rels by stem. Or ψήφος (psífos) with Rels by meaning, plus a large α...ω index (with Hide) to be able to find words easily. E.g. at the verb ψηφίζω (psifízo, I vote) or at ancient ψηφίζω (psēphízō, I count; vote) a selection of close Rels +the compounds are given plus the {see|el|and=1|ψήφος} link to the rest of the ety.field. The ancient ψῆφος (psêphos) has so many Derived terms, that, for the moment, I just gave a link to perseus.
_4 modern -άω/ώ twin.variant verbs Appendix:Greek_verbs#2nd_Conjugation my.2024.notes. The {link|el|ωωωάω|ωωωάω/ωωωώ}} was used when it is judged that the only thing to see at the -ώ variant is "go to -άω". A separate link for -ώ variant is used when it is still in use, sometimes as isodynamous to the -άω, not dated. Rarely, an -ώ is the main lemma, not the -άω (τηλεφωνώ (tilefonó, I phone). Or, an -ώ variant does not exist at all in practice. The modern -ώ is not a contraction of -άω (-áo) which is a modern suffix unlike the ancient uncontracted -άω (-áō), also -έω, -όω > contructing to -ῶ () which is the basic verb from Hellenistic Koine onwards. The -ίζω (-ízo) is a completely different verb and always has a separate link.
Thank you for your hard work. PS ...#Waiting for Medieval Greek. ‑‑Sarri.greek  I 10:31, 9 January 2025 (UTC)Reply
I would add that I think the {{see}} template is a rather good idea, and the only reason it is restricted to the Greek lects is because non-Greek-focused editors don't know about it! This, that and the other (talk) 23:30, 9 January 2025 (UTC)Reply
Yeah you are probably right. BTW I am thinking of adding parameters |keepfirst= and |keeplast= indicating a number of lines at the beginning or end of {{col}} to keep in that position and not sort elsewhere, so you can e.g. use |keeplast=1 to prevent a use of {{see}} at the end of {{col}} from getting sorted among other lines. Maybe there is a better way of doing this. Benwing2 (talk) 23:37, 9 January 2025 (UTC)Reply

Waiting for Medieval Greek

[edit]

Still waiting for Medieval Greek (2024) L2 title to be implemented. En.wiktionary could become a pioneer, rectifying the absurd "Ancient Greek up to 1453" seen at all official catalogues, still standing as a relic, in 2025! Happy New Year! ‑‑Sarri.greek  I 10:31, 9 January 2025 (UTC)Reply

@Sarri.greek: I've been thinking about this issue some more. Would I be right in saying that your objection to the two-way split (pre- and post-1453) is primarily nomenclatural? If so, would you be happy if Ancient Greek, qua all Greek written in Greek before 1453 (i.e., excluding Mycenaean Greek, written in Linear B), were renamed "Classical Greek"? I assume it would seem less strange to call AD-second-millennium Greek "Classical" than it would to call it "Ancient". Alternatively, what about a two-way split based on orthography, denominated "Polytonic Greek" and "Monotonic Greek"? According to the latter dichotomy, Katharevousa would be included under Polytonic Greek, which would presumably be a good thing, given that it already uses grc templates for its declension and given that it is in large part a continuation of the Classical idiom (Atticism). These are just some thoughts from outside the box to attempt to cut through the impasse. What do you think? 0DF (talk) 03:02, 14 January 2025 (UTC)Reply
Dear @0DF, think grk, not grc for a moment and forget script (polyonic-monotonic, which led to misconceptions at en.wikt since 2006). We can discuss somewhere else lexicographic conventions of presenting any grk: briefly, ANYTHING grk, can be, and indeed was, written in polytonic. Our convention is that we do not duplicate pages with different scripts without reason (differentiating observations), and that we lemmatise Modern Greek in its monotonic version, which is the current script after 1982, although everything was written polytonically until then (quotations may be polytonic for 1st editions, thus el‑transliteration has to include every symbol as Module:grc-translit does).
Ref to wikt:el:Template:ελληνικά for the broad image. There is ancient (including Mycenaean, studied separately, and including Hellenistic Koine), medieval (including the Byzantine polity's output mentioned here) and modern (including Katharevousa) Plus dialects, regional idioms for each period. That is the question. Whether 'tis nobler in the mindset of en.wikt to separate grk in THREE periods. It is a language with too long a history of 3,000 years. Because of that, as a case study, it has too many lang-specifics, some probably absent from en.wikt, or systematically ignored, or even mocked as obsessive "force-feed" (sic) from us, native editors who are trained and daily exposed to different kinds of grk since the age of 12. The borders of periods and registers of Greek are diaphanous and “διάτρητος” “diatretus” ("perforated"); words reappearing with a different garment in each lemma creating intrinsic new behaviours & phaenomena. This is a difficulty for any grk-editor, even for native philologists. But are we going to put our heads in the sand?
Here, I am trying to ask the bord of bureaucrats Why is this little vote not implemented?, why this procrastination? (because the turnout for any question about this 'little language', grk, is 5-8 people -most of whom do not specialise in any kind of grk in particular-). The excellent bureaucrat @Benwing is left alone to judge and carry out everything here?
The state of grk: At the moment, of the three directors (ref) of Ancient Greek @Erutuon, JohnC5, Mahagaja only M Mahagaja is active and has responded; also my mentor Erutuon responded. Generally, admins (all seriously trained linguists, as I understand) abstain and avoid expressing a decision when not specialised in Koine or Medieval Greek, whereas other editors advertise their decisions instantly; still any linguist could take a look at sources and bibliography, including greek bibliography -or is it discarded?- for such a broad question. The ModGreek admin, and my mentor, @Saltmarsh (studying and building Modern Greek since 2005), is less active now but always present, and has agreed manyfold. Plus, presently there is no active specialist (known as trained) in grc‑koi‑lat, or gkm, or gkm‑lat = el‑ear. How many times is this vote going to repeat itself? Everyone is fascinated with grc and its dialects (its texts being most prestigious), or exotic or very rare lemmata, grc‑koi under AncGr is in a very bad state and needs full review, MedGr gkm is mentioned under AncGr, el-kth under ModGr needs a polytonic reviewing plus sources, modern dialects and regional idioms under ModGr are promoted to 'Languages' indiscriminately(?), and ModGr el etymologies need review with refs. Thank you. PS could Template:code be nobr? The hyphen breaks lines. ‑‑Sarri.greek  I 11:49, 14 January 2025 (UTC)Reply
Thank you Dear @Sarri.greek: Mediaeval Greek would indeed be a good idea!
I am less active, more correctly inactive. I started learning Greek around 2004 - an interest spurred by a different alphabet (I was a scientist) and Greek holidays. But 2 weeks a year, bad hearing and a brain which remembers as many French words learnt 65 years ago as the Greek ones last week are no good for maintaining a vocabulary. So pay more attention to conjugations and declensions which are (after a fashion) systematic (for Greek).
Enough of that — whatever was done was done with the best of intentions, seeming like a "good idea at the time". (For example see also makes updating 'related' easier.) Those with a vision of a future unified appearance should be encouraged.
Overlong texts (like this one) and some factionalism have discouraged me taking part in our community.   — Saltmarsh 12:03, 24 January 2025 (UTC)Reply

SoP hyphenated compounds

[edit]

The CFI presently states "Idiomaticity rules apply to hyphenated compounds in the same way as to spaced phrases", but does not explicitly mention hyphenated prefixes. I think it would be beneficial to clarify this section to make it explicit that the same also applies to hyphenated prefixes (where the prefix is, or should be, individually defined in the dictionary), consistently with the decision to delete all these (excepting those saved by a separate rule). To this end I propose that we amend the above sentence to read "Idiomaticity rules apply to hyphenated compounds, including hyphenated prefixed words, in the same way as to spaced phrases". Mihia (talk) 20:36, 9 January 2025 (UTC)Reply

Sounds good to me but we might need an explicit vote to change CFI; I'm not sure what the policy for this is. Benwing2 (talk) 23:41, 9 January 2025 (UTC)Reply
It seems worth mentioning that it has also been proposed to adopt the opposite policy, as discussed at Wiktionary:Votes/2024-11/Updating COALMINE rule and Wiktionary:Beer_parlour/2022/September#Including_hyphenated_prefixed_words_as_single_words.--Urszag (talk) 17:19, 10 January 2025 (UTC)Reply
  • I was thinking it might be nodded through as merely codifying something that we are already doing. Can anyone, perhaps an administrator, advise whether this should go to a formal vote? Mihia (talk) 18:28, 10 January 2025 (UTC)Reply
    I went ahead and added the proposed amendment, because, as stated it is simply a clarification of the current practice, and the ex-teacher example given in the same paragraph is arguably already an example of a SOP hyphenated prefixed word.
    Relatedly, I am interested to know @Mihia's position on entries like non-existent, non-essential, etc. since these are also essentially SOP but are saved by COALMINE; as a proponent of revoking COALMINE, do you believe these entries should be deleted? Svārtava (tɕ) 19:22, 10 January 2025 (UTC)Reply
    Thanks for making that edit. To answer your question, if "non-X" means "non- + X", and nothing more, then I think it should count as SoP even if "nonX" exists, so yes, delete it. I don't see why the existence somewhere of "nonX" -- which is probably the case for almost all "non-X" anyway, for some value of "almost" -- should make any difference. I know there are arguments why it should, but last time round I remember not being convinced by these. In fact, I would go even futher: if "nonX" means nothing more than "non + X" then I believe it is also SoP -- in other words, the accident of how we write things doesn't make any difference, logically -- but, then again, actually implementing this generally in any useful way seems impractical. But with the hyphen, the component division is obvious. I do think, however, that if you search for "non-X" and it does not exist in the dictionary then ideally you should be more helpfully directed to the parts (and similarly for other defined prefixes). Mihia (talk) 20:41, 10 January 2025 (UTC)Reply

Category for dog whistles

[edit]

I wish to propose a new category, as well as possibly labels, for dog whistles for classifying a word is a political allusion that a certain audience is supposed to know the intended meaning. this is a specific type of word that currently has no categorisation. Juwan (talk) 00:17, 11 January 2025 (UTC)Reply

I don’t know, it is heavily context-dependent. Even at our quotes for dog whistle, examples like globalist are given. What about slangy terms which are only understood at all at a certain demographic, or statistically make inferences about political convictions possible? If somebody calls a television idiot box, he is kind of an intellectual, if he calls it electric Jew, the intellectual circles the speaker moves in are discernibly narrow, yet the audience is not double-layered, which would allude us that an essential part of the definition of dog whistle is not satisfyingly reflected by us on its definition page. But our labels are supposed to describe lexemic meanings. Does dog whistle even belong to linguistic terminology rather than being a political catchword, perhaps presupposing epistemological, not to say ideological, paradigms not all dictionary readers and editors might comprehend? Fay Freak (talk) 07:22, 11 January 2025 (UTC)Reply
Would you object to it being an appendix? —Justin (koavf)TCM 07:31, 11 January 2025 (UTC)Reply
No, even less so to a user-page, which one may always try to be more convincing. We should see first how much rhyme or reason there is behind it, to assess the eventual useful content of a category. It requires an overview of political ideologies to which we believe us able to assign terms, doesn’t it? Would a new label and category not be a combination of existing labels and categories like Category:Neo-Nazism and Category:Conservatism plus the presence of a certain degree of double-entendre? Fay Freak (talk) 08:50, 11 January 2025 (UTC)Reply
I think this is far too subjective to be a useful category. — Sgconlaw (talk) 22:48, 12 January 2025 (UTC)Reply

Moving User:JnpoJuwan/Images into main

[edit]

I am not sure about the process about adding policy into main. many other contributors have already taken a look and I request to move it to Wiktionary namespace, replacing the pages Wiktionary:Images and Help:Images (the latter becomes a redirect). Juwan (talk) 10:03, 11 January 2025 (UTC)Reply

@JnpoJuwan: if it is to be policy, you’ll need to list it at “Wiktionary:Votes” to be formally voted on. — Sgconlaw (talk) 06:01, 12 January 2025 (UTC)Reply
I support the move; the two pages linked to are not useful in their current state. This could easily be a think tank policy without a formal vote. Ultimateria (talk) 19:15, 13 January 2025 (UTC)Reply
Done Moved as think tank policy. Juwan (talk) 22:40, 14 January 2025 (UTC)Reply

Sporadic senseid

[edit]

If one wants to make reference to a specific sense of a lemma, is it bad practice to add {{senseid}} for just one sense? If applying it to one sense, should one ensure that all senses are identified by {{senseid}} or {{tl|etymid}? In the case prompting this question, I wanted to use the sense ID for a meaning of short, but saw that {{senseid}} had been used for none of them. --RichardW57 (talk) 13:26, 11 January 2025 (UTC) Incidentally, is it allowed to use sense IDs for translations to English, or does that breach the rule that non-English lemmas get translations, not meanings? I don't trust commonsense to apply. --13:26, 11 January 2025 (UTC)Reply

@RichardW57 I would say it is perfectly fine to add {{senseid}} as and when necessary, if you need to link to an individual sense, no matter whether it is a one-off or not. I do this fairly regularly. No need to add {{senseid}} to all senses unless you have a specific reason to!
As for "is it allowed to use sense IDs for translations to English" - not sure what you mean exactly, but {{senseid}} can be used in entries of any language. This, that and the other (talk) 00:17, 13 January 2025 (UTC)Reply
@This, that and the other: I can see an objection that using sense IDs is giving the meaning of a word, as opposed to translation. This urge to be helpful and save user clicks seems to irritate some users. For example, I often add |t= to forms and other script forms to remind the user of the rough meaning of a word if he's simply temporarily forgotten it, but that seems to irritate other editors, who seem to think it should only be used for disambiguation. --RichardW57 (talk) 14:13, 13 January 2025 (UTC)Reply

Table of Contents

[edit]

I'm finding it hard to monitor the Beer Parlour and allied pages for new topics. Would it be in order to add something like {{minitoc}} so that one can immediately see new topics without being swamped by detailed discussions of no personal interest? I can imagine that adding it may need some careful crafting. Or am I missing a trick I should adopt? --RichardW57 (talk) 13:36, 11 January 2025 (UTC)Reply

Replacement for LDL

[edit]

@This, that and the other: {{LDL}} was removed in 2024 on the basis that it wasn't being used where WT:CFI said it should be. I, however, was finding it useful for words which I suspected would not meet the requirements even if the language were well-documented, i.e. that the inability to furnish 3 examples was not due to accidents of preservation (or modern publishing or inadequate research on my part). What mechanism should now be used to sound a note of caution? --RichardW57 (talk) 14:05, 11 January 2025 (UTC)Reply

Incidentally, I can't find the vote removing the policy requiring the use of LDL or its like, only the discussion at Template talk:LDL. RichardW57 (talk) 14:05, 11 January 2025 (UTC)Reply

@RichardW57 we have {{lb|...|hapax}} if you think it's reasonably likely that only one attestation is available. If the word is attested twice, I do have to wonder whether such a fact needs any special flagging or attention for users. If a "note of caution" needs sounding, surely it would be better to write a usage note afresh for each entry to explain exactly why you feel "caution" is required. (Or at a maximum, a language-specific usage note template could be created.)
As for the policy change, I didn't bother with the formality of a vote, as the RFDO discussion was closed as delete (not by me) and it seemed absurd to leave a reference to a deleted template in the policy. However, I accept your perspective that you feel the whole thing should have been done by vote. If you feel the change should be retrospectively ratified by vote, I could hardly argue against you. This, that and the other (talk) 02:50, 12 January 2025 (UTC)Reply
@This, that and the other: Part of the objection to the template was that its output was far too prominent. The words I am worried about appear only in dictionaries (which could prompt their uses in mediaeval or modern texts), either as entries or parts of a definition - they barely rise to the status of hapax! If the 'Pali editor community' complied with the requirement to maintain a list of acceptable sole sources, the dictionaries I'm using should be on it. (I'd be inclined to make the dictionary requirement the presence on two or more of a list of dictionaries, but that wouldn't be compliant either.) At the moment I'm resorting to {{rfq}} to ask for attestation, but it asks for interesting examples of usage, and relies on HTML comments for one to say what one would like to see shown. (The justification for not maintaining such a list is that is that the editors have better things to do with their time.) There are some instances where it would be good to show the syntax associated with the word, but it's difficult to ask for that. I'm working on the principle that any old 'durably archived' attestation is better than none. I find lawfully (especially working from a non-rogue jurisdiction) providing literal enough translations hard work. We also seem to lack a decent mechanism for acknowledging translations, perhaps because it's not needed in the main rogue jurisdiction. Still, 'hapax' will do for words I can only find, directly or indirectly, in a single dictionary definition. Now what I need to find is a reference for the recommended syntax of an annotated translation to English. --RichardW57 (talk) 11:01, 12 January 2025 (UTC)Reply

redoing list templates

[edit]

We have a ton of list templates like {{list:countries of Africa/en}}, most of which use an antiquated list-helper framework that dumps the results into a raw, hard-to-read list like this:

In languages with transliteration, as @Fenakhay points out, it is even worse, like the corresponding Japanese list:

I would like to redo these using {{col}}. Any objections?

BTW the current framework is really bad in that it pretty much forces you to format your own raw list; the badly named {{list helper 2}} template (and we inexplicably have both {{list helper}} and {{list helper 2}}) takes a |list= param whose value is a list of comma-separated pre-formatted links, which makes it difficult to change the format into something else. Instead the entries should be a comma-separated list of raw terms with inline modifiers as needed; or the entries can each be in their own parameter, again raw with inline modifiers. Either format makes it easy for the underlying framework to choose how to display the entries optimally.

Any other suggestions for improvements/etc.?

Benwing2 (talk) 00:27, 12 January 2025 (UTC)Reply

Have you mocked up a ferinstance for us to judge? —Justin (koavf)TCM 00:35, 12 January 2025 (UTC)Reply
@Koavf The use of {{col}} would be something like this:
Here I have linked the title to the actual category. I haven't yet mocked up an edit button but it could be displayed just to the right of the title, or right-justified. Benwing2 (talk) 00:44, 12 January 2025 (UTC)Reply
Looks nice enough and generally prettier, but I'd prefer no "show more" dropdown. —Justin (koavf)TCM 00:52, 12 January 2025 (UTC)Reply
@Koavf Do you mean you want all of them displayed by default? There is a param for that but I'd want to make sure there is consensus for that as it takes up a certain amount of space. Benwing2 (talk) 00:56, 12 January 2025 (UTC)Reply
Yes, that what I want. I don't think that showing basically 12 arbitrary names out of a list of 55 is helpful. —Justin (koavf)TCM 00:58, 12 January 2025 (UTC)Reply
I agree. I find the show-more dropdowns a bit inconvenient.--Urszag (talk) 11:14, 12 January 2025 (UTC)Reply
Personally I'm a fan. I need to scroll past long lists more often than I need to look through them, especially on basic English entries and pages with many language sections. Ultimateria (talk) 19:01, 13 January 2025 (UTC)Reply
I agree in changing it to {{col}}, {{list:countries of Asia/en}} is also pretty convoluted. Also, if the problem is showing just part of the list, wouldn't it be better to do the opposite and have it fully collapsed by default? I think the Japanese one would be pretty big and a bunch of uncollapsed lists could be a nuisance in pages like Kuwait. Trooper57 (talk) 01:30, 12 January 2025 (UTC)Reply
I'm against collapsed by default, but collapsed entirely is better than sneak peeks of various parts of an alphabetical list. —Justin (koavf)TCM 01:40, 12 January 2025 (UTC)Reply
I Support this proposal, and see no reason not to use the built-in collapsibility of {{col}} for consistency with other term-list boxes in our entries. I note that this style of collapsibility for term lists was the beneficiary of clear community consensus. This, that and the other (talk) 02:38, 12 January 2025 (UTC)Reply
I hate the existing list format. I'd love for the replacement to not have any visible members, just the "show more". DCDuring (talk) 03:10, 12 January 2025 (UTC)Reply
Support Vininn126 (talk) 05:48, 12 January 2025 (UTC)Reply
Other than having the boring antiquated style, is the old list really hard to read? The new one takes up much more space. And the collapsibility, that is used to improve the space usage, is a usability issue on its own, because it requires a click to see the full list.
Speaking as someone having zero Japanese skills, the old-style Japanese version of the list would be easier for me to navigate if it were alphabetically sorted by the English transcriptions. But, I guess, I'm not supposed to be reading the Japanese entries in the first place, so my opinion is irrelevant. That's just the only reason why the demonstrated Japanese list subjectively feels awkward to me, but I don't see any other problems with it. --Ssvb (talk) 09:26, 12 January 2025 (UTC)Reply
Comment: How would it work with countries that have multiple names? I don't think {{col}} has support for multiple links in the same line, right? This doesn't really happen with African countries, but I don't think it'd be right to hide either Turkey or Türkiye from the European list. To do so would be to get prescriptive about it imo. And then the alternative of listing them in two bullet points seems even worse.
...Actually, I guess we do have Côte d'Ivoire vs Ivory Coast and Eswatini vs eSwatini vs Swaziland. It's worth thinking about those and Czech Republic vs Czechia. I'm sure other languages have similar conundrums as well. MedK1 (talk) 18:59, 12 January 2025 (UTC)Reply
@MedK1 See my comment below. {{col}} now has support for multiple links on a given line, either separated by a comma (intended for synonyms or alternative forms with significant differences) or a tilde (intended for alternative forms with minor differences). So we could write Côte d'Ivoire,Ivory Coast or Eswatini~eSwatini,Swaziland etc. (the latter displays as Eswatini ~ eSwatini, Swaziland). Benwing2 (talk) 21:22, 12 January 2025 (UTC)Reply
Ah, awesome! Support then, though I do think the format of a label and a colon preceding the list looks very very ugly. MedK1 (talk) 21:29, 12 January 2025 (UTC)Reply
@MedK1 Are you referring to my suggestion for (continents: mabara) or similar? This would go on a line by itself rather than preceding the list on the same line. If you think that's ugly, do you have any suggestions for how to improve it? I'm not sure what part of it you find ugly. Maybe we could potentially dispense with the mabara part entirely; I've just kept this because all the existing lists have it. If your objection is to the overall format of the header, maybe User:This, that and the other has some ideas how to prettify it. Benwing2 (talk) 21:44, 12 January 2025 (UTC)Reply
Yeah, I don't like the titles (the *(countries in Africa): bit before {{col}}). I don't like them in the other lists either — they always look jarring. I feel they don't look like titles per se either.
Factors that contribute to this feeling may be them being justified to the left, being in italic rather than bold, and them starting with lowercase characters. Changing any of these (but especially the last two) would go a long way.
But then they wouldn't fit right with being on a list, now would they? Ugly as they are, I get how they came to be that way. I can't think of something that'd both 'work' with lists while also looking good; it's why I hadn't made any suggestions before. I hope someone can come up with something nice... MedK1 (talk) 21:57, 12 January 2025 (UTC)Reply
Can you mock up some ideas that would look better to you? I'm not at all opposed to changing the way that {{col}} handles headers; it's something that has kinda come to be without a lot of thought put into it. Benwing2 (talk) 22:44, 12 January 2025 (UTC)Reply
@Benwing2 @MedK1 if you are looking for inspiration I made some mockups at User:This, that and the other/NavFrame and list-switcher#Interim improvement: add title to list-switcher. This, that and the other (talk) 00:11, 13 January 2025 (UTC)Reply
@This, that and the other Thanks. 3a and 3b look the same to me and both look a lot better than what we have. Benwing2 (talk) 00:17, 13 January 2025 (UTC)Reply
Also I'd like to add an [edit] box on the right side of the title bar, for use with lists; how hard is that to do using the 3a/3b style? Benwing2 (talk) 00:19, 13 January 2025 (UTC)Reply
@Benwing2 I was referring specifically to the "Interim improvement: add title to list-switcher" section - sorry I didn't make that clear. Maybe I should have made a separate page. The rest of that page is for a discussion I want to start at some point about unifying the style of NavFrames and list-switchers.
Anyway, yes it's very easy to add an "edit" link there. The whole thing will need some CSS implementation - not necessarily the way I have mocked it up on that page. This, that and the other (talk) 00:21, 13 January 2025 (UTC)Reply
So yeah, interim B looks better. Benwing2 (talk) 00:31, 13 January 2025 (UTC)Reply
Thanks!! I like these; ignoring interim A (which is not a significant improvement, imo), they're all better than what we have and I'm fine with them! Interim B is the best I think! MedK1 (talk) 00:30, 13 January 2025 (UTC)Reply
I do want to add that I think navboxes (er, aka NavFrames?) are fine as-is however. They're their own thing, so it's fine if they look a bit different from your standard page text. MedK1 (talk) 00:33, 13 January 2025 (UTC)Reply
Support. big lists deserve some love and better formatting. Juwan (talk) 22:46, 14 January 2025 (UTC)Reply

Comment: So far everyone except for User:Ssvb seems in favor of using {{col}}, although there is some disagreement about whether to display all the elements by default or only some of them (or none). I am writing the new list-helper code, which is likely to require each list:foo of Bar/CODE to do something like this (for {{list:continents/sw}}):

{{#invoke:topic list|show|sw
|hypernym=[[bara|mabara]]
|Afrika<t:Africa>
|Antaktika~Antaktiki<t:Antarctica>
|Asia<t:Asia>
|Ulaya,Uropa<t:Europe>
|Amerika ya Kaskazini<t:North America>
|Amerika ya Kusini<t:South America>
|Australia<t:Australia>
}}

This code is smart enough to know that the English hypernym to be displayed in the title is continents (based on the template name), and the corresponding category is Category:sw:Continents, although you can override either using |enhypernym= or |cat=. The format of each element is exactly as in {{col}}, meaning it can take inline modifiers; multiple comma-separated or tilde-separated items (the tilde is meant to delimit slight variants and the comma to delimit synonyms or variants with more significant differences); pre-formatted elements (e.g. for Japanese using {{ja-r}} or similar); etc. The title will currently display as something like (continents: mabara), where the English hypernym "continents" links to the appropriate category and the language-appropriate hypernym mabara links to the singular equivalent of the term mentioned. This means that language-appropriate hypernym needs to be in the plural but typically linked to the singular, which isn't always the case in the current lists (a lot of the time, the hypernym is in the singular form, so I may need some help auditing the lists to rewrite the hypernyms in the plural). The reason that each list template directly invokes a module instead of having a wrapping {{topic list}} template is so that the user can specify parameters to the list template (e.g. |nocat=) and they are handled automatically without each list template having to explicitly pass all such parameters through. I assume this won't be such an issue as most people will just copy an existing list to create a new one.

One nice thing about this is that we can easily decide the exactly display format later, and change or optimize it at any time; we could even add a per-user parameter to control the appearance of such lists if there is a lot of disagreement over the appearance. Benwing2 (talk) 11:50, 12 January 2025 (UTC)Reply

One factor no one has mentioned: see Category:Latin script templates, most of which are transcluded in all the single-Latin-letter pages like "a". We need to be very careful not to add more Lua overhead to most of those pages. If anything, we need to come up with lite versions that use modules even less than before. Even if we split off letter/symbol entries to a subpage, that subpage could still end up having problems all by itself. Chuck Entz (talk) 23:13, 12 January 2025 (UTC)Reply
@Chuck Entz Those templates use their own dedicated module, Module:letters, which is out of scope at this point. IMO they might be better implemented using {{flatlist}}, which is non-Lua and uses a simple CSS implementation to display items horizontally with bullets in between them (which looks a lot better than commas). Benwing2 (talk) 23:31, 12 January 2025 (UTC)Reply
@Benwing2: I just asked what do people find unreadable in the compact look of the old list? Maybe giving it the same light blue background could make it look a bit better and more distinctly stand out? Clicking on the link https://en.wiktionary.org/wiki/Category:en:Countries_in_Africa of the old list is functionally roughly equivalent to clicking on the "show more" link of the new list, as both present the data formatted as columns. It's one click here versus one click there. Do small lists of only 3-5 items look nice with the new design? And the configurability of the template is a powerful feature, but it enables formatting discrepancies between different entries. I don't object the changes per see, but I see a potential for a "simplification initiative" in a few years from now, essentially reverting this redesign. --Ssvb (talk) 07:51, 13 January 2025 (UTC)Reply
@Ssvb for long lists, the existing style is a barely-readable jumble of text. There's a reason typesetters use techniques such as spacing, columns, and tables to lay out long lists.
You're right that for very short lists it may be overkill to use the term-list style. The one that springs to mind is {{list:days of the week/en}}. @Benwing2 I wonder if we should keep using a similar style to the current style for short list? "List" templates are, generally speaking, lists of coordinate terms, and so I wonder it could make sense to have some kind of inline {{cot}}-style template for lists of, say 8 items or less. (And also move the boxes up from a "See also" L3 to a "Coordinate terms" L4 - but I digress.) This, that and the other (talk) 10:13, 13 January 2025 (UTC)Reply
The terms should definitely be listed under "Coordinate terms" regardless of style, and I've been doing that as I encounter them listed under Related terms or See also. I've also had the thought that maybe short lists should use a horizontal layout style, although I somewhat prefer the {{flatlist}} style with bullets between the items rather than commas, something like this:
That would also allow commas and tildes to be used with their current meaning. Benwing2 (talk) 10:42, 13 January 2025 (UTC)Reply
BTW I'm now convinced that allowing for bullet or comma separation with horizontal display is a good idea, but it has to be specified manually. Something like doing it automatically if there are less than a certain number of items won't work because some items take up much more width than others. See for example ਖ਼#Punjabi. Under See also there are three lists of characters (Gurmukhi script letters, vowels and diacritics). The first list has 42 elements but is definitely better displayed horizontally with comma separators; the current display with three horizontal lists looks decent (other than IMO the category should be linked to the title rather than displayed separately). OTOH something like {{list:fundamental interactions/ru}} has only four items but looks terrible laid out horizontally like it currently is. Possibly we could do something involving the total character width of all elements but I think that would be tricky to get right. Benwing2 (talk) 23:58, 13 January 2025 (UTC)Reply
The manual override to enable horizontal display with the bullet/comma choice sounds good to me for creating new content. Still what will happen to the existing entries, such as the mentioned ਖ਼#Punjabi? Will they all need to be manually edited to enable horizontal display for them? --Ssvb (talk) 07:36, 14 January 2025 (UTC)Reply
@Ssvb No. The override would be done once in e.g. {{list:Gurmukhi script letters/pa}}, which is the template that displays that list. In fact, the system I am designing will automatically override the default value for horizontal display so as to enable it with a comma separator (and set certain other defaults, such as disabling transliteration and providing appendix lists for the script and alphabet if they exist) for *ALL* lists of the form list:Foo script letters/CODE. This is on the assumption that all lists of letters should be displayed in a similar fashion. An individual list can always override the default and supply different settings if the defaults don't look right for that particular list, but on the assumption that most lists won't do that, we'll get fairly uniform, good-looking lists of letters while other sorts of lists may display in a different fashion. (For example, days-of-the-week lists may set horizontal display with bullets as the default, since there are only 7 entries and they're usually not too long; but some particular languages may want to override this and go back to a columnar display if the horizontal list gets too long, e.g. if there are several synonyms of each day name, each one with translit, or something.) The only thing that might need to change on each page calling these lists is to remove the bullet to the left of the list template invocation, since the list module itself will generate it when the list is displayed horizontally. Benwing2 (talk) 07:50, 14 January 2025 (UTC)Reply

Comment: I have created an implementation of this, and a sample list in {{list:countries of Africa/ja/sandbox}} (it needs to be in mainspace because its name is parsed to get part of the title). It displays as follows:

As noted above, we are going to change the title bar to look better. Benwing2 (talk) 01:50, 13 January 2025 (UTC)Reply

And I have just gone ahead and changed the title bar, as envisaged.
On the Vector 2022 skin this template takes up 4 columns. This is a bit cramped and it should preferably only take up 3 columns in this skin. A solution to this problem is in the works. This, that and the other (talk) 06:19, 13 January 2025 (UTC)Reply
Thanks. Do you feel your prototypical algorithm in JavaScript is good enough to convert to Lua? If so I can go ahead and do it. Benwing2 (talk) 06:34, 13 January 2025 (UTC)Reply
I'll reply on your talk to avoid cluttering BP with technical discussion. This, that and the other (talk) 06:59, 13 January 2025 (UTC)Reply

obnoxious collapsibles on mobile

[edit]

Many wiktionary entries are blessed with a veeery rich list of linked terms (e. g. derived terms at hand). On desktop, these appear in a neat handy collapsible box, that can be open to view the deluge of links. When opening a page, the table is shortened and can be scrolled past easily. I personally use mobile more often though and in my (firefox) browser these collapsibles pose a serious obstacle to accessing the rest of the page, simply because they are default non-collapsed with the "show less/more"-button on the bottom, so I have to scroll past hundreds of words to get to the sections below.

I have no clue whatsoevery how these are programmed or by who, but I can imagine two solutions: Either have them start off closed when opening a page (like translations) or move the button to the top, so users don't have to scroll past to close them.

PS: I found these two discussions https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour/2021/April and https://en.wiktionary.org/wiki/Wiktionary:Grease_pit/2021/June#Experience_on_mobile on the same problem with language headers. This is even worse because users can at least immediately close a language header Jan R Müller (talk) 13:11, 12 January 2025 (UTC)Reply

I'm sure issues like this have been discussed more recently than 2021, and some solutions have been proposed. I don't recall what the current blocker is; I recall some longstanding Phabricator ticket to do something or other. @Ioaxxere @Surjection @This, that and the other Can you comment? I vaguely remember some proposals for having a gadget to improve the mobile experience. Benwing2 (talk) 21:50, 12 January 2025 (UTC)Reply
There are two separate issues here. Ioaxxere can probably say more about the language headers one.
As for the collapsible term-list boxes, I have actually been noticing something similar on my phone, except for me, it is all the quotations in the entry that appear uncollapsed - the collapsible boxes are collapsed as they should be. It turns out that the sticky "visibility" links that are shown in the sidebar in the desktop Vector skin (and in the Tools menu on Vector 2022) appear at the very bottom of the page on mobile, and I have at some point turned on the "Show quotations" option. So @Jan R Müller please scroll to the very bottom of an affected entry and tap "Hide derived terms" or similar.
I wonder if we should move these links somewhere else on mobile... This, that and the other (talk) 00:10, 13 January 2025 (UTC)Reply
@Jan R Müller: The language header thing you mentioned is tracked at phab:T376446 but the developers are yet to fix this for us. Ioaxxere (talk) 04:27, 13 January 2025 (UTC)Reply
Good news as of Jan 9 (3 days ago):
We have decided to not block this on T374883. Anything that reduces the complexity of the section collapsing code is worth doing as it simplifies the transition to Parsoid. This means articles with only one heading will display with the heading collapsed by disabled, but given there is a user preference to override this behaviour, we don't see this as a problem. We've queued this up for next sprint since the patch is written, it's mostly removing code and we don't see this as a big time sync.
Benwing2 (talk) 04:32, 13 January 2025 (UTC)Reply
@Benwing2 @Ioaxxere just noting that this appears to have happened. Now all section headings appear collapsed by default, even when there is only a single section. This, that and the other (talk) 04:45, 18 January 2025 (UTC)Reply

Can we label Five-Percent Nation lingo or should we consider it slang?

[edit]

Hi, I've been thinking about including a lot of Five-Percent Nation lingo into dictionary. Some of the words are already there like a-alike or 85er but I thought that maybe we could create a label for it that would automatically add a category, let's say "Five-Percent Nation" or "Five-Percent Nation jargon"? Of course not all words should be allowed as most of them are just English words that carries similar meaning but since the lingo was popularised in the 90s by many rappers and some of it is still used up to today in hip-hop, I think it should have its own category? What do you guys thinks? Tashi (talk) 21:34, 12 January 2025 (UTC)Reply

I don't know much about the Five-Percent Nation but if the inclusion of these terms can be justified according to CFI and there are enough of them, then IMO yes we should probably have a category. Category:African-American English and especially Category:African-American Vernacular English are a mess and a lot of recategorization is in order. @-sche Benwing2 (talk) 21:56, 12 January 2025 (UTC)Reply
I don't know how many terms we could include. Is there a minimum number of entries a jargon should have? Tashi (talk) 22:08, 12 January 2025 (UTC)Reply
IMO a category should have at least ~10 items to be worth creating. Benwing2 (talk) 23:32, 12 January 2025 (UTC)Reply
Agreed. Most people don't realize that Wikimedia categories aren't really for categorizing, they're navigation aids to help find pages that have specific things in common. If there's only a few pages, you might as well just link directly to the other pages from each of them (i.e. in a "See also" section or a usage note). Chuck Entz (talk) 00:04, 13 January 2025 (UTC)Reply
I think it's doable :) If I want to add it to the module, do I need a permission or I can just do it now? Tashi (talk) 23:02, 13 January 2025 (UTC)Reply

mismatches between list templates and categories

[edit]

I made a list of all the current list templates and the categories they categorize into. In the following list I chopped off the language-specific component of the template and category names, counted up each (template, category) combination and sorted the result by count>

What you can see is:

  1. 42 languages (including English) categorize their Gregorian calendar months into LANG:Gregorian calendar months but 107 categorize just into LANG:Months. I'd like to fix that, and I think we have one of two options. (1) categorize only into LANG:Gregorian calendar months; (2) categorize both into LANG:Gregorian calendar months and LANG:Months. (English does (1) but then manually adds Category:en:Months to all the months.)
  2. Templates named countries of Continent vs. categories countries in Continent. The logic used in Module:place and related data modules is that "of" is usually reserved for political/administrative subdivisions of a country or sub-country-level entity while "in" is used for random lists (hence Category:en:Counties of Texas, USA and Category:en:Municipalities of Minas Gerais, Brazil but Category:Cities in Texas, USA and Category:Towns in Minas Gerais, Brazil). By this reckoning, a continent is not a political entity so it's Category:en:Countries in Africa not #Category:en:Countries of Africa. I propose making the template names follow the same logic.
  3. Templates like Devanagari script letters map to language-specific categories. Should they instead map to a topic cat LANG:Devanagari script letters (and similarly for other Foo script letters categories)? Benwing2 (talk) 03:23, 13 January 2025 (UTC)Reply

Importing hatnote templates from Wikipedia

[edit]

for the purposes of non-mainspace (Wiktionary and Appendix namespaces), I wish to import some hatnote templates from Wikipedia, such as the ones below:

these would be helpful as they are already intuitive for wiki readers and writers. some of these are also being imported to fix their visual output, as some are misuing the bare : ''a'' syntax, which is bad for both desktop and mobile. Juwan (talk) 21:52, 13 January 2025 (UTC)Reply

@JnpoJuwan: could you give some examples of how you intend to use them? — Sgconlaw (talk) 22:55, 13 January 2025 (UTC)Reply
Generally I discourage importing code from Wikipedia as it almost always makes for a maintenance headache and usually there is already a way to do that in Wiktionary. In this case, we already have {{also}} for confusable terms, and a name like {{also2}} would be very confusing to editors. I think if it's desired to have a hatnote functionality for Wiktionary and Appendix there should be a single template called {{hatnote}} or similar, and in fact it already exists; so do {{selfref}} and {{main}}, along with {{maincat}}. I'd definitely discourage having more hatnote templates that don't obviously differ from each other. Benwing2 (talk) 23:30, 13 January 2025 (UTC)Reply
I see no valid use for the templates not created yet. The first three take account of the encyclopedic character of a project, which Wiktionary lacks, the fourth displays the same as our {{also}}, the fifth is already created for links to internal pages. Fay Freak (talk) 21:38, 14 January 2025 (UTC)Reply
Well, turns out we have encyclopedic content on appendix pages, which is where {{main}} is used on 152 pages, some of uses of which are ugly webdesign (some rhyme pages as Rhymes:Indonesian/i-), some redundant (e.g. Wiktionary:List of languages/special, where the link to the main page is auto-created due to the present page being a subpage). I doubt that Wikipedia editors are going to write those linguistic appendices, though, so as Benwing I am shrewdly afeared that the presence of the templates is disproportionate maintenance burden. Fay Freak (talk) 21:47, 14 January 2025 (UTC)Reply
I believe that these would be helpful exactly for quick editing in internal pages, on the appendix and Wiktionary namespaces. it's not that only Wikipedia editors will write those, but I am suggesting these are globally well-known and helpful in the case that I suggested, as I find them missing sometimes. Juwan (talk) 22:38, 14 January 2025 (UTC)Reply
for a concrete example, I wanted these when writing Wiktionary:Images. I would find it also helpful for navigation around other policy pages. Juwan (talk) 22:43, 14 January 2025 (UTC)Reply
@JnpoJuwan You need to be specific about how the existing {{hatnote}} and such templates aren't sufficient. Just saying "let's do what Wikipedia does" isn't enough. Benwing2 (talk) 22:46, 14 January 2025 (UTC)Reply
I may be doing a bad job at explaining myself. the {{hatnote}} templates may be sufficient, however, these would be practical in quickly writing the wikitext with simpler formatting, including links and all. I don't mean that these templates actually have to be copies from the code on Wikipedia, as the templates sure a lot to deal with. rather, their outputs should be similar.
I apologise for this discussion having a lot of words to try to explain a few. Juwan (talk) 22:58, 14 January 2025 (UTC)Reply
Especially since I specifically deny that people who know it from alleged global usage would deploy them for some quick editing in Wiktionary-internal pages. Editors are not equally invested in Wiktionary as well as Wikipedia or any project where these templates equally make sense and if so it would take extra steps for them to try use these templates and be seriously disappointed if they are referred to an existing more general method ({{hatnote}}) to format their text. We aren’t that heavy in formulated guidelines and policy pages. So currently the prospective uses of the additions – they hardly deserve the attribute “practical” – are pushed into the background by the maintenance concern. We actively remove rarely used templates and template redirects that do the same job as other templates that are oftener. Fay Freak (talk) 23:07, 14 January 2025 (UTC)Reply

Old Albanian

[edit]
Discussion moved to Wiktionary:Language treatment requests#Old Albanian.

Usage of etym-only codes: what should or shouldn't be appropriate?

[edit]

One thing I see often debated is how etym-only codes are to be used. Some argue only in etymology sections for categories (why not in {{cog}} or {{nocog}}?) Some argue they should be allowed in descendants sections. Some complain about uses such as купорос, where Middle Russian is an ety-only code for Russian, and this would be better handled as an alt form.

I'm not going to lie, I'm not optimistic about editors coming to some sort of consensus, but I figure it's worth a try to start the discussion and get input as well as potential problems or solutions to each approach.

Personally, I feel that etym-only codes are fine in any etymology template as well as descendants, which is a sort of "reverse" etymology section, and it could also be helpful in organizing things such as dialectal forms. For example, Polish has a few etymology codes - instead of everything being in just the alts section, it would be possible to show how each dialect group handled a given reflex to show better comparison of regional differences. Vininn126 (talk) 21:39, 14 January 2025 (UTC)Reply

Removing non-lemmas from Special:Random

[edit]

In Wikipedia, a series of pages called disambiguation pages were removed from the pool of pages that Special:Random can direct you towards. Even though disambiguation pages make up 225,920/6,939,921 or ~3.26% of the total pages on Wikipedia, their repetitive and auxiliary rather than informative nature was enough that their removal was deemed an improvement to the overall "Random article" button experience.

If one is to consider the state of affairs prior to the above change as an issue, then here on Wiktionary, what we have is a much bigger version of it. Like disambiguation pages, our non-lemma pages are built to direct readers to the word's canonical version. Unlike disambiguation pages, however, non-lemmas actually tend to form a sizable chunk of entries for most languages with active editors; our Wiktionary:Statistics page and the many non-lemma and lemma category pages (highlighted blue just now) can provide a rough view of the situation, where the corresponding categories for English and French show that the amount of non-lemma entries is respectively ~61% and ~310% (!!!) the amount of lemmas.

It's worth noting that English is a weakly inflected language: The proportion of non-lemmas becomes far more noticeable if you take a highly inflected one like Finnish. User:Jberkel keeps a list of wanted entries, where most of them are non-lemma entries. If we took the stats for his latest data dump and began creating all the pages in there, the creation of the Finnish pages alone would nearly quadruplicate our total amount of entries as well as heavily skew the Special:Random output.

With these and other more anecdotal considerations — such as my own experience with successive "Random entry" clicks and commentary by other users in the Wiktionary Discord server — in mind, I'd like to know if users here would be in favor of changing Special:Random to remove non-lemma pages from its pool. This is, pages where all of their definitions are non-lemmas. The way I imagine it, it'd be implemented by having code check for the absence of any "LANG lemmas" categories in the page, perhaps even by checking "whether any categories end in lemmas" and if that's not the case, then the page is not eligible for showing up in Special:Random and would therefore be invisible/ignored by it.

Should people agree, I aim to get this proposal to Phabricator and use this discussion as evidence of community consensus (as has been done in the past) to hopefully get this sorted out! I'm looking forward to potentially going back to enjoying my favorite Wiktionary pastime. MedK1 (talk) 22:16, 14 January 2025 (UTC)Reply

Strong support Fay Freak (talk) 22:47, 14 January 2025 (UTC)Reply
Support Trooper57 (talk) 20:29, 15 January 2025 (UTC)Reply
Support. — Sgconlaw (talk) 22:33, 15 January 2025 (UTC)Reply
Support! Polomo47 (talk) 02:30, 16 January 2025 (UTC)Reply
Support Vininn126 (talk) 09:16, 16 January 2025 (UTC) Vininn126 (talk) 09:16, 16 January 2025 (UTC)Reply
Strong support Chihunglu83 (talk) 18:24, 16 January 2025 (UTC)Reply
Support, although I have little hope this will get implemented. Jberkel 13:44, 17 January 2025 (UTC)Reply
Indeed. Vininn126 (talk) 13:45, 17 January 2025 (UTC)Reply
Well, depends what people use Random for, doesn't it? If you want a word for its meaning and definition: lemmas are best. If you want something like a dice roll (a random valid word for a word game, or for composing surreal Dada poetry) then maybe you want all possible forms. I don't use it so I don't know the use cases. 2A00:23C5:FE1C:3701:2921:96CC:86C1:8A99 13:48, 17 January 2025 (UTC)Reply
Well, sure, but good luck getting this implemented - I don't think any such feature exists in MediaWiki.
Here's a more workable proposal: Many years ago we used to have a link in the sidebar, immediately below "Random page", which took you to WT:Random page, where you pick a language and get sent to a random lemma in that language. I'm not sure why it was removed. We could very well add that back, or at least make it easier to find. This, that and the other (talk) 14:17, 17 January 2025 (UTC)Reply
In the MediaWiki codebase there seem to be a hook to override Special:Random, but this would require custom development just for Wiktionary, which is unlikely to happen (unless there's a patch coming from the community, but even then there might be security concerns). Jberkel 14:37, 17 January 2025 (UTC)Reply
Strong support despite technical limitations pointed by Jberkel and TTO. Making Wiktionary:Random page (a very interesting page I wasn't aware of before this) more visible and accessible is also a great idea. Svārtava (tɕ) 18:16, 20 January 2025 (UTC)Reply
Support although @MedK1 you have a typo up above; you wrote:
perhaps even by checking "whether any categories end in lemmas" and if that's the case, then the page is not eligible for showing up in Special:Random
when you presumably mean "if that's not the case". Benwing2 (talk) 09:21, 21 January 2025 (UTC)Reply
Oh, true! I'll fix it right away, thank you! MedK1 (talk) 15:29, 21 January 2025 (UTC)Reply

adding a topic category 'Religions'

[edit]

Very surprisingly, we don't seem to have a set category listing religions, although we have CAT:en:Religion (a related-to category, which has 1,241 items and is in serious need of subclassifying) and even CAT:en:Religious occupations (?!). Any objection to me adding one? There are currently four lists of religions e.g. Template:list:religions/en and similarly for Telugu (although it's garbage, with English terms mixed in), Syriac and Georgian, and I can redirect the terms in those lists to the new 'Religions' categories. (Or should it be 'Religious movements'?) Benwing2 (talk) 05:18, 15 January 2025 (UTC)Reply

I also propose a set category 'Taxonomic ranks' (kingdom, phylum, class, order, family, genus, species and many others, e.g. subspecies, superclass, tribe, etc.). Benwing2 (talk) 08:04, 15 January 2025 (UTC)Reply
This can easily be filled by Template:list:taxonomic ranks/fa and 9 templates of the form Template:list:taxonomy/CODE (zh, ko, my, th, vi, ja, km, hi, ml), which should be renamed to Template:list:taxonomic ranks/CODE. Benwing2 (talk) 08:06, 15 January 2025 (UTC)Reply
@Benwing2 How would you propose to handle some things that are not exactly taxonomic ranks, to wit, clade, taxon, group? There are also ambiguous ranks like section, which is used both as a subgeneric and suprageneric 'rank'. I assume that we are talking only about current/recent definitions. I am not sure where exactly group names like cohort, series, and division fit in. There is also a bit of ambiguity because botanists, zoologogists, prokaryotists, virologists, horticulturalists, etc. each have different codes which seem to affect terms like empire, domain, kingdom, and realm at the very top (or bottom?) of the chain. It is probably just a matter of presentation, with some items appearing outside of the main sequence of ranks. DCDuring (talk) 21:49, 20 January 2025 (UTC)Reply
If I'm understanding things right, terms like clade and taxon are at the meta-level; they are types of groupings. So maybe we should have a Category:Taxonomic groupings or Category:Types of taxonomic groupings to include them. Benwing2 (talk) 22:10, 20 January 2025 (UTC)Reply
No those first three are terms for taxonomic names that may appear in the hierarchy anywhere, in principle. (Also, it is desirable that any taxonomic name be a clade, in the sense of a taxonomic group that includes a single group of organisms and all its descendants. For taxa in general this is at best a hypothesis or a goal.) I don't see why these and things like nothospecies, nothogenus, variety, oogenus, form, section, and cohort etc, should appear in a separate category. Each of them can make sense in a inheritance/descent taxonomic sequence of groups of organisms. DCDuring (talk) 22:36, 20 January 2025 (UTC)Reply
OK you know more than I do about this. I just think that 'taxonomic ranks' should only include actual ranks (inclusively; any term used as a rank by any related field should count); meta-terms like clade should go somewhere else, or only in Category:Taxonomy. Benwing2 (talk) 22:58, 20 January 2025 (UTC)Reply
I agree, except that I think the group names that can appear in multiple positions in taxonomic trees should also be included, specifically clade, section, series, division, as well as sames such as cohort, and also group (which is often used in nearly SoP terms like species group, stem group, informal group, etc.) and, possibly, taxon. DCDuring (talk) 16:21, 21 January 2025 (UTC)Reply
No opinion either way, but note that we do have Appendix:Religions, which may explain the absence of a category. Andrew Sheedy (talk) 18:20, 17 January 2025 (UTC)Reply
I'm indifferent ultimately, but I will just note that the way that "religion" is used can be ambiguous and there may be complications. For instance, I routinely hear someone saying that (e.g.) Catholicism is a "religion", when I think of Catholicism as a subset of Christianity and Christianity is the "religion". And just like the word "language", there is a language like English and there's the sense of "he uses foul language" or "I don't like your (use of the English) language" and I think the ambiguity is similar here. So, to stop rambling: I agree with the kind of taxonomic approach (where you have examples like "Christianity > Catholicism > Roman Catholicism" etc.), but I can easily imagine a lot of disagreement or misuse of the category tree by putting things in the top level that belong lower in the taxonomy. —Justin (koavf)TCM 18:27, 17 January 2025 (UTC)Reply

Kiautschou German Pidgin

[edit]

An entry has just been created at Gobenol with no headword and incorrect language codes. The problem is that we don't have a language code for Kiautschou German pidgin, so @WorldPeaceIsNotFarAway "improvised". It also should be moved to lowercase gobenol, if we decide to fix it rather than delete it. Chuck Entz (talk) 16:01, 15 January 2025 (UTC)Reply

@Chuck Entz: Kiautschou German pidgin makes interesting reading. The capitalisation of the noun is original and conforms to German grammar, so I'd say the entry's in the right place. The pidgin has no ISO 639-3 code code, so we'll have to make our own; de-kiau, perhaps? 0DF (talk) 01:32, 18 January 2025 (UTC)Reply
@0DF: The standard scheme is to use the family code followed by three letters that make as much sense as possible, so I went ahead and created the code "crp-kia". There's only one entry, and there's not a lot of the language attested to make more, so if anything is wrong it won't take long to clean it up. Fortunately, it has a Wikidata item and there's another German-based pidgin I was able to use as a model, so I think I got the basics done right. I'm a bit fuzzy about the correct derivation templates to use for terms that come from a pidgin's lexifier, so I changed the ones in the entry to {{der}} to be safe. Chuck Entz (talk) 04:32, 25 January 2025 (UTC)Reply

Update on enabling dark mode

[edit]

A while back I made a request to Phabricator (phab:T381058) requesting that dark mode be enabled for logged-out users. Recently, User:Jdlrobson (one of the developers leading the dark mode project) informed me that it would not be enabled until there were fewer "accessibility issues" (i.e. not enough contrast) in dark mode than in light mode. The results of the evaluation are here:

The number in light mode is surprisingly high, but from spot checking a few of them it seems like the tool is very sensitive and catching colour combinations that aren't really hard to distinguish in practice. The dark mode contrast issues are much more severe, so that's what we should prioritize for now.

Therefore I ask everyone reading this: if you edit a language, please make sure that its templates is using the palette colours! If a template is widely used, fixing it could bring down the contrast issues by a significant amount. I haven't had a lot of spare time lately but I will also try to help out over the weekend.

Ioaxxere (talk) 15:03, 16 January 2025 (UTC)Reply

@Ioaxxere: probably best to add something to “Wiktionary:Templates” about this. — Sgconlaw (talk) 18:27, 16 January 2025 (UTC)Reply
As I said at GP recently, there is a very long tail (probably thousands) of non-dark-mode-compliant templates that are used on only a handful of pages. Many of them have other issues besides colour (poor padding/spacing, needless fixed or relative width settings, small text, ...) and I have been converting many of these to {{inflection-table-top}}.
Sometimes this conversion gives rise to objections - and some editors say they would prefer to have been consulted beforehand - but in most cases the change is relatively minor, and if it is not liked by a group of editors, it can easily be reverted and discussed after the fact. Moreover, the conversion as a whole seems quite uncontroversial. I have converted templates in countless languages and I've received very few complaints - concerning only Greek (which was able to be resolved), Old English (still unresolved - although, to be fair, I did take the liberty of rearranging the template at the same time...) and Afar (just today).
@Jdlrobson according to the random sample at [1] we have 27% of pages that are non-compliant. What threshold would you consider acceptable? I would also add that the report doesn't indicate which template has the problem, which is a bit of a time-waster. Surely the name of the template can be extracted from the Parsoid HTML. Is it possible to improve this? This, that and the other (talk) 23:59, 16 January 2025 (UTC)Reply
At mw:Recommendations for night mode compatibility on Wikimedia wikis#Use accessible colors which pass WCAG AA checks there are browser extensions for Chrome and Firefox. I checked 吃瓜 and with such contrast checker I can see the issue in "trad" and "simp" from {{zh-forms}}. Note that this report is from the 500 most visited pages, and the figures change with weekly updates. The threshold is simply fewer issues in night mode than light mode, supposing that in light mode they are minor issues not reported so far. Vriullop (talk) 07:23, 17 January 2025 (UTC)Reply
The top of the report has selectors. We cannot infer templates.
A shortcut would be to add rules to MediaWiki:Common.css and MediaWiki:Minerva.css that cover those selectors and seek to migrate the rules to templates at your own pace. Jdlrobson (talk) 02:41, 24 January 2025 (UTC)Reply

Romance definite article boxes

[edit]

We need to decide what to do with Romance definite article templates like {{Italian definite articles}}, {{Mirandolese Emilian definite articles}}, ... Template:Italian definite articles In my opinion these stick out like a sore thumb even in regular light mode:

  • in no other situation do we show inflection information in a right-floating box
  • they are multicoloured for some reason
  • the grey colour have poor contrast with the text

I'd like to propose moving these under an "Inflection" or "Declension" L4 header within the relevant "Article" POS, which is where most languages place similar tables. Naturally I would also convert them to {{inflection-table-top}}:

LANG definite articles
singular plural
masculine ... ...
feminine ... ...

Any comments or objections? This, that and the other (talk) 11:10, 17 January 2025 (UTC)Reply

There being no objections, this particular item is Done Done. This, that and the other (talk) 10:02, 22 January 2025 (UTC)Reply

what counts as a "country"?

[edit]

I am cleaning up all the list templates and I notice that someone has stuck Do NOT add Kosovo here in a bunch of "countries of Europe" lists. Just 4 days ago, despite this, an IP went ahead and stuck it into Template:list:countries of Europe/en, noting (reasonably IMO) that Template:list:countries of Asia/en includes both Palestine and Taiwan, despite neither having full diplomatic recognition and Palestine being under military occupation. In order to head off controversy, I'd like to get consensus on which countries to include and which ones not to include. I propose:

  • Taiwan, Palestine and Kosovo all go in these lists (as well as the other obvious candidates listed under Wikipedia's Category:States with limited recognition, which include Armenia, China, Cyprus, Israel, North Korea and South Korea).
  • Other de-facto-independent countries with wide but partial diplomatic recognition and no military dispute involved also go in these lists (notably, the Cook Islands and Niue, which are technically "self-governing in free association with New Zealand", but there may be other island nations in a similar situation).
  • "Frozen conflict" areas that have very limited diplomatic recognition *DO NOT* go in these lists. This includes e.g. Transnistria, Abkhazia, South Ossetia, the occupied parts of Ukraine, Northern Cyprus, Somaliland, Puntland, and various more obscure areas listed in Category:States with limited recognition.
  • Constituent countries probably *DO NOT* go in these lists. (Although there is now support in {{col}} for indented sublists, which will show up as parenthesized sublists in horizontally-laid-out lists once I finish the support for this; so conceivably we could put Greenland and the Faroe Islands indented under Denmark; England, Scotland, Wales and Northern Ireland indented under the United Kingdom; etc.)
  • Finally, I don't know what to do about the Sahrawi Arab Democratic Republic (which claims Western Sahara but controls only a fraction of it), as I don't know whether it's more controversial to include it or leave it out. Per Wikipedia, as of Sep 2022 it had diplomatic relations with 46 states (but not including the US, Canada, anywhere in Europe, China, Russia, India or Brazil), with recognition frequently given and withdrawn; see International recognition of the Sahrawi Arab Democratic Republic for the gory details.

The intent here is to take the least controversial and least POV positions, as we're a dictionary and not in the business of being politically controversial. (You could maybe argue that it's best to not include any partly-recognized state, but that would entail leaving out not only Taiwan, Palestine and Kosovo but China, North and South Korea, Israel, etc. etc., which feels needlessly POINTy and hardly the least controversial approach.) Benwing2 (talk) 07:00, 17 January 2025 (UTC)Reply

I think we should go by the most common definition and simply list all 193 member states of the United Nations and its two observer states. Anything else would basically (accidentally) be Wiktionary making a statement; this is the least controversial option as it is the most basic—albeit somewhat random and illogical—definition.
The second choice is the above plus the Cook Islands, Kosovo, Niue, SADR and Taiwan. I would argue for the inclusion of SADR because ~40 countries recognising it are four times that of Taiwan, so listing just Taiwan would seem somewhat strange. The Cook Islands and Niue are very weird because they are essentially the exact same as the United States' “Compact of Free Association” rubbish (Marshall Islands, Federated States of Micronesia and Palau), the only difference being that the American ones are UN members whereas the New Zealand ones are not. However, as UN membership = country is quite the elementary definition, and, seeing as the Cook Islands and Niue have the exact same self-governance as the US's states, they should be kept.
But again, I am in favour of the first option because IMO it should not really be up to Wiktionary to decide what a “country” is here. LunaEatsTuna (talk) 09:21, 17 January 2025 (UTC)Reply
To be honest I think leaving out Taiwan makes us look "less neutral" than including it, simply because most countries acknowledge that it's a de facto independent country (but they can't say they have "diplomatic relations" with it), have some kind of representative office there (but can't call it an "embassy"), and so forth. To me, leaving out Taiwan is adopting the PRC POV.
On the whole I generally prefer Benwing's proposal (Luna's second choice). I don't really have a strong view on how we treat Western Sahara - if we don't include limited-recognition states I'd lean against treating it as a country, as it is effectively a government-in-exile at this point.
Another idea could be to emulate what Wikipedia does - it places limited-recognition states on the same level as "regular" countries. Compare their Template:Economy of Europe for instance. This, that and the other (talk) 10:48, 17 January 2025 (UTC)Reply
The English language definition of a country should guide what political entities are considered countries in Wiktionary entry definitions. The United Nations membership is one factor, but it is arbitrary and is not ultimate authority on which political entities meet the English language definition of country, the English language speaker understanding of "country" is the authority. Taiwan meets the country definition; it also meets other definitions. --Geographyinitiative (talk) 11:02, 17 January 2025 (UTC)Reply
I agree with the point about Taiwan and with Benwing's proposal/Luna's 2nd choice being the best one. MedK1 (talk) 17:01, 17 January 2025 (UTC)Reply
@LunaEatsTuna: Note: Cook Islands and Niue are far from the exact same as the US's COFA, as there are notable differences. The two entities do not have their own citizenship laws and rely on NZ citizenship. They are a part of Realm of New Zealand and New Zealand does not consider them sovereign states. As such, this year, Cook Islands PM Mark Brown confirmed that they do not meet the UN criteria for membership, likely due to the relationship with New Zealand. Notably last month, New Zealand also rejected the Cook Islander request for a separate passport. This is in stark contrast to the COFA situation where the United States provides services to the countries, but has no control over their citizenship and foreign affairs. The citizens of the Marshall Islands, FSM, and Palau are citizens of those respective countries and not U.S. citizens, and they each have their own passports. Hence why they have been admitted to the UN as member states, as they are fully sovereign. They do not belong to or make part of the U.S. I just want to make it clear that the two situations are completely different. AG202 (talk) 15:44, 17 January 2025 (UTC)Reply
The Marshalls, Micronesia, and Palau are free to leave their compact of free association with the United States and not under its sovereignty (tho they are "insular" areas of the United States) and everyone there has citizenship in those three states. No one on Earth has "Cook Islands citizenship": they are all New Zealand citizens a part of the larger realm of New Zealand, just like those on Tokelau. —Justin (koavf)TCM 16:26, 17 January 2025 (UTC)Reply
@Koavf: While they are insular areas, they are not a part of the United States, so there's no way for them to "leave". AG202 (talk) 17:13, 17 January 2025 (UTC)Reply
Exactly my point: they are not part of the United States. They can leave the compact of free association if they want. —Justin (koavf)TCM 17:14, 17 January 2025 (UTC)Reply
Ahhh got it, the phrasing "are free to leave the United States" made it sound like they were a part of it, but I see what you mean now. AG202 (talk) 17:25, 17 January 2025 (UTC)Reply
Yes, I worded that in such a confusing and misleading way that you were perfectly reasonable in correcting me. —Justin (koavf)TCM 17:27, 17 January 2025 (UTC)Reply
Everything that fulfils the three criteria of statehood – population, territory, and government. International recognition is only declaratory, not constitutive (communis opinio). Wikipedia editors only find it relevant as they rest on tertiary sources rather than judge the available material (which we have to anyway to see the linguistic situation of a country realistically rather than just mentioning it indirectly). Again they made up the distinction between de facto and de jure countries.
Apparently they call the three criteria Montevideo checklist or Montevideo criteria for statehood in English. After the Second World War, when the United Nations were instituted, it seems to have been difficult to swallow that these pillars of international law have been formulated in Germany, by Georg Jellinek, → Drei-Elemente-Lehre. Anyway you have a master’s thesis by Ali Zounouzy Zadeh (2012) International law and the criteria for statehood for 60 pages dissertation where all is related to the English-reading audience, and the keywords and titles for more.
Is least controversial: If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck. Taiwan exists, Transnistria exists, Israel exists, not our business if it has a right to exist. Fay Freak (talk) 12:36, 17 January 2025 (UTC)Reply
@Fay Freak Would that definition of yours exclude or include entities that do not claim statehood but function as independent entities? Examples being Puntland and the Wa State; they are essentially countries (in the most neutral definition) with the only difference being they themselves do not use the label and cannot—in most circumstances—enter into foreign relations with other states. LunaEatsTuna (talk) 17:04, 17 January 2025 (UTC)Reply
@LunaEatsTuna: Puntland seems widely described as and even intend to be a federated state – why is this no entry? –, which would be an exclusion criterion since terminologically, and for the present categorization purpose, in English, we highlight countries from states, but this well-meaning opinion—without me visiting it to inspect its administration and legal system—does not appear to hold water since they are at war with their neighbouring states and entertain foreign relations with ministerial delegations, everything with African characteristics. In view of the government, Puntland is more of a state than Somalia is, unsurprisingly, since the latter is synonymous to anarchy, or at least the most common negative example in discussions of libertarianism or anarchocapitalism, for as long as we know the internet. Fay Freak (talk) 17:40, 17 January 2025 (UTC)Reply
I agree with your proposal and the SADR is a state. —Justin (koavf)TCM 16:24, 17 January 2025 (UTC)Reply
Very argumentative Koavf. — Fenakhay (حيطي · مساهماتي) 16:51, 17 January 2025 (UTC)Reply
? —Justin (koavf)TCM 16:54, 17 January 2025 (UTC)Reply
To be a state, you need a territory, a permanent population and state institutions; not just tents in a foreign country. — Fenakhay (حيطي · مساهماتي) 16:57, 17 January 2025 (UTC)Reply
The SADR has all of those things and not "just tents in a foreign country". Please don't spread misinformation. —Justin (koavf)TCM 17:12, 17 January 2025 (UTC)Reply
Hahahahaha. Quite the joke. Hahaha. — Fenakhay (حيطي · مساهماتي) 17:15, 17 January 2025 (UTC)Reply
Stop posting your lies and stop abusing rollback. —Justin (koavf)TCM 17:20, 17 January 2025 (UTC)Reply
Oh look who's upset for spreading misinformation and suppressing other peoples' comments. Get a grasp of reality. — Fenakhay (حيطي · مساهماتي) 17:22, 17 January 2025 (UTC)Reply
"An edit should be reverted if it is clearly and irredeemably nonconstructive". Your comments here are that. Stop posting them. —Justin (koavf)TCM 17:26, 17 January 2025 (UTC)Reply
Wow you can partially read. Please read the whole paragraph :) (this is a discussion) . — Fenakhay (حيطي · مساهماتي) 17:30, 17 January 2025 (UTC)Reply
That is a personal attack and is not relevant to this discussion. I am blocking you from this page for 72 hours. —Justin (koavf)TCM 17:37, 17 January 2025 (UTC)Reply
Ah, misuse of admin powers, classic move. Constructive feedback seems to be in short supply again, doesn’t it? :) — Fenakhay (حيطي · مساهماتي) 17:44, 17 January 2025 (UTC)Reply
@Benwing, Benwing2: as the person who started this thread and a fellow admin, I would like to request that you review this portion to see if it's constructive. Fenakhay, leave me alone. —Justin (koavf)TCM 18:20, 17 January 2025 (UTC)Reply
Koavf's block against me is problematic. As the founder of the SADR WikiProject, he’s clearly not neutral on this topic. I pointed out issues with his arguments and what seemed like a deliberate misreading of the revert rules. Instead of discussing it, he blocked me. There were repeated shouts of "Stop spreading misinformation" which is ironic, as that is exactly what is happening, given that the reality is far from the fiction being portrayed. — Fenakhay (حيطي · مساهماتي) 19:13, 17 January 2025 (UTC)Reply
I don't know much of anything about the SADR, so I'm just going on what is going on here: @Koavf made a rather absolute statement and @Fenakhay disputed it in a rather dismissive way. As a native speaker of Moroccan Arabic, Fenakhay might be expected to have strong opinions on the subject, and Koavf apparently does as well. Koavf responded by being more absolute, and Fenakhay responding by being more dimissive, calling Koavf's statement a joke. Koavf removed Fenakhay's remark, which Fenakhay reverted in order to restore that remark. Koavf then accused Fenakhay of lying and abusing rollback, and attempted to block him.
Neither party can be pround of their actions here, but being dismissive/characterizing a statement as a joke is not on the same level as censoring the other person's remark, characterizing the other person's statements as "lies" and using admin powers to attempt to enforce the censorship. One is expressing the person's own opinion, and the other is trying to prevent someone else from expressing their opinion. Fenakhay's use of rollback rather than simply typing his content in by hand is pretty minor. Koavf's blocking on grounds of "Repeated unconstructive edits, personal attacks, and misuse of rollback" is much worse, and the block reason could just as easily be applied to Koavf, with the substitution of "block" for "rollback".
This subject obviously hits a raw nerve for both parties, and I would ask both of them to step back, take a deep breath, and walk away- at least for a day or two. Chuck Entz (talk) 20:16, 17 January 2025 (UTC)Reply
Thanks. This will be my last comment here. —Justin (koavf)TCM 20:19, 17 January 2025 (UTC)Reply
Why the SADR specifically? It doesn't feel country-like to me, although I'm not faniliar with African politics. CitationsFreak (talk) 18:23, 17 January 2025 (UTC)Reply
Fenakhay proposed three qualities, which all apply to them: territory (the Free Zone), a population (the Sahrawi refugees), and state apparatuses (a military, ambassadors, membership in international organizations like the African Union), etc. They also fit the Montevideo Convention requirements. —Justin (koavf)TCM 18:31, 17 January 2025 (UTC)Reply
I didn't propose anything. That was Fay Freak. Stop spreading misinformation. — Fenakhay (حيطي · مساهماتي) 19:15, 17 January 2025 (UTC)Reply
I told you to leave me alone. Leave me alone. Also, stop your lying, as everyone can see that you are lying. —Justin (koavf)TCM 19:56, 17 January 2025 (UTC)Reply
Koavf brought it up specifically, which is why I asked. (Personally, I don't support including it. The UN lists it on the same level as Guam in terms of sovereignty, which we should all agree is not a country.) CitationsFreak (talk) 19:27, 17 January 2025 (UTC)Reply
I presume you mean the list of non-self-governing territories, which is not some kind of index of what is or should be a sovereign state. Also, the UN is not determinative of what is or should be sovereign in the first place. —Justin (koavf)TCM 20:01, 17 January 2025 (UTC)Reply
I agree with Fay Freak and I would support the first choice presented by LunaEatsTuna, to be the least neutral. We could also add a separate list for partially-recognized "countries" to include those that a worth mentioning; like Taiwan and such. — Fenakhay (حيطي · مساهماتي) 16:53, 17 January 2025 (UTC)Reply
  • Much as I hate using this term, disputes over what is and isn't a country...ARE encyclopedic. If enough sources refer to something as a country to pass RfV, I reckon we're bound to call something a country. Purplebackpack89 18:47, 17 January 2025 (UTC)Reply
To define a term, we employ language (our working language English) to describe the thing denoted by the term and not just anything someone has called the thing in the context of the term as long as it happened three times. Otherwise we would declare Comirnaty a bioweapon. And the long list of micronations and reichtard countries would also be countries, with few attested translations though they be. Where and when to put the scare quotes? We would find more references contradicting these statements of course, “enough” sources to the contrary, but why? It is encyclopedic. Collating sources to determine what something is referred to as, rather than what a user of a linguistic symbol comprehends, is encyclopedic. They describe opinions about a thing on Wikipedia, we go directly to what is conceptualized for a term. One just has to employ circumspect language. If there is a problem with the State of Palestine and the West Bank meeting the usual idea of a state, we can just tell the issues with some hedges; a who's who is obviously out of our scope and wouldn’t solve the problem, you do that over at Wikipedia, and despite all the job conditioning for disserting sources it would still be a bad job since it would not discuss the issue of perspective, sources having to be relativized for having limited purposes in mind.
We could also dilute our concept of a country given that we just need vocabulary lists across many languages but then we can as well accept that we have a misnomer in calling a country everything someone can travel to and trade with which practically has different administration a polyglot has to know about, and therefore typically memorizes, roughly and for example. These aren’t disputes, these are different definitions. People live with different understandings of things and thus terms matched by them with them, either variously deranged, depending on which functional environment you take as a litmus test. Fay Freak (talk) 20:24, 17 January 2025 (UTC)Reply
@Benwing, your proposal (or Luna's second choice) seems reasonable to me. If we are (and we are!) a descriptivist dictionary and the point of our lists of countries is to help people find words for things which, descriptively, a significant number of speakers think of as countries, I think it makes sense to be relatively inclusive, and certainly include Kosovo and Taiwan (as long as they still have some recognition, as a sanity check). Western Sahara too is on a lot of maps, and lists of countries, and if it is recognized by dozens of countries, then yes, we should include it (but my instinct would be to list it under that name, Western Sahara, if our lists are mainly using countries' common names; only if we're making a list that has French Republic instead of France [etc] would I list it as SADR). - -sche (discuss) 19:23, 17 January 2025 (UTC)Reply
Thanks! This comment is very constructive and helpful. Benwing2 (talk) 22:24, 17 January 2025 (UTC)Reply
I agree with this. I think a certain amount of common sense is important here. If someone asks, "How many countries are there in the world", what are they likely to have in mind? I think Benwing's proposal captures the usual scope of the word fairly well. Andrew Sheedy (talk) 00:28, 18 January 2025 (UTC)Reply
For reference, see: Wikipedia's list of sovereign states. I think I can support with Benwing's proposal. I personally would prefer the UN member (or UN observer) criteria and then separate other entities out on their own line, but as mentioned, that would run into issues with ex: Taiwan and would take up space. I'm a bit hesitant on Cook Islands & Niue for the reasons I've mentioned above + their limited international recognition as independent sovereign states. However, I guess since we're using "country" instead of "sovereign state", it doesn't matter that much. AG202 (talk) 01:15, 18 January 2025 (UTC)Reply
Thanks for the comments. I also thought about using separate lines or footnotes but it gets complicated real fast. Benwing2 (talk) 01:48, 18 January 2025 (UTC)Reply
A country can also include places like Scotland, Wales, Greenland, etc but they aren't sovereign states. 115.188.138.105 11:49, 18 January 2025 (UTC)Reply
True. We know Trump wants to get his hands on Greenland. Hopefully he'll fail. DonnanZ (talk) 09:22, 20 January 2025 (UTC)Reply
How about "in a UN agency, observer state of the UN, or recognized as a state by any of the states listed"? This should every state that should in the list, along with a few stragglers (including the SADR and Cook Islands). CitationsFreak (talk) 09:33, 20 January 2025 (UTC)Reply

T:User lang-1 through T:User lang-5

[edit]

Is there a reason there are 6 separate templates? I don't see why these can't just be merged into {{User lang}} and we use a switch for the color/category changes. - saph ^_^⠀talk⠀ 15:11, 17 January 2025 (UTC)Reply

They can be. It's just easier to make the five templates than the one to the extent that it requires less technical knowledge. —Justin (koavf)TCM 15:34, 17 January 2025 (UTC)Reply
It's not like it would make it any more complicated, though, really just changing the dash to a pipe:
{{User lang-4|en|This user speaks English at a near-native level.}}
{{User lang|4|en|This user speaks English at a near-native level.}}
Unless I'm misunderstanding what you mean by "technical knowledge." - saph ^_^⠀talk⠀ 15:45, 17 January 2025 (UTC)Reply
See Special:Permalink/83630137 for examples of a merged template I put together. - saph ^_^⠀talk⠀ 16:07, 17 January 2025 (UTC)Reply
I don't know that you're misunderstanding, I'm just answering the question you asked: it's easier to make a template that always looks one way and harder to make a template that changes how it looks based on your input. It's harder still to make one that changes twice based on two inputs. —Justin (koavf)TCM 16:10, 17 January 2025 (UTC)Reply
It only needs one input for proficiency, the names for the CSS classes and for the categories are the same. - saph ^_^⠀talk⠀ 16:14, 17 January 2025 (UTC)Reply
One input: language code. Another input: proficiency level. —Justin (koavf)TCM 16:22, 17 January 2025 (UTC)Reply
I agree. Benwing2 (talk) 00:00, 18 January 2025 (UTC)Reply

If someone revamps the language templates, can they please also add support for an "inactive=" parameter (or whatever anyone wants to name it) that a bot could set en masse to take inactive users out of the proficiency categories? See vote and 2023 GP. Currently I hover over usernames in the categories with the old Navigation popups to see when they were last active, to figure out who to ping. - -sche (discuss) 01:20, 18 January 2025 (UTC)Reply

Yup, I'm completely with you on this. The main issue is that a lot of users use the built-in parser functions, which we need to work around/prohibit/whatever. Benwing2 (talk) 01:46, 18 January 2025 (UTC)Reply
Yeah, should be trivial to implement on the template end. I think a lot of our Babel templates already have a |nocat= parameter anyway. - saph ^_^⠀talk⠀ 05:06, 18 January 2025 (UTC)Reply
Added. - saph ^_^⠀talk⠀ 06:04, 18 January 2025 (UTC)Reply

Declension tables for Arabic dialects

[edit]

The Arabic dialects' noun and adjective declension, while being limited compared to MSA, is extensive enough to be not sufficiently represented in the headword. Even if no cases exist, features like feminine constructs and definite article assimilation are simply not represented. So why hasn't there already been made basic declension tables for them? There already exists verb conjugation tables, and dialectal variation in declension is not great enough either to require more than a few tweaks to fit each dialect well. ☆ Vesper (talk) 06:43, 18 January 2025 (UTC)Reply

I'm not sure what you're referring to exactly. It's true that Arabic dialects don't get a lot of love, but neither feminine construct forms nor definite article assimilation is shown in the Arabic script and AFAIK they are quite predictable in most (if not all?) dialects. Benwing2 (talk) 07:54, 18 January 2025 (UTC)Reply
[edit]

User @Victar has removed a link to ज्वल् (jval) at the PIE *ǵwelH- page, reasoning that ज्वलति (jvalati) is already mentioned. My thinking is: given that the pages for Sanskrit roots are there (and a page like ज्वल् (jval) should have its derived terms expanded), it seems logical to have a link to them at the PIE level. Thoughts? Exarchus (talk) 13:26, 18 January 2025 (UTC)Reply

Agreed, if the root-to-derived morphology pipeline is still perceived as productive or at least generally seen to exist in Sanskrit (as it seems to be). — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 15:54, 18 January 2025 (UTC)Reply
What supporting data does Sanskrit ज्वल् (jval) add to PIE *ǵwelH- that ज्वलति (jvalati) doesn't? Roots are just linguistic constructs, and it seems silly to add a descents section to all PIE root pages just for Sanskrit. --{{victar|talk}} 02:50, 22 January 2025 (UTC)Reply
After my edit it already adds more: it points to derived terms that shouldn't be projected back to PIE but were (often) created within Sanskrit.
It also seems simply a matter of consistency: if A refers to B as its ancestor, then it makes sense for B to refer to A as its descendant. Exarchus (talk) 09:26, 22 January 2025 (UTC)Reply
All said derivatives can and should already be on ज्वलति (jvalati). I don't understand your argument on consistency. If we had consistency, we would have Proto-Germanic root pages, like *kul-. --{{victar|talk}} 02:45, 23 January 2025 (UTC)Reply
"All said derivatives can and should already be on ज्वलति (jvalati)." That is not how Sanskrit is currently treated on this site, the primary overview of the derivatives is at the root pages. Also because the derivatives often can't be derived from the present verb, or because there are multiple present verbs.
"I don't understand your argument on consistency." The ज्वल् (jval) page gives *ǵwelH- as its ancestor. So would make sense to have ज्वल् (jval) as descendant at the *ǵwelH- page.
"If we had consistency, we would have Proto-Germanic root pages, like *kul-." Would such hypothetical Proto-Germanic roots fit the condition given above by mellohi: "if the root-to-derived morphology pipeline is still perceived as productive or at least generally seen to exist"? Exarchus (talk) 10:06, 23 January 2025 (UTC)Reply
I see you've added a bunch of secondary formations to Sanskrit ज्वल् (jval). Ignoring the unsupported headers, many of those forms could you worked into the declension table for ज्वलति (jvalati), eliminating the need to manually list them.
Sanskrit ज्वल् (jval) is only the direct ancestor of PIE *ǵwelH- in an artificial academic sense. And yes, Proto-Germanic absolutely had, as it was put, a "root-to-derived morphology pipeline", which included secondary o-grade causatives and full-grade presents. --{{victar|talk}} 17:54, 23 January 2025 (UTC)Reply

Lashi revamp and on mass-deletions

[edit]

Hey all. Way back when in 2018-2019, I created a bunch of Lashi (Lacid) lemmas based on {{R:lsi:Luk:2017}} and {{R:lsi:Wannemacher:2011}}. While these are perfectly respectable sources by themselves, they obviously do not capture the entirety of the language by themselves, not to mention the fact that they use a scientific, phonological orthography (in the case of Wannemacher, simply IPA) to note down the 30.000-speaker language which actually has both an orthography and a Bible translation in that orthography, both of which seem to be accepted by the speakers.

Now, since I was following the above grammars, we obviously don't follow this orthography, but rather a (dogshit) semi-phonological transcription. I would like to fix this situation, perhaps by contributing again, now more carefully and following written materials (like the above-described Bible translation, as well as primers and other books written in the language over the last years), but the sheer amount of work needed to convert those lemmata that are based on the linguistic works, and may lack the appropriate information I need to actually write these terms down in the orthography (basically I would need to track down every single word in the non-literal Bible translations and pray to God [pun wholeheartedly intended] that they are attested there) is not only frightening, but simply undoable. It would be much easier to start anew, and work from the written materials, rather than from the scholarly ones.

As such, I propose we delete all existing entries in CAT:Lashi lemmas, and start anew. I could write a pronunciation template based on Wannemacher's phonological description, gather some more resources and start consistently working from the written materials. That way our readers will not get confused by the inconsistency in our entries, and we will not have to go through the herculean task of converting every single entry to a written form that is likely not even attested outside of the linguistic analyses.

I would also like to quickly start a second, related topic, which will inevitably come up if we reach a consensus on the first one. We have a number of languages that have a significant number of entries, which makes cleanup difficult, but that desperately need it nonetheless - to name a few notable cases, we have languages that have been massacred by a certain user that is now blocked, a couple of other languages that have been edited by yours truly in their early and stupid years and those whose cleanup has already been discussed not only in the Beer Parlour, but even on Reddit! So my second question is, in case the creator of the entries was indefinitely blocked for the creation of these entries, or if the editor themself agrees that maybe they should've been (like me :P), would it be acceptable to start a discussion on WT:Language treatment requests about the deletion of all lemmas/all members of a subcategory in the given language? Thanks in advance for your input. Thadh (talk) 22:37, 18 January 2025 (UTC)Reply

IMO yes yes and yes to everything you say, and in general I would advocate nuking languages that are FUBAR (Mon seems an especially flagrant example). I have disagreed with you in the past about the idea that IPA or an IPA-like transcription should never be used to represent a language, but obviously it's preferable to use a proper orthography if it exists and is in use, which it sounds like it is. Benwing2 (talk) 23:02, 18 January 2025 (UTC)Reply

Ongoing vandal

[edit]

User:68.188.203.200 is creating lots of made-up nouns like insecticidality. 2A00:23C5:FE1C:3701:ACCF:6A12:5948:AB6D 01:34, 19 January 2025 (UTC)Reply

https://duckduckgo.com/?q=%22insecticidality%22&ia=webJustin (koavf)TCM 01:52, 19 January 2025 (UTC)Reply
Checking a sample of the user's last 50 creations, valueful, skeletality, simpliciality, exclamativity, medicinality, Facharztzentrum, Neueröffnung, エコノミクス, inhumatorio and 검붉다 all seem to be attested, though simpliciality may be a non-native speaker's word (or perhaps the reason it seems to show up mainly in technical texts by non-native speakers is just that it's a technical word). abusivity also seems to be uncommon, with modern uses by NNES, and inhumatorio does not seem to be as common as I would expect (is another word more usual for this?). For groupoidality, however, I only spot a single use on Google Scholar, and nothing on Google Books or Archive.org, so I've RFVed that one. (The fact that the user is quickly creating lots of entries in 5+ different languages does also give me some pause.) - -sche (discuss) 07:26, 19 January 2025 (UTC)Reply
The German ones are actively used in speech; the difference in content between edited languages is indicative of legitimate treatment of the material, since no one master’s multiple languages equiformly. For charity I tend to assume that the editor is a polyglot with a biomedicine background or similar. Success in some academic fields depends on your ability to juggle vocabulary more than anything. Fay Freak (talk) 21:24, 19 January 2025 (UTC)Reply

Does Unbinding Disinherit?

[edit]

Can an unbound form be inherited from a bound form as opposed to merely being derived? The question is applicable to the relationship of the Pali participle kārita to as yet unentered antecedent Sanskrit कारित (kārita) ; Monier Williams only records the latter as a bound form. --RichardW57 (talk) 11:39, 19 January 2025 (UTC)Reply

proposed set categories

[edit]

WARNING: Long post. See also my above post about Category:Religions and Category:Taxonomic ranks. Pinging @-sche and @Ioaxxere as editors who have contributed intensively to prior discussions on topic categories.

I would like to add the following seemingly obvious gaps:

  1. Category:Units of time: millennium, century, decade, year, month, etc. I also propose to include geological time units in this category: eon, era, period, etc. Note that we currently have Category:Units of measure (and Category:en:Units of measure has 632 elements and really should be split, but that can be done later).
  2. Category:Western zodiac signs: Pisces, Sagittarius, etc. Bizarrely, we have Category:Chinese zodiac signs but no category for Western zodiac signs. We have 8 Template:list:Western astrology signs/CODE lists that can populate these categories.
  3. Category:Religious texts: We have three Template:list:religious texts/CODE lists giving names of religious texts/scriptures of various religions. Probably they should go in their own category.
  4. Category:French Republican calendar months: Yeah these are funny but they exist and we have three Template:list:French Republican Calendar months/CODE lists specifying them for different languages.
  5. Category:Poetic meters: Category:en:Poetry has 447 members, ugh. We have three misnamed lists Template:list:poetic meter/CODE that can populate these categories to begin with. Many terms for poetic meters like iambic pentameter and heroic verse are either only in Category:en:Prosody or nowhere.
  6. Category:Types of electromagnetic radiation: gamma ray, X-ray, visible light, microwaves, radio waves, infrared, ultraviolet, etc. I have been trying to avoid putting the type of topic category into the name of the category but here I don't see any alternative, as Category:Electromagnetic radiation by itself suggests a related-to category. Maybe this should be called Category:Electromagnetic waves but I dunno if this is a term of art in physics. Note that we have three lists Template:list:electromagnetic radiation/CODE listing types of electromagnetic radiation in different languages.

I would like to make the following splits/renames:

  1. Category:Lunar months should be deleted and split into categories for specific calendars:
    1. Those in Category:en:Lunar months should almost all be moved to Category:Hindu lunar calendar months, which should be renamed. Discussion in Discord has concluded that it should be either Category:Hindu calendar months or Category:Vikrami calendar months, but it's not clear which is better. The problem with Category:Hindu calendar months is that what is essentially the same calendar and should probably be unified with it is also used by Muslims in Bengal and Punjab, and the problem with Category:Vikrami calendar months is that per Wikipedia, this refers only to a subset of all the Hindu calendar variants (although some editors on Discord say it is also used in a wider sense, incorporating all the Hindu calendar variants). Wikipedia is somewhat schizophrenic; for example, w:Assamese calendar is considered a Hindu calendar, at least by its categorization, while w:Bengali calendar is not, but they are virtually the same. (This is probably because Assam is majority-Hindu while Bengal is majority-Muslim.) I personally think "Hindu calendar" is fine; note by comparison that Western digits are normally called "Arabic numerals" or "Hindu-Arabic numerals" even though the majority of users are neither Arab nor Hindu (and for that matter, most Hindus and many Arabs user different numerals); the term reflects its origin rather than its current use. An alternative, I suppose, is "South Asian calendar", but this term does not appear to be in use in this sense. (IMO what matters most for unification of these different calendar variants is whether the month names are cognate with each other, which they almost always are. Some variants in fact are solar, some are lunar and some are lunisolar, and some have different starting points, but these distinctions are not decisive.)
    2. Category:ban:Lunar months should be moved to Category:ban:Balinese calendar months, as the Balinese calendar seems quite different from others (there are actually two Balinese calendars but only one has months, it seems).
    3. All the Southeast Asian Buddhist calendars seem to be based on the Hindu calendar and should probably be merged into them. Alternatively, place them in subcategories of Category:Buddist calendar months.
    4. The only remaining language with months in a subcategory of Category:Lunar months is Zulu, which should provisionally get its own Category:zu:Zulu calendar months. There are several lunar/lunisolar calendars used in Africa, and some of them are likely unifiable, but I don't know which ones.
  2. Category:Books of the Bible: Split into Category:Books of the Old Testament and Category:Books of the New Testament. Probably delete Category:Books of the Bible, since AFAIK there are no books of the Bible that can't clearly be categorized into Old or New Testament. Besides the fact that the Old Testament and New Testament are associated with different religions, on a practical level Category:en:Books of the Bible has 106 members and Category:zh:Books of the Bible has 278 members. Note that we also already have 8 Template:list:books of the New Testament/CODE lists, 6 Template:list:books of the Catholic Old Testament/CODE lists and 6 Template:list:books of the Protestant Old Testament/CODE lists. (FYI the latter is essentially the same as the books of the Jewish Tanakh.)
  3. Category:Greek deities: Split into something like Category:Greek mythology Olympian deities, Category:Greek mythology Muses and Category:Greek mythology Titans. Any that don't fit into these subcategories can stay in the supercategory. Category:en:Greek deities has 226 members and we already have 6 Template:list:Greek mythology Olympian gods/CODE lists, 5 Template:list:Greek mythology Muses/CODE lists and 2 Template:list:Greek mythology Titans/CODE lists.
  4. Category:Fingers: Split out Category:Names of fingers (or Category:Types of fingers or just Category:Terms for fingers? Is "ring finger" the name of a finger or a type of finger?). Same issue as Category:Types of electromagnetic radiation but no obvious naming alternative. Category:Fingers is a mixture of terms for names or types of fingers (ring finger, middle finger, thumb, ..., along with more colorful terms like leech-finger, pussy finger and tall man) and terms related to fingers such as polydactyly and fingernail. We have 9 lists of types of fingers in Template:list:fingers/CODE as well as Template:list:fingers-humorous/pt listing humorous names for fingers in Portuguese.
  5. Category:Size: Split out Category:Sizes, ranging from itsy-bitsy and extra-small to humongous and ginormous. Category:en:Size has 163 members.

Benwing2 (talk) 08:48, 20 January 2025 (UTC)Reply

Support, although re books of the Bible we should also consider how to handle books of the Tanakh, which overlap with books of the Old Testament. At present, it seems like we just don't categorize books of the Tanakh at all(?), e.g. נחמיה is categoryless; it seems suboptimal to only categorize books of the Jewish Tanakh as "Old Testament" (since that's the Christian POV naming/framing of the Jews' books/religion as only being the old half of the whole story, though if any of our Jewish editors want to weigh in and say they DGAF, I defer to them), but it could be redundant to double-categorize a lot of books as both OT and Tanakh, but if we only have a "Category:Books of the Tanakh", it misses books which Christians regard as OT but Jews don't regard as Tanakh (the Apocrypha). Maybe we have "Books of the Tanakh", and then double-categorize that category into "Category:Books of the Old Testament" (also putting Christian apocrypha directly into that category) and "Category:Judaism"??
Re Hindu calendar months, if both Hindu and Muslim Bengalis use the same names for months, and those months are originally/mainly from the/a Hindu calendar, then my initial reaction (like yours) is to just call them "Hindu calendar months" and not sweat it—thinking of how e.g. lots of Malaysians, including Muslims and Malays, celebrate Chinese lunar new year, without necessitating calling it Chinese-and-Muslim-and-Malay new year. But if Hindu and Muslim Bengali-speakers use different month names, or just object to the calendar being called a Hindu calendar, is it a problem to have both a "CAT:hi:Hindu calendar months" and a "CAT:bn:Bengali calendar months" as (ultimately) subcategories of "CAT:Months"?
- -sche (discuss) 21:12, 20 January 2025 (UTC)Reply
@-sche Thanks for your comments. I mention above that AFAIK the Protestant Old Testament and Jewish Tanakh have the same books. I agree it may be a bit strange to call נחמיה a "book of the Old Testament" (although that's exactly what our definition says) but I'm not sure we need to double-categorize; maybe we can call it "Books of the Old Testament and Tanakh" or "Books of the Old Testament and/or Tanakh" or something? Granted that the books of the Apocrypha are not in the Tanakh but IMO should still go in that category. There are also things like the Book of Enoch, not considered canonical by most Christians and Jews (but canonical for Ethiopian Jews and Ethiopian and Eritrean Orthodox Christians), which probably should go in the category as well.
As for the month issue, yeah my original thought was also to have a distinct Category:Bengali calendar months. But I ran into the issue of Template:list:Sylheti calendar months/syl, where per Wikipedia there isn't even a distinct Sylheti calendar (or at least, the page on it was deleted as being fictional), and months like ꠎꠂꠑ (zoiṭó) are given simultaneously as months in the Assamese, Bengali and Sylheti Hindu calendar (and where the Assamese calendar seems not to differ from the Bengali calendar). This led me to conclude that if we start classifying each calendar as different, we could end up triple or quadruple classifying a lot of terms, which would just be confusing. Benwing2 (talk) 21:39, 20 January 2025 (UTC)Reply
"AFAIK the Protestant Old Testament and Jewish Tanakh have the same books." Kind of: e.g. Book of Ezra and Book of Nehemiah are a single piece of literature in the Tanakh/Jewish Bible. —Justin (koavf)TCM 22:35, 20 January 2025 (UTC)Reply
This is a lot to respond to, so it should probably be separate threads or proposals, but
Gaps:
  1. Category:Units of time: Agreed
  2. Category:Western zodiac signs: Agreed
  3. Category:Religious texts: Agreed
  4. Category:French Republican calendar months: Agreed
  5. Category:Poetic meters: Agreed
  6. Category:Types of electromagnetic radiation: Call this Category:Electromagnetic spectrum. It's easier, more intuitive, and does not include all extraneous electromagnetic phenomena.
Splits/renames:
  1. Category:Lunar months should be deleted and split into categories for specific calendars: Agreed
  2. Category:Books of the Bible: Split into Category:Books of the Old Testament and Category:Books of the New Testament. Strong disagree. This runs into all kinds of issues with the Jewish Bible structuring, deuterocanon, and Ethiopian/Eritrean extended canon. It's not necessary and more trouble than helpful.
  3. Category:Greek deities: Weak disagree, as 226 is a navigable amount, but I'm open to clearly defined subcategories like the muses.
  4. Category:Fingers: Weak agree
  5. Category:Size: "Split out Category:Sizes, ranging from itsy-bitsy and extra-small to humongous and ginormous. Category:en:Size has 163 members." Okay, but what wouldn't be in the new category? I guess shoe size and maybe a few general terms related to the concept of sizing, but galactic and S and most of the other things in here are actual measures of sizes (formal or informal), so I don't think this would help much and 163 is a perfectly navigable category.
As an aside, what is the value of discussion in Discord? Why is any policy discussion happening off-wiki and not recorded here? —Justin (koavf)TCM 22:34, 20 January 2025 (UTC)Reply
Thanks for your comments. Discussion on Discord can happen in real time, and it is much better suited for conversations with extensive back and forth discussion than Wiktionary's forums are. It is easy to ask a question and get an immediate response, which tends not to happen on Wiktionary itself. The discussion about the Hindu calendar occurred in the #indo-iranian channel on Discord over maybe a total of 30 minutes; nothing like that could happen on Wiktionary. As for the splits and renames:
  1. I'm not sure your issue with splitting "Books of the Bible". You mention issues about canonicality but is there actually an issue with an Old/New split with inclusive policies as to what goes in (basically anything considered canonical by any major denomination)? Are there books where there is a question whether they are considered Old or New? The canonicality issues are the same whether we have a single "Books of the Bible" category (which IMO is very Christian-biased in a way that an Old/New split isn't) or two categories. Also keep in mind that we have lists like Template:list:books of the Protestant Old Testament/en and Template:list:books of the Catholic Old Testament/en that can help with specifying what is considered canonical by which group.
  2. Category:Size has terms like blow up, procerity, grow, pipsqueakery, lobsterling, long drink of water and other randomness that are not a specific size but just have some vague relation to size. It's a related-to category, which allows for random stuff like this, which a set category would not. After splitting out Category:Sizes, we could potentially merge Category:Size and Victar's Category:Quantity, which are not obviously distinct.
  3. As for Category:Greek deities, IMO categories esp. set categories should not have more than 100 or so members unless they're clearly all of the same type.
Benwing2 (talk) 23:21, 20 January 2025 (UTC)Reply
Thanks yourself.
"Are there books where there is a question whether they are considered Old or New?" The extended canon of the Tewahedo churches does not fit this Old/New Testament split.
"The canonicality issues are the same whether we have a single "Books of the Bible" category (which IMO is very Christian-biased in a way that an Old/New split isn't) or two categories." I think it's the other way around: having an "Old Testament/New Testament" divide is a Christian notion, but Jews in common language could use the word "Bible" with others (tho "Tanakh" or "the Law" or something is more common when practitioners of Judaism are discussing scripture among themselves).
"As for Category:Greek deities, IMO categories esp. set categories should not have more than 100 or so members unless they're clearly all of the same type." That's a nice goal maybe, but there are just some sets that have more than 100 things. I don't see that as a problem really. If a clearly defined set that is actually useful for language (e.g. not something like "even numbers" or "objects with mass" or something) has a thousand members, it's still very useful to categorize them together. —Justin (koavf)TCM 00:00, 21 January 2025 (UTC)Reply
I am speaking as someone who has both a Jewish and Christian background, and who considers himself Jewish. The New Testament is a Christian addition to the Jewish scriptures; grouping them together as the "Bible" is a Christian concept that is foreign to Judaism. Imagine if we did not have a "Books of the Bible" category but instead had a category "Books of the Quad Combination" or "Books of the Standard Works" (see w:Standard Works) that indiscriminately grouped together the books of the Christian Bible along with those of the Book of Mormon, the Doctrine and Covenants and the Pearl of Great Price. And someone argued on technical grounds against splitting them into distinct categories? Surely Catholics and Protestants would object? Benwing2 (talk) 00:34, 21 January 2025 (UTC)Reply
Sorry if I'm stupid here, but I'm not following. As I intended to say earlier, the notion of a "New Testament" appended to prior Jewish scripture is a Christian notion, for sure. But what is the point you're making with the Mormon analogy? Calling the Jewish scripture the "Old Testament" is still just using the Christian terms for that tradition's holy literature. —Justin (koavf)TCM 03:25, 21 January 2025 (UTC)Reply
@Koavf Did you see my response to -sche? I proposed a category name "Books of the Old Testament and Tanakh" or "Books of the Old Testament and/or Tanakh" or similar, since the Old Testament is just the Christian interpretation of the Jewish scriptures (modulo some argumentation over what is considered canonical, which led to certain books being excluded from the Tanakh but included in the Catholic Old Testament, among other things). My point is that insisting on combining the Old Testament/Tanakh with the New Testament and calling it the "Bible" is objectionable to Jews in the same way that grouping the Bible with other Mormon scriptures and calling the resulting amalgamation by the Mormon name would be objectionable to non-Mormon Christians. Having separate categories is less biased. We could have two categories "Books of the Tanakh" and "Books of the Bible" but that would lead to double categorizing practically all of the Tanakh books, which I think would be more confusing than anything else. As for the extra Ethiopian Tewahedo "Church Order" books, IMO including them in a "Bible" category could well be considered biased in the same way as including Mormon-specific scripture in a "Bible" category would, since they are clearly not either New or Old Testament and postdate both. Benwing2 (talk) 08:28, 21 January 2025 (UTC)Reply
But it would be biased to not include them in "Books of the Bible", since they are. It's not for us to decide canonicity in the Bible. I can't speak to Jews being offended at the term "Bible" being used to refer to the common scripture tradition, but in my limited experience, it's just a word used for convenience's sake and not offensive. Others' mileage may vary, clearly. —Justin (koavf)TCM 09:01, 21 January 2025 (UTC)Reply
Doing a quick search, I found (e.g.) this:
FWIW, in academic discourse, "Bible" can be understood more restrictively as the "Hebrew Bible" when used by Jewish authors or in Jewish contexts, while it can be understood more expansively as (some form of) the Christian Bible ("Old Testament" + "New Testament") when used by Christian authors or in Christian contexts. Biblical scholars find this fluctuation in usage quite natural.
Also FWIW, the Jewish Study Bible refers to the New Testament without any impulse to put it in scare quotes. The Preface includes some comments that reflect tangentially on OP's question on page x.
So that aligns with what I think is generally true, but I appreciate that this is anecdotal. —Justin (koavf)TCM 09:04, 21 January 2025 (UTC)Reply
I really think this is anecdotal as it does not align at all with my experience as a Jew, and regardless of the term, the category itself is biased as containing all and only what Christians consider canonical (and even then only certain Christians; for some reason I can't understand, you insist on counting Ethiopian Christian-specific scriptures as canonical but not Mormon-specific scriptures). I have to say, having a discussion with you is exhausting and about as pleasant as a root canal, since you keep ignoring the main thrust of my argument and cherry-picking things to respond to. For this reason I am not going to engage any more with you. Your disagreement with splitting is registered, but you don't get a liberum veto in case of consensus in favor of splitting. Benwing2 (talk) 09:17, 21 January 2025 (UTC)Reply
Outside of Christian contexts I tend to refer to the Tanakh as the "Hebrew scriptures", though the presence of Aramaic in two books renders that not completely correct. I've seen "scriptures" used in a number of non-Christian contexts, but I think Christians would recognize it as referring to the Bible. The Apocrypha are kind of weird as part of the Judaic tradition and included in the early Jewish Septuagint translation, but not accepted as canonical by Judaism. The older Christian denominations accepted them but the Protestants didn't. As for anecdotal evidence: there are so many denominations just in Christianity that you can find someone who will agree with just about anything. While that isn't as true in Judaism, I'm pretty sure that the mere name of the "Jewish Study Bible" would be fairly effective at selecting for those who aren't offended by references to "the Bible". I'm sure if there were a publication with "MAGA" in the title, you would find the people there in favor of Trump... Chuck Entz (talk) 14:36, 21 January 2025 (UTC)Reply
I really have no idea what I wrote that was so off-putting, nor did I ever reject your claim about Mormons and Mormonism. —Justin (koavf)TCM 16:06, 21 January 2025 (UTC)Reply
Why not every have every book of the Bible that some denomination of Christianity considers canon, including stuff like the Book of Mormon and the Church Order? Still with the Old/New Testement split (along with the others), of course. Would be the most neutral. CitationsFreak (talk) 20:46, 21 January 2025 (UTC)Reply
I think that's the best solution and what I would argue for. I appreciate that Benwing is not interested in discussing and I don't want to speak for him, but I think that's consistent with what he was saying before and I would argue for this solution being consistent with that. —Justin (koavf)TCM 04:20, 23 January 2025 (UTC)Reply
Maybe the most neutral / simplest thing is to give Christianity and Judaism their own (sub)category(s), even if there is overlap, i.e. have both "CAT:Books of the Tanakh" and "CAT:Books of the Old Testament" even if some (but not all!) entries will be in both categories? After all, we correctly have Apollo as both a Greek god and a Roman god, and we regularly write things like {{lb|en|transitive|intransitive}}, putting (a single definition of) a verb into both the "transitive verbs" and "intransitive verbs" category: that there are entries in both categories is OK if both categories apply.
"Books of the Bible" would hold "CAT:Books of the Old Testament", "CAT:Books of the New Testament", and any entries that don't fit into one of those subcategories (if, as mentioned above, any Tewahedo books don't). - -sche (discuss) 22:21, 21 January 2025 (UTC)Reply
Category:Size feels like it should be a thesaurus, unless the idea is for it to have specific size standards like clothing sizes. Ioaxxere (talk) 18:30, 22 January 2025 (UTC)Reply

splitting Category:Music

[edit]

Category:en:Music has 3,870 terms (?!). We have Category:en:Musical notes, which has things like sixteenth note and quaver (I would have thought F-sharp is a musical note as well, but apparently not), but we need a shitload more set categories. I suggest:

  1. Category:Musical notes: Rename to Category:Musical note values or Category:Musical note durations. This holds terms for note durations such whole note, half note, quarter note, double whole note and corresponding British terms like quaver, crotchet, breve. (FYI "note value" is Wikipedia's term; I have studied Classical piano for 12+ years and have not obviously encountered this term. I just ambiguously call them "notes" but I didn't really study music theory super formally in school.)
  2. Category:Musical rests: This is the corresponding set of rests to the notes/note values/note durations of the previous category: half rest, quarter rest, British terms like crotchet rest, etc.
  3. Category:Musical tones or repurposed Category:Musical notes or Category:Musical scale notes: F-sharp, D-flat, E, also solfège notes like do/ut, re, mi, fa, sol, la, ti/si.
  4. Category:Musical keys: F-sharp minor, E-flat major, etc.
  5. Category:Musical scales: major/major scale, minor/minor scale, natural minor scale, harmonic minor scale, melodic minor scale, pentatonic/pentatonic scale, whole-tone scale and many others.
  6. Category:Musical modes: Ionian, Dorian, Phrygian, Lydian, Mixolydian, Aeolian, Locrian and many variants like hypolydian, hypomixolydian, etc.
  7. Category:Musical chords: major triad, dominant seventh chord, half-diminished seventh chord, ninth chord, Picardy third, Neapolitan chord (BTW we are missing Neapolitan sixth/Neapolitan sixth chord) etc.
  8. Category:Musical intervals: fourth, fifth, sixth, augmented sixth, diminished sixth, tritone, octave, etc.
  9. Category:Musical clefs: treble/treble clef (aka G clef), bass/bass clef (aka F clef), tenor clef, alto clef (both types of C-clef/C clef), etc.
  10. Category:Musical tempos: largo, lento, adagio, andante, andantino, moderato, allegretto, allegro, vivace, presto, prestissimo, etc.
  11. Category:Musical dynamic values or some related name: pianissimo, piano, mezzo piano, mezzo forte, forte, fortissimo, fortississimo, etc.
  12. Category:Musical articulations: legato, staccato, sforzando, pizzicato, glissando, arpeggiato, etc.
  13. Category:Musical tempo changes: accelerando, ritardando, rallentando, stretto, stringendo, rubato, meno mosso, più mosso, etc.
  14. Category:Musical dynamic changes: crescendo, decrescendo, diminuendo, smorzando, calando, morendo, etc.
  15. Category:Musical vocal ranges (update: we have Category:Musical voices and registers): soprano, alto, tenor, bass, mezzo soprano, baritone, basso profundo, treble, countertenor, etc.
  16. Category:Musical time signatures (update: we have Category:Musical meters): four-four time, three-four time/three-quarter time, cut time/alla breve, common time, six-eight time (which we are missing), twelve-eight time (likewise), etc.
  17. Category:Musical ornaments (maybe there is a better term): trill, shake, mordent, tremolo, vibrato, slide, arpeggio, etc.
  18. Category:Musical mnemonics: FACE, EGBDF (= every good boy does fine, every good boy deserves fudge, and other variants), ACEG (= all cows eat grass, all cars eat gas, etc.).

There are surely others (e.g. we need to split Category:Musical instruments even more than it currently is), but this comes to mind first. Benwing2 (talk) 10:27, 20 January 2025 (UTC)Reply

CAT:Musical genres? Vininn126 (talk) 10:31, 20 January 2025 (UTC)Reply
We have it, you just misspelled it :) Benwing2 (talk) 10:35, 20 January 2025 (UTC)Reply
I see my coffee and ADHD meds haven't fully woken me up yet... Vininn126 (talk) 10:37, 20 January 2025 (UTC)Reply
lol! Benwing2 (talk) 10:38, 20 January 2025 (UTC)Reply

OK some more:

  1. Category:Musical composition forms (need better name): song, instrumental, rock opera, ... (for popular music); prelude, etude, sonata, symphony, fugue, cantata, toccata, fantasy, mass, requiem, opera and a zillion others for classical music
  2. Category:Musical composition parts (need better name): verse, chorus, bridge, fadeout, ... (for popular music); movement, trio, exposition, recapitulation, coda, etc. for classical music
  3. Category:Musical chord progressions: Axis progression, backdoor progression, circle progression, 50's progression/'50s progression/ice cream changes/doo-wop progression, twelve-bar blues, eight-bar blues, etc.

Benwing2 (talk) 20:24, 20 January 2025 (UTC)Reply

I agree subcategorization is needed! I am hesitant about whether people will grasp and maintain the intended distinctions between all of "tempos, dynamic values, articulations, tempo changes, dynamic changes, ornaments". Do we think the average user adding e.g. an -issimo term a year from now will expect/grasp that it goes in one category if it's more similar to prestissimo "very quickly", but a different category if it's more similar to fortissimo "very loudly", and intuit that a term like allegretto vs one like accelerando, or a term like accelerando vs crescendo, vs vibrato vs staccato, go in five separate categories? I am unsure. Maybe anyone who is dealing with musical terminology does perceive these as fundamentally different categories of direction and will have no problem maintaining the distinctions. But I wonder whether some of these should be consolidated into something like "musical directions" / "musical directives", or "musical changes" grouping tempo and dynamic changes. - -sche (discuss) 20:41, 20 January 2025 (UTC)Reply
Thanks for opening this topic. I generally agree with your decisions. I am fairly knowledgeable in classical music and you're welcome to contact on Discord with questions. A few naming suggestions: 1. Musical note durations (or values is fine). 3. Musical pitches. 7. Name is okay but a bit awkward, maybe just Chords or Musical chord types. 11. Musical dynamics. Fold 13 and 14 into Musical directives, as outside of the most obvious set, there is a lot of subjectivity involved (some terms may indicate changing both tempo and dynamic at once, or are ambiguous/up to interpretation). I wouldn't mind if a word goes in both Musical directives as well as another category.
Re: items 1 and 2 for the second list, I feel it would be sensible to split classical from popular, partly because their meta-terminology (i.e. what "form", "genre", "part", etc. even mean) is quite different, and because I think users are more likely to want to use a list specific to their domain of interest than a miscellaneous one. If not, then Musical composition types and Musical composition sections. Since the situation is tricky, I don't have fully-formed thoughts right now on split-classical/popular category names beyond Classical music composition types/sections. I'll note that classical "genre" often encompasses many but not all items of 1(c) (whereas popular "genre" encompasses e.g. reggaeton and sludgecore); classical "form" encompasses sonata form, rondo, strophic, binary form, etc., and the two blur and are often confused; and classical composition "parts" without other context often refers to a musician's role e.g. "the violin part" – see first 3 bullets at Part.
I'm not that concerned about people grasping this setup. With a good structure (which Benwing has proposed), I feel confident that average users can either look at similar entries and copy by analogy, or otherwise do whatever and it's not the end of the world for entries to await a bit of cleanup. Hftf (talk) 21:28, 20 January 2025 (UTC)Reply
Thanks! I agree with all your suggestions, and I think splitting classical from popular in the first two items of the second list makes sense. The third item about chord progressions also seems mostly to refer to popular music; classical music seems to speak of cadences, which maybe should be a different category (see w:Cadence for an exhaustive discussion). @-sche I think that in practice, people familiar with music theory will understand the distinctions of the categories, esp. if the descriptions are clear, and they are more likely to be the ones adding music terms. I suspect people not familiar with music theory who add a music term will be more likely to just put it under Category:Music, regardless of whether we have a more ramified category tree or a less ramified one. Having a more ramified tree has the advantage of allowing people to more easily grasp the distinctions between categories if they're not clear about them, and there are more than enough terms to fill even a highly ramified tree. Benwing2 (talk) 22:01, 20 January 2025 (UTC)Reply

Category:Musicians: rename and add Category:Musical artists?

[edit]

This category is for types of musicians (drummer, pianist, etc.) but manages to also include Beatles. We should maybe rename Category:Musicians -> Category:Types of musicians and create Category:Musical artists for things like Beatles and Rolling Stone (which are very questionable as entries, but ...). Under Category:Musical artists could go Category:Taylor Swift and Category:Justin Bieber (yes these damn categories exist) instead of having them directly under Category:Music. We probably need a Category:Beatles since we have Beatledom, Beatlemania, Beatles-esque, Beatle cut, Beatlehead, and others. Benwing2 (talk) 10:44, 20 January 2025 (UTC)Reply

"Old" and "Orkhon" Turkic, plus some more

[edit]

Current coverage of Pre-Islamic (+ Karakhanid) Turkic languages is quite shoddy across Wiktionary, here's what I mean:

  1. Academic coverage of 'Old Turkic' spans around 4 centuries [8th-11th] (Orkhon T., Old Uyghur with early Karakhanid texts (Kutadgu Bilig, Divan Lughat at-Turk) marking the literary end for this term[1][2][3]), yet in Wiktionary, the 'Old Turkic [otk]' label is used specifically for Orkhon (or Inscriptional) Turkic attested around the 8th and 9th centuries. Which is simply not ideal.
    1. This leads to many new editors to use Orkhon script for all 'Old Turkic' terms, which is quite misleading (since the terms in ""Runic"" script constitute the least amount of examples for this language.)
  2. Currently, 'Old Turkic' and Old Uyghur is shown under the Siberian branch, while Karakhanid is shown as a Karluk language. This may not be reflective of how these languages should be listed ideally[2] (according to Tekin.) That image would also need us to rewrite or agree upon a significantly different family tree than what we have now.
  3. Currently, descendants from Proto-Turkic follow rigid "family branches", like "Karluk", "kipchak" or "Oghuz". I oppose to this classification, which leads to some fringe and uncategorizable cases like Salar, Pecheneg and Western Yugur and (Northern/Southern) Altai from what I get.
  4. Currently we only have one language tag for Bulgar [xbo], but we have two different 'versions' of Bulgar, Danube- and Volga-[4]. Perhaps we can employ [xbo-dnb] and [xbo-vol] for these too, though I am reluctant for this change.

For those I propose this classification:

  • Proto-Turkic [trk-pro]
    • West Old Turkic [*trk-wes] (PROPOSAL) (also listed as 'Proto-Bulgaric')
      • Khazar [zkz] (also listed as 'Kuban Bulgar')
      • Danube Bulgar [xbo], [*xbo-dnb] (PROPOSAL)
      • Volga Bulgar [xbo], [*xbo-vol] (PROPOSAL)
        • (...)
    • Common Turkic [trk-cmn]
      • East Old Turkic [*trk-eas] (PROPOSAL)
        • Orkhon/Inscriptional Turkic [otk]
        • Yenisei Kyrgyz [otk-kir]
        • Old Uyghur [oui]
        • Karakhanid [xqa]
        • (...)
          • (...)

Specifics can be decided on later, but this is the main frame I am going with for Proto-Turkic geneaology. What do you think (pinging users from [3])? @BurakD53 @Allahverdi Verdizade @Yorınçga573 @Blueskies006 @Ardahan Karabağ @Bartanaqa @Samiollah1357 @Zbutie3.14 @Rttle1

  1. ^ Erdal, M. (2004). A grammar of Old Turkic. BRILL. pp. 6-22
  2. ^ Johanson, L., & Csató, É. Á. (2021). The Turkic languages (2nd ed.). Routledge. p. 132 DOI: 10.4324/9781003243809-8
  3. ^ Tekin, T., & Ölmez, M. (2003). Türk Dilleri: Giriş (2nd ed.). Yıldız. pp. 18-28
  4. ^ Tekin, T., & Ölmez, M. (2003). Türk Dilleri: Giriş (2nd ed.). Yıldız. pp. 28-31

AmaçsızBirKişi (talk) 11:51, 20 January 2025 (UTC)Reply

1. It is not particularly important whether the Old Turkic language is labeled as "Orkhon Turkic" or "Old Turkic." Ultimately, when we refer to otk, we know what we mean. However, it would be more accurate to call it "Orkhon Turkic." for descendants. The label can be changed to "Orkhon Turkic." in descendants list. Calling it "Inscriptional Turkic" would be incorrect, as we list Yenisei Kyrgyz inscriptions separately in the descendants list. Those are also inscriptions.
2. Karakhanid Turkic can indeed be considered a continuation of Old Turkic, but it diverges from it at a certain point. Its grammar does not entirely align with that of Old Turkic. While Old Turkic and Siberian languages fall under the "olur-" group, Karakhanid Turkic remains in the "o(l)tur-" group. I see no issue in classifying it as the initial stage of a separate branch.
3. Salar can historically be traced back to an Oghuz tribe. Salar Turkic also exhibits lexical features characteristic of Oghuz languages. For instance, using sağ instead of oň for the direction "right" is unique to Oghuz languages, and Salar aligns with this. Similarly, using dudak instead of erin for "lip" is unique to Oghuz languages, and Salar aligns here as well. Along with numerous other lexical features not listed here, Salar conforms to Oghuz languages. Morphologically and grammatically, it differs significantly not only from Oghuz languages but also from other Turkic languages. However, it can still be traced back to Proto-Oghuz. Even though it has been heavily influenced phonetically by Chinese and Tibetan, there is no obstacle to considering it part of the Oghuz branch; on the contrary, there is evidence to support this classification.
4. I agree. If otk-kir exists and does not refer to a separate language but directly redirects to otk, then xbo-dnb and xbo-vol can also directly redirect to Bulgar.
For those I propose this classification:
* Pre-Turkic/Proto-Turkic [trk-pro]
** Proto-Bulgaric
*** Khazar [zkz]
*** Danube Bulgar [xbo], [*xbo-dnb] (PROPOSAL)
*** Volga Bulgar [xbo], [*xbo-vol] (PROPOSAL)
(...)
** Proto-Common-Turkic [trk-cmn]
*** Siberian:
**** Orkhon Turkic [otk-ork] (PROPOSAL)
***** Yenisei Kyrgyz [otk-kir]
***** Old Uyghur [oui]
*** Karluk:
**** Karakhanid [xqa]
***** Khorezmian
****** Chagatai [chg]
(...)
(...) BurakD53 (talk) 13:02, 20 January 2025 (UTC)Reply
Your classification of Old Turkic/Karakhanid seems more accepted, and I guess we need not be done with family branches completely also. If no-one objects to that, we can implement a version of your scheme in the Descendants section from now on.
Although 'probing' this question revealed yet another can of worms, what about the rest of the family tree? As @Zbutie3.14 pointed out, many PT pages are plagued with inconsistent and incorrect branch names, and there's no one, fully-agreed upon scheme for all pages to follow, a bot can be used for this maybe, or we can use {{desctree}} like Indo-European for standardization (this would necessitate us creating new language headers like "Karluk Turkic", "Proto-Bulgar" etc.
AmaçsızBirKişi (talk) 07:52, 21 January 2025 (UTC)Reply
I can help with this if (a) everyone agrees on a branching scheme, (b) someone makes a list of all the mappings from incorrect branches to correct branches and how to reorganize the incorrect branches. Benwing2 (talk) 08:03, 21 January 2025 (UTC)Reply
Thank you. We already have a lead at WT:ATRK#Descendants, but that list is going to change soon I believe. It would be nice to use more templates across RC:PT pages to make pages more uniform. AmaçsızBirKişi (talk) 08:36, 21 January 2025 (UTC)Reply
Let me know. I would definitely advise using {{desctree}} when possible, simply to avoid duplication if nothing else. Benwing2 (talk) 09:23, 21 January 2025 (UTC)Reply
Personally I strongly prefer tables over trees; I would like it if we had the look of the current proto turkic pages but with the uniformity of using a template. Zbutie3.14 (talk) 21:13, 21 January 2025 (UTC)Reply
I support having Old Turkic as a larger category above the rest, so it would look like
  • Old Turkic:
    • Orkhon Turkic
    • Old Uyghur
    • etc
Would we need a bot to go through everything and change it? Also most proto turkic pages still have north/west/south for kipchak which I made a tea room discussion about earlier in december and you replied so I guess we should deal with that too if we're going to go through and change things.
\\
Salar is Oğuz for sure
\\
From what I know most turkic languages can be fit into the current category system very well so a few languages being hard to fit doesn't mean we need to change the whole system
\\
I don't know anything about the rest Zbutie3.14 (talk) 18:20, 20 January 2025 (UTC)Reply
The points you raise are valid, but I agree with @BurakD53‘s classification more. See below:
• If we place Karakhanid under Old Uyghur and Old Turkic, won’t this make modern Uzbek and Uyghur inherit from Old Turkic? Not this hasn’t happened but I don’t want readers to have the impression that Old Turkic = Proto-Turkic (which many underinformed people already think, just look at Turkish Wiktionary). Most people who use Wiktionary aren’t linguists
• To what extent are Danube and Volga Bulgar different languages? This isn’t my area of expertise but are they distinct enough to merit separate headers?
• We can (and should) inform readers that classification isn’t linear and strict; we can highlight any inconsistencies in etymology or usage notes
• Agree that Salar is Oghuz. Do we classify it as a third distinct Oghuz branch or put it with Turkmen under East Oghuz?
A few other questions about classification:
• Should we make headers for other unattested proto-languages (Kipchak, other Kipchak branches, Siberian, Oghur, etc.)? Having consistent reconstructions instead of skipping from Proto-Turkic reconstructions to next attested form might make diachronic development easier to follow
• To what extent is Balkan Gagauz Turkish its own language? Between Istanbul Turkish and (Moldovan) Gagauz I’m not sure if the other Balkan Turkish/Gagauz dialects are divergent enough to constitute a different language
• Where do we place Äynu ([aib])? I think it merits inclusion as a descendant of Uyghur or Chagatai Blueskies006 (talk) 21:03, 20 January 2025 (UTC)Reply
Just adding here that although I don't know much about Old Turkic, I know there have been prior discussions on this topic that AFAIK haven't led anywhere, so I would suggest someone look them up and see what the sticking points were. Benwing2 (talk) 22:03, 20 January 2025 (UTC)Reply
The difference between Volga Bulgar and Danube Bulgar is actually very clear. One is Muslim and, therefore, influenced by Arabic, using the Arabic script. Almost all linguistic data that has survived to the present day comes from this language written in Arabic script. It was also influenced by Volga Turkic and eventually gave way to Volga Turkic over time. Danube Bulgars, on the other hand, were not Muslim, so they were not influenced by Arabic and did not use the Arabic script. Instead, they used the Cyrillic or Greek alphabet. It was spoken earlier than Volga Bulgar. Over time, they became Slavicized and disappeared from history. However, they contributed words to Old Church Slavonic, and even today, it is possible to find a few Old Bulgar words in modern Bulgarian. Of course, there is no influence from Arabic or Old Tatar (Volga Turkic) in their language. We have a calendar consisting of animal names, personal names, and two inscriptions from them. Talat Tekin has two separate books on Danube Bulgar and Volga Bulgar, addressing each language individually. Tekin also mentions Kuban Bulgar, but since there is no linguistic data, he relies only on borrowings. So, there are very significant and distinct points separating these two languages. I am not saying they should be listed as separate languages on the site, just that they should be separated in the descendants section, as is done with otk-kir. This would also ensure that the borrowings are placed in the correct category. BurakD53 (talk) 23:43, 20 January 2025 (UTC)Reply
That makes sense, ty for clarifying Blueskies006 (talk) 01:16, 21 January 2025 (UTC)Reply
- I am not well-versed enough to comment on Salar, see Burak's reply above, but I don't think we need three branches in Oghuz tree personally (might be wrong, but I envision this:)
  • Oghuz (the attested language, not the RC "Proto-Oghuz")
    • Old Anatolian Turkish
      • (...)
    • Ajem Turkic (I am adding this from another BP or LPD chat, tentative)
      • (...)
        • Azerbaijani
        • Qashqai
    • Turkmen
    • Salar
Again, correct me if I am wrong, but this seems apt for Oghuz branch from what I get.
- Balkan Gagauz and Äynu I cannot tell much other than we should include them in our pages (which we didn't in the past.)
- - But for Äynu I think placing it under Chagatai works, I had done that once for a RC:PT page and it did not get replaced/shifted yet.
I can see a need for a separate, thorough talk on classification. Maybe now, maybe in the future.
AmaçsızBirKişi (talk) 08:03, 21 January 2025 (UTC)Reply
I agree with you. This Oghuz categorization is better. Old Anatolian Turkish should be divided as OAT and Ajem Turkic. Balkan Gagauz language should stay a dialect. Balkan Gagauz as a language is unnecessary. Support BurakD53 (talk) 13:40, 21 January 2025 (UTC)Reply
Implementing the following changes to WT:ATRK#Descendants, if nobody objects.
Overhauled descendants table (which will be added to every Proto-Turkic entry from now on). Please tell me any further improvements or your thoughts on this:
  • Proto-Turkic [trk-pro]
    • Proto-Bulgaric [*???] (a new language code will be necessary)
      • Khazar [zkz]
      • Danube Bulgar [xbo], [*xbo-dnb] (a new language code will be necessary)
      • Volga Bulgar [xbo], [*xbo-vol] (a new language code will be necessary)
        • Middle Chuvash [cv-mid]
          • Chuvash [cv]
            • (...)
    • Proto-Common Turkic [trk-cmn]
      • Old Turkic [otk]
        • Orkhon Turkic [otk-ork] (a new language code will be necessary)
          • Yenisei Kyrgyz [otk-kir]
          • Old Uyghur [oui]
            • (...)
      • Siberian [trk-sib]
        • North Siberian [trk-nsb]
          • (...)
        • South Siberian [trk-ssb]
          • Yenisei Turkic
            • (...)
          • Sayan Turkic
            • (...)
          • Northern Altai [atv] (this one may need to be shifted around)
      • Karluk [trk-kar]
        • Ili Turki [ili]
        • Karakhanid [xqa]
          • Khorezmian [zkh]
            • Chagatai [chg]
              • (...)
      • Oghuz [trk-ogz] (the language attested in Diwan Lughat at-Turk, not the "reconstructed" one)
        • Old Anatolian Turkic [trk-oat]
          • (...)
        • Ajem-Turkic [*???] (a new language code will be necessary)
          • Classical Azerbaijani [az-cls]
            • Azerbaijani [az]
            • Qashqai [qxq]
        • Turkmen [tk]
        • Salar [slr]
      • Kipchak [trk-kip]
        • (...)
I can see a bot reformatting every single Proto-Turkic entry and fixing the old classifications. Some specifics may be decided on further down the line.
AmaçsızBirKişi (talk) 08:51, 23 January 2025 (UTC)Reply
Reconstruction of Ajem is unnecessary. We don't need identical reconstructions, also we would add them without code anyway. We have that will. In other words, we do not need to list languages ​​that do not have a code and that we did not need before. Also I don't understand why Old Turkic is not Siberian and I am asking. BurakD53 (talk) 01:31, 25 January 2025 (UTC)Reply
+ are we sure that Ili Turki is not Chagatai but Karluk? Because I don't know what separates it from being Chagatai. BurakD53 (talk) 01:35, 25 January 2025 (UTC)Reply
If there is no an important difference between Ajem and az-cls, we can just use one of them. BurakD53 (talk) 01:37, 25 January 2025 (UTC)Reply
I was not reconstructing Ajem Turkic, [*???] means some future language code that I don't know (like [trk-ajm]). I placed Old Turkic there by mistake when copy-pasted, that can be fixed.
Zero clue as to where to place Ili Turki, lists it just after the Karluk independent of Karakhanid and its descendants.
I was not planning to include Ajem Turkic lemmas when I first proposed this tree, though I should have made it more clear. It'll be like "Sayan Turkic:" in essence, just a name of a branch inside the Oghuz tree.
I am implementing these changes now, seeing no objection.
AmaçsızBirKişi (talk) 09:55, 25 January 2025 (UTC)Reply

Types of taxonomic ranks for categories

[edit]

I think an understanding of how taxonomic ranks work will help in deciding how to subdivide them. I haven't really studied the codes for viruses and prokaryotes, which have their own, separate logic, so I'll stick the animal and algae/fungi/plant codes. These treat different broad levels differently:

  1. To begin with, the basic, "atomic" unit is the binomen:
    1. The genus or generic name, which is a noun
    2. The species or specific epithet, which is either:
      1. An adjective which agrees with the generic name in gender and number, or
      2. A noun which is:
        1. "in apposition", and only agrees with itself (in other words, it's its own referent), or
        2. a genitive, and agrees with the referent in gender and number. Of particular interest are names of parasites, which tend to have names that are the genitive of the name of the host- this is sometimes the only way we know the gender of names above the rank of genus.
  2. In fact, the animal code divides everything into:
    1. Species group: species, subspecies, etc. (species and below), which behave the same in matters of agreement- a lower rank like a subspecies
    2. Genus group: Genus and everything between genus and species. There may be some coordination in gender and number with the genus (I don't remember offhand), but no agreement.
    3. Family group: Everything from family down, but also a few ranks above that derive from family, such as superfamily. No agreement.
    4. Higher taxa: Everything above the family group. The animal code only really concerns itself with family group and below, except that it specifies standard endings for each rank above the genus group, and the stem generally derives from the genus (usually the genitve- thus Hominidae from Homo) of the type species. The other code forms names the same way, but with different standard endings. An exception, built into the Code, is made for a few very well established old family names that can optionally be used instead of the standard ones, such as the Compositae, Cruciferae and Leguminosae instead of Asteraceae, Brassicaceae and Fabaceae.
  3. The most natural way to split taxonomic names would be family, genus and species groups vs. higher taxa. Either that, or family group and higher.
  4. Ranks vs. clades:
    1. Taxonomy was originally based on the belief in divine creation, with classification being a way to discern the order God had set up in the process of creation. Thus, common traits would show they were meant to be a group. The traditional taxonomic ranks were mostly set up with this in mind, and are more about overall organization tha about finer details of the structure.
    2. Modern taxonomy, on the other hand, is based on the idea that all living things are descended from a single organism or group of organisms, with changes in genetic traits being inherited by descendants, but not by non-descendants. That means that it would theoretically be possibly to reconstruct the family tree of a group of organisms by statistically analyzing the distribution of traits among them. The science of doing this is cladistics, and the tree is a phylogeny. Any part of this tree is a clade. Theoretically, members of a clade should all be the descendants of the organism that diverged from its sibling organisms in some way.
      1. If taxonomists do their job right, every taxon should be a clade- but there are more of these divergences than there are taxonomic ranks. A clade can be anywhere on the tree below the very top, so it isn't necessarily the same as a taxonomic rank. There are also organizations such as the Angiosperm Phylogeny Group that are more interested in figuring out the tree than in making all the clades fit into the traditional taxonomic ranks. They come up with informal clades such as the Eudicots and Rosids that are supposed to eventually be mapped onto traditional taxonomic ranks- at which point they will have names formed according to the traditional rules that apply to those ranks. These informal clades are not official taxa, so they don't really have taxonomic ranks- you can tell the ranks above them and below them, but not in between.

I hope this helps in deciding how to set up the categories. Chuck Entz (talk) 01:30, 21 January 2025 (UTC)Reply

Sorry, this is confusing to me. Was this in regard to the short discussion between me and @DCDuring? I was just proposing a single set category 'Taxonomic ranks' to capture terms for specific taxonomic ranks, and maybe another set category for "meta-terms" like taxon, clade and rank. If this is what you are writing about and you have a better proposal, please let me know; otherwise, can you clarify what your intent was? Benwing2 (talk) 08:08, 21 January 2025 (UTC)Reply
To be clear, an organism does not belong to a group called a rank. Rank is a metaterm. Clade can have metaterm meaning, in the sense that any taxonomic name may or may not be a clade, clade being a Good Thing in modern taxonomy. A taxon, a clade, and a group can have members, just like the traditional ranks. The traditional ranks have the advantage of (relative) stability and are suggestive of relative position in the tree of life. OTOH, the traditional ranks are positively confusing in paleobiology, in which, for example, an extinct order can have a modern class as a member.
As to metaterms, there are a huge number of terms used in bionomenclature. (See Terms used in Bionomeclature GBIF (2010); not paginated, but my copy printed single-sided the body of which is about 3/4" thick and has about 10 terms per page.) It mixes rank names, morphemes, and a variety of other types of terms, some SoP. I don't know how to break this set of terms into useful subcategories. The first thing for us to do might be to make sure that nomenclatural terms are distinguished from other categories of terms used in biology (and virology). DCDuring (talk) 17:01, 21 January 2025 (UTC)Reply

Automatically expand the first section on mobile

[edit]

Currently, when you go to an entry on mobile, every is collapsed (example: [4]). This makes for a pretty awful user experience. See phab:T376446 for further discussion. I propose that we add some code according to this rule: when a page loads, if no section is expanded, automatically expand the first section. I already have this in my own common.js and I strongly recommend that we address this as soon as possible. (@Benwing2, Surjection, This, that and the other) Ioaxxere (talk) 18:25, 22 January 2025 (UTC)Reply

I don't think we should expand the first section automatically. If there are multiple non-English sections, opening the first one is completely arbitrary. — SURJECTION / T / C / L / 18:38, 22 January 2025 (UTC)Reply
How about something like this?
  1. If the first section is English (or Translingual?), open it.
  2. Otherwise if there are <= N sections, for some small N (e.g. 3), open them all.
  3. Otherwise, don't open. (It would be great in that case to display a "summarizing table of contents" that lists some short but crucial info about each language, such as the first five words of the first definition.)
  4. If possible, add the ability to specify preferred languages that auto-open; this might be more work, though.
Benwing2 (talk) 20:02, 22 January 2025 (UTC)Reply
My opinion is that we should always expand a single section, but no more. I would be somewhat opposed to automatically opening an English section if there are multiple. One thing is clear, though: if the user enters a link through an anchor, then we should not expand any sections if there are multiple, since tne anchor will automatically open whatever is needed. — SURJECTION / T / C / L / 12:06, 23 January 2025 (UTC)Reply
I think the automatic opening of sections needs to be as limited as possible, for three reasons:
  • The behaviour needs to be simple enough that it can be understood by regular or frequent visitors to Wiktionary.
  • @Jdlrobson said this at phab:T376446:

    Any code that expands headings in site JavaScript is going to be rough on performance and possibly SEO for all the pages that code runs. Note, JS is not blocking, so that code could end up executing very late

    Expanding where there is only one section seems safe in this context, but expanding a section high on the page after the user has already scrolled to find a section lower on the page would be annoying behaviour.
  • Even if there are only a handful of languages, tapping the language name you want is clearly better than having to scroll and "hunt" through the page for your desired language imho.
So:
  • I will obviously support auto-opening where there is only a single section on the page (I believe this is the behaviour that the WMF devs have agreed to develop, although I'm open to being corrected).
  • I could support auto-opening English as well, given this is the English Wiktionary, although this would be a weak support - it could be very annoying for people looking for non-English entries if the JS runs too late.
  • I'd oppose other changes.
This, that and the other (talk) 22:07, 22 January 2025 (UTC)Reply
The current behavior is completely non-intuitive. Today, I looked up broligarchy on my cellphone, didn't see what looked like an entry, and reported that we didn't have an entry. Admittedly, my age-related cognitive deficits may have played a role, but this just doesn't seem right.
I like @User:Benwing2's proposal or something very similar. (Unsurprisingly, I would like Translingual to open, too, if the first L2 [which it always is!!!].) If it were practical for registered users to specify some number (1, 2,{{..}} n [where n is small]) of preferred L2s, then all of those could be opened, with Benwins2's proposal the default. DCDuring (talk) 23:50, 22 January 2025 (UTC)Reply
Maybe we should treat these three options as separate proposals:
  1. Expand the first section if it is the only one on the page.
  2. Expand the first section if it is #Translingual.
  3. Expans the first section if it is #English.
I think there is consensus for #1 but I'm not sure what your stance is on the others. Ioaxxere (talk) 02:13, 24 January 2025 (UTC)Reply
Another option is a social solution which makes this an editorial decision. Disable section collapsing on the page with an edit as in this example. It would be your most future proof solution.
Section collapsing doesn't apply to pages wrapped in div elements. That is how Main page works.
Write a bot that categorizes pages if you can decide the criteria and even autofix this if needed. Jdlrobson (talk) 02:47, 24 January 2025 (UTC)Reply
Example 2: disable on first section but not second.@ Jdlrobson (talk) 02:51, 24 January 2025 (UTC)Reply
Ugh, this use of <div> seems super hacky and doesn't work well with the way Wiktionary structures multi-language pages. Benwing2 (talk) 02:52, 24 January 2025 (UTC)Reply
IMO a solution that requires a bot to go through and mark pages like this is probably the least future proof solution as it is very painful to make changes to the algorithm; it requires a bot to change millions of pages. Benwing2 (talk) 02:55, 24 January 2025 (UTC)Reply
I'm not sure who "you" is here. You should also add my #2 suggestion as a separate proposal (expand all sections if there are a small number N where N can be e.g. 2 or 3). For reference there are 7.7 million pages with 1 entry, ~ 419,000 pages with 2 entries and ~ 101,000 pages with 3 entries. Benwing2 (talk) 02:49, 24 January 2025 (UTC)Reply
@Ioaxxere Also, there does appear to be consensus for proposal #1 so you may want to go ahead and implement that. If there are objections, it can always be disabled. Benwing2 (talk) 02:56, 24 January 2025 (UTC)Reply
I've gone ahead and implemented proposal #1 in MediaWiki:Mobile.js. Of course this can be changed or expanded later. This, that and the other (talk) 07:06, 26 January 2025 (UTC)Reply
...although it doesn't appear to actually work. Perhaps it runs too early. I assume this is why @Ioaxxere used a MutationObserver! This, that and the other (talk) 07:25, 26 January 2025 (UTC)Reply
Done Fixed - sorry about that. This, that and the other (talk) 10:52, 26 January 2025 (UTC)Reply
@This, that and the other: Yes, MediaWiki does some funky thing where the collapsible sections are set up by JavaScript after the page has already loaded. Hence silly bugs like phab:T384149... Ioaxxere (talk) 16:19, 26 January 2025 (UTC)Reply

Universal Code of Conduct annual review: provide your comments on the UCoC and Enforcement Guidelines

[edit]

Please help translate to your language.

I am writing to you to let you know the annual review period for the Universal Code of Conduct and Enforcement Guidelines is open now. You can make suggestions for changes through 3 February 2025. This is the first step of several to be taken for the annual review. Read more information and find a conversation to join on the UCoC page on Meta.

The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. This annual review was planned and implemented by the U4C. For more information and the responsibilities of the U4C, you may review the U4C Charter.

Please share this information with other members in your community wherever else might be appropriate.

-- In cooperation with the U4C, Keegan (WMF) (talk) 01:12, 24 January 2025 (UTC)Reply

Layout in the Vietnamese entry of mần

[edit]

The entry links each of 3 senses to làm through {{syn}}, which looks pretty unpleasant. @PhanAnh123 put {{syn|vi|làm}} in 2 senses (as I infer from this edit; the 1 other sense is done by me for good measure). His edit replaces my layout of placing {{syn of|vi|làm}} (which represents, and have, all those 3 senses), opinionating that the layout is incorrect because mần is not etymologically related to làm (although seemingly not against my point that the word is fully synonymous with làm in Central Vietnam, which is the sole reason I have grouped the senses under {{syn of|vi|làm}}).

The documentation of either {{syn}} or {{syn of}} don't seem to mention anything about the usage. I am also new, so I don't really know how Wiktionary editors use it in entries. Can anyone senior to Wiktionary help or explain me the layout for mần? HungKhanh0106 (talk) 06:30, 25 January 2025 (UTC)Reply

@HungKhanh0106 Though not explicitly stated in the documentation, PhanAnh's point of his edit was to avoid the usage of {{syn of}} which would make Central Vietnamese mần look "lesser" in value than the more common làm. By using {{syn}}, he places both terms on an equal footing without preference for either term. Though you're also right, this practice is not very user-friendly layoutwise. --ChemPro (talk) 15:01, 27 January 2025 (UTC)Reply

{{senseid}}

[edit]

What's with this template being used with serial number-like IDs, like {{senseid|en|Q13048847}} at kid? Is that the idea? They're not very intuitive, to say the least. @TaokailamCaoimhin ceallach (talk) 21:09, 25 January 2025 (UTC)Reply

I'd like to start adding entries for Ukrainian

[edit]

Apologies if I'm putting this in the wrong forum, I'm still figuring all of this out. I'm a teacher of English for Ukrainian students and have been studying Ukrainian for about a year. In this time I've come across about a hundred (lemma form) words that are missing. I'd like to start adding them, but I've never done more than minor corrections to wiktionary before and I'm worried that I will make more trouble by doing things wrong than I will help. I've found many of the guideline resources on creating entries, and I think most of it makes sense. I'm wondering if there us anyone here that can answer questions and/or review my sandboxed entries to help me get up-to-speed on process and expectations. I have six native speakers at my disposal to confirm my translations, and have a good grasp of grammar as a literary editor, so it will mostly be about properly formatting. Thanks! Proudlyuseless (talk) 01:20, 26 January 2025 (UTC)Reply

Where are the sandboxed entries? Nicodene (talk) 02:29, 26 January 2025 (UTC)Reply
Same question, also you just make entries by looking on other entries and observing eventual corrections other editors make to your entries. After all you don’t need to have entered anything but a heading with a head template and one or more definitions and an inflection table if known in order to have an entry. Fay Freak (talk) 03:09, 26 January 2025 (UTC)Reply
I don't speak much Ukrainian, but I've cleaned up my share of technical errors in Ukrainian entries. First make sure you've read through our Entry layout page and our About Ukrainian page. After that, I would recommend visiting the templates in Category:Ukrainian templates and reading the documentation. These templates will save you a great deal of work, but they get downright nasty if you don't give them what they need up front. Especially important is knowing where the accent is. Chuck Entz (talk) 04:31, 26 January 2025 (UTC)Reply

rename 'name' categories to 'instance' categories

[edit]

@-sche @Ioaxxere In previous discussions about topic categories, we ran into the issue of whether e.g. Afropunk or death metal are more like "types" of musical genres or "names" for musical genres. This distinction is clearer with tangible objects, e.g. a Stradivarius is a type of musical instrument while the Gibson Stradivarius is the name of a musical instrument; an atomic bomb is a type of bomb while Fat Man is the name of a specific bomb. But this distinction is often obscured: Is A minor a type or name of a musical key? Is thumb or index finger a type or name of (a) finger? I'm thinking that renaming "name category" to "instance category" will reduce the number of categories being treated indistinctly as "set categories" (whose current definition even says "terms for types or instances of X"). Thoughts? Benwing2 (talk) 23:38, 27 January 2025 (UTC)Reply

Unfortunately, I am unsure whether "instance" adds clarity overall, or has similar ambiguity... for example, offhand, both "atomic bomb" and "Fat Man" seem to me to have about as much or as little claim as each other to being "instances" of "bomb", and I don't know if the ambiguity over whether "index finger" is a "type" of finger is much reduced by whether the alternative is that it's an "instance" vs a "name" ("index finger" seems like a type of finger, as well as possibly an instance of a finger, and a name of a finger, to me). Hopefully other people will weigh in. Oof, it's hard to untangle this set (er, type?) of categories. "Names of individual Xs" or even something like "specific individual Xs" would be clear in the case of e.g. bombs, but I don't suppose enough individual people's fingers are as famous as Fat Man for "names of individual fingers" to be any clearer: it still sounds, at least to me, like a category that could contain "index finger". - -sche (discuss) 15:57, 28 January 2025 (UTC)Reply
@Benwing2: I think a good working definition is that an instance is something that can't be subdivided into smaller groups. Therefore index finger goes under "types of fingers" while Galileo's middle finger would be under "names of individual fingers". I don't see how you could subdivide the concept of A minor, so I would be tempted to call that an individual or instance, although I realize it's a bit weird to extend this system onto abstract concepts. Ioaxxere (talk) 17:45, 28 January 2025 (UTC)Reply
If it helps, here is a touchstone about types versus instances (and thus hyponyms versus instances): a capital city is a type of city but Paris and London are instances of capital cities (and thus also of cities). Thus, an index finger is a type of finger, whereas your left index finger is an instance of an index finger (and thus also of a finger), and so is my left index finger (those add up to two instances). Admittedly it is easier to keep the distinction straight with things that have physical instances, as opposed to abstractions. But even with abstractions, we can see that intransigence is a type of predisposition, whereas your intransigence is an instance of predisposition, and so is mine (those add up to two instances). Quercus solaris (talk) 17:36, 29 January 2025 (UTC)Reply
I am reminded of the kind of definition that new contributors sometimes add, say, for coupe "the name of a two-seater car". There is hardly ever a reason for the head of a definition to be "the name of" in a dictionary aimed at adults and children past elementary school. In elementary education a word might often be called a name of its referent. In the names of categories it is more understandable why we might need to use name sometimes, but normally we would still view the items in any category of terms at Wiktionary to be terms that have referents. I think that the categories Category:Taxonomic names and Category:Taxonomic names needing vernacular names include the word names because they are outside of the topical system.
Within the topical system, it would seem to me that we should always be thinking of the referents in the real world. There may be fuzzy areas in the realms of philosophy and linguistics where in some way a kind of name is a referent, but it seems to me that in those areas terms other than mere name are used.
AFAICR, the issue has been how to keep "instances" separate from, 1., words "about" the category and/or its members and, 2., words "used by" those within the relevant field. I don't see how the term names is any help in this regard, except, possibly, in communicating at a near-elementary-school level to some of our users. DCDuring (talk) 20:05, 29 January 2025 (UTC)Reply

Reliability of Joseph Wright's English Dialect Dictionary

[edit]

I came across an online copy of this dictionary and I was wondering if anyone else is familiar with this and if it has legitimate dialectical information or if it's just nonce words and hoaxes. Any thoughts? —Justin (koavf)TCM 05:08, 28 January 2025 (UTC)Reply

@Koavf: see {{R:English Dialect Dictionary}} for a full version of the 1st edition. It is widely cited by the OED. I understand it is essentially a compilation of glossaries of dialectal terms collected by various authors—the bibliography is in volume VI. You can probably get a better idea of the methodology used from the preface in volume I. — Sgconlaw (talk) 05:23, 28 January 2025 (UTC)Reply
Thanks. Seems like it is reliable. —Justin (koavf)TCM 05:41, 28 January 2025 (UTC)Reply
Yeah, in my experience checking whether dialectal terms meet Wiktionary CFI, I've found the EDD generally reliable, sometimes invaluable, though of course (as we all know) no dictionary is without errors (occasionally its definitions are mistaken, or somewhat more often its etymologies are outdated, and sometimes words are presented in it as real but still fail RFV for lack of uses). - -sche (discuss) 08:03, 28 January 2025 (UTC)Reply
@Koavf: I'm curious what prompted you to ask this. I've never seen an academic source doubt its legitimacy or accuracy. Ioaxxere (talk) 17:53, 28 January 2025 (UTC)Reply
I wanted to be conservative before I went about adding seemingly silly or hoax-y words. I have no reason to doubt this source in particular. —Justin (koavf)TCM 18:36, 28 January 2025 (UTC)Reply
I trust you're checking for attestations of use, of course, especially if words seem hoax-y: a fair few things in the EDD don't meet our own CFI. Also note that some EDD words do meet CFI, even on the strength of the same 3+ cites the EDD has for them, but whereas the EDD as them as English, we'd put them under a ==Scots== header instead. - -sche (discuss) 19:56, 28 January 2025 (UTC)Reply
Yes, doing a web search for "snerdle" and trying to determine if anyone had ever actually used this word outside of this reference work is what directly led me to ask. —Justin (koavf)TCM 21:05, 28 January 2025 (UTC)Reply
@Koavf: sometimes the EDD will have quotations which one can look up and use as attestations. However, on other occasions it only quotes examples of speech recorded by other authors. Personally I would also regard these as sufficient attestation, one then having to quote the EDD itself as the source. However, I don't know if everyone agrees. — Sgconlaw (talk) 22:17, 28 January 2025 (UTC)Reply

Tibetic script suffixes

[edit]

@Michael Ly as probably interested, feel free to ping others.

Currently, our {{suffix}} template automatically adds a hyphen before suffixes (as does {{prefix}} for prefixes) in Tibetic-script languages. However, the Tibetic script is scriptuo continua, so in most cases, a suffix will simply function as a separate particle. In cases like ཅ་མ however, it seems like a good idea to be able to create categories for productive suffixes ( (ma) doesn't ever function as a word, nor does it have a grammatical function). I also notice our CAT:Tibetan suffixes does not have any hyphens. I wonder if it wouldn't be best to disable the feature of adding hyphens for the Tibetic script. However, I'm not sure as to the consensus in the field/among speakers.

Any ideas? Thadh (talk) 21:33, 29 January 2025 (UTC)Reply

Global ban proposal for Shāntián Tàiláng

[edit]

Hello. This is to notify the community that there is an ongoing global ban proposal for User:Shāntián Tàiláng who has been active on this wiki. You are invited to participate at m:Requests for comment/Global ban for Shāntián Tàiláng. Wüstenspringmaus (talk) 12:08, 2 February 2025 (UTC)Reply