Jump to content

Wiktionary:Beer parlour/2025/January

Add topic
From Wiktionary, the free dictionary

Bad ledes in Thesaurus namespace

[edit]

@qwertygiy Standard practice in the Thesaurus namespace is currently putting a blank line between {{ws header}} and the first L2. See, for example, Thesaurus:person, Thesaurus:berry, etc. This gives a warning to anyone who makes a new Thesaurus entry (e.g. at this trigger of filter 115), because it's in violation of WT:NORM. So either the Thesaurus namespace should be excluded from filter 115 or the practice should be changed to match NORM. -saph (usertalkcontribs) 03:12, 1 January 2025 (UTC)Reply

2024 – Top pageviews statistics

[edit]

The top for en.wiktionary.org (unfiltered list from dump files) is:

	25840383 Special:Search
	21800424 Wiktionary:Main_Page
	 1993839 Appendix:Glossary
	 1645478 rainbow_kiss
	 1506823 xxx
	 1451758 -
	 1244194 黑料
	 1145183 吃瓜
	  938861 Category:English_swear_words
	  681510 bokep
	  672460 I'll
	  633276 视频
	  599259 XXXX
	  598818 aww
	  567081 colmek
	  527691 麻豆
	  523495 bocil
	  493837 Appendix:Protologisms/Long_words/Titin
	  463888 Appendix:Filipino_surnames
	  444811 Wiktionary:International_Phonetic_Alphabet
	  427258 XXX
	  395379 لا_إله_إلا_الله_محمد_رسول_الله
	  395192 
	  386054 «
	  377134 astaghfirullah
	  356207 変態
	  349388 Category:English_surnames_from_Old_English
	  343450 سکس
	  342540 ‘
	  338717 pajeet

Detail and another WMF projects: https://archive.org/details/2024-top_2k_user_pageviews Dušan Kreheľ (talk) 08:49, 1 January 2025 (UTC)Reply

Pronunciation of irregular plurals

[edit]

Currently there is no way of knowing how to pronounce, for example, ibices or sphinges. JMGN (talk) 00:46, 2 January 2025 (UTC)Reply

Then add the pronunciations or add a pronunciation request? -saph (usertalkcontribs) 03:42, 2 January 2025 (UTC)Reply
To all of them? I thought this was Beer parlor... JMGN (talk) 12:17, 2 January 2025 (UTC)Reply
There's nothing preventing anyone from adding them now, hence, no need to bring the issue to Beer parlour. See, for instance, vertices or indices. Andrew Sheedy (talk) 18:33, 2 January 2025 (UTC)Reply
Really surprized that this cannot be automated as the rest of the pronunciations though... JMGN (talk) 21:21, 3 January 2025 (UTC)Reply
English is a special case: it represents the collision of two branches of Indo-European, followed by a thousand years of history including serving as the main language in two whole continents and in numerous countries worldwide, and as a second language for over a billion people. It has huge numbers of loanwords coming from languages all over the world throughout known history. On top of all that, it has no authoritative standard. Although it may be possible to automate the pronunciation, it would be a huge project and would probably add considerably to system overhead on the million+ pages where it would be deployed. Chuck Entz (talk) 02:01, 4 January 2025 (UTC)Reply

"number" or "numeral"?

[edit]

We currently have a POS "numeral", hence CAT:Numerals by language, CAT:English numerals, etc. but for some reason we have CAT:Cardinal numbers by language and CAT:Ordinal numbers by language not #cardinal numeral or #ordinal numeral. Category:Numerical appendices has a mixture of appendices called "Foo numbers" and "Foo numerals". I'd like to straighten this out by using a consistent naming scheme, probably numeral instead of number. [The root of the issue seems to be that numeral is normally taken to be a symbol (like 2, 3, 4 in the Hindu-Arabic system) that refers to a number, which is an abstract concept, but (a) whether numerical words like two, three, four are considered "numerals" or "numbers" is less clear (technically it appears they are numerals, being symbols of sorts, maybe more correctly signs, that refer to abstract numbers), and (b) in common parlance, the distinction between numeral and number is elided.] Mixing both terms is unhelpful, so if we can settle on consistent terminology, either "numeral" or "number", I can do the renames. Benwing2 (talk) 03:33, 2 January 2025 (UTC)Reply

I think I have a preference for numeral. Vininn126 (talk) 03:45, 2 January 2025 (UTC)Reply
If we are to standardize to one of them, then "numeral" is the better option of the two. But I wonder if we could instead come up with some consistent distinction between the two. — SURJECTION / T / C / L / 07:58, 2 January 2025 (UTC)Reply
@Surjection Can you be more specific? What sort of distinction were you thinking of? Benwing2 (talk) 09:17, 2 January 2025 (UTC)Reply
Numerals are a POS, numbers are a semantic category. Ordinal numbers are often not numerals in many languages, same goes for adverbial numbers (once, twice) and fractional numbers (half, third), for instance.
In many languages, the numeral POS has different grammatical properties than other POS: in Finnish and Russian, it governs very specific cases on the adjacent noun, for instance. In many other languages, there is no numeral POS at all (e.g. Afar nammáy or Tokelauan lua). They are still numbers though.
So, basically, the appendices calling these "numerals" are 'wrong' (in the sense that they don't follow the above distinction), and should probably be standardised to numbers. Thadh (talk) 09:10, 2 January 2025 (UTC)Reply
I'm not sure where you got the idea that a numeral word has to be its own part of speech to be a numeral. See Ordinal numeral on Wikipedia. That seems a Thadh-ism (if I may call it that), which you've extrapolated from a handful of languages. Benwing2 (talk) 09:16, 2 January 2025 (UTC)Reply
OK, that was a bit snarky and I apologize for that, but I'm still confused as to where you've gotten your ideas from. Benwing2 (talk) 09:27, 2 January 2025 (UTC)Reply
Yes, it was... No worries. I don't think I've said that, but rather that for now that's the distinction we do (or should/could) handle. 'Numeral' as a grammatical category is very useful for these languages that do have one, whereas they still have a distinct semantic category including other parts of speech. We can and should make a distinction between the two, and I think calling them by the same name will lead to much more confusion than the status quo. Thadh (talk) 09:16, 3 January 2025 (UTC)Reply
I am not convinced of this. Not all Russian numbers work the same way by any means; they range from один (pure adjective) to миллион (pure noun), with in-between numbers getting progressively more noun-like and less adjective-like. I don't know Finnish but I wouldn't be surprised things are similar. If you want to make a number vs. numeral distinction, you need to spell out when one term is used and when another is used, and what renames need to happen; otherwise I have no idea what you're getting at. Benwing2 (talk) 09:38, 3 January 2025 (UTC)Reply
All right:
- number is used to denote any member of the semantic category/ies that denote a specified amount, position etc. that can be theoretically counted.
- numeral is used to denote any member of a syntactic category of nominals that exhibits syntactic behaviour not found in other nouns and adjectives, and is typically associated with amounts that can or cannot be counted.
Since POSs already are language-specific, I can't give you hard-and-fast rules where to use the latter: just like I can't give you a way to recognise a noun or a verb or an adjective, it differs by language. However, it is clear to me that there are languages where numerals are nominals (in flectional languages, they can usually be inflected, in others, they can be used as a head of a nominal phrase) that do not syntactically behave the same way as nouns or adjectives or (if those exist) determiners, and have very specific rules of governing the head noun.
Maybe indeed the numeral 'one' (yksi) could better be analysed as an adjective, but that doesn't take away that kaksi is neither an adjective nor a noun, as it agrees in the oblique cases but takes a partitive-case noun in the nominative (non-agreement). Same thing for Russian два (dva): две девушки, but двум девушкам - partial agreement, not identical to adjectives or nouns. Thadh (talk) 09:55, 3 January 2025 (UTC)Reply

Extended Mover request: User:Rex Aurorum

[edit]

Hello. I'd like to request extended mover rights, mainly to be able to fix issues: 1. Typos made by earlier editors (non-Indonesian speakers) 2. Typos made by myself (frequently made typos in certain clusters) ―Rex AurōrumDisputātiō 10:29, 2 January 2025 (UTC)Reply

Sundanese main entries

[edit]

I noticed that Sundanese entries have the main entries in the Sundanese script but according to @Udaradingin, the most common script used nowadays is the modern script. Shouldn't the main entries (and possibly also links) be moved to Latin script entries just like how Tagalog use Latin script and the Baybayin spelling is just shown as an alternative spelling? Thanks. 𝄽 ysrael214 (talk) 08:51, 3 January 2025 (UTC)Reply

Stray Arabic-script digit entries

[edit]

We have entries for the main series of Arabic-Indic digits in Unicode, and what Unicode refers to as the Extended Arabic-Indic digits

Digit Main series Main series
language sections
Extended series Extended series
language sections
1 ١ Translingual ۱ Ottoman Turkish
Persian
Punjabi
Urdu
2 ٢ Translingual ۲ Ottoman Turkish
Persian
Punjabi
Urdu
3 ٣ Translingual ۳ Ottoman Turkish
Persian
Punjabi
Urdu
4 ٤ Translingual
Ottoman Turkish
۴ Persian
Punjabi
Urdu
5 ٥ Translingual
Ottoman Turkish
۵ Persian
Punjabi
Urdu
6 ٦ Translingual
Ottoman Turkish
۶ Pashto
Persian
Punjabi
Urdu
7 ٧ Translingual ۷ Ottoman Turkish
Persian
Punjabi
Urdu
8 ٨ Translingual ۸ Ottoman Turkish
Persian
Punjabi
Urdu
9 ٩ Translingual ۹ Ottoman Turkish
Persian
Punjabi
Urdu
0 ٠ Translingual ۰ Ottoman Turkish
Persian
Punjabi
Urdu

As you can see, the main series are all straightforward Translingual entries like we have for the Latin-script digits, though a few also have Ottoman Turkish entries. The Extended series, however, are all nothing but entries for individual languages. What's more, many of them have no headword templates, and the ones that do treat them as entries for the spelled-out word that the symbol represents in that language. I did my best to fix the entries at ۱ (in the Extended series), but then I realized that there shouldn't be entries for specific languages at all, just a Translingual section at the top.

I'm not sure what the Translingual entries for these characters should look like- some, at least, seem like variants used in some Arabic-script languages, but not others. Others seem like identical glyphs that are separate due to some quirk in the early history of Unicode. There's a task listed in WT:Todo for fixing entries that use the wrong character for a given language, so there's probably a lot more to this.

I do think that all of the pages in both series should have only a Translingual section, and all the other language sections should be merged into the spelled-out versions that the digits represent (if there's anything worth keeping). I didn't see any idiomatic senses like "4" used for "for" in texting.

The main problem is that I'm not really proficient in these languages, so I'm not sure how, exactly, to fix this- but I am sure it needs to be fixed, somehow. Thanks, Chuck Entz (talk) 01:28, 4 January 2025 (UTC)Reply

You are making sense. The Ottoman entries were added by @Moonpulsar in 2023, the Persian and Urdu ones in 2006 and 2007, when rules, standards or consistency on Wiktionary were not developed in a now relevant extent, as formatting was wild. The analogy to non-Arabic-script languages suggest that we keep but translingual entries, perhaps even with hard-redirects of alternative forms.
I have seen both series of numbers in either Arabic, Persian, and Ottoman prints, a second one did not need to have been encoded by Unicode in the first place, rather than being relegated to font systems varying display by language, as say, italic б looks different depending on whether it is Serbian or Russian, and less clearly Bulgarian.
The numbers did not even have distinct names from the ones we use in Europe, once again I note that the terminology of Eastern Arabic numerals vs. Western Arabic numerals is Wikipedia’s citogenesis, with their frequent problem of citing terms added to some list but barely used, in case anyone attempts to conceive what Wiktionary has to portray. Fay Freak (talk) 02:18, 4 January 2025 (UTC)Reply