This page is designed to discuss moves (renaming pages), mergers and splits. Its aim is to take the burden away from the Beer Parlour and Requests for Deletion where these issues were previously listed. Please note that uncontroversial page moves to correct typos, missing characters etc. should not be listed here, but moved directly using the move function.
Out of scope: Merging entries which are alternative forms or spellings or synonyms such as color/colour or traveled/travelled. Unlike Wikipedia, we don’t redirect in these sort of situations. Each spelling gets its own page, often employing the templates {{alternative spelling of}} or {{alternative form of}}.
Tagging pages: To tag a page, you can use the general template {{rfm}}, as well as one of the more specific templates {{move}}, {{merge}} and {{split}}.
Note that discussions for splitting, merging, and renaming languages, once held here, are now held at WT:Language treatment requests.
Latest comment: 1 year ago27 comments6 people in discussion
Reviving the earlier discussion, I'm still bothered by the fact that we have two different categories for names. But the previous discussion also made it clear that it's not as easy as just merging them.
I think Category:en:Place names should probably be renamed to Category:en:Places, since it's really meant to contain terms for places. That is, since it's a topical/set-type category, the focus should be on the referent of the word, whereas part-of-speech categories like Category:English names focus on the word itself. A word is a name, and it refers to something bearing that name.
Category:en:Demonyms is a bit more problematic and I brought it up before, though I don't remember where. "Demonym", again, is a term focused on the word, not the referent. A word is a demonym. Perhaps this could be renamed to something else? Category:en:Peoples maybe?
"Category:en:Transliteration of personal names" could be renamed to "Category:English names transliterated from other languages", I suppose. What's the matter with the demonyms category? It contains demonyms, as expected. Would it be better titled "English demonyms", on the model of "English phrases"? - -sche(discuss)06:02, 10 November 2015 (UTC)Reply
@ExcarnateSojourner There being no opposition here, only support (albeit mostly old support), and no opposition or interest when I brought this up in the BP, let's revise whatever needs to be revised to put (at a minimum) all given names and surnames into subcategories of Category:Names by language, instead of some of them being in subcategories of Category:Names. The split is haphazard and arbitrary; I see the intention — put a name that was given within English in one top-level category and a name transliterating a foreign name in a different top-level category — but in practice that's not maintained, since e.g. Alexandra in the context of discussing ancient Greek is transliterating the Ancient Greek name, Sergei has been given to babies born in the Anglosphere (and to characters in English fiction), and we don't maintain such a split with place names. - -sche(discuss)16:01, 24 April 2023 (UTC)Reply
It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche(discuss)14:37, 5 May 2023 (UTC)Reply
@ExcarnateSojourner @-sche I am going to take a stab at implementing this. Can you help with what the renames should be? I understand the separation between poscat categories and topic categories should be "lexical" vs. "semantic" but I sometimes have trouble putting this into practice. A tentative list based on what's already been proposed:
'DESTLANGCODE:SOURCELANG male given names' -> 'DESTLANG male given names transliterated from SOURCELANG'; same for 'female given names', 'surnames', etc. This doesn't work; these are not DESTLANG names but SOURCELANG names rendered into DESTLANG. So I propose 'DESTLANG renderings of SOURCELANG male given names' or similar. ("Transliteration" isn't quite right; sometimes these are transliterations, sometimes respellings, sometimes mere borrowings (cf. Italian Clinton).)
'LANGCODE:Foreign personal names' (a grouping category) -> 'LANG foreign personal names'
'LANGCODE:Named roads' -> 'LANGCODE:Names of roads' and remove from 'LANGCODE:Names'
'LANGCODE:Named prayers' -> 'LANGCODE:Names of prayers' and remove from 'LANGCODE:Names'
What about the following:
Subcategories of 'LANGCODE:Demonyms':
'LANGCODE:Armenian demonyms'?
'LANGCODE:Celestial inhabitants'?
'LANGCODE:Ufology' -> stays as a topic category.
'LANGCODE:Latvian demonyms'?
'LANGCODE:Nationalities'
'LANGCODE:Tribes'
'LANGCODE:Celtic tribes'
'LANGCODE:Germanic tribes'
'LANGCODE:Native American tribes'
See also 'LANGCODE:Mongolian tribes' under 'LANGCODE:Ethnonyms'.
Subcategories of 'LANGCODE:Ethnonyms':
'LANGCODE:Mongolian tribes' -> Goes wherever 'LANGCODE:Celtic tribes', 'LANGCODE:Germanic tribes' and 'LANGCODE:Native American tribes' go.
'LANGCODE:Place names' -> Delete and reclassify the terms under them using {{place}} so they end up in 'Places in FOO'.
'LANGCODE:Places' -> Leave as a topic category but remove 'LANGCODE:Names' as a parent?
Script-specific variants of 'LANGCODE:Letter names': 'LANGCODE:Arabic letter names', 'LANGCODE:Devanagari letter names', 'LANGCODE:Imperial Aramaic letter names', 'LANGCODE:Korean letter names', 'LANGCODE:Latin letter names'?
Subcategories of 'LANGCODE:Nicknames':
'LANGCODE:Nicknames' itself? This is a grouping category.
'LANGCODE:Nicknames of individuals'?
'LANGCODE:City nicknames'?
'LANGCODE:Country nicknames'?
'LANGCODE:Racist names for countries' -> Terminate with extreme prejudice, see WT:BP.
'LANGCODE:Sports nicknames' -> either 'LANGCODE:Sports team nicknames', 'LANGCODE:Nicknames of sports teams', 'LANG sports team nicknames', 'LANG nicknames of sports teams'
See also 'LANGCODE:Couple nicknames' above.
'LANGCODE:Onomastics' -> stays as topic category but should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Language families'? Regardless, it should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Languages'? Regardless, it should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Taxonomic names' and subcategories:
'LANGCODE:Taxonomic names' itself?
'Taxonomic eponyms by language': Already a pos category.
'Specific epithets' -> 'Translingual specific epithets'?
Other topic categories not directly reachable through 'LANGCODE:Names' but needing consideration:
Sorry, didn't mean to ignore your ping, but got distracted by life after seeing it. As far as the categories for "English renderings of Ukrainian names" (or whatever), I have no strong preference for any particular name at this time. My immediate concern was just with addressing the odd point of bifurcation where "native English placename like Warwick or Alberta; English rendering of an Armenian placename like Stepanakert; English rendering of a personal name someone gave a baby born in Ukraine like Volodymyr" are in one top-level category system ("LANGCODE:Names", named like 'set' categories), and "personal name someone gave a baby born in Canada" is in a different top-level category system ("LANGNAME names", treated like a quasi-part of speech). It's hard to decide where exactly to split the spectrum of categories we're dealing with here, if we're wanting to keep e.g. "John" in "Category:English male given names" at that (part-of-speech-esque) category name, but wanting to consider some things like Category:en:Native American tribes to be clearly a set/list category (a set/list of tribes); my immediate point was just that I don't see a sound basis for considering "John, Jane" a POS-type (LANGNAME) category but "Volodymyr, Sergei" a LANGCODE:-set-type category — surely they're both one or both the other, and the greater momentum seems to be towards considering "names" a POS-type/LANGNAME category. But maybe we should think about that more carefully and consider them all to be "sets"? (But then, "Category:English verbs" is also just a category containing the set of English verbs. Hmm... should we perhaps allow only things that are truly "parts of speech" to have "Category:LANGNAME foobars" names, and make all the "names" categories that contain John and Volodymyr into set categories? Should that be the direction in which we eliminate the bifurcation of the 'John' vs 'Volodymyr' categories?) I do think even keeping names in two subcategories like "English given names" vs "English renderings of Ukrainian names"/"English renderings of Chinese names"/etc [whatever we call those categories] based on, in effect, whether they were born in Ukraine vs to a Ukrainian family in Canada (or in China vs to a Chinese family in America) may be less than ideal; e.g. what do we do if a transliterated Ukrainian or Chinese name is common in English-language fiction? What about if it's a German name; does the fact that those names are "natively" Latin script make the threshold for considering them to have become "English names" lower? Does it make a difference if the fiction is set in lightly-fictionalized Germany or Ukraine or China, vs in a space future or a generic medievalesque Middle Earth / Westeros? But I don't have time to think through and suggest any proposal for any better approach to that yet. "LANG foreign personal names" (e.g. "English foreign personal names") sounds a bit odd; would "LANG renderings of foreign personal names" (aligning with your proposed "DESTLANG renderings of SOURCELANG male given names") be better, iff we're sticking with moving "Names" categories to LANGNAME names and not LANGCODE names? I will try to respond more, and to the rest, later. - -sche(discuss)17:54, 4 November 2023 (UTC)Reply
@-sche Thanks for your comments. I have no issue with "LANG renderings of foreign personal names". I see your point about the line between nativized foreign-origin names and renderings of actual foreign names being fuzzy, but there does feel to me like a distinction, esp. in languages like Latvian that tend to respell foreign names according to Latvian spelling conventions, and the distinction is fairly clearly made in reality between e.g. the large number of Russian names respelled according to Latvian conventions (and used e.g. by the large population of Russians in Latvia) vs. the smaller number of Russian-origin names that have become nativized for naming of ethnic Latvians. In a multi-ethnic society like the US or Canada where nationality and ethnicity aren't always clearly distinguished, things get a lot fuzzier, although it still feels like there's some sort of distinction between names like Volodymyr or Volha that are unlikely to be borne by anyone other than someone who is Ukrainian (resp. Belarusian) or whose parents or grandparents are Ukrainian (resp. Belarusian), vs. a name like Vladimir or Olga that might be given to someone with no particular connection to Russia. As for whether these should use LANGNAME-type or LANGCODE-type naming, I'm not sure although I gather the distinction is supposed to be lexical vs. semantic, if that helps at all. Benwing2 (talk) 23:57, 4 November 2023 (UTC)Reply
I guess we should stick with LANGNAME naming for given names / surnames, then, at least for now. (Switching gears for a moment to address a different aspect:) Regarding "horse given names", we also have (but apparently don't currently categorize) dog given names likes Scruffy, Fido, and Spot, and we have Polly as a name for a parrot, and Mittens, Kitty, Socks for cats (also e.g. Miming in Cebuano). Perhaps we should merge all the different animals into one category for "animal given names". To me, at least, it seems intuitive to then handle this category in whatever way we handle the human given name categories—so, if we're naming the category that contains 'John' "English male given names", then 'Fido' goes in "English animal given names", or if we're using language codes, then use codes for both. (Back to the first gear:) We also have names that belong to specific individual people (Confucius, Cicero) or animals (Laika, and mythically Cerberus, Garm); we seem to put these in LANGCODE-set categories; I suppose the rationale is that the category that contains "Confucius, Cicero" contains a set of individuals, whereas "John" and "Jane" are 'less restricted'... in practice, people have undoubtedly also named babies 'Confucius' and 'Cicero', but if we demonstrate that, then we add a {{given name}} sense, so I guess we're fine leaving the individuals in LANGCODE-set categories and the {{given name}}s in LANGNAME categories... I guess this also explains the difference between nicknames (LANGNAME nicknames) and relationship names (the category contains a set of specific ships)...? nevermind, "Category:Nicknames" doesn't contain what I would've expected ("Bob, Jim, Tom" for Robert, James, Thomas) - -sche(discuss)18:45, 5 November 2023 (UTC)Reply
Just checking, when your "list based on what's already been proposed" includes "'LANGCODE:Demonyms' -> 'LANG demonyms'" but then your follow-up proposal is for Subcategories of 'LANGCODE:Demonyms': like 'LANGCODE:Armenian demonyms'?, you're proposing to not actually rename "'LANGCODE:Demonyms' -> 'LANG demonyms'", right? I'm just checking that we're going to handle "Demonyms" and the subcategories like "Armenian demonyms" the same way, either all using LANGCODEs or all using LANGNAME. I could see handling the categories that actually have the word "demonyms" in their name either way, but since some of the other subcategories like "LANGCODE:Native American tribes" do seem more like set categories, maybe it's best to consider the whole batch to be set categories and stick with LANGCODE names like they have at present? (But maybe move them out of the "Names" category?) "Couple nicknames" is an interesting case, because intuitively it seems like those and (relation)ship names should be handled the same way, since they seem like the exact same thing: "Lumity" is the portmanteau name for the two specific individuals Luz Noceda and Amity Blight, and Billary is the portmanteau name for the two specific individuals Bill Clinton and Hillary Clinton... maybe LANGCODE:Couple nicknames should be renamed "LANGCODE:Couples" to be more clearly a set category? and moved out from under the "names" category, since we don't categorize ship names as "names"? - -sche(discuss)02:34, 6 November 2023 (UTC)Reply
@-sche Thanks for pointing out that inconsistency. Rua's point awhile ago was that 'Native American tribes' is named correctly as a set category because the contents are "names of Native American tribes" but 'Armenian demonyms' isn't named correctly as the contents aren't "names of Armenian demonyms". Rua suggested renaming 'Demonyms' -> 'Peoples' although that seems a bit strange to me as the term 'demonym' is fairly well established, and furthermore a distinction could be made between nominal demonyms and adjectival demonyms (note, we have {{demonym-noun}} and {{demonym-adj}} for these two, respectively), which is clearly a lexical distinction. That suggests maybe they should all be considered lexical categories, esp. since I think something like Category:en:Exonyms doesn't make sense as a set category (being an exonym is completely a lexical property. If we are to make Category:en:Armenian demonyms a lexical category, IMO it should be Category:English demonyms for Armenians as Category:English Armenian demonyms doesn't make much sense. As for CAT:en:Couples, that seems ambiguous so maybe it should be CAT:en:Nicknames of couples or something (which would be keeping with future names like CAT:Types of stars and such). Benwing2 (talk) 02:54, 6 November 2023 (UTC)Reply
"CAT:en:Nicknames of couples" works. Or should it even be "Nicknames of pairs", since it currently contains a few things like Bushbama {{subst:dash}} or should we remove those? (We don't categorize e.g. Republicrat as anything but "US politics".) Good point about exonyms. "Demonyms", or at least the things currently in the "Demonyms" categories, seem to straddle the line between being a set category like "Occupations", vs being lexical like "Exonyms"... ugh, as you said earlier, it's hard to pin down and "put into practice" the difference, since so many of these categories exist in a grey area with characteristics of both. Like: it would not technically be wrong AFAICT to say "Category:English male given names and Category:English nouns are set categories containing the set of all English male given names or nouns respectively" (it would just be madness, heh). And in the other direction, isn't being a placename as much a lexical property as being a given name? But should they go into the same top-level "LANGNAME names" category, or is that madness? Thinking aloud for a moment, I guess one difference is whether a term refers to one specific entity, or to an open-ended cast, which would rationalize why "John" and "Bob"—as names that can be given to an open-ended variety of people, new babies every day—are in (or belong in, in the case of "Volodymyr") "LANGNAME names" categories, whereas "Baghdad Bob" (individual's nickname), "Billary" and "Lumity" (real and fictional couples' nicknames) and e.g. "Saskatchewan" and "Yerevan" (placenames) refer to specific entities, and so are LANGCODE set categories...? So then, since demonyms like "Saskatchewanian" and "Yerevanian" also refer to an open-ended set of people (new babies born in Saskatchewan every day), and as you say, 'being a demonym' can be argued to be a lexical property like 'being an exonym', that justifies them being "LANGNAME demonyms" categories...? (Then the "type of"-set categories, like the category for "the set of all types of stars" or "the set of Native American tribes", are LANGCODE-set categories for a different reason.) - -sche(discuss)19:04, 6 November 2023 (UTC)Reply
@-sche Yes, that seems to make a lot of sense. BTW I have written the script to move topic (langcode) categories to lexical (langname) categories and I'm probably going to run it on exonyms first. Benwing2 (talk) 19:59, 6 November 2023 (UTC)Reply
Relevant to the discussion above about creating a general animal given names category, this discussion points out "Ralph" for a raven, as well as "Rover" as another dog name. Whenever the situation with human names is sorted out, I suggest moving "LANGCODE:Horse given names" ("is:Horse given names") to "LANGNAME animal given names" ("Icelandic animal given names"), unless anyone has objections... (or we could add a general "animal given names" category and retain subcategories for specific animals if one or more languages had a lot of names for them, as might be the case for dogs and horses...) - -sche(discuss)17:24, 11 November 2023 (UTC)Reply
I'm a little confused about what's going on here. Are you RFV-ing every entry in this category? Or are you just looking for evidence that Khitan was written using this script? —Mr. Granger (talk • contribs) 12:45, 13 August 2016 (UTC)Reply
I understand that, but I don't understand what your goal is with this discussion. If you want to RFV every entry in the category, then I'd like to add {{rfv}} tags to alert anyone watching the entries. If you want to discuss what writing systems Khitan used, maybe with the goal of moving all of these entries to different titles, then I'm not sure RFV is the right place for the discussion. (Likewise with the Buyeo section below.) —Mr. Granger (talk • contribs) 17:55, 13 September 2016 (UTC)Reply
{{jiajie}} should be merged with {{liushu}}, which could be renamed as {{Han liushu}}, following {{Han compound}} and {{Han etym}}. It might not be a good idea to use a particular language code because these templates are intended for use in multiple languages now. They used to be used under Translingual, but we have decided to move the glyph origin to their respective languages. — justin(r)leung{ (t...) | c=› }20:22, 17 May 2017 (UTC)Reply
Support merging {{jiajie}} with {{liushu}}, this template only has a few uses and would be trivial to replace, I'm not sure why this discussion died out. The jiajie template does have functionality liushu is missing (e.g. at 云, where all the extra text produced by jiajie there is absent in liushu) though, but that also shouldn't be hard to add. - saph ^_^⠀talk⠀14:42, 3 February 2025 (UTC)Reply
Latest comment: 1 year ago5 comments2 people in discussion
sense: Noun: "(aviation) A large multi-engined aircraft.
The term heavy normally follows the call-sign when used by air traffic controllers."
In the aviation usage AA21 heavy ("American Airline flight 21 heavy") the head of the NP is AA21, heavy being a qualifying adjective indicating a "wide-bodied", ergo "heavy", aircraft.
I don't know what I meant 5 years ago, but that's what I mean now: move it to adjective. Though it would be good to confirm that there is not sufficient attestation of heavies and/or [DET] heavy. DCDuring (talk) 12:48, 18 October 2022 (UTC)Reply
I can find the plural in reference to large (sometimes restricted to widebody) commercial aircraft and heavy bombers (sometimes 2-engine, always at least 4-). Also "heavy" motor vehicles (eg. large trucks, esp semis). I'm not entirely sure what heavy refers to when used by the pilot of a Cessna. DCDuring (talk) 12:57, 18 October 2022 (UTC)Reply
Support. I oppose the existence of categories with language code like "en:" in the first place, but what is proposed here seems to be an improvement over the status quo. --Daniel Carrero (talk) 20:27, 20 October 2017 (UTC)Reply
@Rua It looks sane to me if politics are let out. But why is Abkhazia in Georgia though it is an independent state, statehood only depending on factual prerequisites and not on diplomatic recognition which has nothing to do with it? Where does the Crimea belong to? (article Sevastopol is only in Category:en:Ukraine because it has not really been edited since 2014.) I can think of two solutions: First possibility: We focus on geographical and cultural constants. Second possibility: We focus on the actual political power. I disprefer the second slightly because it can mean much work in cases of war (i.e. how much the Islamic state holds etc., or say the current factions in Libya). But in neither case Abkhazia is in Georgia. But the first possibility does not even answer what the Crimea belongs to, i.e. I am not sure if it is historically correct to speak of the Crimea as Ukraine. And geographical terms are often fuzzy and subject to editorial decisions. All seems so easy if you start your concepts from the United States, which do not even have a name for the region they are situated in. And even for the USA your idea is questionable because the constituent states of the United States are states in their own right (Teilstaat, Gliedstaat in German), as is also the case for the Federal Republic of Germany and the Russian Federation partially (according to the Russian constitution only those of the 85 subjects are states which are called Republic, not the Oblasti etc.). Is Tatarstan Russia? Not even Russians can agree with such a sentence, as in Russia one sharply distinguishs русские and россияне, Россия and Российская федерация. Technically Ceuta and Melilla are in Morocco because Spain is not in Africa. Also, Kosovo je Srbija, and it would become just a coincidence if a place important in Serbian history is listed as X, Kosovo or X, Serbia. Palaestrator verborum (loquier) 16:06, 14 November 2017 (UTC)Reply
Starting with the above, I don't know how the Tokyo ward system works, but I imagine it's a subdivision of the city. In England wards are subdivisions in cities, boroughs, local government districts, and possibly counties. "Wards in" is the natural usage.
Municipalities similarly. For example in Norway there are hundreds of municipalities (kommuner) which are subdivisions within counties (fylker). Some of these can be large, especially in the north, but so are the counties in the north. To me "municipalities in" is the natural wording.
States and provinces in the USA and Canada: In nearly all cases it is unnecessary to add the country name as the names are unambiguous. The only exception I can think of is Georgia, USA. This could also apply to prefectures in Japan and states in India (is there a Punjab in Pakistan?). DonnanZ (talk) 18:52, 14 November 2017 (UTC)Reply
Yes, there is, like there is in India. Maybe categorisations should be abundant? Cities can belong to Punjab as well as to Punjab, India, and the Crimea is part of administration of both the Russian Federation and the Republic Ukraine at least for some purposes in the Republic Ukraine. We can make the least thing wrong by adding Sheikh Zuweid (presuming it exists) as well to the Islamic State as to the Arab Republic of Egypt, because we do not want to judge morally and formally states and terror organizations are indistinguishable. On the other hand of course we need sufficient data to relate towns to administrative divisions and ISIS presumably does not publish organigrams. Palaestrator verborum (loquier) 19:44, 14 November 2017 (UTC)Reply
A benefit to having it as a category is that theoretically it ought to be addable by the headword templates examining the pagename (like "English terms spelled with Œ"), which, if implemented (...if it could be implemented without excessive memory costs), would allow it to be kept up to date automatically. - -sche(discuss)17:16, 15 March 2018 (UTC)Reply
Meh. Mehhhhhh. On one hand, I still like the idea of a category which can be populated automatically any time a new relevant entry is added. OTOH, it's very trivial. Well, it would be simple for someone to copy the current contents of the category over to the appendix and then remove the category from the entries (maybe with AWB to speed things up). - -sche(discuss)09:04, 28 December 2023 (UTC)Reply
Latest comment: 6 years ago11 comments6 people in discussion
I would like to request the move of the content of entries like 茨城県(Ibaraki-ken, literally “Ibaraki prefecture”) to simply 茨城(Ibaraki, “Ibaraki”), cf. Daijisen. 県 is not an essential part of the name.
(edit conflict) It seems like a two-word phrase to me. I am not a native speaker, but I think that if someone asked "水戸市は何県?" ((in) What prefecture is Mito?) then "茨城です。" (It's Ibaraki) would be a correct answer. Entries such as 奈良 and 広島 should have both the city and the prefecture. (I see that 奈良 currently does.) Cnilep (talk) 04:01, 19 April 2018 (UTC)Reply
Yes, 茨城県 is also correct. And if someone asked どこの出身? (Where are you from?) the answer would probably be 奈良県 rather than 奈良, or else expect a follow-up question. But I don't think that is necessarily a matter of word boundaries. Compare Pittsburgh, Pennsylvania and Pittsburgh, Kansas; the fact that it is usually necessary, and always acceptable to specify the latter doesn't mean that Pittsburgh on its own is not a proper noun. By same token, I think that 茨城 (et alia) is a word. That's the point I had in mind. I will say nothing about what is more common. I don't even have good intuitions about frequency in my native language. Cnilep (talk) 04:54, 19 April 2018 (UTC)Reply
I fully agree that 茨城 is a term worthy of inclusion. I also think that 茨城県 is a term worthy of inclusion. We have entries for both New York and New York City, and even New York State. Similarly, I think we should have entries for [PREFECTURE NAME], and also for [PREFECTURE NAME]県 and [PREFECTURE NAME]市 and [PREFECTURE NAME]郡, etc., as appropriate. ‑‑ Eiríkr Útlendi │Tala við mig05:03, 19 April 2018 (UTC)Reply
A lot (maybe all?) of the prefecture names minus the 県(-ken) suffix are polysemous. Listing a few from the north to the south, limiting just to geographical senses, and just in the same regions at that:
Generally support. Less duplication is good, and it is not much different from Chinese etc. for which we generally delemmatise, if not completely hard-redirect, these forms. Wyang (talk) 04:49, 19 April 2018 (UTC)Reply
Support. For a dictionary, I think we don't need to keep entries with both prefecture name and prefecture, despite the usage but it's always helpful to provide usage notes (e.g. normally used with 県: ~県) and usage examples, e.g. 奈良県(Nara ken, “Nara (prefecture)”). --Anatoli T.(обсудить/вклад)05:45, 19 April 2018 (UTC)Reply
Latest comment: 5 months ago7 comments6 people in discussion
After some discussion on Category talk:Baybayin script (that went a bit off-topic), some of the Indian language editors (@Bhagadatta, Msasag and myself) have agreed that this category should be renamed to Category:Eastern Nagari script, the reasons being (1) several languages other than Bengali use this script, and (2) the Bengali alphabet is just a subset of this script and lacks some of the glyphs used by other Bengali-script languages (most prominently Assamese which has a separate r-glyph). I want to make sure that there are no objections to this by editors who were not in the discussion. —AryamanA(मुझसे बात करें • योगदान)02:06, 20 July 2018 (UTC)Reply
Latest comment: 1 year ago4 comments3 people in discussion
A number of Mecayapan Nahuatl words are currently written with U+0027 APOSTROPHE, which is a punctuation mark and not a letter. And a couple are using U+02BC MODIFIER LETTER APOSTROPHE, which is the wrong shape for this language. They should all be written with U+A78C LATIN SMALL LETTER SALTILLO instead.
I feel determiner is the more common name for this in English; the different definitions of these terms across languages should not be a concern - e.g. we also use adjective differently for Korean. adnominal may be confused with the -eun, -neun, -eul, -deon forms of Korean verbs and adjectives. Wyang (talk) 03:57, 13 October 2018 (UTC)Reply
A lot of this is redundant to our suffix derivation categories. In many cases, the suffix used already determines what something is derived from. For example, -ness always forms deadjectival nouns, it can't really be anything else. —Rua (mew) 18:47, 25 February 2019 (UTC)Reply
However this does not work with non-catenative morphology thus far – you may link the previous discussions on those infix categorization matters here, but even if that pattern collecting is solved the derived terms listed at صَلِيب(ṣalīb, “cross”), for instance, would only be categorized by pattern but nothing would imply that the terms are denominal –, and the point I have made about the categorization and naming of these categories is still there. But I give you green light in any case, if you want to replace all those “[language] deverbals” and “[language] denominal verbs” categorizations by suffigation categories of the format “[language] words suffixed with -∅ [deverbal]”, as well if it concerns action towards categorization of noncatenative morphology language terms, since your idea of uniformity is correct. Fay Freak (talk) 19:49, 25 February 2019 (UTC)Reply
Nonconcatenative morphology is still an underexplored part of Wiktionary, which is kind of annoying. But quite often, we simply show the concatenative part as the affix, and then leave a usage note saying what other changes occur when this form of derivation is used. For example on Northern Sami-i and -hit. —Rua (mew) 20:40, 25 February 2019 (UTC)Reply
How to create an affix category with an id: add the id to the definition line in the affix's entry with {{senseid|language code|id}}, add {{affix|language code|affix|id1=id}} (at minimum) to the etymology section of a term that uses the affix, find the resulting red-linked category and create it with {{auto cat}}. — Eru·tuon20:51, 25 February 2019 (UTC)Reply
Thanks, this is easier than I imagined, so it takes the category name from {{senseid}}. I thought it is in some background module data. Now where to document it? Add it to the documentation of {{affix}} under |idN=? This is the main or even only use of this parameter in this template, right? Fay Freak (talk) 21:18, 25 February 2019 (UTC)Reply
It's not that {{senseid}} has any effect on the category name, but that a category with a parenthesis after it, such as Latin words suffixed with -tus (action noun), expects a matching {{senseid}} in the entry for -tus, in this case {{senseid|la|action noun}} because the link in the category description points to -tus#Latin-action_noun, which is the format of the anchor created by {{senseid}}. The |id= type parameters, including in {{affix}}, generally create a link of that type. In {{affix}}, the parameter also has the effect of changing the category name. Sorry, I am not sure if I am explaining this clearly. — Eru·tuon22:36, 25 February 2019 (UTC)Reply
You explain this clearly. I just rolled it up from that side that I need to choose the name in {{senseid}} that I want to have in the category name so later with affix I will categorize in a reasonably named category because in other cases the id can arbitrary – not that {{senseid}} has an effect on the category name. Fay Freak (talk) 22:53, 25 February 2019 (UTC)Reply
Our affix system is not sufficient to handle morphological derivation we have to deal with (unless you want us to introduce lambdas...) Serbo-Croatian hardly has the intricacy of Arabic conjugation, but there are plenty of nouns that are created from verbal roots through apophony, and this needs to be categorized somehow. Crom daba (talk) 17:24, 2 March 2019 (UTC)Reply
@Crom daba At least for Indo-European, we do have a system for handling combinations of affixation + ablaut, like on *-os (notice the parentheses showing the root grade) and -ος(-os). Our current system totally fails where there is no affix, though, a case which also exists in Indo-European. For example, there are some Indo-European forms of derivation, called "internal derivation", which are built entirely around changing ablaut grades and accents: *krótus(“strength”) > *krétus(“strong”) or τόμος(tómos, “slice”) > τομός(tomós, “sharp”). We have no systematic way to indicate this kind of derivation, but it is sorely needed. —Rua (mew) 23:42, 30 April 2019 (UTC)Reply
Numerals can be words (one, two in spelling alphabets), while numeral symbols are not (Roman numerals). The difference is subtle, but I think it is there. — surjection ⟨??⟩ 18:51, 19 October 2021 (UTC)Reply
Latest comment: 22 days ago28 comments13 people in discussion
Some years ago, there was an RFM to rename all these pages, the discussion of which is archived at Wiktionary talk:English entry guidelines#RFM discussion: November 2015–August 2018. The original nomination mentions "and likewise for other languages", meaning that the intent was to rename these pages in parallel for every language. In the end, only the English page was moved, so that now the English page has a name different from all the others. User:Sgconlaw suggested starting a new discussion instead of moving the pages after the RFM has long been closed.
My own opinion on this is to rename the pages in other languages to match the English one. That was the original intent of the first RFM, and the new name better describes what these pages are for. The name "about" instead suggests something like a Wikipedia page where you can write any interesting fact about the language, which is of course not what they're actually for. Some discussion may be needed regarding the shortcuts of all these pages. They currently follow the format of WT:A(language code), so e.g. WT:AEN but also WT:ACEL-BRY with hyphens in the name. The original shortcuts should probably be kept, at least for a while, but we may want to think of something to match the new page name as well. —Rua (mew) 13:00, 29 April 2019 (UTC)Reply
@Metaknowledge FYI, this may take a little while. Lots of these pages have redirects to them and MediaWiki doesn't handle double redirects, so I have to find all the links to these pages (at least, those in redirects) and fix them. Benwing2 (talk) 01:19, 29 March 2021 (UTC)Reply
@Benwing2: 1. I don't think we need the word "languages". 2. The second option sounds more grammatically correct. 3 & 4. I would go with subpages, but you may want to hold off on those, as some of the pages are heavily used and links to them will have to be fixed. Opinions solicited: @Justinrleung, suzukaze-c, Atitarev, Tibidibi 5. It should be moved somewhere very inconspicuous; we could even delete it and nobody would miss it. 6. I guess the former? 7. The first one is now fine, the second can stay where it is, and the third seems somewhat useless (but @-sche may have an opinion). —Μετάknowledgediscuss/deeds02:29, 29 March 2021 (UTC)Reply
I think there's nothing on Wiktionary:About Algonquian languages that requires that page to exist, anyway, and am just going to make it a hard redirect it to the About Proto-Alg. page instead of the soft redirect which is currently its entire contents, keeping the old edit history and old talk page comments. - -sche(discuss)18:58, 26 July 2021 (UTC)Reply
@Globins Wiktionary's category structure only follows the first definition, which is the more common meaning. We shouldn't mix up the two definitions. —Rua (mew) 17:52, 13 May 2019 (UTC)Reply
Not really. An eponym is derived from a name. A toponym is a name. So a term derived from a toponym is derived from a name, but a term derived from an eponym is derived from another word that is then derived from a name. They're not equivalent. —Rua (mew) 21:18, 14 May 2019 (UTC)Reply
I think "eponymic terms" would be better if you want to preserve the "name that a term is derived from" sense of eponym (as opposed to the "term derived from a name" sense). "Terms derived from eponyms" seems odd, maybe tautological, to me because a name is not inherently an eponym, but only when we are discussing the fact that a term is derived from it. — Eru·tuon21:35, 14 May 2019 (UTC)Reply
I don't mind one way or another, but the whole category tree then needs to be renamed for consistency. (@Donnanz: how is car ambiguous? Do you mean it could be confused for, say, a train carriage or something?) — SGconlaw (talk) 10:34, 3 June 2019 (UTC)Reply
Well, car is used especially in US English for a railroad car (either freight or passenger), and can be used in BrE for a railway passenger carriage. I feel the word auto can be ambiguous as well; "auto parts" can be used in the UK, but "car parts" is preferred. The word "auto" isn't used for a motor car in the UK. There is another category, Category:Automotive, so Category:Automotive parts may be a solution. DonnanZ (talk) 13:52, 3 June 2019 (UTC)Reply
As I see it, have isn't part of the metaphor, but it is part of an expression that is not in turn a form of tie someone's hands. The passive (one's) hands are/were/being/been tied are such forms, though none make for a good lemma entry or likely searches. DCDuring (talk) 13:38, 25 June 2019 (UTC)Reply
Indeed. Unless the active form is very uncommon, I'd prefer it as the lemma. I don't think that we would be wrong have both the active-voice expression and the have and/or get expressions, even though we could argue that it is a matter of grammar that one can transform certain expressions in the way Lambian describes. DCDuring (talk) 22:31, 25 June 2019 (UTC)Reply
Latest comment: 22 days ago4 comments3 people in discussion
We say ourselves in the entry for oxymoron that its use to mean "contradiction in terms" is loose and sometimes proscribed (despite the fact that many people use it this way nowadays). We say much the same thing at contradiction in terms as well.
The so-called oxymorons in this category are all or almost all contradictions in terms, where the contradiction is accidental or comes about only by interpreting the component words in a different way from their actual meanings in the phrase. An oxymoron in the strict sense has an intentional contradiction. I think we should be more precise about this, in the same way as we already are with using the term "blend" instead of "portmanteau", which has a narrower meaning. I therefore suggest we move this page to "Category:English contradictions in terms" (but see my second comment below). Likewise for any corresponding categories for other languages. — Paul G (talk) 06:51, 25 August 2019 (UTC)Reply
On second thoughts, I think this category should be retained but restricted to true oxymorons, such as "bittersweet" and "deafening silence". Ones such as "man-child" and "pianoforte" are not intended to be oxymoronic and are only accidentally contradictions in terms. — Paul G (talk) 17:18, 26 August 2019 (UTC)Reply
Latest comment: 4 years ago8 comments7 people in discussion
@Donnanz, Fay Freak, Rua I'm not sure what the real difference is between a city and a town, and I suspect most people don't know either. For this reason I think we should maybe merge the two into a single 'Cities and towns in Foo' category. Benwing2 (talk) 03:54, 17 January 2020 (UTC)Reply
I oppose this merger. I would not think to look for a category with such an unintuitive name, and I do not know of any examples where this is problematic. Wikipedia seems to be able to choose which word to use without trouble, so why can't we? —Μετάknowledgediscuss/deeds05:36, 17 January 2020 (UTC)Reply
Eliminating one of them is a good idea where there is no meaningful distinction between cities and towns. But that's going to be a country-specific decision: England makes the decision, the Netherlands does not. I think in cases without a distinction, we should keep "cities" and eliminate "towns". —Rua (mew) 10:15, 17 January 2020 (UTC)Reply
I wouldn't recommend merging them. It's a complex subject though, and the rules defining cities and towns can differ from country to country, and from state to state in the USA; I have come across "cities" with a population of less than 1,000 in the USA, sometimes around 50, but apparently they have that status. Cities in the UK have that status as granted by a monarch, towns can be harder to define in metropolitan areas, and villages can call themselves towns if they have a town council. Some villages large enough to be towns prefer to keep the village title. DonnanZ (talk) 10:34, 17 January 2020 (UTC)Reply
The odds that editors will accurately/consistently distinguish these categories when adding (the template that generate) them ... seems low. However, even if the categories are merged, that problem will remain on the level of the displayed definitions. And, apparently some users above want to keep them distinct. So, meh. - -sche(discuss)05:36, 18 January 2020 (UTC)Reply
I can see arguments for both sides, actually. The idea needs a lot more thought, as you would probably have to drag in villages etc. as well. DonnanZ (talk) 14:15, 19 January 2020 (UTC)Reply
Could merge them into Municipalities in Foo and have the various alternatives point to that category. Of course there are some "cities" which contain several municipalities, but I don't think there is a word which comprises every form of village/town/hamlet/city/urban area. - TheDaveRoss12:47, 21 January 2020 (UTC)Reply
In New York State alone, we have cities, towns, villages (which are subdivisions of towns), and unincorporated places, all of which exist within counties, except NY City, which is coextensive with 5 counties, each of which is coextensive with a borough of the City. The identities and borders of these places in NYS are generally fairly stable, though subject to occasional revision. Legislative and judicial districts are separate, with legislative districts changing after each decennial census. Census-designated places form a parallel structure with relationships to the state systems. The census system has the virtue of being uniform for the entire US, but the borders of many census places do not necessarily correspond to the borders of larger governmental units such as states and counties. Within New York State there are lists of each type of jurisdiction. In principle each US state has its own names for classes of jurisdictions. Finally, in popular practice, place names for inhabited place can differ from the names of governmental units and tend to have different boundaries even when the names are the same.
In light of the lack of homogeneity even within the US, let alone between countries, I think we need to respect national and state and provincial naming systems. If there is a worldwide system for categorizing places, we could also follow that, but I have not heard of such a system. Does the EU have some uniform system?
In the absence of any generally accepted uniform universal or near-universal system for categorizing places, I think we need to accept the fact that nations and semi-sovereign parts of nations (eg, US states, Canadian provinces) each have their own naming systems, which are accepted within their boundaries. I think it would be foolish for us to attempt to have our own system for categorizing places and derelict for us to fail to use the various national and subnational categories.
Oppose: Even though "out on a limb" can be used without "go", I think "go out on a limb" =/= "go" + "out on a limb", so I think we should keep them separate after all. — excarnateSojourner (ta·co)02:14, 7 April 2024 (UTC)Reply
In addition of all the tense, person, and number variants (also contractions) of the current entry one can find variants omitting the pronoun, adding adverbs, using till or 'til instead of until; [VERB] oneself blue in the face; go|become|turn blue in the face; and blue-in-the-face and blue in the face as adjectives outside any of these expressions. The unchanging core of these is the set phrase blue in the face. It also has medical use (synonym cyanotic), which renders the figurative sense evolution and meaning obvious. DCDuring (talk) 17:39, 15 April 2020 (UTC)Reply
@Metaknowledge: I agree, and I don't think it's needed for any purpose, at least not for Azerbaijani. There was a user (or anon?) who insisted on adding those "underlying" verbs and creating templates for them, but I never understood the linguistics behind this reasoning. Allahverdi Verdizade (talk) 23:42, 8 March 2021 (UTC)Reply
Sense 3 of state should cover it. I think if 3(a) doesn't cover it, then "Never do anything against conscience even if the state demands it." is not an appropriate citation thereof; I think Einstein would consider national, state, and city governments all part of "the state".--Prosfilaes (talk) 07:04, 17 August 2020 (UTC)Reply
Latest comment: 1 day ago12 comments4 people in discussion
I wonder if these all ought to be merged into some entry akin to "play the ____ card" or something. There appear to be other words substituted aside from victim, race, and gender. Tharthan (talk) 22:09, 21 May 2020 (UTC)Reply
I lament that our way of handling snowclones is not optimal, banishing them to appendix-space, such that the choices here amount to 'have these multiple similar entries in the mainspace where users find them' or 'banish them to a tidy but less-findable appendix'. However, I see that we have a sense at card for this (although the definition could use some work), and between putting a link there and redirects from these entries, I suppose we could get by with migrating these to the snowclone appendix. Centralizing them does seem sensible since there are so many. ("Play the religion card" also exists.) - -sche(discuss)23:56, 21 May 2020 (UTC)Reply
Interesting idea. Perhaps there would be an extensive entry for play the (something) card, but with full entries for the main attestable instances (eg, race/gender and perhaps victim, derived terms, and a usage note about "(something)". Play the X card seems to be something that would be highly productive, unless its use in too many cases would be deemed a microaggression. Attestation for play the (something) card would have to be limited to "somethings" other than the forms that have their own attestation. Other instances that I can readily find are disability, oppression, and queer. The uses of feminist and bully don't fit the "victim" semantics, which might warrant a second figurative definition for play the (something) card in addition to a {{&lit}} "definition". DCDuring (talk) 15:10, 4 September 2022 (UTC)Reply
Besides those, there's "play the poverty card", "play the gay card", "play the abuse card", "play the disabled card", "play the rape card", etc., as well as ones which, as you say, seem like they may have different semantics (e.g. some uses of "play the Muslim card" in reference to legislation ?to get Muslim support?, and some uses of "play the Holocaust card"?) ... it seems too productive to have entries for every attested X (it becomes SOP). Should this be in the mainspace as play the something card, or at Appendix:Snowclones/play the X card like Appendix:Snowclones/X is the new Y? For snowclones like this that require placeholders other than "someone"/"one" in the title, we seem to in recent years prefer to put them in Appendix:Snowclones/ rather than in mainspace, but I do see a handful of mainspace titles where "something" is a placeholder, like give something a go. If we redirect all the variations people might search for, add usexes to the relevant sense we list at card, and maybe add a usex to whichever sense of play is relevant, it should be sufficiently findable. - -sche(discuss)16:48, 4 September 2022 (UTC)Reply
I'd favor having a full entry for any term (presumably they would be attestable) that another dictionary had. It is unfortunate that our basic search engine searching for "play the disabled card" (with or without quotes) does not take a user to any of our existing play the X card entries. (I have added test entries for play the card and play the something card.) That would imply that we could use hard redirects for as many attestable instances of the snow clone as seem likely to help users. It may well be that the hard redirects should go to the snowclone appendix subpages, but there is no particular reason to do so in preference to a mainspace entry. Concern about the aesthetics of headwords with a placeholders seem misplaced. And (who knows?) someone might actually search for the expression using a placeholder and find it if it were in principal namespace. DCDuring (talk) 20:27, 4 September 2022 (UTC)Reply
Also, as the MWOnline entry shows play is not strictly essential; it can be replaced by use, among other verbs, such as deploy. So, perhaps a sense of card is an appropriate target for redirects. But I doubt that the entry for card is the right place for an intelligible presentation. For one thing, any etymology (sense derivation), usage notes, and derived terms or collocations (eg, race card) would necessarily be separated from the relevant definition for the polysemous noun, so as not to appear on the same screen. And, even if they did, that they belonged together would not be at all obvious. I realize that this kind of argument, if applied, might make for some inconsistency in our presentation of snowclones and might violate a strict reading of idiomaticity, but cases like this may merit exceptional treatment. DCDuring (talk) 21:05, 4 September 2022 (UTC)Reply
Onelook finds "play the race card" and "play the gender card" in various dictionaries, but "play the victim card" only in us, and it seems unlike the others in other ways, too: card seems unnecessary, as the same meaning is expressed by play the victim. As you note, all of these can also be found with other verbs, like "use". I am inclined to redirect play the victim card to either card's relevant sense or play the something card. It has a Swedish translation; if there are others, I would think play the victim would be the better THUB location. I'm not sure what to do about play the gender card and play the race card; on one hand, each is in other Onelook dictionaries; OTOH, you can swap out "play" for "use", "gender" for other things ("sexism", "sex", "woman", and with different meaning other things like "religion", etc), it's not a set phrase and the kernel of idiomaticity is obviously some smaller part, maybe just card, not the whole phrase. - -sche(discuss)17:03, 7 September 2023 (UTC)Reply
As to card, we may need two additional definitions, one for the general metaphorical sense, another (subsense?) for the more specific sociopolitical use. As Equinox observed elsewhere, the metaphor of a competitive card game must be understood for the expression to make any sense at all.
(figuratively) A ploy of potentially advantageous use in a situation viewed as analogous to a card game.
The only card left for him to play was playing dumb.
An invocation of an emotionally or politically charged issue or symbol, as in a political competition.
Latest comment: 5 months ago5 comments5 people in discussion
I don't think this is a special phrase with "you're", it sounds like a phrasal verb be on. They want a fight? They're on! She issued a challenge, so she's on!. You can also use it in reference to the fight itself, e.g. the fight is on. 76.100.241.8918:51, 23 May 2020 (UTC)Reply
Hmm, perhaps. But the IP is right that "on" can be used with other pronouns. I suppose the question is whether this is better viewed as someone is on, be on, or just on: we already have a sense for this at on, "(informal) Destined, normally in the context of a challenge being accepted; involved, doomed. "Five bucks says the Cavs win tonight." ―"You're on!"Mike just threw coffee onto Paul's lap. It's on now." - -sche(discuss)04:25, 1 August 2020 (UTC)Reply
I have never heard the they're on or she's on examples given by OP, and the second one doesn't even fit, since it's about acceptance of a challenge. I've heard it's on, but that's slightly different. Theknightwho (talk) 02:43, 6 September 2024 (UTC)Reply
@PUC: I have no objections to the move, however, I'm not entirely sure that *vьlkodlakъ was the primary form. Semantically, it makes sense to analyze the lemma as Proto-Slavic*vьlkodolkъ = *vьlkъ(“wolf”) + *dolka(“skin”) + *-ъ with -ol- > -la- metathesis or possible *vьlkodьlakъ (less likely in view of East Slavic forms with *-olo-, e.g. Russianвурдала́к(vurdalák, “vampire”)[1] /first recorded in written form in 18-19 cent./). You should consult with User:Rua in regard to which form should be created - *vьlkodlakъ or *vьlkodolkъ. I'm not so familiar with the style that Wiktionary likes to follows. Безименен (talk) 12:25, 10 July 2020 (UTC)Reply
If the original form had -dl-, why do we not see it in the other languages that preserve it, such as Polish? —Rua (mew) 13:25, 10 July 2020 (UTC)Reply
Not sure, but looking again at the entry, it seems not only Czech but also Serbo-Croatian and Slovene preserve the -dl- as well. --108.20.184.1916:51, 10 July 2020 (UTC)Reply
Latest comment: 5 months ago4 comments3 people in discussion
An example of w:U and non-U English, which probably should be decided for the latter. While “scent” can possibly be broader, this category also has the danger of just about including anything that has a strong odour naturally. Hence I included بَارْزَد(bārzad, “galbanum”) and جُنْدُبَادَسْتَر(jundubādastar, “castoreum”). The English category has a weak six entries since created in 2011. But even Category:en:Perfumes includes dubious things. I doubt perfumes are something that can be categorized well – it’s basically anything smelly? –, maybe delete all? Fay Freak (talk) 01:09, 27 July 2020 (UTC)Reply
I think a case could be made for "scent" being not something that smells, but smell itself (like musk and maybe putridity). I don't see any reason why perfumes can't be categorized. I don't think it's meant to include anything that could be used as the scent of a perfume, but words that specifically describe perfumes. For instance, cologne isn't "cologne-scented", it's the name of a type of perfume; jasmine is a plant, but it is also used as the word for a perfume, not just to describe a perfume (you could say, "She always wore a liberal quantity of jasmine" and not just "She always wore a liberal quantity of jasmine-scented perfume". Of course, you could also say "She always wore a liberal quantity of Autumn Breeze" because it's a proper noun, but I don't think you could say "She always wore a liberal quantity of lilac". Instead you would say "lilac perfume".) Andrew Sheedy (talk) 03:07, 27 July 2020 (UTC)Reply
Latest comment: 4 years ago6 comments4 people in discussion
IMO it does not make sense to have some terms categorized directly into Category:Regional English (not its subcategories) and other terms categorized directly into Category:English dialectal terms, because in practice no-one seems to be maintaining a distinction as far as putting one kind of entry in one and another in the other, it seems haphazard as to whether an entry uses e.g. {{lb|en|US|regional}} / {{lb|en|UK|regional}} like pope, mercury, jack, snap, wedge, phosphate, tab, or gob, or else uses {{lb|en|US|dialectal}} / {{lb|en|UK|dialectal}} like pope (!), admire, haunt, on, sook, book, yinz, and gon. Many of the {{lb|en|US|dialectal}} / {{lb|en|UK|dialectal}} terms go on to specify which regions they're used in, like "Pittsburgh and Appalachia" or "Northern England" or "Scotland". And we put every more specific dialect category as a subcat of "Regional", not of "Dialectal". I'm not entirely sure which category the entries in the two top-level categories should be consolidated into, but I'm inclined to think they should go in one or the other. Or do we want to try to implement some distinction? (At the very least, entries that use "regional" but then go on to specify the regions, like "US, regional, Pittsburgh", can drop the unnecessary "regional".) The one situation I can think of where simply changing "regional" to "dialectal" would not work is that some entries are labelled "regional AAVE". Thoughts? - -sche(discuss)01:06, 10 October 2020 (UTC)Reply
I personally think that dialectal and regional terms should be separated. Since a term for something in a region from an out-of-region dialect should be categorize into both regional dialects. -- 65.92.244.14716:29, 22 November 2020 (UTC)Reply
That doesn't make sense. It's not the thing referred to that makes it regional or dialectal, it's the term itself. Do you have an example in mind? Chuck Entz (talk) 18:21, 22 November 2020 (UTC)Reply
I think the real problem is that it's not clear what we mean when we say something is dialectal. Linguistically, a dialect can be any speech variety that is separate from the rest of the language. With a language such as English that has multiple standards, you could say that much of the language is dialectal, though no one uses the term that way. I suspect there may be a value judgment involved: dialectal English is the way local people talk when they're not using proper English. Regional has less of that: I say potayto and you say potahto, but that's just a matter of geography. Theoretically, sociolects like AAVE and Cockney would be better described as dialectal than regional, but I'm not sure whether they're described as either. For a lot of people, though, it's probably whatever it's called in the references they check (or copy from). Chuck Entz (talk) 18:21, 22 November 2020 (UTC)Reply
"dialectal English is the way local people talk when they're not using proper English".
What, pray tell, is proper English? General Australian? Standard Canadian English? General American (*had trouble including that as a suggestion with a straight face*)? Standard Indian English?
If someone were to suggest that whatever is arbitrarily declared to be the 'standard' dialect of the English in their country is thus "proper English", and every other dialect is not, then that is obvious nonsense. I get that that is the reason why you used the phrasing value judgement, but if what you suggest to be going on is actually going on, then that is a problem.
Wiktionary aims to be descriptive, not prescriptive. So if the category "Regional English" is being used to suggest that certain dialectal terms are more "proper" than others, then we need to get rid of one category or the other. Tharthan (talk) 18:42, 22 November 2020 (UTC)Reply
I'm not agreeing with the value judgment. I was too lazy this morning to put everything in quotation marks. The basic problem is that this terminology goes back to earlier academic standards and it's hard to tell what it means in a more modern context. A dialectologist or other linguist would probably have a more rigorous definition, but we don't seem to. Chuck Entz (talk) 19:36, 22 November 2020 (UTC)Reply
Changing the name of the category will lead to greater consistency with Category:Conlanging, putting the contrast between the purpose of each category (names of constructed languages vs. conlanging terminology) in sharper relief.
The odd choice of wording was intended to avoid the topical category conflicting with Category:Constructed languages, which is a holding category for those languages. Given that our MediaWiki trappings make it impossible to resolve this conflict, I support this proposal as a better compromise. —Μετάknowledgediscuss/deeds06:16, 8 November 2020 (UTC)Reply
I'd bet that you couldn't come up with definitions on the merged entry that were both complete and subsitutable as both adjective and adverb in such definitions. Also I'd expect that synonyms might need to be distinguished by PoS. DCDuring (talk) 16:22, 5 January 2021 (UTC)Reply
Hmm. I think the adjective is normally all-out. The adverb seems to mostly be "all out". So it seems like each POS is best situated where it is, on its own page...? - -sche(discuss)05:14, 31 March 2024 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
These are terms that were historically used in the Dutch East Indies, perhaps to some degree also in Malay-speaking territories of the Dutch East India Company. A rename to Category:Dutch_East_Indies_Malay makes the most sense. It is doubtful that a category "Netherlands Malay" is needed because the number of speakers of Malay in the Netherlands is not very high. ←₰-→Lingo BingoDingo (talk) 19:57, 10 January 2021 (UTC)Reply
Latest comment: 4 years ago4 comments3 people in discussion
Not synonyms, of course, but certain senses overlap almost entirely (except people have edited one and not the other without realising). Equinox◑04:12, 14 February 2021 (UTC)Reply
An approach would be to put all and only the true definitions that are most commonly use a given spelling in that spelling and also have a definition in each saying that it is a synonym of the other spelling. That might not be exactly true, but would be close. To rely on the other term appearing in related terms seems a bit weak. DCDuring (talk) 04:19, 14 February 2021 (UTC)Reply
Yeah, I think that's what we may have to do, with glosses in the {{synonym of}}s to make clear that each entry being a {{synonym of}} the other is not (just) circular. Like egoist vs egotist (we are not the only dictionary to have a sense line defining each of those terms as the other, in addition to other definitions). - -sche(discuss)19:45, 14 February 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
The definitions are different (I think fight shy is better; the other is too vague) and it seems that the entries should be merged anyway. Note that fight shy can occur alone, without of. Equinox◑02:45, 12 April 2021 (UTC)Reply
In this case (spitting) it should be "someone's", because "one" spits in "someone" else's face. We use "one" where the phrase is constructed so that it happens to oneself. Equinox◑20:27, 23 May 2021 (UTC)Reply
On the other hand, it's conceivable that someone says, "How dare you spit in my face?", meaning that the person addressed has treated the speaker disrespectfully. — SGconlaw (talk) 17:44, 4 June 2021 (UTC)Reply
Ah I see now, the distinction is that one's constructions are supposed to be reflexive. The distinction in titling however is not obvious and I wish it were made clear somewhere. — 69.120.64.1503:57, 5 June 2021 (UTC)Reply
Latest comment: 3 years ago3 comments1 person in discussion
All of the other Proto-Tocharian entries so far use ⟨y⟩ for this phoneme */j/, equivalent to Adams' ⟨i̯⟩. This is also the letter used on the Wikipedia article for Proto-Tocharian and in the standard romanization of Tocharian languages, which we use, not to mention for the corresponding phoneme in PIE, *y. It would be nonsensical and confusing to use ⟨j⟩ instead for the Proto-Tocharian stage only. The page was created recently (April), so presumably its creator just forgot to check the existing entries. — 69.120.64.1503:37, 5 June 2021 (UTC)Reply
Wait, apparently there is a distinction in how Adams uses ⟨i̯⟩ versus ⟨y⟩ for Proto-Tocharian, but I have no clue what it is. (It has nothing to do with PIE *d versus *y, for instance, and nothing to do with laryngeals.) — 69.120.64.1504:18, 5 June 2021 (UTC)Reply
Ok, it seems to be non-phonemic and have to do with the following vowel. */jä/ (⟨ä⟩ ≈ IPA /ɨ/) and */jē/ are written ⟨i̯ä⟩ and ⟨i̯ē⟩ respectively, but /jV/ for all other vowels seem to use ⟨y⟩. I doubt this is a necessary distinction for Wiktionary to make, since it seems entirely predictable from environment, but I'm still unsure what purpose it is meant to serve. @GabeMoore, might you be able to weigh in? — 69.120.64.1504:34, 5 June 2021 (UTC)Reply
Latest comment: 2 years ago10 comments9 people in discussion
I'm all but certain that one can't have a word without pronounced vowels, but I feel that it reads better if it's explicitly stated anyway. Johano★01:15, 15 June 2021 (UTC)Reply
Yeah (except maybe "English terms"), that would also reduce how dumb it looks that the category includes lots of numbers which are quite regularly pronounced with vowels, and things where the vowels have merely been obscured (b****cks), and abbreviations that aren't even "words" per se, like BHD. - -sche(discuss)22:03, 8 July 2021 (UTC)Reply
Also, why is it a subcategory of Category:English shortenings? Sure, a lot of shortenings omit the vowels, but the converse isn't true: hmm, grr, 1984 (unless every number is a shortening of its spelled out form, which doesn't seem all that useful). Do I need to start a separate request to remove a subcategory? Medmunds (talk) 18:53, 18 March 2022 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
A bit fiddly: one entry is a verb and the other a noun, and they both have multiple senses with slight distinctions that should be ironed out. Equinox◑13:13, 25 July 2021 (UTC)Reply
User:Kwamikagami has moved a few Rapa Nui pages en masse from a straight apostrophe (U+0027) to a saltillo (U+A78C)
The reason they give for this is that, since unicode classifies the apostrophe as a punctuation mark, rather than a letter, it shouldn't be used as a letter, and thus the visually similar saltillo should be used.
The counter-reason given is that Unicode's classification is arbitrary and has little to do with actual usage in the language, which we as Wiktionary want to follow.
There is one mention of the saltillo being open to usage, in Kieviet (2007).
There is yet to be found at least one usage of the saltillo for Rapa Nui in the wild. since both Kieviet (2007) and schoolbooks published by the Chilean government use either a straight apostrophe (U+0027) [This one is most common], or a curly apostrophe (U+2018) provided with a font that renders it similar to a prime (U+2032). Other grammar books and dictionaries use any of the three characters.
I believe we should move these pages back to a straight apostrophe, and set the use of the straight apostrophe in stone at WT:ARAP. What do others think? Thadh (talk) 10:53, 3 September 2021 (UTC)Reply
We have four sources:
We have Du Feu, who used a special font because the usual fonts available to her were inadequate for Rapa Nui, which required two special letters (the glottal stop and the engma). If the ASCII apostrophe were adequate for glottal stop, there would've been no need for a special letter.
We have Kieviet, who states that, now that Unicode provides for the saltillo, there is no longer a need for a special font.
We have the ministry dictionary, which uses an apostrophe letter -- not ASCII input with smart quotes, because it has the '9' shape at the beginning of a word.
We have the ministry educational material, which uses a hodgepodge of ASCII apostrophes, curly apostrophes and curly quotation marks -- that is, sometimes '1' shaped, sometimes '9' shaped and sometimes '6' shaped, with little consistency. Presumably we wish to aim for better than that, even if it is common.
In most languages that use an apostrophe-like letter for glottal stop, it's common to substitute a keyboard <'>, but that doesn't mean we should do the same. When writing Chechen, it's common to use a digit <1> for palochka, but again that doesn't mean we should do the same. When writing Ossetian, it's common to use a Latin rather than Cyrillic æ, but if you did that in a domain name, it would likely be tagged as phishing. The shortcuts people take with typography may be common, but a dictionary is expected to be more professional. kwami (talk) 15:10, 3 September 2021 (UTC)Reply
To briefly summarise the important points of what I said on Thadh's Kwamikagami's talk page: This move should have been raised here first, so the weight of the evidence should have to point to the saltillo for us not to move it back. Kwami is from Wikipedia, and believes that we should be "more professional", even at the cost of ignoring all actual usage in a language community. (He has not, to the best of my knowledge, taken me up on my suggestion that he should go to the Wikipedias of languages like Rapa Nui and Hausa that use the apostrophe, and tell them that they're doing it wrong — just us.) I was open to the possibility that the saltillo might see actual use, but the fact that it doesn't makes this seem to be all about the Unicode specifications, which are not relevant to a descriptive dictionary. As I result, I support moving back to the apostrophe. —Μετάknowledgediscuss/deeds16:15, 3 September 2021 (UTC)Reply
There are several recent cases where @Mahagaja has advocated a particular Unicode character instead of a the straight apostrophe in such cases, but I don't remember the specifics off the top of my head. Chuck Entz (talk) 16:18, 3 September 2021 (UTC)Reply
We do need to use Unicode correctly. The straight apostrophe (U+0027) and curly apostrophe (U+2018) are punctuation marks and should not be used as letters. That's what the saltillo (U+A78C) and modifier letter apostrophe (U+02BC) are for. If using punctuation marks as letters were acceptable, Unicode wouldn't have bothered creating those characters. Using punctuation marks for letters is as bad as mixing Latin and Cyrillic (which is something we used to do for Montenegrin Serbo-Croatian, but don't anymore), as Kwami points out, and just because other sources do it doesn't mean we should. We can, of course, have hard redirects from spellings with the more easily typable straight apostrophe, or put the correct page name in {{also}} if the spelling with the straight apostrophe exists (as a punctuation mark) in another language. But Kwami was quite right to move these Rapa Nui pages to the spelling using the correct character, and they should not be moved back. —Mahāgaja · talk16:39, 3 September 2021 (UTC)Reply
@Mahagaja: So if nearly everyone writing text in a given language (say, tens of millions of people) use a character that you consider "wrong", we should still avoid it because it doesn't respect Unicode? Whatever happened to descriptivism? (And if you think this is a silly hypothetical, it's not — I just described the situation with the apostrophe in Hausa.) —Μετάknowledgediscuss/deeds21:56, 3 September 2021 (UTC)Reply
It reminds me of when I started adding entries in the Cupeño language and had to figure out how to deal with a letter that the (pre-Unicode) main source defends as being very easy to replicate by filing bits off the $ key on a typewriter. People work with what they have available, and it doesn't always fit neatly into the right categories. Chuck Entz (talk) 23:00, 3 September 2021 (UTC)Reply
@Metaknowledge: If tens of millions of people used Rapa Nui, it would have its own keyboard layout and the saltillo would be easy to type for them. Descriptivism applies to language, not orthography. It's not anti-descriptivist to say that recieve is a misspelling, and using an apostrophe as a letter is also a misspelling. The only difference is that using an apostrophe instead of a saltillo isn't a mistake that can be made when writing by hand or by typewriter or that can be detected in a photocopy or a scan, so it's more subtle (like mixing Latin and Cyrillic), but it's still a mistake. —Mahāgaja · talk06:49, 4 September 2021 (UTC)Reply
@Mahagaja: As I said, my example wasn't a hypothetical. There are somewhere around 60 million native speakers of Hausa per WP. Mac offers lots of keyboards for lots of languages, including one for Hawaiian complete with ʻokina, but it doesn't provide a Hausa one. When I search for Hausa keyboards on Google, they provide the apostrophe and quotation marks, but no character designated by Unicode as a letter. So are you really maintaining that nearly all typed material in Hausa is misspelt? —Μετάknowledgediscuss/deeds07:11, 4 September 2021 (UTC)Reply
Yes, though of course that's not the Hausa users' fault, it's the fault of the software companies that care more about providing support for a minority language spoken by 24,000 people in the United States than about providing support for a language spoken by tens of millions of people in Africa (i.e. systemic racism). I don't blame Hausa users for doing the best they can with the materials available to them, and I know it's unrealistic to expect them all to type ʼ instead of just hitting the apostrophe key, but as a dictionary it's our responsibility to do things the right way rather than the easy way. —Mahāgaja · talk07:29, 4 September 2021 (UTC)Reply
@Mahagaja: Systemic racism is the root cause of lots of annoying things, but some of those things are set in stone. At this point, Hausa users have no reason to follow Unicode rules even when they can. I'm sure the editors at Hausa Wikipedia can figure out how to get the "correct" character if they wanted to, but I see that you too have no interest in going over there and telling them they're doing it wrong. I have a radical idea: let's respect their choices. —Μετάknowledgediscuss/deeds17:57, 4 September 2021 (UTC)Reply
I have a better idea. We'll let Hausa Wikipedia worry about Hausa Wikipedia, and we'll worry about Wiktionary, which, as I said, has a responsibility to use Unicode correctly, even when other Wikimedia projects use it wrong. —Mahāgaja · talk18:11, 4 September 2021 (UTC)Reply
If Wikitionary or our browsers represent the languages incorrectly, because they follow the Unicode definition that punctuation marks are punctuation marks, then we are not documenting the languages correctly. If a language commission chooses a specific Unicode point that is one thing, but that's seldom the case. Since we by necessity choose a Unicode point for each letter regardless, we might as well choose one that represents the language well. kwami (talk) 01:17, 11 September 2021 (UTC)Reply
Just to jump in quickly, as someone who is Nigerian and has had to go through the process of creating my own keyboards to be able to type properly in Yorùbá and as someone who is learning Hawaiian, while there definitely is systemic racism when it comes to African languages, I really would not pit them against Hawaiian. Hawaiian still lacks a ton of support, sometimes even less than Hausa, Igbo, & Yorùbá (see: spellcheck on PC Microsoft Word or language packs for Windows), and people are still trying to get more support for it. At the same time, Hawaiian is more than just "a minority language spoken by 24,000 people in the United States", it is an indigenous language that currently is the product of tons of effort gone towards revitalizing it and making sure that it's well-supported. And so, please do not pit them against each other saying that Hawaiian having more support (even though it doesn't) is systemic racism. The communities are aiming for very similar goals and are all dealing with racism in our own ways, not from each other.
Re: the main issue at hand. I would go with what the speakers of the language use. It's similar to what we do for Hausa, Igbo, & Yorùbá tones. No matter how annoying it can be, since the majority of speakers don't write tones out, we don't put them in page titles and only in headword lines, since we want people to be able to find words that they see "in the wild", which will often not be tone-marked. So it's a similar issue here, if the majority of speakers and majority of texts don't use the special character and it's not hard prescribed, then the page title shouldn't change, and the special character can be put in the headword line. That's my personal take on that issue. AG202 (talk) 13:48, 11 September 2021 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ This isn't a case of leaving out elements like tone marks. All RS's for Rapa Nui use the glottal stop. It's a matter of deciding which Unicode point to use for it, not whether to include it.
Re. the poor support for W. African languages, that's not racism so much as bias in the interests of the people developing Unicode. When Unicode decided they would no longer accept precomposed Latin, there was a call for people to get what they needed in before the deadline. But the respondents were all working on European languages. After West Africans started complaining that Unicode didn't adequately support their languages, the Unicode people realized they'd fucked up. At least, the ones I've talked to say they wished they'd realized what was going to happen, and spent more time on major African languages than on minor European languages.
Now that there are supplemental planes, there's room for more precomposed Latin. But as computers improve, there's less and less need for it, so I doubt they'll start accepting precomposed letters again.
I find it amazing that you could write Yoruba without tone. I mean, you can write it without vowels as long as you include the tone! kwami (talk) 23:06, 18 October 2021 (UTC)Reply
There are many reasons that folks don't write Yoruba with tones, partly because of a lack of solid education, partly because of a lack of technological support, partly because you can (usually) tell what you mean from context, and a ton more. There was a solid seminar done last year at the British Library about it actually, but yea it's complicated. I wish that precomposed characters could be brought back, but that's a pipe dream. I don't think that you can write it without vowels as long as you include the tone though, as Yoruba is very vowel-heavy, and it'd get confusing quickly.
In terms of the question at hand though, I brought up the comparison more of a way to show how proscribed writing & everyday writing can interact on Wiktionary. If the majority of speakers type/write one way in informal & formal registers, that way should be the way that should be primarily reflected on Wiktionary, while the proscribed way can be shown in the headword line or an alternative form or whatever. However, I don't know the specifics of the situation with Rapa Nui, so I won't comment directly on the specifics of addressing it. AG202 (talk) 23:15, 18 October 2021 (UTC)Reply
My impression of Yoruba, from the very very little I think I know of it, is that in fluent colloquial speech the vowels tend to assimilate to each other, and even consonants sometimes drop out, so that you might be left with a long [ooooo] with a bunch of tones and just a few consonants. It's the tone that makes it comprehensible. But that's by ear; I guess it wouldn't work well in writing.
But Hausa, yeah, I can see omitting the tone without any problem, except maybe the need to dab an occasional word. You might learn to write those few words with tone, the way accent marks distinguish homonyms in Romance languages, and otherwise ignore it. And some languages mark changes in tone, rather than the tone of each syllable. But I doubt that would work for Yoruba either. kwami (talk) 23:23, 18 October 2021 (UTC)Reply
Input needed
This discussion needs further input in order to be successfully closed. Please take a look!
I have a suggestion that might be able to square this circle, but it's a bit awkward to explain so bear with me:
There are situations when it makes sense to remember the difference between the orthographic character a person intends to write, and the Unicode character which they actually use. A good example of this is the full stop "." (U+002E), which is also used in English (and Translingually) to represent the decimal point. We all agree that a full stop and decimal point are two different things, because any competent French translator would have to treat them differently, but the important thing is that that remains true regardless of which Unicode codepoints we happen to use. Indeed, it's true whether or not we're even encoding the characters at all. The same is true in French with the decimal point and the comma, too. Equally, nobody who receives "A-" on their homework is receiving "A dash" or "A hyphen".
Conversely, just because I write in full-width doesn't mean that any of us actually think "j" has a distinct identity to "j" etc. There might be technical, historical and/or stylistic reasons why we have both, but the point is that we consider them to have the same orthographic identity.
However, none of this prevents us from having a particular manual of style when it comes to certain characters. If we want to start using the en dash "–" (U+2013) or minus sign "−" (U+2212) in places where people intended to use them (i.e. intended characters with those orthographical identities), then that's fine. It would be no more of a problem than our choice to use a clear, black, legible font on white by default, when the original might be scrawled on a barely legible manuscript. Obviously there are no codepoints to interpret in cases such as that. Hell, a lot of the time the codepoints "used" are actually just whatever the OCR software vomited up anyway. Just like with misspellings, there needs to be some genuine intention, and it needs to be considered with respect to the orthographical identity of the characters, and not the codepoints they happened to pick.
A final point is that writers of a language don't necessarily know their own language perfectly, or they might not perceive a conscious distinction between two characters that does actually exist, because the context usually makes it so obvious (e.g. the full stop and the decimal point). It's not enough to say "yes, they intended to write an apostrophe because that's what they used". Are they really treating it as one?
I don't know enough about Rapa Nui to know whether the saltillo is the most appropriate character, but I hope that's a framework that makes it easier to determine the answer. Theknightwho (talk) 16:00, 7 July 2022 (UTC)Reply
This Turkish suffix entry is probably the same as -i, and possibly -ı, due to vowel harmony. While I don't know much about Turkish, the fact that this was created as the only Turkish edit ever by this contributor and the other two were created by a veteran contributor who is a native speaker has to count for something. Chuck Entz (talk) 05:45, 18 September 2021 (UTC)Reply
-u is a harmonized form of -i, as are -ü and -ı. The canonical forms of suffixes are those with i and e. I disagree with the current policy of essentially providing the same definition 4 (or 2) times, see for instance the situation with -im, -ım, -um, -üm where only one contains all meanings and etymological information. I'm in favor of keeping the harmonized realizations of the suffixes as separate articles but I'm strongly in favor of converting the non-canonical forms into simple referral pages (see -dük). --Fytcha (talk) 09:59, 18 September 2021 (UTC)Reply
This would almost certainly not survive on Wikipedia. I have mixed feelings about the Appendix namespace here, as it seems a lot of things go there that would never be acceptable in the main dictionary, but that appendix pages are so hard to find for the casual user that it doesn't really bring us down. —Soap—11:36, 13 April 2023 (UTC)Reply
If we do end up deleting this, i'd hope we could try to contact the editor who wrote most of it (see User_talk:DKThel). There are other wikis that could host content like this where they wouldn't be pushed into the background like our appendix pages are. Admittedly the trade-off for that is having ads and using a site that is itself harder to find. —Soap—11:47, 13 April 2023 (UTC)Reply
There's definitely no point in moving it to Wikipedia, since it was originally moved here from Wikipedia, so they already decided they don't want it and foisted it on us. That's why it's written so encyclopedically. If anyone's interested in it, they should clean it up to make it more dictionarian; otherwise we should just delete it. —Mahāgaja · talk12:34, 13 April 2023 (UTC)Reply
Sfacimma is an alternative spelling of sfaccimma, which is how the word has been mostly written in the past years; the IPA pronunciation of the word is nowadays always /ʃfat͡ʃˈt͡ʃimmə/, with the voiceless postalveolar affricate having always the gemination, hence the spelling cc. Antomanu14 (talk) 13:59, 10 December 2022 (UTC)Reply
I think it's been an alternative form of guarantee for a couple of centuries. Only recently (this decade) has it become much less common than guarantee. DCDuring (talk) 17:39, 13 November 2021 (UTC)Reply
I'm in favour of moving this page to *én. As {{R:ine:LIPP|page=221|vol=2}} shows, there is no evidence that points to an initial laryngeal and Greek and Vedic speak against it.
We reconstruct all PIE terms with an initial laryngeal on the project, per current PIE theory, so *én = *h₁én. Sidenote, {{R:ine:LIPP}} is an embarrassment in the academic community, and should never be used as a primary source. --Sokkjo (talk) 02:24, 8 February 2023 (UTC)Reply
I have elaborated to why accademics largely reject {{R:ine:LIPP}}, and referred you to this unfavorable review, DOI:10.1515/zcph-2019-0009. That's all neither here nor there, as on this project, we subscribe to larygeal theory, which also calls for word-intitial larygeals before vowels. If you wish to make an arugment for why we should do away with that standard, feel free to start a discussion in the WT:Beer parlour, but as is, your move request is unwarranted. --– Sokkjō08:52, 27 March 2023 (UTC)Reply
You seem to have not read that review. If you did you'd see that it is overwhelmingly positive:
"Ce sera un ouvrage de référence pour longtemps."
"Ces remarques ne retirent rien à l’importance de l’ouvrage, qui peut servir de base tant à une recherche synchronique éclairée consacrée à tel ou tel groupe de langues qu’à une étude proprement comparative."
The other review I'm aware of is also overwhelmingly positive:
"In this massive, and truly monumental, two-volume work that was years in the making, author George Dunkel (henceforth D) draws on the extensive research, and the literally dozens of articles, that he has done throughout his distinguished career as an Indo-Europeanist, investigating the uninflected bits and pieces – the ἄπτωτα (áptota), the indeclinabilia¹ – of the Indo-European lexicon that are so indispensable to the phrasal and sentential syntax and to discourse and text structure in all the family’s languages." https://www.jbe-platform.com/content/journals/10.1075/dia.33.4.05jos
Nothing about this thing takes away from laryngeal theory.
Since I'm guessing you don't have academic access to the second page:
Reste une réserve. Malgré la prudence de l’auteur, les processus de formation des grammèmes qu’il étudie relèvent, par définition, de la reconstruction, et l’ouvrage n’étudie pas de manière détaillée les processus qui ont lieu à date historique. Parfois le lecteur peut avoir l’impression que le système titué est d’une complexité qui le rend typologiquement invraisemblable; ainsi, vol. 1, pp. 24–26, l’auteur pense pouvoir reconstruire pour l’indo-européen quatre thèmes pronominaux qui relèvent de l’exophore proximale, deux qui relèvent de l’exophore distale, et quatre thèmes anaphoriques (George Dunkel écrit que les thèmes liés à l’exophore proximale et distale ne sont pas en contraste sémantique les uns avec les autres, mais seulement avec l’absence de déixis; ce point est obscur aux yeux du recenseur).Une telle richesse en thèmes démonstratifs nécessiterait une explication. Au demeurant l’opposition entre exophore proximale et distale n’est pas nécessairement suffisante pour couvrir tous les thèmes de l’indo-européen, qui a pu posséder par exemple trois degrés d’exophore. En fait il peut sembler que la reconstruction des grammèmes indo-européens est vouée au flou, faute de données permettant d’étudier, notamment, la sémantique exacte des éléments concernés aux différents stades chronologiques et dans les différentes aires géographiques à prendre en compte.
Again, I'm not here to agrue about LIPP -- that's beside the point. The point is that the established convention we follow on the project for reconstructioning PIE is that #VC- only possibly exists in pronouns, if even there. See {{R:ine:IEL|52}}. – Sokkjō23:03, 27 March 2023 (UTC)Reply
@Sokkjo, I read the whole review. I even quoted from the second page. As I said before, a reservation does not equal a invalidation.
The validity of LIPP is very much on point. I would like to mention an alternative reconstruction *én (and if others agree move the page), which is supported by evidence instead of on some misplaced assumption. You preclude any discussion by rejecting the evidence out of hand.
In addition, I would like to continue citing LIPP, so your violent objection to it ("embarrassment to the academic community") is relevant to me. I think it is fair to say that if you could back up your objection you would have done so by now.
I am of course aware that roots had a CₓVCₓ structure. There are good reasons for assuming this. However, this is not the case for suffixes, it is not the case for pronouns, it is not the case for adverbs and it is, indeed, not the case for particles.
Your "established convention" that #VC- entries aren't allowed, doesn't exist. If you think otherwise please point to it. WT:AINE does not mention the phonotactics of entries. And at any rate, WT:RECONS clearly says that "variants and disputed forms can then be addressed in great detail within the text of the pages themselves". If you don't want *én on the page, you need to have (at the very least) substantive arguments why the evidence supporting it is wrong.
But I'd ask you to please be more careful about your references. You keep quoting things which don't support your position. {{R:ine:IEL|52}}: "It seems that onsetless initial syllables (#VC-) were rare" ie not nonexistent. LIPP, the first systematically study of Indo-European particles, documents evidence for a substantial number of exactly these. —Caoimhin ceallach (talk) 01:23, 29 March 2023 (UTC)Reply
I'm aware of what I cited, "rare" meaning they are limited to pronouns, and to continue on to the following sentence, "It is common practice now to reconstruct initial laryngeals even when not strictly provable". You seem to be under the impression that I, created this "common practice" and I set that convention here on the project. I'm honored you think I have that seniority, but despite contributing here for over a decade, it long preceeds me. If you want to argue against the status quo, not just on this project, but in academia at large, the weight is on you to do so. – Sokkjō03:51, 29 March 2023 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
The art/literature senses are defined very differently at the two entries, which seems like a problem. One is already tagged for cleanup, so, good luck! Equinox◑00:21, 1 February 2022 (UTC)Reply
{{dialectboiler|zh|the 5th century BC to 2nd century AD, and continued as a [[literary language]] until the 20th century}}
[[Category:Old Chinese lemmas]]
[[Category:Middle Chinese lemmas]]
[[Category:Literary Chinese lemmas]]
I also think we should demote the resulting Category:Classical Chinese language to an etym-only language of Category:Chinese language. It's purely a literary construct and not on the same level as the spoken varieties. Note for example that we don't have separate languages for Classical Latin, Koranic Arabic or Modern Standard Arabic.
@Benwing2: "Literary Chinese" is the generally regarded the same as "Classical Chinese", but reading w:Classical_Chinese#Definitions I think we might want to keep them separate for lexicalgraphic purposes, by treating Literary Chinese as the later stages. Note that Classical Chinese is also distinct from "literary Chinese" (note the capitalisation), although they overlap in certain places (such as Ming/Qing era usage).
Using the hypothetical conjunction "if" as an example, 誠 / 诚(chéng) and 向使(xiàngshǐ) are found in Qin-Han era Classical Chinese, but not in Ming-Qing era Classical Chinese (which uses words closer to modern usage instead, i.e. literary terms) - I wouldn't call the former ones as literary terms, instead more like obsolete or archaic.
Early Classical Chinese (Qin-Han) is significantly different from modern Chinese in terms of grammar and pronunciation, a reasonably educated person would have a hard time understanding a text even with annotations; Tang-Song era Classical Chinese is still somewhat incomprehensible with a couple of obsolete (in modern standards) terms; late Classical Chinese (Ming-Qing) is more fuzzy and one might simply call it literary Chinese; in early modern times these are all considered to be one thing, which is why we have the misnomer Literary Chinese.
I think #1 of your proposal would be relatively uncontroversial, though I would wait to see input from others.
#2 is questionable, depending on what do we regard Chinese to be. Because we have everything placed under Chinese, this corresponds to like stuffing everything from Old Latin (or even earlier) to Neo-Latin into a subvariety of a Latin-Romance language without treating Latin itself as a language, which is a very awkward thing.
Classical Chinese has its own quotations (and as Fish bowl have mentioned we have Category:Korean Classical Chinese and Category:Vietnamese Classical Chinese - the quotations for these varieties are also placed under the Chinese L2), which are categorised as Category:Literary Chinese terms with quotations. Changing Classical Chinese to etymology-only would mean these quotations have nowhere to go - it is often impossible to discern where they should otherwise be treated. I would rather counter-propose that Category:Old Chinese language and Category:Middle Chinese language be treated as an etymology-only variant of Classical Chinese - OC and MC are essentially just a snapshot of the sound system at a particular time point in the history of Chinese.
#3 is 1000% a no, though I would support it if one day we were to accept Altaic languages as valid :)
Fish bowl's proposal might have been motivated by the similarities between late Classical Chinese and literary terms in modern Chinese, but a more in-depth look would suggest that this is untrue. – wpi (talk) 06:09, 19 September 2023 (UTC)Reply
@Wpi Old Chinese and Middle Chinese are more conventional languages, even if semi-reconstructed, so I would argue they should stay as full languages, whereas Classical Chinese is somewhat of an artificial construct, and normally we place those as etym languages. I think the issue here is that there is more than one Classical Chinese, whereas most Classical Foo languages are fairly unified. This suggests we should separate Classical Chinese into something like Old Classical Chinese or Early Classical Chinese (an etym language of Old Chinese), Middle Classical Chinese (an etym language of Middle Chinese), and Late Classical Chinese (an etym language of Chinese?). Benwing2 (talk) 06:20, 19 September 2023 (UTC)Reply
A few issues here. I think we've been kind of sloppy when it comes to the literary/classical distinction. Most entries have been using "literary" since that was what was the norm back in the day. "Classical" came later as the "Classical" label was introduced to other languages, which is probably why we have fewer uses of this label. While I think the Literary/Classical distinction is useful, I wonder if in labelling how we should be making the distinction. If a term is used in both Classical Chinese and Literary Chinese, such as 首 "head" (now labelled "archaic"), do we have to label it with both? Whatever we decide on, I think we also need to think about how this is organized in {{zh-x}}.
In principle, I think #1 would be something okay to do. #2 doesn't seem okay per Wpi. The issue with Classical Chinese is that it cannot fit neatly in OC or MC because as Wpi said, these are snapshots of the phonology. There is also Classical/Literary Chinese works written way after the Middle Chinese period, but not necessarily able to be considered as any modern Chinese variety. #3 would probably need to be worked out entry by entry. Some entries should probably be moved to Classical Chinese, but others may be used in highly formal modern writing. It might be difficult to distinguish the two given our previous usage of "literary" as a label. We would need to set stricter definitions for what goes where. — justin(r)leung{ (t...) | c=› }03:33, 20 September 2023 (UTC)Reply
@Justinrleung @Wpi @Fish bowl Pinging the people who previously participated as well as @Theknightwho. I am trying to convert all the bespoke variety codes in {{zh-x}} to standard codes. I added zhx-lit for "Literary Chinese", specifically the later stage of Classical Chinese; but this conflicts with the name of zhx. I really think we should rename zhx, probably to Classical Chinese. The only issue with this term is that sometimes Classical Chinese specifically seems to refer to the 5th century BC - 2nd century AD period, as in Category:Classical Chinese, or some other similar time period. I'm thinking maybe we need a different term for this: Han Classical Chinese? Although strictly speaking, the Han dynasty only began in the 3rd century BC. Or "Late Old Classical Chinese"? Please note, I also added zhx-pre for "Pre-Classical Chinese" corresponding to the old CL-PC code; but I have no idea if this makes any sense, as it seems awfully similar to Old Chinese. Benwing2 (talk) 04:47, 27 March 2024 (UTC)Reply
I should add, I also added a code cmn-bec for "Beijingic Mandarin", which is the primary branch of Mandarin that includes Beijing and environs. This is described in Wikipedia under Beijing Mandarin (division of Mandarin) whereas Beijing Mandarin itself (code cmn-bei) is described under Beijing dialect. The term "Beijingic" comes from Glottolog. This was added to correspond to the M-UIB code added and used primary by User:Dokurrat. Since M-UIB is described as "dialectal Beijingesque Mandarin", I assume it approximately corresponds to the Beijingic primary branch. Note that the existence of Beijingic is somewhat controversial as some researchers place Beijing and surrounding dialects into Northeastern Mandarin. I also added labels (but not etym codes) for all primary Mandarin branches and many individual dialects under these branches; basically, any dialect that had 4 or more mentions among the labels as well as any dialect where I could find a corresponding English Wikipedia page describing it. (There are more dialects with Chinese Wikipedia pages but I haven't yet found them all.) Eventually I think we should assign etym codes to most or all of these dialects but for the moment I'm mostly just collecting them into labels; once I have a fairly complete set of labels it will be easier to assign codes in a semi-consistent fashion. Also, I am ready to push the code to allow both new (standard) and old (bespoke) variety codes in {{zh-x}} and then convert all uses to the new codes, but this can't run until my current run obsoleting {{zh-noun}} and {{zh-hanzi}} finishes. (It's run for ~ 22 hours so far and has maybe 14 hours to go.) Benwing2 (talk) 07:01, 27 March 2024 (UTC)Reply
BTW here is the current mapping I have worked out from old bespoke {{zh-x}} codes to standard codes:
On a second thought, I don't think we should have yue-wvc and zhx-tai-wvc as they seems to be too similar to lzh-yue and lzh-tai, plus WVC-C is used only once in 萊苑 (and none for WVC-C-T). @Fish bowl who added the ux in 萊苑 for comment. – wpi (talk) 13:07, 27 March 2024 (UTC)Reply
WVC-C can probably be merged into C-LIT, but I don't have any particular suggestion for the *-C-T codes.
Mentioning this again: perhaps we should use a bipartite system giving the text language and the pronunciation language, such as lzh/cmn-TW (Literary Chinese in Taiwanese Mandarin pronunciation). —Fish bowl (talk) 03:49, 30 March 2024 (UTC)Reply
@Fish bowl Thanks for bringing this up; I missed it last time. In fact my recent overhaul of Module:zh-usex/data implemented something very similar. Essentially, there is the variety code, which is typically an etym-only language code, and then for each such variety there is a second "norm code" that is used for romanization (i.e. pronunciation) purposes. There's nothing preventing us from implementing your suggestion on top of this, if it proves necessary. Benwing2 (talk) 04:15, 30 March 2024 (UTC)Reply
@Fish bowl @Wpi Just to confirm: All three of WVC-C (yue-wvc) "Written vernacular Cantonese", CL-C (lzh-yue) "Classical Cantonese" and C-LIT (yue-lit) "Literary Cantonese" can be merged? This is based on @Wpi suggesting merging WVC-C with CL-C and @Fish bowl suggesting merging WVC-C with C-LIT. I assume that if these were different they would refer to different time periods (?), but I don't know if there's enough difference to warrant separation. Benwing2 (talk) 05:02, 30 March 2024 (UTC)Reply
@Benwing2: Incorrect. I believe the problem here is that we have multiple understanding of the usage of the Cantonese codes. General speaking, there are these types:
modern spoken Cantonese
modern written vernacular Chinese (e.g. Hong Kong Chinese)
written vernacular Chinese from 19th/early 20th century
Classical Chinese that uses Cantonese pronunciation
spoken Cantonese from 19th/early 20th century (e.g. dictionaries from missionaries)
They are especially difficult to tell apart when the phrase/sentence is short and does not contain much grammatical features.
Justin, Alex and I uses C for #1 and #5 (or the newly added C-HK when applicable), C-LIT for #2, and C-CL for #3 and #4.
Fish Bowl uses C-GZ for #1, WVC-C for #2 and #3, and C-CL for #4. (Please correct me if my understanding is incorrect)
@Wpi OK thanks and apologies for my confusion, I haven't encountered before (as a linguist) the situation where there's a big gap between the written and spoken forms and multiple ways of pronouncing a given written form. Benwing2 (talk) 20:11, 30 March 2024 (UTC)Reply
I should add, does anyone mind my renaming the old {{zh-x}} codes to the new ones? As shown in the above table, no information will be lost because there is a one-to-one mapping between the old and new codes. Benwing2 (talk) 02:56, 31 March 2024 (UTC)Reply
I don't particularly support the usage of the C-GZ [Guangzhou Cantonese] tag, and think that (for zh-usex at least) it can be safely merged into C [standard Cantonese]. For 2 (modern HK) I also use C-LIT.
@Fish bowl Is your thought that we should use just "Cantonese" (yue, or yue-can as suggested in the RFM discussion below) as the language code? This would be parallel to the normal handling of Latin, where Classical Latin terms are usually identified as just "Latin" (code la), although there's also a code for "Classical Latin" (code la-cla or CL.). The use of Guangzhou Cantonese specifically (yue-gua) as a code would then be restricted to cases where it's important to distinguish usage that is specific to urban Guangzhou speech as opposed to Standard Cantonese. I think this is also parallel to the use of cmn (Standard Mandarin) vs. cmn-bei for Beijing Mandarin. Benwing2 (talk) 00:48, 1 April 2024 (UTC)Reply
yue-gua: maybe yue-gzh? Keeping the initials is more sane IMO but I also remember that you wrote your own guideline in one of the other discussion.
Latest comment: 2 years ago1 comment1 person in discussion
The conjunction وَ(wa) is not part of the phrase really. The phrase does occur frequently with it, but this is mainly owing to the "idiomaticness" of conjunctions in Arabic, mostly in prose. It is a sentence in itself, roughly "There is no(thing) equal to", not like the English adverbials that have comparable meanings (such as particularly and especially or above all). The entry should therefore be moved لَا سِيَّمَا(lā siyyamā), with the "variant" with وَ(wa) deleted. Roger.M.Williams (talk) 18:11, 24 February 2022 (UTC)Reply
Latest comment: 1 year ago2 comments2 people in discussion
There still seems to be a lot of overlap here, e.g. the chandelier sense. Is there any sense of the word that cannot be spelled both ways? Equinox◑03:47, 4 March 2022 (UTC)Reply
Any English word ending with -er occassionally shows up as -re. It doesn't seem like this needs a tag and discussion, though. Some editor at lustre was just wrong: There's no reason to say "alternate form of luster" + 3 repeated senses that're already at luster. Maybe add a usage note if some American speakers tend to still use re unexpectedly more often in some cases.
That said, the luster entry is currently a bit off. 'Shininess', '5-year period', and 'den' all get spelled with an -re in standard British English but using it for 'one who lusts' would still seem like a misspelling. The alt form needs to be with each etym that uses it and not headlining like it is now. — LlywelynII23:40, 13 June 2023 (UTC)Reply
I think I was trying to show that the open form was used more commonly for some definitions and the closed for others. Is there pure circularity remaining? DCDuring (talk) 17:18, 9 March 2022 (UTC)Reply
I have tried to make clearer the differences and have simplified the "Further reading" sections. I don't see why they should be moved, merged, or split, whichever it is that you are seeking. DCDuring (talk) 18:08, 9 March 2022 (UTC)Reply
It's not, just a doublet that came into the language by a different (rather convoluted) route. Serynga is listed as an alternative form at English seringa, but it looks like it's really a borrowing from French, where it's an alternative form of French seringa. The spelling is no doubt influenced by the taxonomic name.
From what I can gather, Latin syringa developed into Dutch sering, which was borrowed into Portuguese as the name for rubber plants in the genus Hevea and into French for the syringa, Philadelphus coronarius, both with an "a" added. English borrowed Portuguese seringa for the rubber plant and French serynga for the syringa.
So, are the definitions in both entries correct? Because they currently claim to both have the same first two definitions... in which case we should either have {{syn}} crosslinks between them or reduce both senses on one to {{synonym of}} (+gloss) of the other. - -sche(discuss)08:07, 28 March 2022 (UTC)Reply
@Biolongvistul: On the contrary- one of these should be made the main entry and the other an alternative form (if you can call it that: it's really just two slightly different ways of writing the exact same thing). Chuck Entz (talk) 08:02, 22 August 2023 (UTC)Reply
The first category contains things like "state", "county", "province". The second contains things like "California", "Yorkshire", "Guangdong". 70.172.194.2518:58, 12 April 2022 (UTC)Reply
The intended distinction (which, when I spot-check a few categories, actually seems to be decently well maintained) seems to be as IP 70.172 says. But I am inclined to agree that the current names don't convey a meaningful distinction. If we want to continue having separate categories for "county, burgh, kingdom, ..." vs "Mayo, Yorkshire, Idaho, ...", it would be better to devise more distinct names for the categories... - -sche(discuss)23:14, 20 February 2023 (UTC)Reply
IP is right. I just came here because Ottoman Turkish قضا(kaza) was in the wrong category, and pushed the panic button. The naming should be something more intelligent. Fay Freak (talk) 03:33, 21 February 2023 (UTC)Reply
I agree that the names are highly confusing. Maybe we should rename the first one “types of administrative division”, or something similar. Incidentally, that’s exactly the name of the corresponding en.wikipedia category. 70.172.194.2503:39, 21 February 2023 (UTC)Reply
Now the yerba gave me the idea. We just name the latter “named political subdivisions”, to avert the exemplified mistake. The former shall not be renamed because it is added manually while the other is a mediate effect of Template:place etc. I also briefly thought about going to Wikipedia to see how they do but we don’t have the same problems. Fay Freak (talk) 03:47, 21 February 2023 (UTC)Reply
Latest comment: 22 days ago3 comments3 people in discussion
So far as I can tell, the two senses refer to the same thing. Is this a case where differing terminology between chemistry and physics means that it's worth keeping both to better aid understanding? If so, we should probably clarify that they aren't referring to different things.
The redundancy was added in diff, I've merged the senses. There may be another sense, to which the first etymology (positron + ium) would apply, for positronium conceived of in sci-fi etc as an element or substance a la uranium, polonium, unobtainium, etc. - -sche(discuss)01:09, 29 May 2022 (UTC)Reply
This is only an English entry, and on English Wikipedia it is not capitalized inside the middle of sentences. The rationale for capitalizing it in 2007 was that it is a German language entry, except there has never been a German language section on this page. -- 65.92.246.14203:20, 13 May 2022 (UTC)Reply
I definitely agree that all of the headwords mentioned should be at "a few ...". Unfortunately there are probably more (attestable?) alternatives besides what -sche has found. Redirects from "few ..." are especially useful because many with beginning knowledge of English seem to have problems with English determiners. DCDuring (talk) 19:23, 29 May 2022 (UTC)Reply
This is a good issue to raise. I've mentioned before that, with proper nouns, we don't seem to have (or at least we don't consistently use) anything about the determiner/article: I mean it's theEiffel Tower and theCold War, but ∅ Dijkstra's algorithm and ∅ Greenpeace. Proper nouns aside, I usually drop the determiner/article from entry titles unless it seems absolutely 100% necessary all the time. But that's pretty vague and comes out of my wacky head. Equinox◑01:53, 4 June 2022 (UTC)Reply
Maybe. Yeah. I would imagine "some few..." etc. might be possible. But even I have better things to do than attest them. Just an observation. Equinox◑02:17, 4 June 2022 (UTC)Reply
It's a snowclone with many possible variants. I dont think many people are going to look up few or short of expecting to find this full phrase. And those words arent in every variant anyway ... one can also say "two cards shy of a full deck" which uses neither of them.
What would be nice is if the Appendix namespace was in the default search space so that the snowclone page might at least turn up in a search. As it stands, I don't think we need all these mainspace pages since they are all exact synonyms of each other, but if we delete them there will be no way for a naive user to find the snowclone pages unless they somehow know that it's tucked away in the Appendix. —Soap—20:05, 30 June 2023 (UTC)Reply
@Dohqo: Because there's already a Persian entry on the same page, and the character is correct for Persian, it doesn't make sense to move the page. Just delete the Old Anatolian Turkish section from this page and create it on the correct page. You can do it all yourself, no admin rights needed. —Mahāgaja · talk08:36, 18 June 2022 (UTC)Reply
New Caledonia is a sui generis overseas collectivity of France. It has membership in the French parliament and France's rule of law and citizenship extends there just like in Corsica or Guadelope or Lyons. None of these are dependencies: they are all first-level administrative divisions of the French Republic. —Justin (koavf)❤T☮C☺M☯00:48, 24 June 2022 (UTC)Reply
I want a category for all overseas territories of France, and I don't much care about the technicalities. What is the right category? Benwing2 (talk) 01:52, 24 June 2022 (UTC)Reply
These were recently moved by @Apisite from their own user namespace to the Wiktionary namespace under "Requested entries (Chinese)". All of these pages are not requested entries but pronunciation requests. I'm not entirely sure where these should be moved instead, but I don't think they're in the right place currently. — justin(r)leung{ (t...) | c=› }16:59, 23 June 2022 (UTC)Reply
@Fytcha: A word of caution: anything involving pondian variation should be handled carefully. There are good arguments for going either way on most of these, and we don't want to start any kind of conflict. Our general practice has been to arbitrarily go with whichever version was first, though it's been a while since one of these came up. Chuck Entz (talk) 20:17, 26 June 2022 (UTC)Reply
In this case, fibre is older, but by only 14 hours. Also, the translation tables are all already at fibre, so I feel like making fibre the primary spelling and fiber the alternative spelling will be less work. —Mahāgaja · talk21:09, 26 June 2022 (UTC)Reply
Also, since 2016 fiber has been more common in Google's British English N-Gram corpus andsix times more common in American English corpus. DCDuring (talk) 21:54, 26 June 2022 (UTC)Reply
@Chuck Entz: I see. If that is de facto policy then the meat should go to fibre. However, if I could have devised the policy, I would have made it so that it always aligns with the frequency because that way the users land on the non-redirecting spelling more often. — Fytcha〈 T | L | C 〉 22:26, 26 June 2022 (UTC)Reply
We actually had an attempt by a Russian internet troll (geolocating to Crimea) to get us arguing about UK vs. US issues, but it went nowhere. At the time I just thought it was odd, but with the revelations after Trump was elected I finally put two and two together and realized what was going on. I still have no idea why they even bothered, since our discussion forums aren't exactly the center of the universe. I do know that the mutual respect between our US and UK editors, helped by this kind of practice, was the main reason it was such a non-issue. Chuck Entz (talk) 23:20, 26 June 2022 (UTC)Reply
@Chuck Entz I have it on my to-do list to build a template that duplicates the material from the "primary" entry, which should hopefully circumvent issues like this anyway. I've done something similar with Tangut already (e.g. see 𗁘(*rjijr²), 𗁩(*tẽ¹), 𗀏(*par²)), though the implementation would need some tweaking. Theknightwho (talk) 00:18, 27 June 2022 (UTC)Reply
The regular approach is to list these at WT:RFM. This seems, however, a place where proposals go to linger in limbo: there is an unresolved category move request (WT:RFM § Category:WC) from 2015. The sledgehammer approach is to create a vote at WT:VOTE. --Lambiam17:20, 15 July 2022 (UTC)Reply
It would be good if we fixed it, as we have with category and label inconsistencies previously. If not now, I am sure someone will bring this issue up and fix it sometime. J3133 (talk) 16:50, 15 July 2022 (UTC)Reply
Personally, I would lowercase the label (and anything else). On the other hand, Google Books Ngrams suggests Internet is more common. That said, it's less work to lowercase the label than to move all the categories... - -sche(discuss)23:33, 23 July 2022 (UTC)Reply
It should be capitalised. There is such a thing as "an internet" or internetwork (generic; although you very rarely hear this terminology any more), versus "the Internet" (the global thing we all use all the time). Same deal with "the Web" versus (I suppose) "a web" although I don't remember even the most braggart webmasters using the latter. As always, citable usage trumps what I say, but I am historically correct. Equinox◑03:14, 13 March 2023 (UTC)Reply
Latest comment: 2 years ago1 comment1 person in discussion
One is tagged as obsolete and defined as A kind of furnace used in refining, to separate the metal from cinders and other foreign matter., another not obsolete defined as A furnace in which slags of litharge left in refining silver are reduced to lead by being heated with charcoal.. Good luck to the potential merger Dunderdool (talk) 18:06, 29 July 2022 (UTC)Reply
"Category:[Language] [word type]" is the standard naming convention of lexical categories. Category:English irregular nouns, Category:English onomatopoeias, Category:English fandom slang, etc. This category contains only English-language DoggoLingo terms, and thus the correct name should be "Category:English DoggoLingo". German-language DoggoLingo terms would go under "Category:German DoggoLingo", French DoggoLiggo would go under "Category:French DoggoLingo", etc. (Presuming this meme has spread to other languages.) WordyAndNerdy (talk) 06:24, 30 July 2022 (UTC)Reply
We could use some empirical data here. Does DoggoLingo or a close equivalent actually exist in German or French? If it does, that provides some reason to approve this proposal (and possibly to update the relevant articles). If not, it provides some reason to reject it. 98.170.164.8806:40, 30 July 2022 (UTC)Reply
I agree with this. WordyAndNerdy, do you have proof that Internet slang related to dogs (i.e., of the type of DoggoLingo) exists in other languages, and would use the same name derived from English slang? J3133 (talk) 06:45, 30 July 2022 (UTC)Reply
*deep existential sigh* English-language lexical categories have an established naming convention. I have never seen an English-language lexical category that was just "Category:Word type" (e.g. "Category:Fandom slang", "Category:Military slang", etc.) in 10+ years of contributing. Can't speak for lexical categories in other languages, but if someone wants to change an established convention, they need to do so by obtaining consensus, not by unilaterally imposing a new standard. This is an extremely straightforward request and having to get bogged down in bureaucratic discussions like this means less time for doing productive things like attesting Internet slang. WordyAndNerdy (talk) 07:15, 30 July 2022 (UTC)Reply
A proper noun that specifically refers to English, if you are not already aware. Like Rotwelsch is a proper noun referring to German. J3133 (talk) 07:30, 30 July 2022 (UTC)Reply
The Wikipedia article defines DoggoLingo as an "Internet language" and doesn't specify that it's limited exclusively to English in said definition. In any case, this is completely perpendicular to the issue of what the category should be named. No one had to prove the existence of Dutch Twitch-speak, Korean Twitch-speak, etc. to create "Category:English Twitch-speak." That's what the category ought to be named following the established naming convention of English-language lexical categories. (And given that you haven't incorporated this category into the category tree module -- which is like step two of creating a new category -- maybe it isn't prudent to act as if you have special expertise or authority in this area.) WordyAndNerdy (talk) 07:51, 30 July 2022 (UTC)Reply
And the consensus of that discussion seems to be that "Thieves' cant" is a strictly English historical example of criminal slang, and that the non-English entries in Category:Thieves' cant should be moved to language-specific criminal slang subcategories- the opposite of this proposal.
It's true that there's a naming convention to put language names in category names, but that doesn't apply to this kind of entry, and saying it does shows a misunderstanding of the convention. While there's nothing to stop other languages from having their own equivalents to DoggoLingo, it seems to have been created by English-speakers using humor based on the peculiarities of the English language. If other languages come up with their own equivalents, I sincerely doubt that they would be called DoggoLingo. DoggoLingo is a variety of English, just like pig Latin and double Dutch, and "English DoggoLingo" would be redundant. Chuck Entz (talk) 08:13, 31 July 2022 (UTC)Reply
This was put up for RFM back in 2022, but I couldn't find a discussion so I assume one was never started. This should definitely be at a topic category Category:en:Doggo lingo if we should even have such a category in the first place (which I don't think we do). @WordyAndNerdy: pinging original nominator - saph ^_^⠀talk⠀14:32, 3 February 2025 (UTC)Reply
@Saph: I have merged your comment with the discussion. As has been mentioned by Chuck Entz (and I agreed), “English DoggoLingo” would be redundant (not “Doggo lingo” (see DoggoLingo), and not with “en:” because this is not a topic category). J3133 (talk) 14:46, 3 February 2025 (UTC)Reply
Do any languages other than English have "DoggoLingo" terms, which they call DoggoLingo? It's believable, but AFAICT undemonstrated (and I haven't spotted any in my own limited search). If "DoggoLingo" is, as presently defined, definitionally English-only, then the current name is in line with e.g. Category:Verlan (which is a subcategory of CAT:French language without having to be named "French verlan", since there is apparently no other kind of verlan), and , as patiently pointed out to OP above. (Category:English Thieves' Cant has "English" in the name despite being English-only, but that's partly my own fault and was the result of an RFM trying to solve a different problem, of people putting other languages' criminal slang in the category. Still, it means Wiktionary does not have one singular standard way of dealing with "language-specific slangs".) OTOH, if non-English DoggoLingo entries exist, then the categories would need to be specified by language. - -sche(discuss)23:56, 4 February 2025 (UTC)Reply
Latest comment: 5 months ago2 comments2 people in discussion
As AG202 stated in the DoggoLingo category RFM, “There’s no overarching Category:Pig Latin or Category Pig Latin terms, nor does there seem to be other languages linked to it, so there really shouldn’t be an English label there.” This was after Chuck Entz used the argument there that “English DoggoLingo” would be redundant, “just like pig Latin”, then Binarystep pointed out that the Pig Latin category does use the English label—redundantly. J3133 (talk) 11:24, 1 August 2022 (UTC)Reply
Latest comment: 2 years ago5 comments4 people in discussion
This idiom is far more versatile than the specific and somewhat informal phrasing we have here (which doesn't even match the quotation we have), it's a fully fledged verb phrase — see the examples at Teaching grandmother to suck eggs.
Two points: there is such a wide range of familiar terms for grandmother that can be used in this phrase so I think it's best to stick with "grandmother". However I think it's worth investigating if it's more common with or without the possessive pronoun (here "one's"); to me it sounds more natural with it but there are citations both ways. 86.145.59.12018:42, 14 August 2022 (UTC)Reply
I'm somewhat inclined to pick a most common or general negative form to lemmatize like not teach grandmother how to suck eggs, and also have the positive form (maybe teach grandmother to suck eggs since a possessive doesn't seem required? or if a pronoun is more common, then redirect the pronounless form to the pronouned form, either works). This is both because it's unclear how many translations can have the negative removed and because in general, as I said in the discussion of all it's cracked up to be further up this page, when we redirect a negative expression to a positive one or vice versa there's a risk that a reader who doesn't notice they were redirected will come away thinking the phrase means the opposite of what it actually means. To avoid duplication we could make the negative form almost a soft redirect, defining it like "To not teach grandmother to suck eggs(“presume to give advice to someone who is more experienced”)" or even "To not teach grandmother to suck eggs(see that entry)"; I don't know, I don't like splitting content across multiple pages, but I also think it's risky to silently strip away the negative polarity with a seamless little redirect and expect IPs who sometimes don't even notice they're on Wiktionary and not Wikipedia to notice and understand that the polarity of the headword has changed and thus that the definition of the term they looked up is the opposite of the one we're giving them. - -sche(discuss)15:20, 19 August 2022 (UTC)Reply
Negative polarity is "licensed" in many forms, starting with the negative being separated from the rest of the expression: conditionals, questions, infinitives with certain verbs (eg, try to) or other expressions (eg. hard to). These might lead someone to look up the positive form. I think that a "negative-polarity item" label (with link to WP or our Glossary), usage examples with adjoining and disjoint not and n't, and redirects would enable us to use the positive form as the lemma. I don't see how to use redirects in the other direction. Even usage examples would be problematic with not in the headwords. DCDuring (talk) 21:07, 19 August 2022 (UTC)Reply
What I mean is, I'm somewhat inclined to have both "not teach grandmother to suck eggs" defined as "not give advice to someone more experienced", and then also "teach grandmother to suck eggs" defined as "give advice to someone more experienced", redirecting all the various negative forms to the first one and the positive forms to the second one. But I'm not opposed to only having the positive form and redirecting everything to it; I do dislike splitting content across multiple pages, I just also think there's always a danger when someone types "not teach grandmother to suck eggs" into the search bar and as seamlessly sent to "teach grandmother to suck eggs" where they read a definition that's inverted from that of the term they typed in and which they think they looked up. - -sche(discuss)21:46, 19 August 2022 (UTC)Reply
Latest comment: 2 years ago4 comments4 people in discussion
Plenty of overlap, spesh with translations. Maybe there's just one species called this, maybe two... something for the animal nerds here... you know who you are Almostonurmind (talk) 00:48, 8 September 2022 (UTC)Reply
Formally, that's probably true, though I doubt most people make the distinction consistently colloquially. But that wouldn't be particular to bighorn. I think people who didn't make the distinction would be just as likely to use bighorn sheep when describing Dall's sheep. Andrew Sheedy (talk) 15:18, 21 October 2022 (UTC)Reply
I have split both into two subsenses and RfVed the O. dalli subsenses. I have not yet found any evidence that either term is applied to O. dalli. I would include O. dalli and thinhorn sheep under See also at both of these entries. DCDuring (talk) 15:57, 21 October 2022 (UTC)Reply
Latest comment: 2 years ago3 comments2 people in discussion
Are these the same? The Kiowa language does not appear to be related to Shoshone, nor does the Wikipedia article on the Kiowa people claim that they are from North Platte, Nebraska. I have a hunch that Kioway is an alt form of something, and this seems like the most obvious answer, but someone should check and try to make sense of this before merging them. 98.170.164.8807:56, 14 October 2022 (UTC)Reply
Shoshonean is an old term for the northern part of the Uto-Aztecan languages, from Shoshone. Many of the names for the Numic languages are only loosely correlated with linguistic reality, so terms like "Shoshone", "Paiute" and "Ute" are kind of hard to pin down without qualifiers. There is a Shoshoni language, but peoples like the Timbisha and the Bannock are also called Shoshone.
Kiowa is part of the Tanoan languages, which may very well be related to Uto-Aztecan as the Aztec–Tanoan languages, but linguists have yet to completely connect the dots. It was speculative in 1913, and it's still not definitively established in 2022. It reminds me of the Achilles and the Tortoise paradox.
It's all part of the confusion that results from early efforts to classify wide-ranging nomadic peoples who have moved into different regions and adopted different cultural patterns and lifestyles. Just as the Comanche were Great Basin Numic people who moved to the Great Plains and adopted a Plains Indian culture, The Kiowa also moved from the pueblos into the Great Plains and adopted a similar culture.
Latest comment: 1 year ago3 comments2 people in discussion
The definitions we give for all three terms are essentially identical, but the forms differ because they are borrowed from different Chinese lects (Mandarin, Cantonese, and Taishanese itself, respectively). Should these use {{alt form}} or {{syn of}}? 98.170.164.8823:08, 14 November 2022 (UTC)Reply
Merged into the first form which, per ngrams, is the most common. (For the place rather than the -ese, Taishan is particularly lopsidedly more common than the alternatives.) - -sche(discuss)07:26, 6 March 2023 (UTC)Reply
Latest comment: 2 years ago1 comment1 person in discussion
These two are essentially the same phrase sharing the same meaning, with the more common 食花生 being derived from the other. – Wpi31 (talk) 12:00, 18 November 2022 (UTC)Reply
Latest comment: 1 year ago4 comments3 people in discussion
Requesting to move snowsquall to a space-separated form snow squall. The unspaced form doesn't appear to have been used purposefully or frequently, if at all, in the past or present. It also does not appear to be used by either the US-American NWS or the Canadian MSC, and hasn't appeared in any online news coverage. Bailmoney27 (talk) 19:14, 19 November 2022 (UTC)Reply
I support this. I've never seen the bunched spelling before and I've been following winter weather for many years. It does seem to be in use, but distinctly less common. Wikipedia's favoring of the bunched spelling seems to be largely a matter of the article having been created early in Wikipedia's lifetime, and with a radar scan from 2004 featuring that bunched spelling. Essentially, we had a model to follow and we stuck to it, but it happens that most people, including the national weather services of both the US and Canada, prefer the two-word form. —Soap—21:22, 10 December 2022 (UTC)Reply
Well, I decided to just move the page myself, as it's been up here unopposed for six months, and because I want to fill in the usual "see also" hatnote which would require that both spellings exist. Since this would make a non-admin move impossible, I moved the page before I put in the hatnote. —Soap—11:19, 13 April 2023 (UTC)Reply
No, the two have different etymologies (one from Latin missa and another from Latin Massa), and obviously the two have different meanings too איתן קרסנטי (talk) 05:55, 31 July 2024 (UTC)Reply
They look like alternative case forms to me, with almost identical etymologies and definitions. It may be difficult to tease out whether the upper- or lowercase form is more common, based on collocation searches. As a start, since 1870 Holy Mass has been more common than than holy mass, holy Mass, and Holy mass (probably mostly sentence initial) at Google N-Grams and much more common since 1940. (Is Holy Mass SoP???) And o'clock mass and o'clock Mass have been roughly equal since 1900. DCDuring (talk) 14:30, 31 July 2024 (UTC)Reply
@Sgconlaw: I meant that “Etymology 1” and “Etymology 2” (but not “Etymology 3”) in one entry should be merged with the respective etymologies in the other entry. J3133 (talk) 19:02, 5 January 2023 (UTC)Reply
@J3133: ah, ha ha! I take it you mean that etymology 1 in haha and ha-ha duplicate each other, so one entry should be made the main lemma and the other converted to an alternative form; and likewise for etymology 2 in those entries. — Sgconlaw (talk) 19:19, 5 January 2023 (UTC)Reply
Latest comment: 2 years ago1 comment1 person in discussion
Shouldn't this be in the Reconstruction namespace? Tbh I'm not sure why we need an entry for this at all, even granting that the suffixes -iġ and -eġ are alternative forms of each other. If this specific non-attested form were mentioned in secondary literature then I could see a case for it, but I can't find anything. To be generous, it's plausible that a version with an /e/ vowel existed in Anglo-Saxon speech, if the versions of the suffix were interchangeable. For now at least, I'll just leave this at RFM, but feel free to send this to RFD if desired. 98.170.164.8810:49, 11 December 2022 (UTC)Reply
Latest comment: 6 months ago7 comments5 people in discussion
Currently this redirects to arsed and, further to the discussion in the Tea Room, I propose that we undo the redirect. After all we aren't currently redirecting can't be fucked or can't be bothered. It seems better to have stub entries for all synonyms of can't be bothered listing them as alternative forms only, with all the synonyms and translations listed on the same page. Though I'm not suggesting creating be arsed and be fucked, we should probably keep be bothered as a translation hub and for the purpose of distinguishing it from the rare word bebothered as we currently do. --Overlordnat1 (talk) 01:49, 14 December 2022 (UTC)Reply
I agree. I hate these redirects to single words - they rarely make sense without the rest of the term, and they're unintuitive even for experienced users. Theknightwho (talk) 23:39, 4 January 2023 (UTC)Reply
I don't know whether there are any other uses of fuck to mean "bother", nor of arse with that meaning.
Why wouldn't we RfV arse#Verb "To make, to bother" if the redirect doesn't seem right? If virtually the only usage with the "bother" sense is can't be arsed there is no reason for this not to be a lemma. DCDuring (talk) 23:44, 15 August 2023 (UTC)Reply
Latest comment: 1 year ago2 comments2 people in discussion
The capitalization of these entries is inconsistent, even though they are all coordinate terms for different views on the same issue. Note that Miaphysite and dyophysite don't (currently) exist, while both capitalizations of monophysite do. Also, some of these have adjective senses and some don't. Not technically a request for a move, merger, or split, but it's a similar issue to what often comes up here, so this seemed like a fitting venue. 70.172.194.2511:37, 19 December 2022 (UTC)Reply
IMHO, at the very least, one foul swoop needs explanation and therefore needs a full entry. Also, it has a distinct pronunciation and [[[fell]] and foul are not close cognates, so they don't seem to be alternative forms of one another. One foul swoop seems to refer to (be derived from) one fell swoop. If one foul swoop gets the main entry I think it deserves, then in one foul swoop should redirect thereto. DCDuring (talk) 16:59, 5 January 2023 (UTC)Reply
Agreed. As in one fell swoop redirects to one fell swoop, redirecting in one foul swoop to one foul swoop would seemingly be the only logical and consistent course of action. --Overlordnat1 (talk) 11:43, 6 January 2023 (UTC)Reply
Latest comment: 1 year ago3 comments3 people in discussion
parlez vous, parleyvoo, and parley-vous whilst having the exact same meanings and roughly the same pronunciation, all have their own pages and the others are listed as synonyms. Two have the meaning of “a Frenchmen, one has “the French language” and all of them have “to speak a foreign language, especially French”. Are these all not the word, with differing spellings? -CanadianRosbif (talk) 10:37, 7 January 2023 (UTC)Reply
We should probably merge them into parlez vous but list the other two spellings as alternative forms. There is also the song 'Mademoiselle from Armentieres' aka 'Hinky Dinky Parley Voo'[2] which has the form parley voo, the spaced version of parleyvoo, though I don't think this bawdy WW1 song would be a good example to include in our entry, fun though it is, as it's not clear what the final refrain of parley voo at the end of each line is actually supposed to mean. There is also a version that appears in the final credits of Peter Jackson's film They Shall Not Grow Old which can be found on YouTube and which is where I first came across the song. --Overlordnat1 (talk) 02:21, 8 January 2023 (UTC)Reply
Latest comment: 3 months ago14 comments12 people in discussion
Currently, {{lb|de|EU politics}} categorizes as Category:European politics and the lede in Category:European politics says "terms related to politics of the European Union." I don't dispute that this ridiculous misnomer is widespread but we don't do ourselves any favors by leaning into it. I propose that we repurpose Category:European politics and make it the category of all European (i.e. taking place in or relating to the continent of Europe) politics categories and entries, not just those related to the politics of the European Union. Entries and categories pertaining to EU politics should instead be part of Category:EU politics which itself should be a subcategory of Category:European politics. — Fytcha〈 T | L | C 〉 08:22, 19 January 2023 (UTC)Reply
Fair point. We should probably change "US politics" to "United States politics" and "UK politics" to "United Kingdom politics", in that case. Best to be consistent with country/supranational entity names. Theknightwho (talk) 14:00, 12 July 2023 (UTC)Reply
Latest comment: 2 years ago1 comment1 person in discussion
Merge into Reconstruction:Proto-Indo-European/(s)mel-. Most modern sources agree these are part of one and the same root. The only descendant that (traditionally) requires PIE *a is Latin malus, which fits semantically better with the gloss at *(s)mel- anyway. In fact it is unnecessary to reconstruct *a at all, in light of *mo > *ma unrounding in an open syllable with coda resonant (see de Vaan:2011 p. 8: 7.1; p. 360), the same process that resulted in mare(“sea”) < *móri. In any case the reconstruction of the vowel is irrelevant to whether the Latin, Slavic and Germanic words are cognate, despite the last sentence of the Latin etymology 1 described at malus. — 69.121.86.1319:31, 3 February 2023 (UTC)Reply
Latest comment: 1 year ago3 comments2 people in discussion
English.
As the entry says "capitalization varies". I see no compelling reason that this shouldn't be a noun sense at boot with "always with 'the'", or something of the sort. Chuck Entz (talk) 23:43, 18 February 2023 (UTC)Reply
The Boot meaning the Saatse Boot should be somewhere uppercase, I think, whether Boot or the Boot or The Boot I'm not entirely sure, because it functions as a proper noun place name. I'm not familiar with how the (b/B)oot meaning Louisiana is used; in the one cite in the entry, or others I can image like referring to LA as America's boot, it seems like a metaphorical general sense for something or somewhere boot-shaped. So it may be an RFV question, does Saatse-style use as a proper noun place name exist (for either place ... I can't actually find the Saatse one in books, either, only online). - -sche(discuss)17:51, 24 April 2023 (UTC)Reply
Latest comment: 1 month ago6 comments6 people in discussion
The forms with noses are pretty dated at this point and not in widespread use. I think it'd be better if the noseless forms were the main entries, perhaps with a note on the older forms indicating that they were used first. Binarystep (talk) 20:24, 25 February 2023 (UTC)Reply
Noseless variants are also more common in East Slavic languages, albeit eyeless-and-noseless variants are more frequent. However I can’t find data regarding this, I am speaking solely from my experience. БудетЛучше (talk) 20:11, 6 September 2024 (UTC)Reply
Latest comment: 1 year ago10 comments5 people in discussion
We have two different entries for the same thing, while links generated with {{m}} or {{l}} like *vьśegъda link to the latter (vьsegъda) as they seem to ignore ś in Proto-Slavic reconstructions which IMO is unexpected. This makes the former (vьśegъda) being ignored and forgotten recently. I guess both entries should be merged and the language modules should be tweaked to make Proto-Slavic stuff ś-aware? // Silmeth@talk12:28, 15 March 2023 (UTC)Reply
@Silmethule Converting ś to s seems intentional, and asserts that there's no separate ś phoneme in Proto-Slavic. Reconstructing ś seems ahistorical to me; it's rather that the third (and second ...) palatalizations occurred post-Proto-Slavic. Benwing2 (talk) 06:27, 16 March 2023 (UTC)Reply
@Benwing2: but it has different reflexes in different branches. So, either those palatalizations happened post-Proto-Slavic and *ś is a valid dia-phoneme projected back and reconstructing *s in those places for Proto-Slavic is wrong, or it was an actual Proto-Slavic phoneme with some value separate from both *s and *š that merged with those at a later stage – in which case we’re justified to reconstruct *ś and *s is wrong. In either case, unless we undo all progressive and 2nd regressive palatalizations of *x (and all the other sounds? there are traces of non-palatalization in *otьcь in the east too), we need to treat *ś as a (dia)phoneme of its own and *s is wrong. Also WT:About Proto-Slavic seems to treat *ś as a separate phoneme (and even ascribes a specific IPA value to it). // Silmeth@talk10:00, 16 March 2023 (UTC)Reply
@Benwing2: What primary sources? Proto-Slavic is a reconstructed, not directly attested, language.
If you mean etymological dictionaries and historical linguistic papers – depends, you get all sorts of things (*vьšь in Polish dictionaries, *vьsь in some southern ones, non-palatalized *vьxъ in Vasmer, etc.) – although in general progressive and 2nd regressive palatalizations are commonly marked. But *x is problematic as it has different reflexes in the west vs south+east; hence Derksen’s notation with *ś, as he puts it:
The introduction of *ś, on the other hand, could not be avoided, cf. *vьśь ‘all’ vs. *vьsь ‘village’
I already agreed that we should use ś. Third palatalisation is only absent in Old Novgorodian and most of our entries already do apply the sound law to stops, so I don't see why we should treat the sibilant any differently. Thadh (talk) 22:23, 16 March 2023 (UTC)Reply
I don't think this would normally be spelled with a hyphen, at least not as a verb. heave-to with a hyphen looks like a noun, probably meaning "the act of heaving to", though as a landlubber I don't know if such a noun exists. —Mahāgaja · talk10:47, 24 March 2023 (UTC)Reply
You may be right as kaffir seems to be slightly more widely used than kafir, though oddly enough we (and Wikipedia) have an entry for Kafiristan and not Kaffiristan (which is a far more prevalent form on GoogleBooks). Though on a raw Google search 'Kafir' is twice as popular as 'Kaffir' and 'Kafiristan' is a lot more popular than 'Kaffiristan' and there does seem to be a slight tendency of late to differentiate the 2 words so that 'kaffir' is the Souh African insult and 'kafir' is the Islamic one. --Overlordnat1 (talk) 09:06, 13 April 2023 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
So from doing a lot of research and hearing testimonies from elders who speak this North African Judeo-Spanish language, I think there should be a separate list and code for Haketia. It has been associated as just a dialect of Ladino but that is not the case. Haketia has consonants and words directly from Arabic that are never used in Ladino as well as an array of different phrases and spellings. It is a separate language. Let me know if this can be done. I have a lot of words, pronunciations and phrases ready for adding to it after it is set up. Shukur/thanks. Nevermiand. (talk) 18:43, 16 April 2023 (UTC)Reply
IMO these are usually uncapitalised in later use, though it's quite hard to tell because it's usually the first thing in a sentence so gets a capital anyway. And older texts, pre-mid-19th century, would capitalise nouns fairly commonly anyway. But conventionally ods bodikins, od rat it, odzooks etc. are written with small Os. The OED and Chambers both lemmatise od and derivatives uncapitalised. Ƿidsiþ13:32, 18 September 2023 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
“akrasia” is currently listed as the alternative spelling of “acrasia”, which contradicts Wikipedia, as well as the fact that “acratic” is (correctly) listed as the alternative of “akratic”. Also, “akrasia” has 4.5× as many Google results as “acrasia” does. (There’s probably a better metric I could cite, but oh well.) IMO we should swap the two and make “akrasia” the main one. —Will • B[talk]23:30, 5 May 2023 (UTC)Reply
Other static copulas can replace be. Should we consider be a generic static copula as we might consider do a generic transitive verb and something a generic NP? DCDuring (talk) 01:17, 18 September 2023 (UTC)Reply
@76.100.240.27 How is the idiomatic meaning of trust someone to used in a sentence? Would one say, "I trust her to spend all day reading" to mean "It is predictable for her to spend all day reading"? If so that makes our current gloss unsubstitutable. Is it ever used without any words between trust and to? If not I think we should not move it. — excarnateSojourner (talk · contrib)16:34, 22 May 2023 (UTC)Reply
I agree that trust to is a worse location for the expression than trust someone to, though both are worse than trust + to, IMHO.
Doesn't whip it on require a person (or personified object) as complement? I suppose we could handle that with a label. Also. it is possible that there might be another meaning involving inanimate objects or other expressions. I would probably then be easier on users to be able to compare meanings. DCDuring (talk) 00:42, 23 May 2023 (UTC)Reply
I would not move this entry. People say "I put it behind me" and "You need to put it behind you and move on", not *"I put it behind myself" and *"You need to put it behind yourself and move on". —Mahāgaja · talk06:48, 23 May 2023 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
molly-mawk is given as an alternative form of mollemoke, and not of mollymawk. The etymologies given for those two are half-different too, while both mention fulmars. There's probably some obsolete taxonomy in there too, so a taxo-specialist's eyes would be more than welcome. Skisckis (talk) 20:40, 11 May 2023 (UTC)Reply
Latest comment: 1 year ago2 comments2 people in discussion
English. There seems to be some conflation between the two. {{lb|en|China}} categorizes into the former, though people often do meant the latter, which only has 3 entries. For example, typhoon shelter, Hong Kong foot, add oil, and aiya are labelled as both {{lb|en|China}} and {{lb|en|Hong Kong}}.
Also, "Chinese English" technically includes Hong Kong English by the criteria of geography, but linguistically and lexicographically speaking, there is very little influence on HKE from the mainland, which means there are not many instances where we actually need to categorize into both; the existing ones in the category that I'm aware of are (excluding the four already mentioned above) joss stick, Ins, KMT, and ACG. Note that this also causes abominations like the one at ACG, which is meant to include Taiwan as well. (We can ignore Macau for the sake of simplicity, since the English used there is basically a toned down version of formal HKE) – Wpi (talk) 17:29, 30 May 2023 (UTC)Reply
Off-topic: In my opinion, KMT and joss stick are not regional forms of English; indeed, the latter is currently not labelled as such. (Indeed, 'joss' is not so labelled, though it's not part of my active vocabulary.) --RichardW57m (talk) 09:18, 2 June 2023 (UTC)Reply
@Elevenpluscolors Per the OED entry, no, it shouldn't be split but geological senses (and probably the cuckold sense too) should be separately listed as appearing with a lower-case initial letter. Whichever is currently the more common form should be the main entry. The other one should still have those senses but use the template for alternative case form of. — LlywelynII22:04, 9 June 2023 (UTC)Reply
Latest comment: 1 year ago5 comments5 people in discussion
Should these categories be merged? Many terms in the -cide categories end in -icide, and thus should be moved, unless we decide not to make this distinction. J3133 (talk) 12:45, 14 June 2023 (UTC)Reply
I am familiar with the similar queer the deal, defined by "NetLingo" as "To ruin a potential business deal or arrangement despite all favorable odds. For example, 'They are a liberal company, so don't queer the deal by letting them know our conservative tactics.'" The "deal" version is more common with "the" than with possessive pronouns. But I wonder whether the right approach isn't to make both the "the"" and the "someone's" versions redirect to the right sense of queer#Verb, adding usage examples there. DCDuring (talk) 20:15, 28 June 2023 (UTC)Reply
It looks like the only quotations for cotton could be moved to one of the other two. Unless someone has a quote without a preposition/with other prepositions, that would seem to be the best solution. Then the verb entry at cotton could simply be deleted. Mazzlebury (talk)
It would be better if we had some evidence for all similar cases. OTOH, if we think new normal users are able to use the failed-search page, then they would find [[bee's knees]], even if they searched for "the bee's knees" (and vice versa). I personally think that normal users can't be assumed to make good use of that page. DCDuring (talk) 12:53, 8 August 2023 (UTC)Reply
We are quite inconsistent about whether we include the or not, e.g. cat's pyjamas redirects to the cat's pyjamas, contrary to the direction of the the shits → shits redirect. It would be better to try to decide on a general approach rather than move entries piecemeal. DCDuring, you argued in favor of redirecting verb oneself to verb even when it's never attested other than with a reflexive pronoun; it seems to me the same logic would make it better to centralize content at bee's knees, too. "The" is dropped from constructions like this when they're used attributively and in certain other cases (peruse the cites at google books:"and bee's knees"), and in headlinese ("Mayor Says New Parks Are Bee's Knees"). I pointed this out about Talk:The Rock, too. - -sche(discuss)16:15, 16 August 2023 (UTC)Reply
@Benwing2: it seems like {{circa}} and {{circa2}} serve different purposes. The template {{circa}} (and {{ante}} and {{post}}) appear to have been created for quotations in entries that do not use quotation templates. That is why the year appears in bold and there is a comma after the year. On the other hand, {{circa2}} is for adding circa or c. before a year in other contexts, such as in etymology sections or image captions. I suppose {{circa}} and {{circa2}} could be merged, but then some parameter would have to be added to allow for switching between the two formats. Alternatively, if all quotations using {{circa}}, {{ante}}, and {{post}} were replaced with quotation templates, then {{ante}} and {{post}} could be eliminated and {{circa2}} could be renamed as {{circa}}. — Sgconlaw (talk) 05:55, 2 August 2023 (UTC)Reply
@Sgconlaw I have eliminated the aftercomma from {{circa}}, {{ante}} and {{post}}. What differences remain? Just the boldface? That seems a pretty small thing to have two templates for, esp. given the horrible naming. Benwing2 (talk) 06:17, 2 August 2023 (UTC)Reply
{{quote-book|en|year={{ante|1597}}|first=William|last=Shakespeare|authorlink=William Shakespeare|title={{w|The Merry Wives of Windsor}}|section=Act 3, Scene 5|passage=No, Master Brook, but the peaking '''cornuto''' / her husband, Master Brook, dwelling in a continual / 'larum of jealousy, comes me in the instant of our / encounter, after we had embraced, kissed, protested, / and, as it were, spoke the prologue of our comedy}}
I am cleaning all of them up to use e.g. a. 1597 instead. Note that |origyear=, |year_published=, etc. now support a., c. and p. prefixes. Benwing2 (talk) 08:22, 13 August 2023 (UTC)Reply
Hmm. I would not merge these as things stand now, with them having the differences re bolding that they do, and the differences in where they're used: they currently serve different purposes. (Since we don't normally bold years in etymologies, descendants lists, etc, a template used in etymologies to qualify a year as circa shouldn't bold the year either, whereas we do normally bold years at the start of quotation metadata, so a template that supplies circa there should bold the year.) However, if we replace all of the relatively few (~680) uses of {{circa2}} with just the spelled-out word "circa" — formatted however: "circa", "c.", whatever we decide — rather than a template, we could just delete {{circa2}}. And/or if we made sure all uses of {{circa}} were inside quotation templates (not manually-formatted quotations), then we could presumably have the quotation templates know that if year={{circa|####}}, then format #### in bold (but don't bold circa?), and then if {{circa}} and {{circa2}} stopped differing in the formatting they apply, they could be merged. - -sche(discuss)16:06, 16 August 2023 (UTC)Reply
Was the ping specifically to this discussion? Fascinating. (If it was just to this general page, I might speculate that my recent removal-then-readdition of a bunch of discussions pinged people somehow, even though it shouldn't because I think you have to add four tildes at the same time as linking someone's username to ping them.) Pings seem to be wonky lately; AG202 pinged me in an edit summary on this entry recently and I didn't get the ping, only noticing that it existed because I had the entry in my watchlist and was looking at the edit history. - -sche(discuss)14:59, 29 March 2024 (UTC)Reply
@-sche @Sgconlaw I just got a bunch of pings that claim to be from -sche but were actually old responses of mine *TO* -sche. Strange. As for {{circa2}}, I am not sure anymore; I think when I looked into this awhile ago, I concluded they indeed serve slightly different purposes, although the naming is definitely bad. Benwing2 (talk) 19:32, 29 March 2024 (UTC)Reply
I notice all the pings are from 6 hours ago and are in WT:RFM specifically, so I think they are indeed related to your removal/readdition of discussions at that time. Benwing2 (talk) 19:38, 29 March 2024 (UTC)Reply
@-sche: when I clicked on the notification I was led to this discussion. Anyhoo, about this discussion, @Benwing2: what about changing {{circa}} to {{circa-quote}}, and {{ante}} and {{post}} similarly since they are all intended only for use with quotations (though they should be phased out in favour of the {{quote-*}} templates), and then renaming {{circa2}} to {{circa}}? Would that be confusing? — Sgconlaw (talk) 19:43, 29 March 2024 (UTC)Reply
(Merged from a request to merge into Category:en:Christianity)
Not clear to me how these are supposed to be distinguished. The boilerplate description at Category:Ecclesiastical terms by language says "terms used only by religious figures", but that's manifestly wrong for the terms at Category:English ecclesiastical terms which are also variously used by commentators like historians or musicologists who may or may not be religious themselves. In reality the category, certainly for English, seems to just contain terms topically related to Christian churches—not just religion in general—and these should be listed under Category:Christianity instead. The "ecclesiastical" label should perhaps also be made an alias of "Christianity". @Andrew Sheedy —Al-Muqanna المقنع (talk) 15:56, 29 August 2023 (UTC)Reply
This is actually already on this page! See Ioaxxere's discussion above. It was pointed out that not all the terms are related to Christianity. However, I do agree that "ecclesiastical" is not the best label. Simply labelling according to religion would be preferable, I think. Andrew Sheedy (talk) 17:42, 29 August 2023 (UTC)Reply
@Andrew Sheedy: Oops, completely missed that. I'll merge the discussions (and add a template to save anyone else making the same mistake). @Theknightwho The Thai category is very interesting, looking through it, but it seems to be describing a very different thing from the English category—maybe the problem is specifically how the English category is being used? —Al-Muqanna المقنع (talk) 18:03, 29 August 2023 (UTC)Reply
It seems like everything which is in this category would be better off in a specific religion's category or, if pan-religious, in the "religion" category. (But many things currently in the "religion" categories are Christianity-specific, as I raised at Wiktionary:Information desk/2023/August#Christianity_terms_labelled_broadly_"religion" and intend to deal with at some point.) The widespread misuse of the label / category for terms that are better in other categories means we might be better off retiring it, although the other possibility is making it an alias of "religion" and then trying to monitor misuse, which we have to do with "religion" already anyway. - -sche(discuss)16:33, 6 September 2023 (UTC)Reply
In the Thai case it might be useful to distinguish between terms that are topically relevant to religions and terms used in religious contexts. I'm not convinced that distinction is generally useful, though: stuff like PBUH would certainly fall into the latter category but I think the (Islam) context label does the job (and labelling it "ecclesiastical" would come off as decidedly odd in general). My inclination would also be to merge it in the way you describe, so moving it to the relevant religion(s) or to the overall religion category if it's non-specific, but that leaves more complicated stuff like the POS subcategories at Category:Thai ecclesiastical terms up in the air given that we don't generally do that kind of breakdown for topic categories. —Al-Muqanna المقنع (talk) 17:19, 6 September 2023 (UTC)Reply
Do the Thai POS subcategories make any sense or can they simply be deleted? "to kill (a god, high priest, or royal person)." does not seem to be an "ecclesiastical verb"-as-different-from-a-"verb", any more than deicide is an "English ecclesiastical noun", it seems to just include religions figures in its scope. And if specific verbs are only used by Buddhists (or whatever), then using the usual POS categories and then also using {{lb}} would seem to be the normal way of handling that, right? - -sche(discuss)03:13, 7 September 2023 (UTC)Reply
All three -- {{lj}}, {{jaru}}, and {{ruby/ja}} -- were the creations of Fumiko Take. They did very little to document any of the templates or aliases they created, and if dim memory serves, they were even aggressively oppositional when asked to provide documentation.
Stepping back -- what is the use case for this infrastructure? Do we not already have functional ruby text provided by {{ja-r}}?
Granted, {{ruby/ja}} offers the ability to specify arbitrary ruby text -- but I struggle to think of when we'd actually want that. It's used to great effect in manga, when authors will not uncommonly spell a word to convey a particular sense, and gloss it with ruby to indicate a different word entirely -- but for a dictionary, this is aberrant behavior outside of direct quotes of such texts. I suspect that, in most cases, {{ja-r}} would do just fine for our needs.
Stepping back a bit further -- do we need ruby text at all?
Serious question. Tiny kana over the kanji is something that only provides value to people who can already read kana, and is otherwise likely to confuse anyone unfamiliar with Japanese typography (which is probably the greater part of our user base). If a given user can already read kana, they are likely to be savvy enough to be able to match up any provided romanized string to the kanji, much as we get when using {{m|ja|TERM|tr=romanization}}.
I argue that kana ruby text over kanji is snazzy, but it also presents usability issues.
@Eirikr I'm in full agreement that these are superfluous to {{ja-r}} (and {{ryu-r}}), but I disagree that we should be getting rid of rubytext. I think the aim should be to incorporate rubytext into {{l}}, {{m}} (et al). The infrastructure for language-specific formatting in links already exists (and is already used by Chinese and the Chinese lects to generate simplified forms), so we could add something for the Japonic languages that essentially reimplements {{ja-r}} (for the relevant language). Theknightwho (talk) 22:09, 15 August 2023 (UTC) Forgot to ping Benwing2. Theknightwho (talk) 22:12, 15 August 2023 (UTC)Reply
Just to add a bit further to this - I'd also like to automate much of the kanji/kana mapping which is currently necessary with {{ja-r}}. It won't be possible to do away with it entirely, due to redlinks or when there are multiple possibilities, but {{ja-pos}} (and all the other headword templates) are able to do this already by looking at the input for {{ja-kanjitab}}, so there's no reason why link templates shouldn't be able to do this as well.
This would greatly simplify a lot of the complexity encountered when adding Japanese links, which would help with the usability issues Eirikr mentions. Theknightwho (talk) 22:17, 15 August 2023 (UTC)Reply
Blah. So much crappy East Asian code (and templates) out there. Even if the conversion is possible auomatically in only say 80% of the cases, that would probably be good enough, as we can do the remainder by hand or just leave them. If for example there are cases that can be handled using {{ruby}} and not with {{ja-r}} that is probably fine, but we should not have two ways of doing the same thing and randomly use one or the other. Benwing2 (talk) 23:31, 15 August 2023 (UTC)Reply
@Benwing2 Yeah, I suspect a conversion is possible, and as a last resort 1,000 uses is doable manually if a few of us handle it.
@Theknightwho, my usability concern is not about editing, it's about reading, and about accessing the text as it is rendered in the browser.
On the reading side, things like 漢方(kanpō) are visually unclear to anyone not already somewhat familiar with Japanese typography -- it looks like the entire block of kanji + furigana together is the Japanese "word", ruby and all, when in fact the Japanese term is 漢方. Even if a reader understands that the ruby text is not actually part of this term, the kana are only useful for someone who already knows how to read kana. The kana are also superfluous, as we already include a romanization, which provides the same information just in a different script.
In terms of the accessibility of the rendered text, for reasons obscure to me, the <ruby> element in the HTML seems to render the Japanese term un-copyable. If I select the text "things like 漢方(kanpō) are" as rendered, and hit CTRL+C and then try to paste that somewhere, I only get "things like (kanpō) are" -- the Japanese text itself is missing entirely. Meanwhile, if I select the text "the Japanese term is 漢方." and do the same, I get "the Japanese term is 漢方." -- the pasted text includes everything I expected.
@Eirikr I think realistically, most Japanese entries are going to be used by people already familiar with Japanese enough to know what the function of the rubytext is. Although we’re a dictionary in English, that doesn’t change the reality that most dictionary entries are of little use to a complete novice.
You’re right about there being an issue from a copy and paste point of view, and it’s something that it would be good to solve if at all possible. I’m sure there is a solution, but I’d need to look into it. Theknightwho (talk) 23:07, 15 August 2023 (UTC)Reply
Also, just adding that the rubytext does actually serve an additional purpose to the romanisation, as it shows the reading for each kanji; romanisation can’t do that (unless we used rubytext for that instead, which I don’t think would be very helpful as it wouldn’t show semantic word breaks). Theknightwho (talk) 23:14, 15 August 2023 (UTC)Reply
If folks are familiar enough with Japanese to where they understand both kana and how furigana (kana used as ruby text) work, then they also have some idea of how Japanese phonemes break down, and how kanji readings work -- so again, furigana wind up largely superfluous to the only audience that knows how to use them.
I really think we (speaking generally) get too caught up in technical details and the coolness factor, and lose sight of usability and usefulness. Outside of those manga-esque cases were the spelling and the intended reading are really orthogonal, like 騎士(naito), I honestly don't think that furigana are useful enough to offset the negative impacts on usability.
... One idea occurs to me. Is there any easy way of toggling ruby display on and off? Thinking further, would there be any way of indicating in the wikicode if ruby is really needed (as in the 騎士(naito) example, otherwise anyone who can read Japanese that looks at 騎士 would expect to read it as kishi), or if the ruby is optional (such as when the ruby just indicates the regular reading of a given spelling)? ‑‑ Eiríkr Útlendi │Tala við mig19:51, 16 August 2023 (UTC)Reply
@Eirikr I know next to nothing about Japanese but I can see how ruby text is useful. For example, I can read Cyrillic but I don't know the ins and outs of irregular pronunciations in Russian; in cases like that we show a respelling in Cyrillic as well as give the IPA, and I think the Cyrillic respelling is useful. I imagine there are plenty of Japanese learners who will be able to read Hiragana (it's probably one of the first things taught) but have difficulty with Kanji (keep in mind it takes around 10 years for native speakers to learn to read and write Kanji, and probably only a few weeks to learn Hiragana). Benwing2 (talk) 20:30, 16 August 2023 (UTC)Reply
I was afraid of some confusion, and indeed, here we have it. :)
Speaking specifically about ruby for Japanese -- I grant that there are plenty of other use cases in other languages. By no means do I advocate for getting rid of {{ruby}}. I'm looking solely at the use case for {{ruby/ja}} and redirects.
@Eirikr The ruby text seems to allow for convenient markup of running Japanese text without interrupting the flow; putting romanizations in parens in the middle of a sentence would interrupt the flow, which is why it gets added at the end. I could imagine putting romanization in ruby text but it seems that isn't conventional. Benwing2 (talk) 21:05, 16 August 2023 (UTC)Reply
{{usex|ja}} and {{ja-usex}} put romanization afterwards, not mid-text. I can't think of any case where a romanization would be inserted in the middle of an otherwise-running Japanese text.
{{usex|ja|これは見本です。|This is an example.|tr=Kore wa mihon desu.}} →
これは見本です。
Kore wa mihon desu.
This is an example.
{{ja-usex|これは見本です。|これ は みほん です。|This is an example.}} →
{{ja-r|これは見本です。|^これ は みほん です。|This is an example.|linkto=-}} →
これは見本です。(Kore wa mihon desu., “This is an example.”)
In terms of the wikicode used to call the templates, I'd argue that {{ruby/ja}} is more of a mess, and the syntax is confusingly different from the rest of our Japanese infrastructure.
From the markup example on the Module:ja-ruby page (what {{ruby/ja}} actually invokes):
[[振る|[振](ふ)り]][[仮名|[仮](が)[名](な)]]
Yuck. Granted, part of the problem here is borderline link abuse, but by way of comparison, we could use {{ja-r}} to similar effect, with a more straightforward syntax:
Separately, in looking for examples of {{lj}} just now, I'm finding cases where {{lj}} seems to have been used as a replacement for {{lang|ja}} -- there are no ruby characters provided. See this snippet of the wikicode source at 会う#Japanese, for instance:
@Eirikr For reference, on MacOS, using Chrome, when I copy the text with Ruby in it and paste it into TextEdit I get this:
things like 漢方
かんぽう
(kanpō) are
The same thing happens using Safari, which suggests it's an OS issue, although possibly there are carriage returns in the underlying text that are leading to this. Benwing2 (talk) 23:10, 15 August 2023 (UTC)Reply
Geez. I asked this one in February and again in March to update the documentation of Module:languages/data/2 for the "generate_forms" stuff that is otherwise largely unexplained. With the promise "I'll add it shortly" half a century passed and the documentation is still nowhere to find. Now he suddenly jumps out and complains how Japanese does not follow the Chinese model... -- Huhu9001 (talk) 01:54, 16 August 2023 (UTC)Reply
@Huhu9001, Eirikr My original proposal was to rewrite {{lj}} and {{jaru}} into {{rja}} as a shortcut for {{ruby/ja}}, but given what Eirikr says, maybe we don't need either of them, or {{ruby/ja}} for that matter. It sounds like maybe the best thing is for {{ruby}} to take a language code and use it to wrap the generated text appropriately, and to simply use {{ruby|ja|FOO}} when you really need to display arbitrary ruby that can't be handled by {{ja-r}}. Then we can get rid of {{ruby/ja}} and its shortcuts. Thoughts? Benwing2 (talk) 19:42, 15 August 2023 (UTC)Reply
T:ruby sometimes serves to prevent double wrapping of language HTML classes, mainly in |title= or |chapter= of quotation templates, like this one |title={{lw|ko|s:님의 침묵/생의 예술|{{ruby|[生](생)의 [藝](예)[術](술)}}|tr=Saeng-ui yesul}} in 열.
@Huhu9001 We seriously need to avoid having to wrap one template in another. Maybe we need to make {{ruby}} smarter so that it can handle cases like the one above. Can you enumerate other cases where {{ruby}} gets wrapped in another template, or vice-versa, that can't simply be replaced by the equivalent of {{lang|FOO|{{ruby|...}}}}? Benwing2 (talk) 03:43, 16 August 2023 (UTC)Reply
There are some cases when you want to ruby only a part of text. Then it can be done like: {{lang|LANG|unrubied text, blahblah, {{ruby|somehow rubied text}}, more blahblah}}. One such usage is in 閣下. -- Huhu9001 (talk) 04:14, 16 August 2023 (UTC)Reply
Sometimes wrapping the whole text produces repetition. It may become:
{{lang|LANG|unrubied text, blahblah, {{furigana|text to be rubied|ruby}}, more blahblah}} vs
{{furigana|LANG|unrubied text, blahblah, text to be rubied, more blahblah|unrubied text, blahblah, ruby, more blahblah}}
Sometimes the wrap is not t:lang. It may be t:quote, like when you have {{quote|ja|text=(ruby text)...}}. How do you "wrap the whole text" for this one?
@Huhu9001 You need to think outside the box a bit. For #1, we're talking for the moment about {{ruby}} not {{furigana}}, but {{furigana}} can be made smarter like {{ruby}} is, so that you can annotate part of the text. For #2, {{quote}} should be modified not to language-tag text that already is language-tagged, so it's OK to write {{ruby|ja|...}} inside of {{quote}}; and/or we make a ruby-quote template, similar to how we already have {{ja-x}}; and/or we add built-in support to {{quote}} for ruby text. In general, having to manually wrap using both {{lang}} and {{ruby}} inside of each other is super ugly and should be avoided. Benwing2 (talk) 05:22, 16 August 2023 (UTC)Reply
@Huhu9001 What I'm probably going to do is modify {{ruby}} so it takes a language param, but you can write {{ruby|-|...}} to force no language wrapping, so that if you really want to embed one template in another, you can do it without fear. Benwing2 (talk) 05:27, 16 August 2023 (UTC)Reply
I don't see any real need for {{lj}} or {{jaru}}, but I haven't looked at any current uses. It seems to me that {{ruby|ja|}} should suffice. As regards e.g. {{ja-r}}, it makes sense to me to use hiragana ruby with kanji, as this is fairly commonly done in Japanese-learning materials. It seems to me (again naively, without having done any specific research into the question) that users are likely to include a fair number of Japanese language learners. Cnilep (talk) 01:49, 16 August 2023 (UTC)Reply
Latest comment: 1 year ago5 comments3 people in discussion
Translingual.
Asked to request a move because this is not attested as 'translingual', being so far found only in Northern Thai. I don't know why we'd want to request a move rather than just fixing the section header. kwami (talk) 14:14, 27 August 2023 (UTC)Reply
Latest comment: 1 year ago2 comments2 people in discussion
The same sense is in both places; it shouldn't be. Either leave it at hinge where it has a usage note about the "(up)on", or move it to hinge upon and reduce the relevant sense-line at hinge to a {{used in phrasal verbs|en|hinge on}} pointer. - -sche(discuss)01:16, 7 September 2023 (UTC)Reply
'Druther have a redirect from hinge on to hinge. Personally, I am loath to assert that hinge onis a phrasal verb. If it really is one, we should have an entry for it.
I would vote for putting the sense under the top one, because that's where people will look for it. Additionally, even if the true etymology is from Latin, it's certainly not widely seen that way, since people will say "I second this" ... "I thi0rd this", and so on, rather than using whatever the appropriate Latin word would be. —Soap—16:53, 9 September 2023 (UTC)Reply
All senses are from Latin, displacing native twoth. The problem here is that the "agree" sense is partly (and originally) from Etymology 3, and partly rederived from Etymology 1. Rather than reduplicating the sense a better solution would probably be to stick to one section and note the reinforcement in the etymology. —Al-Muqanna المقنع (talk) 18:24, 9 September 2023 (UTC)Reply
Rather, it muscled in on the territory of native other, for which at Wiktionary we have to go back to Old English ōþer. I can only find twoth as part of a compound ordinal, which is a new function for the meaning 'second'. --RichardW57 (talk) 23:44, 9 September 2023 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
There are two senses which are probably the same. Etymology 3 - A hollow; a valley, especially the upper end of a narrow mountain valley when it is nearly encircled by smooth, green slopes; a combe. and Etymology 4 - A sloping plain between mountain ridges Jewle V (talk) 18:48, 14 September 2023 (UTC)Reply
Latest comment: 6 months ago10 comments4 people in discussion
English. To be moved to be going to. Most modern grammars and some dictionaries treat be going to (" ~ will") as an idiom. The inflection line for [[going to]] has long been "(be) going to". Edit summaries show that contributors here have thought the expression included be. I can't think of another copula that could substitute for be. I also can't picture anything other than adverbials like yet, still, later, and some other short temporal adverbs (with or without not) appearing between be and going to. IOW, it's close to being a set phrase. The adverbial insertions would look good in some of the usage examples. DCDuring (talk) 18:51, 14 September 2023 (UTC)Reply
Yeah, I did wonder about that after writing the above and apparently in Early Modern usage (according to the chapter I cited as a source for the etymology) it appears without be as well. So I'm not sure. It might be worth first collecting attestations without "be" to verify how to treat the elided use (e.g. it could be moved and the elided version turned into an informal altform). —Al-Muqanna المقنع (talk) 18:08, 19 September 2023 (UTC)Reply
Aren't are our normal users better off with a main entry at an unelided form with redirects from any elided forms. In this case, were we to have an additional entry at going to, we should have at least three definitions at going to, to wit, 1., "elided form of be going to; 2. Used other than figuratively or idiomatically: seegoing, to.; 2.1 "moving toward" (subsenses for 2.1.1 progressive verb, 2.1.2, for gerund?); and, possibly, 2.2, et seq. DCDuring (talk) 22:11, 19 September 2023 (UTC)Reply
I think I'm thinking of the "be elision" kind of as "be elision" in other cases, such as with "be" + adjective. In other words, the "be" is not core to the construction. But perhaps this is debatable. — justin(r)leung{ (t...) | c=› }02:49, 20 September 2023 (UTC)Reply
Surface comparisons can be misleading. English has lots of collocations that seem word-by-word analogous, but behave differently. I think supposed to is a false parallel. Supposed to can (infrequently) be used with other copulative verbs. I'd like to see evidence that going to is used without be in recent (~200 years) English. The few (2) other OneLook dictionaries (MWOnline, Collins) that cover (be) going to cover it at be going to. DCDuring (talk) 11:58, 19 July 2024 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
English. I have moved senses (and their translations) between on the way and on one's way, making on the way a lemma with most of the definitions from on one's way rather than an alternative form. It occurs to me that there might be a pondian difference that would account for the previous arrangement. In any event, please review the changes. DCDuring (talk) 15:00, 23 September 2023 (UTC)Reply
If a set category is truly restricted to one language (e.g. Translingual), should we leave it at whatever prefixless name it may have, or move it to "mul:" (or whatever other language code is appropriate) and put it into the "set category" system, even if it only exists for one language?
Do the categories named above actually only exist in one language (Translingual)? Should .გე, .հայ, .한국, etc go in the same category as .de, or would they belong in "ka:Top-level domain codes" (and "hy:", "ko:", etc)?
@-sche My current tendency is to only create topic and set categories if they exist (or may exist) for more than one language. However, I think is probably the wrong thing to do. The poscatboiler system supports language-specific categories like Category:Bulgarian conjugation 2.1 verbs and I don't see why we can't do the same in the topic category system. (BTW the poscatboiler system now handles all categories of all sorts except for topic and set categories. I've been thinking for awhile of making it handle topic/set categories as well and eliminate the separate topic category system; this would make it possible to consolidate the generic category code into the poscatboiler system, so there's only one unified category system.) For #2, I'm not really sure, but my instinct is that non-Latin-script top level domains should also be translingual. Note for example that Korea created Korean-specific Latin-script TLD's like .kia, .samsung and .hyundai (see .kr on Wikipedia); if these are translingual I don't see why the Korean-script ones shouldn't be. Benwing2 (talk) Benwing2 (talk) 02:22, 26 September 2023 (UTC)Reply
I just added this category to the list template {{ccTLD}} which goes in TLD entries so that all the mainspace transclusions will be in the category. I figured we might as well have the category full until we decide to change it. That means there will be some entries with the category both hard-coded and template-generated. Chuck Entz (talk) 15:34, 12 December 2024 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
There are multiple constellation systems, but we only have one category for all constellations - contrast this with Category:Chinese astronomy which is a subcategory of Category:Astronomy. In the label tree there is already {{lb|zh|Chinese constellation}} which categorises into Category:LANG:Chinese astronomy and Category:LANG:Constellations, and therefore makes these categories very messy (see e.g. Category:zh:Constellations where terms ending in 座 are in the European system while the rest are the Chinese ones - I'm in the progress of adding more for the latter). Also note that there are still a bunch more that have been incorrectly categorised, e.g. Ox which has {{lb|en|astronomy}} rather than {{lb|en|Chinese constellation}}, so there would be a decent amount of terms to warrant a split.
I imagine this was done this way because in the US, "football" universally refers to American football (or occasionally Canadian football, which is quite similar), and never to soccer (except in the names of certain soccer clubs, which often call themselves "football clubs" (F.C. for short) in imitation of European football clubs). But it looks out of place, and Canada similarly refers to Canadian football as just "football" but our category is Category:Canadian football not Category:Football (Canadian). Wikipedia has its article on American football at American football (logically) and similarly for Commons. BTW once we rename Category:Football (American) to Category:American football, we might consider renaming the soccer category to Category:Association football (consistent with Wikipedia), but that's a separate can of worms. Benwing2 (talk) 04:01, 4 October 2023 (UTC)Reply
I would think that our contributors could tolerate a lack of parallelism in topical categories where the base terms reflect common usage and the differentia are in parentheses. This seems like overtidying, letting one's own personal preferences for parallelism override broader, user-oriented considerations. The (non)problem only appears in the Category:Football page. DCDuring (talk) 12:16, 4 October 2023 (UTC)Reply
I would've guessed it was done this way so someone typing "Footba..." into Hotcat (or typing "Category:Footba..." into search) would notice that they needed to specify rather than just using bare "Football". I'm not wedded to the current names, but I don't see a compelling reason to change them, either. - -sche(discuss)05:56, 3 November 2023 (UTC)Reply
Well if some national anthems are referred to with an article and some without I think it would be more helpful for us to document these differences (as we seem to be doing) rather than creating and imposing an artificial consistency. Besides, isn't The in The Call of South Africa part of the official title of the song? Wouldn't removing it be like removing The from The Lion, the Witch and the Wardrobe? — excarnateSojourner (ta·co)20:22, 15 January 2025 (UTC)Reply
Oppose – The citations for both senses almost exclusively feature the hyphenated form. Google Trends results shouldn't be the guidepost here. This would also create unnecessary asymmetry with the antonym anti-shipper. WordyAndNerdy (talk) 23:00, 29 May 2024 (UTC)Reply
@WordyAndNerdy: The citations for the second sense are from print media, which doesn't reflect common usage of the term. If we included social media citations (which we could, per current policy), proshipper would outnumber pro-shipper 10-to-1. As for the asymmetry, this assumes that anti-shipper is more common than antishipper, which isn't necessarily true either. Hyphens in general are becoming increasingly uncommon in English, and slang terms like proshipper are naturally some of the first to reflect this trend. Incidentally, this is why I don't think the citations for the first sense should be counted in this discussion – that sense is rather dated, and not really used anymore in fandom spaces. I don't think it benefits readers to have the lemma form be at a spelling that they're unlikely to even encounter in the first place. Binarystep (talk) 10:48, 30 May 2024 (UTC)Reply
Unhyphenated forms may occur more in online spaces less for orthographic reasons and more as a product of often-poorly-punctuated informal Internet speech. We wouldn't have entries for hannigram or dni based on Internet-speak all-lowercase tweets like "proshippers dni, hannigram sucks" (the links are to the standard forms). This seems like a solution in search of a problem. The current weight of evidence supports the prevalence of the hyphenated form. I don't think we should set out to tip the balance in the other direction for prescriptive reasons. Readers can easily locate the current lemma through the unhyphenated redirect. WordyAndNerdy (talk) 11:15, 30 May 2024 (UTC)Reply
@WordyAndNerdy: There's a difference between all-lowercase typing (which is inconsistent with standard English orthography) and using prefixes without hyphens (which isn't grammatically incorrect). You're right, we wouldn't have entries for hannigram or dni – but we do have an entry for proshipper, because it's an equally valid spelling, not a mere error resulting from informal speech. Further proof is the fact that one can easily find plenty of uses from people who use standard spelling and punctuation. I also find it rather ironic that you'd characterize my reasoning as prescriptive, given that my argument is based solely on frequency of use; if anything, it's more prescriptive to invalidate proshipper on the basis of it being "a product of often-poorly-punctuated informal Internet speech". Binarystep (talk) 19:13, 31 May 2024 (UTC)Reply
Something I discovered while attesting comship/comshipper is that a not-insignificant number of people seem to regard proshipper (unhyphenated) as a blend of problematic + shipper. This strikes me as a rather unlikely folk etymology. Forgive the prescriptiveness but I now favour retaining the hyphenated form to avoid creating confusion over which etymology is more likely to be accurate. The pro- + shipper interpretation is better supported by evidence. WordyAndNerdy (talk) 08:14, 1 June 2024 (UTC)Reply
Oppose based on the cites. Google results are notoriously unreliable (they'll say a search finds X hits, but only display far less); I don't know if Trends are any better. - -sche(discuss)04:54, 30 May 2024 (UTC)Reply
Oppose (provisionally) - I have never seen or used the word, but I do see that Teen Vogue used the unhyphenated version at least once- see [3]. I would like to see more citations on the unhyphenated version before a final analysis be made. I am always wary of downplaying of hyphenated words in favor of unhyphenated words because of what I dimly perceive to be a "systemic bias" (accidental or intentional) against hyphenated words on Wiktionary caused by various factors. --Geographyinitiative (talk) 10:09, 30 May 2024 (UTC)Reply
English slang terms whose usage is typically restricted to users of the website Reddit.
However, I am not sure that "Reddit slang" is a particularly viable subcat of "Internet slang"; a lot of these terms didn't originate on Reddit and aren't actually restricted to Reddit. It probably only seems that way because of the prominence of Reddit as a site where people use Internet slang heavily.
Even if "Reddit" isn't enough for a category, isn't it enough for its own label. And if a usage or, at least, a term originated in Reddit, shouldn't there be an etymology saying so? DCDuring (talk) 22:13, 6 April 2024 (UTC)Reply
They're not. Not all multiword terms are phrases by our reckoning, and the multiword term category contains more than 10 times as many entries as the phrase category. —Mahāgaja · talk08:49, 13 November 2023 (UTC)Reply
The descriptions in the categories are
Hebrew groups of words elaborated to express ideas, not necessarily phrases in the grammatical sense.
and
Hebrew lemmas that are an idiomatic combination of multiple words.
I agree the descriptions aren't clear, but "phrases" in Wiktionary are a grammatical concept and indicate things that can't be clearly classified as nouns, verbs, adjectives and the like, while any POS can be multiword. Benwing2 (talk) 23:09, 14 November 2023 (UTC)Reply
I found 460 hits for "CBSテレビ", versus only one hit for "シービーエス・ソニーグループ" (CBS/Sony Group), in archives for Mainichi Shimbun. I would say the romaji is the common form, and should be used, if the proper name is kept. Cnilep (talk) 04:37, 9 September 2024 (UTC)Reply
Strong oppose. @Musetta6729 and I have discussed this previously in private and have already cleaned up Shanghainese Chinese, which we both found unnecessary as most of the terms in it can be classified as either "chiefly Shanghainese (Wu)" or just plain Shanghainese. As correctly identified previously, the Chinese category contained mostly Wu terms, which we have already dealt with. We have already dealt with the majority of the category's pages, and left four that could also be removed:
鄉下人/乡下人 (shián-gho-gnin), 硬盤/硬盘 (ngan-boe), and 硬盤人/硬盘人 (ngan-boe-gnin) are all generally "xenophobic" terms that can be classed as "chiefly Shanghainese (Wu)" (or something similar)
三環/三环 (sé-gue) is a geographical term that pertains to the city of Shanghai. We can simply remove the Shanghainese Chinese label and deal with it much like the other geographical terms, cf. 筲箕灣/筲箕湾 as just one example
If we implement these two measures, the Chinese category will be completely vacated and can potentially be removed. Even if we do not remove it, I would like for at least some dignity to be given to Shanghainese, as the to-be completely unused label will get the succinct "Shanghai" name while the language of urban Shanghai will be relegated to the term "Shanghainese Wu", which to be frank, we both found somewhat insulting. — 義順 (talk) 12:57, 26 January 2024 (UTC)Reply
@ND381 I am confused why you think "Shanghainese Wu" is insulting, unless you deny that Shanghainese is a variety of Wu. As for the label, that is an orthogonal discussion and we can change it any way we want. Benwing2 (talk) 19:48, 26 January 2024 (UTC)Reply
@Benwing2: Wu is a grouping of languages. No one speaks "Wu". We treat it as part of Chinese for practical reasons, but the Wu languages are quite divergent from the rest of Chinese, and presumably fairly distinct from each other. I suppose they see it as analogous to "English West Germanic" or "Ukrainian East Slavic". Chuck Entz (talk) 20:30, 26 January 2024 (UTC)Reply
I would like to add a bit onto what has already been said here. Shanghai is incredibly complex sociolinguistically, and what is referred to as "Shanghainese" (on wiktionary as much as elsewhere) tends to be the city-centre varieties that developed during the course of the last centuries as a lingua franca between the original Shanghai locals and migrant populations from nearby areas who now constitute a major part of Shanghai.
But Shanghai in fact has a whole range of regional languages - a range of Wu varieties, in fact, which can all be fairly divergent from each other but still very much maintain mutual contact and influence internally. When someone speaks of "Shanghainese", if they don't specify non-city-centre Shanghainese, then one would usually assume they are talking about city-centre or something adjacent to that. But "Shanghainese Wu" feels then more vague somehow as to whether it refers to any dialect, sociolect or topolect that can be considered "a Wu variety of Shanghai which is not necessarily city-centre", a label which is not in itself necessarily useful, and can potentially even be quite confusing in my opinion.
As of now we have been adding modifiers such as "urban" or "suburban" in front of "Shanghainese" when we come across situations where we need to clarify, and that's been working alright. But coming back to the original point, I think it is also just that "Shanghainese Chinese" - which currently is used as "Standard Mandarin terms found in Shanghai" (the language itself not being native to Shanghai, simply spoken in Shanghai for being the official national language) - should arguably not take precedence to the Chinese varieties that are native to Shanghai instead. — Musetta6729 (talk) 02:47, 27 January 2024 (UTC)Reply
This discussion is very much more of a footnote, but the fact that the significantly more irrelevant category gets the label that the language is meant to have (ie. I would prefer for S’nese the language to get "Shanghainese" or even just "Shanghai" like how other non-top level groups/lects are handled) and instead we have to settle for the (intentionally obtuse?) mouthful that is "Shanghainese Wu" — not even Northern Wu à la Quanzhou Hokkien or Hong Kong Cantonese. Again, this is very much not the main point and from your profile I'm assuming you don't know that much about socioling and language politics in the area so it would be I suppose easier to leave the discussion here
The main problem is still just the category: S’nese Wu is unnecessarily obtuse and if we can get back to the point of whether or not we can just clear S’nese Chinese’s four remaining pages we can have a more fruitful consensus — 義順 (talk) 20:34, 26 January 2024 (UTC)Reply
As there has not been any negative comments regarding the vacating of Category:Shanghainese Chinese, I have removed all four remaining entries in the category.
Regarding the situation of the naming convention, unless there are any further objections, the current Category:Shanghainese Wu should be renamed to just "Shanghainese", and S'nese Chinese is to be either kept as is, renamed to something like "Standard Chinese in Shanghai", or deleted. Of the three options, I believe the last one would be best, as there genuinely isn't a need for it: "chiefly Shanghainese" would cover for most if not all cases of words in Standarin that are used in Shanghai, as those terms almost/always originate from the local variety anyways. Misspellings or Shanghainese-influenced sayings in Standarin that are not found in Shanghainese should perhaps be labelled with "influenced by Shanghainese", if, again, is necessary, which I highly doubt.
For the time being, the "Shanghainese Wu" label will be renamed to "Shanghainese" as per above discussions, and to stay in line with other "-(n)ese" labels (cf. Hainanese, Sichuanese). If for whatever reason S'nese Chinese (ie. Standarin used in Shanghai that isn't "chiefly Shanghainese") is actually needed, unless there are any objections, something along the lines of "Standard Chinese, Shanghai" or "influenced by Shanghainese" (if appropriate) is to be used, though again, there really aren't any words that would warrant this designation. — 義順 (talk) 13:50, 31 January 2024 (UTC)Reply
Apologies for the late reply; I too am fine with renaming or removing "Category:Shanghainese Chinese" (and updating Module:labels). In some similar situations we've used noun forms instead of adjectives to make this kind of distinction, e.g. "Category:Switzerland German" (for de) was renamed to that name to distinguish it from "Swiss German" the Alemannic lect, so if we need a category for "standard Chinese / Mandarin terms chiefly found in Shanghai", it would fit the overall schema to name it something like "Category:Shanghai Chinese"... but if people just don't want such a category, and want {{lb|zh|Shanghainese}} / {{lb|cmn|Shanghainese}} to throw an error and put the entry in a cleanup category so someone can re-code it as a wu entry, that works too... - -sche(discuss)01:30, 6 March 2024 (UTC)Reply
Comment: If we are trying to make a distinction, one category should be referring to Shanghainese Wu, and another should be referring to any variety spoken in Shanghai (i.e. both Shanghainese Wu and Mandarin). I don't know if this distinction should/can be made, though. — justin(r)leung{ (t...) | c=› }04:06, 11 October 2020 (UTC)Reply
I guess the issue then is, do we have native Shanghainese speakers here who can make this distinction? It looks to me like most entries in both categories are Wu terms. Benwing2 (talk) 22:05, 11 October 2020 (UTC)Reply
If we have any entries that make this distinction (and one such entry has been convincingly adduced above), then merger would result in losing information. Do you want Shanghai-specific Mandarin terms to go uncategorised as such? —Μετάknowledgediscuss/deeds03:26, 12 October 2020 (UTC)Reply
@Benwing2, Metaknowledge: @Thedarkknightli probably knows the Mandarin terms and may know some of the Wu terms. For Shanghainese, we have some resources we can consult, so it's the Mandarin terms that are more difficult to figure out. The terms that are in CAT:Shanghainese are Wu for sure (and I would prefer to call the category "Shanghainese Wu" to make it clear). We would need to sift through the CAT:Shanghainese Chinese category to check what's actually Wu and relabel them with "Shanghainese Wu" or just "Wu". BTW, there might be some need to revamp other labels/categories, like "Sichuan" displaying as "Sichuanese" and categorizing to CAT:Sichuanese Mandarin, which could be confusing when we introduce terms in Sichuanese Hakka or Xiang (which we might have some already). — justin(r)leung{ (t...) | c=› }03:40, 12 October 2020 (UTC)Reply
(edit conflict) A native Shanghainese speaker would be User:辛时雨 but he is not very active.
What we lack with regional labels, which is specific to Chinese since the merger needs to work for varieties and subvarieties is the ability to add variety specific categories, {{lb|zh|Shanghai|Wu}} is meant to not only label a term but also categorise it as Shanghainese Wu but {{lb|zh|Shanghai}} is for general Chinese, esp. Mandarin. --Anatoli T.(обсудить/вклад)03:43, 12 October 2020 (UTC)Reply
I think you would need to use {{lb|zh|Shanghai Wu}} or something, not {{lb|zh|Shanghai|Wu}}, since I don't think the same label ("Shanghai") can categorize into two categories. Anyway, add my voice to those saying that if there is intended to be a distinction here, the category names (and, probably, boilerplate texts) should be made clearer. We could also consider "see also"-style crossreferencing them, like Category:Louisiana French and Category:Louisiana Creole French language. - -sche(discuss)17:26, 13 October 2020 (UTC)Reply
Still not sure about Murex pecten's vernacular names.
The main issue here is strictly a matter of orthographic rules: how do you spell the combination of the possessive clitic, 's, with a word that ends in "s" in the singular? I was taught " s' ", but it looks like professionally edited works have used " s's " as well, or just avoided the issue by omitting the clitic. There's variation along those lines for both the plant and the mollusk. I suspect the differences in occurence of the spellings has as much to do with time and place of publication as with any difference between usage of the plant name vs. the animal name. Chuck Entz (talk) 02:17, 31 December 2023 (UTC)Reply
This seems like a cleanup operation, covering several entries and potential entries.
No other OneLook dictionary has open-pit mine. MWOnline, Oxford, Dictionary.com, and Collins have open-pit, Collins having it as a noun. (Attestable as noun, but SoP?) Also we have some of opencast, open-cast, open cast. We should use GoogleNGrams to determine the most common for each of the -pit, -cast, and -cut forms, use Google Books/News to determine which are attestable, include all attestable forms as alt forms, and make sure that at least the main forms show the main form of the other groups as synonyms. There is also the possibility that some of these are used adverbially. DCDuring (talk) 15:18, 19 January 2024 (UTC)Reply
Agreed. As a general rule we don't change the spelling of terms with Pondian differences once the entry has settled on one spelling or another. (There are exceptions, e.g. if British spelling allows both A and B equally and American spelling prefers B, I think it would be reasonable to move a term spelled as A to B.) Benwing2 (talk) 02:53, 27 January 2024 (UTC)Reply
If we check frequency, we should be ready to change which is the main entry. Google NGrams makes it easy, though it covers books (only?). Whether the criterion should be recent usage or all usage is a matter of judgment, at least for now. One can also search in News for usage by location (nation, province/state?) of the source. DCDuring (talk) 16:06, 27 January 2024 (UTC)Reply
Latest comment: 10 months ago3 comments3 people in discussion
I understand that the distinction between 's and -'s is that the former is a contraction of is, was or has and the latter is a possessive, but I think this distinction is likely to be lost on the majority of Wiktionary users and is better made by merging both pages to 's and making the distinction using different Etymology sections. As it is, there is some duplication between these two entries. Benwing2 (talk) 22:44, 31 January 2024 (UTC)Reply
(Oppose unless it can be demonstrated that we don't normally lemmatize suffixes like this at titles with hyphens.) I'm very sympathetic to the fact that content being somewhere that some people don't expect is a problem, and to need to prominently flag when content is on a different page than some people expect, not just in this kind of case, but also e.g. when we usually lemmatize singulars but occasionally put some senses at the plural, or usually lemmatize without the but occasionally have some senses at separate the X entries, or when we lemmatize phrasal verbs outside the main verb entry. I'm a big fan of Template:used in phrasal verbs and "See..." links like at message. But if lemmatizing the possessive at -'s is technically correct and is consistent with how we treat other suffixes, then we should continue lemmatizing at -'s and just take whatever other measures we can to obnoxiously prominently crosslink it to and from the other page... because if we make an exception and lemmatize this page at an incorrect title, it's inconsistent with other entries... do we also move -'#English? What about -'s#German and -'#German? What about -s? And that inconsistency confuses other users and editors who do understand our system, and look in the right/expected place, only to find that the content isn't there because we moved it to an incorrect/inconsistent place to try to outsmart them. I think we have to do things consistently (e.g. if suffixes usually start with hyphens, do so here too), and use prominent "See also..." links where necessary. For verbs linking to phrasal verbs, and for things like message, such links can just be on definition lines; here, I'd be fine with the link taking the form of a big T:LDL-esque yellow box or something if people want, if people feel a ===See also=== link is insufficient. Obviously, any incorrect duplication should be cleaned up. - -sche(discuss)14:31, 29 March 2024 (UTC)Reply
Latest comment: 10 months ago2 comments2 people in discussion
I can't find anyone reconstructing *kwh₂et-. Most sources seem to have trouble deriving the Slavic, Latin, and Armenian words from the same root. They probably don't belong here, but I don't know enough about these languages to decide. —Caoimhin ceallach (talk) 17:02, 23 March 2024 (UTC)Reply
I agree there is not enough distinction. I think the distinction some people hope for is "unknown means no-one has any ideas, uncertain means people have ideas" (?), but I'm sure I've seen even other dictionaries use "Uncertain." as the complete etymology for a word they have no ideas about, and conversely I've seen things like "Unknown. Theories include..."; there is no logical or maintainable distinction; if you're not certain what the etymology is, you don't know (with certainty) what it is (you just hypothesize), and conversely if it's unknown you're not certain what it is. I would not object to renaming the category as Benwing proposes, but I would also not object to just merging "unknown" into "uncertain" (or vice versa). - -sche(discuss)15:55, 4 April 2024 (UTC)Reply
The argument is fallacious because editors regularly do not have precious knowledge about existence and extent of previous attempts, so template application is quite a guess and theology. Given that the different categorization invites wasteful concerns of editors (adding to the learning curve load), I do not only not see the utility of if but also reckon it harmful, and am also sure that Metaknowledge would position himself likewise, as confronted by my argument about underspecified species names vs. uncertain meaning words on Talk:بركة. If you go from unknownness to uncertainty you can also visit underspecification and other more “science-theoretical” details that can only be left to philosophy papers nobody will actually want to write. Fay Freak (talk) 16:27, 4 April 2024 (UTC)Reply
Coming back to this, if the distinction is not clear enough to have two categories, then why have two templates? Could we merge the templates, as @-sche mentioned? AG202 (talk) 02:51, 7 January 2025 (UTC)Reply
Fine with me. We seem to have an apparent consensus for merging (you, me, Victar, Surjection, -sche and I think FF, although as usual his writing is impenetrable). Maybe ask on Discord to see if anyone else has any opinions? If not I can go forward with it. I think the merged template should be called {{uncertain}} because it's rare, at least in well-researched languages, for an etymology to be truly unknown; typically there are various speculations. Benwing2 (talk) 04:22, 7 January 2025 (UTC)Reply
I agree we should merge the templates, as there is no clearly definable difference between an unknown etymology and an uncertain one. —Mahāgaja · talk08:19, 7 January 2025 (UTC)Reply
@This, that and the otherSupport. The existing categories are especially problematic when you have multiple ideographic description characters, such as ⿰⿳⿰SIR木阝. However, why are you proposing to use "entry titles" in the category instead of just "terms"? Benwing2 (talk) 06:31, 5 April 2024 (UTC)Reply
Well, the term itself is not spelled with the ideographic description character. That's just a consequence of the fact the character is not encoded in Unicode. Nobody would consider these characters to be part of the spelling of the term. Moreover, it's ludicrous to say that ⿰亻尭 is spelled with ⿰ when 侥 is not – they are both equally composed of two CJK characters placed side-by-side (not sure of the technical CJK term for that). Compare this to Category:Translingual terms spelled with ◌́, which includes terms that use the combining accent character as well as those using precomposed Unicode characters, hence truly containing all terms spelled with the accent. This, that and the other (talk) 09:31, 5 April 2024 (UTC)Reply
Hah, I see you didn't actually argue for the use of the word "spelled". Whoops! I guess my argument against "terms" still runs along the same lines though. The terms themselves do not use these sequences, it is their Unicode encodings of the entry titles that do. This, that and the other (talk) 09:33, 5 April 2024 (UTC)Reply
@Benwing2 I see this situation as unique, on the grounds that no other category tree picks so heavily on the specific Unicode encoding of the entry title at the total exclusion of the term's actual, human-centred visual appearance or orthography. Even Category:English terms spelled with ◌́ includes titles that use the precomposed characters like é.
Anyway, I'm not going to press the point any further - a merger is the most valuable outcome here. I'd be satisfied to merge to "Category:Translingual terms spelled with/using ideographic description sequences" or any similar name. This, that and the other (talk) 04:05, 1 December 2024 (UTC)Reply
@Benwing2 @This, that and the other There are two ways to handle these: where possible, I think we should do what Module:zh-pron currently does by constructing the character (e.g. see Category:Chinese terms spelled with ⿰氵厶). Where that isn't possible, either the term is a mistake, or the IDS characters are being used for some other purpose (i.e. the term actually contains them as characters), so they should retain the current separate categories. I can only imagine the latter case coming up with emoticons. What I would oppose would be any kind of category like "terms using IDS" or whatever - the IDS are just a tool to represent unencoded characters; we, as a dictionary, only care about them insofar as they help us create entries, but the terms are in no way actually spelled with them (with the exception of the emoticon example I mentioned before). TTATO isn't being nitpicky by pointing this out, imo - it's actually a crucial distinction. Theknightwho (talk) 04:32, 1 December 2024 (UTC)Reply
@Theknightwho The reason I proposed this merge is that it seems valuable or useful to have a category keeping track of entries using IDS in their titles - the Han characters using IDS are necessarily unusual or exceptional in some way and it somehow makes sense to me to group them into the category. Splitting by the individual IDS character used doesn't seem worthwhile, but an overarching category may be. (Perhaps farfetched, but I could imagine it being useful for people looking for new Han characters to propose for inclusion in Unicode!) This, that and the other (talk) 04:49, 1 December 2024 (UTC)Reply
@This, that and the other I have no problem with having an IDS category, but it should be “entries with IDS in the title” or something, not “terms spelled with”, since that’s just factually wrong, as the IDS is only there due to the fact these characters haven’t been encoded yet.
I don’t think there’s a problem in having separate categories for them - they’re still separate characters in their own right, and the only thing that unites them is the fact they aren’t encoded, which isn’t lexically relevant. Chinese and Japanese already do just fine with them, and the general headword template only creates those categories when the title consists of more than a single character anyway (note that I’m taking a complete IDS sequence to be one character). Chinese and Japanese have language-specific reasons for creating categories for single-character entries, but most (maybe all) our IDS entries wouldn’t create Translingual categories for this reason in the first place. They only exist now because the headword module doesn’t know IDS sequences are special. Theknightwho (talk) 12:30, 1 December 2024 (UTC)Reply
Latest comment: 10 months ago1 comment1 person in discussion
English. Move/convert to Appendix. Any red-linked item included in this automatically causes that page to be "wanted" thereby clogging Special:WantedPages with pages almost all or all of the "wants" for which are created the template. There are now 13 such redlinks.
Support. "France French" fits existing practice, as you say; we already use nouns rather than adjectives in some other cases, like "Switzerland German" to avoid the ambiguity of "Swiss German". In fact, now that it's possible to have labels categorize differently for different languages, we could consider changing "Switzerland French" and "Switzerland Italian" back to "Swiss...", since those two are not ambiguous and were just collateral damage of people wanting to rename the German category. But in the other direction... I wonder if we should consider changing not only "French French" but also e.g. "French Yiddish" to "France Yiddish", and "Vietnamese Chinese" to "Vietnam Chinese": I wonder if we should in general try to avoid categories that look like "[language name] [language name]". But that's probably a bigger discussion... - -sche(discuss)05:58, 15 April 2024 (UTC)Reply
In the vein of "Peninsular Spanish", it occurs to me that "French French" could be "Metropolitan French" (though then people unfamiliar with that term might think it means French spoken in metropolises, so I don't know if that's better or worse than "France French"). "England English" seems to be an actual term I can find in use (contrasted with e.g. "American English" and "Australian English"). - -sche(discuss)15:52, 15 April 2024 (UTC)Reply
Latest comment: 8 months ago10 comments5 people in discussion
We give this as a noun, but our cite shows it's a verb that can occur in other tenses. Do we move it to tail wag the dog? Or do we consider it too awkward to find a tail-containing title for the verb to live at, leave the verb on wag the dog, and make this entry a phrase "the tail is wagging the dog"? Either way, it needs to be moved and re-POSed, no? - -sche(discuss)06:33, 19 May 2024 (UTC)Reply
As worded it is clearly and correctly an NP headed by tail.
The lexicographic issue is the appropriate headword, which, in our case, is influenced by our avoidance of the idiom PoS, MWOnline has "the tail wagging the dog" as an idiom. Most OneLook dictionaries don't seem to cover this at all. DCDuring (talk) 14:21, 19 May 2024 (UTC)Reply
@Benwing2 I think we should have a policy for phrases in English which can take multiple tenses, as this comes up relatively often and it would be nice to have something to point to (e.g. time stand still was recently moved to time stands still after quite a long thread at WT:RFDE). As with other parts of speech, I’d prefer we had a consistent lemma format, even if it’s not usually said that way (e.g. lemmatising at kiss one's ass goodbye, which I can only find one durable use for with the pronoun one, despite being relatively common). Theknightwho (talk) 11:58, 6 June 2024 (UTC)Reply
What should be the lemma? Should there be entries, redirects, or nothing for classes of the often-numerous alternative forms (variations in verb inflection, number, pronoun, determiners, grammatical structure, licensed adjective or adverbs, etc)? Do we have to research relative frequency of the forms to make these decisions? How should the variations be acknowledged on the lemma entry? What differences by language type or individual language? DCDuring (talk) 13:16, 6 June 2024 (UTC)Reply
This is the kind of thing that, I believe, other dictionaries cover in a style guide. We could use Wiktionary:Style guide as a location for a set of subpages on relatively narrow lexicographic issues, so that they would be easy to find. Entry types, like this one, that recur would benefit from some principles inferred from examples and will probably generate disagreement, but not major conflict. We could have votes and make individual subpages policy, but that should not be necessary. DCDuring (talk) 12:48, 6 June 2024 (UTC)Reply
@Theknightwho I totally agree. Sometimes I've moved pages to try to make them more consistent, which sometimes led to complaints, so I think a style guide or whatever would be very helpful. For example:
Those seem like good rules to me. There is an interaction with what I think is our preference not to have headwords with leading the. Also, to clarify, when you say infinitive you mean the 'bare infinitive', not the 'to infinitive'. When should something be used instead of someone? (Does it depend on the relative frequency of use of the expression with non-gendered things? Threshhold?) Are there circumstance in which we would go with a different lemma headword? Should we have alt form entries for some of the inflected and other variant forms or just hard redirects. I don't know how complete we should try to be. To much detail might delay implementation and course correction. DCDuring (talk) 01:53, 7 June 2024 (UTC)Reply
@DCDuring These are good questions. You are right that I mean "bare infinitive" rather than "to-infinitive". As for something vs. someone, I think if it can reasonably occur with both, one should be a soft redirect to the other. Generally I prefer soft redirects over hard redirects, although I understand that hard redirects are easier to enter. Another issue is, what's the inanimate equivalent of one's? Is it its? I will bring these rules to the BP and see what people say. Benwing2 (talk) 03:01, 7 June 2024 (UTC)Reply
Latest comment: 8 months ago2 comments2 people in discussion
It seems like this is just a capitalization of the lowercase common noun peraia when referring to a specific one. See Citations:peraia (one cite explicitly says ""The peraia" is not a place name but a common noun"). Should Peraia be moved to lowercase and redefined as a common (rather than proper) noun, maybe with the capitalized form left as an {{altcase}}? (Or, compared to e.g. "the boundaries of the (City|County|State|Province|Duchy|Kingom) in the area of the river", does the Greek-ness or some other factor make the meaning of the capitalized entry different or unintuitive enough that we should have both?) - -sche(discuss)18:46, 23 May 2024 (UTC)Reply
While we’re at it, could we add an experimental parameter (to this and {{altform}}) that disables “standard” categorisation (like POS, gender, etc) and dumps words into a “[language] alternative forms and spellings” category? Nicodene (talk) 04:04, 29 May 2024 (UTC)Reply
@Nicodene There would have to be an extra setting passed to full_headword() in Module:headword that indicates that the term is an alt form, which would change the categorization to Category:Franco-Provençal alternative forms and spellings (or whatever) and would disable all the normal categorization into lemmas, by gender, etc. You'd then need to thread a param for this through the various frp-* templates. This is assuming you want whatever inflections/etc. get auto-generated by Module:frp-headword; if not, you could just say {{head|frp|alt form}} or whatever. Benwing2 (talk) 08:16, 6 June 2024 (UTC)Reply
I guess admins should still decide which languages (like Chinese) shouldnt be merged but have borrowed terms of a child language must be a subcategory of the parent language. Already working in derived terms, just need to be implemented with borrowed terms as well. 𝄽ysrael214 (talk) 20:59, 7 June 2024 (UTC)Reply
I don't understand why you want to treat these alternative forms differently than other alternative forms that have their own entries.... Kiwima (talk) 18:04, 8 June 2024 (UTC)Reply
Latest comment: 8 months ago18 comments5 people in discussion
"Aphetic form" is just a fancy way of saying "clipping at the beginning". I doubt we need to make such a fine distinction, and using opaque linguistic jargon is IMO not helpful. I propose eliminating "aphetic form (of)" in favor of "clipping (of)". Pinging User:PUC (creator of {{aphetic form of}}, with only 60 uses) and User:Adam78 (creator of {{aphetic form}}, with only 71 uses). Benwing2 (talk) 00:13, 9 June 2024 (UTC)Reply
‘Clipping’ covers all of these, is a proper term used in linguistics, and is much more comprehensible to the average person. Nicodene (talk) 01:36, 9 June 2024 (UTC)Reply
To me, clipping has a slightly different connotation. Aphesis, apocope, and syncope suggest that the word was reduced slightly by rapid speech or "laziness" whereas a clipping has been shortened in a much more substantial way, like favourite > fave or unprofessional > unprofesh (where an entire chunk of the word has been "clipped" off). Ioaxxere (talk) 21:26, 9 June 2024 (UTC)Reply
@Ioaxxere: They are, formally, subsets of clipping. The following are synonymous (confirmable by searching them):
It's not strictly incorrect to call them clipped forms, but it's not how people actually think of them. Anyone seeing the Ancient Greek forms labeled "clippings" in a dictionary would be baffled by it, and would want to correct what they would perceive as either mistaken or laughably pedantic. I'm trying to think of a parallel case where something that's technically accurate falls so short of common sense that we can't realistically expect anyone to do it. The best I can think of right now off the top of my head is the fact that we categorize CAT:Birds directly under CAT:Vertebrates and not under CAT:Theropods < CAT:Dinosaurs < CAT:Reptiles. Yes, birds are technically theropod dinosaurs, but classifying them as such here would fly in the face of how people actually think. —Mahāgaja · talk07:28, 10 June 2024 (UTC)Reply
I don't know where you got this impression? I do think of them as clipped forms, because that is what they are. And of the two words clipping and apocope, the latter is infinitely more pedantic– the average person wouldn't even recognize it.
If it's useful to have a specific subcategory for forms produced by regular final clipping in Greek (or Italian), that's fine. Nicodene (talk) 07:44, 10 June 2024 (UTC)Reply
Hmm, the average person wouldn't recognize the term "apocope"? That doesn't make it pedantic, that makes it unfamiliar. If only there were a website that provided definitions of words so that people who were unfamiliar with them could look them up and find out what they mean. Something like a free online dictionary, maybe. —Mahāgaja · talk07:51, 10 June 2024 (UTC)Reply
I don't know what definition of pedantic we're supposed to be operating with, then, because in my world, insisting on fine-grain distinctions and the usage of unnecessarily obscure terminology is in fact more pedantic than not doing so.
The technical terms are used for phonologically-induced changes. Most of what I would call clippings are more a matter of style than necessity. There's no phonological process that changes "brother" and "sister" to "bro" and "sis"- otherwise we'd be calling parents "mo" and "fa". Chuck Entz (talk) 08:42, 10 June 2024 (UTC)Reply
There is nothing about phonological regularity (nor for that matter style or necessity) in the actual definition of apocope. Likewise syncope, aphesis, and clipping. Nicodene (talk) 09:32, 10 June 2024 (UTC)Reply
@Mahagaja There is already a problem with editors having a hard time keeping apart ellipses from clippings, and having three more subvarieties of clippings just adds to the confusion, not to mention the opaqueness of using Greek-origin terms that few people have ever heard of. Possibly we could keep apocope of referring specifically to loss of a single final vowel, since this occurs frequently in Greek and Italian, but I would be definitely opposed to keeping aphesis and syncope, which are underused and I would posit are even more obscure than apocope (since apocope is somewhat well-known specifically in the context of the languages where it is a regular process). Benwing2 (talk) 07:16, 10 June 2024 (UTC)Reply
@Benwing2: "There is already a problem with editors having a hard time keeping apart ellipses from clippings." Actually, by a cursory scroll through User:Ioaxxere/ellipses I couldn't find a single instance where {{ellipsis of}} was used improperly. Also, I would like to note that not all syncopic forms are "clippings" by our definition ("a short form created by removing syllables"), since (for example) collard and its etymon colewort have the same number of syllables. Ioaxxere (talk) 20:52, 10 June 2024 (UTC)Reply
My understanding is that Han script forms are typically only used by Chinese researchers for convenience; while they can be defensible for Sinitic morphemes under our current "unified Chinese" scheme, keeping them for loanwords seems forced and unnecessary. —Fish bowl (talk) 23:04, 13 June 2024 (UTC)Reply
@Fish bowl: Super late reply, but I think they can be kept, maybe as alternative forms. 耶提目 is used in some varieties of Lanyin Mandarin / Central Plains Mandarin beside Dungan, I believe, so that should all the more be kept. — justin(r)leung{ (t...) | c=› }05:05, 30 January 2025 (UTC)Reply
The question is not whether an alternative term page is necessary (It is.) but which form is more common. We also need an English L2 for logatom. DCDuring (talk) 12:47, 12 July 2024 (UTC)Reply
Latest comment: 5 months ago7 comments3 people in discussion
I'm really not keen on these two categories, because they don't really make sense with the way that Japanese is traditionally analysed (and the way we treat it everywhere else on Wiktionary). For instance:
き(ki) is described as "the seventh syllable in the gojūon order", but the etymology section clearly refers to the origin of the kana itself (i.e. the glyph), not the development of the sound in Japanese. The distinction is clearer if you remember that あ(a) and ア(a) are distinct kana that refer to the same mora (a).
キャ(kya) is described as a "katakana syllable", and while it can function as a syllable, if you were to analyse Japanese syllabically, you could rightly say that キャン(kyan, “kiang”) consists of one syllable that can be broken down into two morae: キャ(kya) and ン(n). I don't think anyone would support having a syllable entry for キャン(kyan), though, since there's nothing meaingful about that. This is in contrast to every other language in Category:Syllables by language, where you can't subdivide their syllables into component units (other than letters, in some cases).
Category:Japanese combining forms is used as a kludge to get around the problems caused by calling (full-size) kana syllables, as it's a dumping ground for the kana (and other glyphs) that can't be analysed as syllables. This is mostly okay for vowels like ゃ(-ya), which is described as a "combining form of や(ya) used in yōon mora ...", but it makes a lot less sense with っ and ー, which are full morae in their own right, and therefore function in a completely different way to the small vowel kana. However, because they can't form independent syllables, they've been shoved into the same category. I can also see that 酒 and 水 have been put in there as well, for some reason, which suggests this category just causes confusion at best.
I therefore suggest the following:
Allow a "kana" part of speech (as an alias of "letter", in the same way "kanji" is an alias of "Han character"), which should be used for full-size and small kana. The definitions should refer to the glyphs themselves, so the kana entries for あ(a) and ア(a) would be distinct, since they belong to different systems and have different origins, even though they refer to the same sound. This also goes for any hentaigana etc.
Allow a "mora" part of speech, which should encompass yōon like キャ(kya), but also the gojūon as well, which are written with a single kana. In that respect, きゃ(kya) and キャ(kya) both refer to the same mora, so it's fine for one or other to be an alt form.
As a side point, I also think these entries need serious cleaning up, as I'm not convinced some of these morae actually exist. For instance, ゐゅ(wyu) claims it is "rarely used, with うゅ seeing more use", but is うゅ(wyu) even used in the first place? Seems like someone just got overexcited and created all the theoretical syllables they could think of.
I agree that Category:Japanese syllables seems like an inappropriate way to categorize what are orthographic representations of mora. I'm not totally convinced that kana and mora are "parts of speech" in the sense of grammatical roles, but as categories for things in a broad dictionary, they are probably better than calling e.g. え a "syllable". Similarly, the small kana are not "combining forms" in any real sense – though I would argue that things such as 酒 and 水 really are. I think those were categorized automatically because the lemma entries use {{com form}}. So in the latter case, the problem may how the category is currently used rather than the category as such. To the side point, I can't recall ever seeing ゐゅ and can't imagine it being used, but one never can tell. Advertising, for example, sometimes uses bizarre forms to capture attention. Cnilep (talk) 23:31, 17 July 2024 (UTC)Reply
@Cnilep I completely agreed that "mora" and "kana" aren't part of speech in the strict sense, but I don't think it's possible to craft a definition that excludes them while still including "syllable" or "letter", which we use quite widely cross-linguistically (especially "letter"). If we still want to keep using the "combining forms" category for 酒 and 水 then I have no issue with that, but that's quite a different meaning of "combining form" (more akin to an affix). Theknightwho (talk) 23:47, 17 July 2024 (UTC)Reply
For my part, the inclusion of 酒(saka-) and 水(mi-) in Category:Japanese_combining_forms looks like a mistake, brought on by unclear definitions.
As I understand it, Category:Japanese_combining_forms is intended for combining orthographic forms, while the saka- and mi- readings for 酒 and 水 are combining morphophonemic forms, relating (in part) to still-poorly-understood vowel-fronting behavior seen in certain ancient nouns when used as standalone nouns or the latter element in a compound, versus when used as the first component in a compound; and (in part) to how certain ancient nouns could appear in compounds in abbreviated forms (perhaps the original words? or perhaps as contractions? uncertain).
Similarly, the inclusion of ん in this category also appears to be a mistake -- this glyph is not a combining orthographic form, nor does any such exist (AFAIK).
If you mean the "I therefore suggest the following:" part above about new pseudo-POS headers and consequent entry restructuring, I support your proposal 👍, with the addendum that I think we need to treat combining orthographic forms separately from combining morphophonemic forms, regarding 酒(saka-) and 水(mi-).
@Eirikr Thanks - good to know. Re the categories, they can be deleted right now tbh, since they’re empty. If we do discover any bizarre terms using those morae and add them, they’ll get populated/readded automatically anyway. Theknightwho (talk) 19:21, 21 August 2024 (UTC)Reply
Poking around briefly in online search results, it looks like the katakana form is amply confirmable for this sense of reverse-"Amazon OK". The hiragana, however, appears to be just the straightforward SOP of この(kono, “this”) + ざま(zama, “pitiful appearance / situation / state”), and as SOP should probably not have an entry.
Rather than outright moving, I'd suggest keeping コノザマ(konozama) as an alt form and having the lemma at konozama(konozama). (Can anyone explain why the heck {{m|ja|konozama}} is outputting romanization on a romanized term? Not wanted.) For instance, compare:
Latest comment: 6 months ago1 comment1 person in discussion
The current title is awkward (IMO). I would tentatively suggest moving to dogs are barking with redirects from any other commonly-attested forms like dogs were barking. Perhaps someone has a better idea, or even wants to defend the current name as best. - -sche(discuss)03:23, 30 July 2024 (UTC)Reply
Agree with महागज. Latin spellings of Faliscan are more rare here, so it's better to move it to the original spelling i guess. Or do you find it problematic? Tollef Salemann (talk) 21:35, 8 August 2024 (UTC)Reply
Let's give Mellohi! a chance to explain why he moved it from *Wenikaros to *Uenicaros in the first place before moving it back. And why do we need a separate page for the Gaulish reconstruction at all when we have RC:Proto-Celtic/Wenikaros? The latter can list a Gaulish reconstruction without a link and then put Latin Venicarus under that. Having a whole separate entry for the Gaulish reconstruction feels unnecessary. —Mahāgaja · talk07:21, 10 August 2024 (UTC)Reply
I was basically moving all the Gaulish reconstructed entries to match up with the orthography of actual Gaulish inscriptions (many Latin-script Gaulish inscriptions exist). Latin-script Gaulish inscriptions basically used U for /w/ (and rendered with the letter U in scholarly mentions of words with the glide) and C for /k/. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 07:58, 10 August 2024 (UTC)Reply
Alternatively “Category:English terms spelled with numbers” should be renamed to an Arabic-numeral category and instead be a supercategory for Arabic- and Roman-numeral subcategories. J3133 (talk) 06:07, 11 August 2024 (UTC)Reply
I am not convinced that there is any need for action on either issue:
The "spelled with" categories call out the specific characters used to spell the term, with no concern for what they are being used to represent. The entries featuring Roman numerals in their titles are "spelled with" I, X, etc.
Looking at the situation in English in isolation, the term "number" is not ambiguous and changing it to "Arabic numerals" seems somewhat pedantic. The situation could be different in languages which use other systems of numeration, but I'm not familiar with these and can't comment.
Latest comment: 1 month ago19 comments12 people in discussion
I suggest we remove the word "Scottish" from the following categories, as it is redundant and in most if not all cases less common than the name with just plain "Gaelic":
Sounds reasonable to me. I agree Scottish in this context is redundant. (Arran Gaelic might be slightly confusing to some due to Aran islands being a Gaeltacht area in Ireland… but I don’t think the impact of this would be significant anyway, the spelling is different and specific dialects of Ireland are rarely referred to with the word Gaelic) // Silmeth@talk00:00, 24 August 2024 (UTC)Reply
"Canadian Gaelic" might be confusing for similar reasons: without knowing about Scottish immigration in Canada, it might not be clear which Gaelic. Also, I hope the module isn't just tacking the language name to the end of the variety to make the category. Chuck Entz (talk) 00:25, 24 August 2024 (UTC)Reply
@Chuck Entz: Canadian Scottish Gaelic is the one in the list for which I am most amenable to keeping "Scottish" in the name. And yes, Module:labels/data/lang/gd lists regional_categories = true, which means it is just tacking the language name to the end of the variety to make the category. But we can change it to plain_categories = "Argyll Gaelic", etc. to change the category name. —Mahāgaja · talk09:39, 24 August 2024 (UTC)Reply
It would seem highly strange to me to have the categories for dialects of a language not contain the actual name of the language. Scottish English is a dialect of English; it's not at all necessary to speak of a given sub-dialect as Glasgow Scottish English. The Scottish is absolutely redundant there. But Scottish Gaelic is not merely a variant or dialect; it's a fully standardized language of its own. To me, logistically, this idea feels somewhat equivalent to, for example, renaming Gotlandic Swedish to Gotlandic Norse.
There's a few other reasons, as well. At least in the United States, Gaelic sans modifier is almost completely synonymous with Irish, unless the context makes it explicit that one is discussing Scottish Gaelic or the Goidelic family as a whole. (Which is why individual dialects would most often be referred to as simply Gaelic; they're not being presented in a multilingual context like Wiktionary.) I feel like the possibility for innocent confusion isn't too dangerously high, given the fairly niche nature of the language and the lack of Irish spoken natively in Scotland; but to Chuck Entz's point, there was also a dialect of Newfoundland Irish, extant all the way into the 1900s, that could rightly be deemed Canadian Gaelic. Qwertygiy (talk) 02:24, 24 August 2024 (UTC)Reply
@Qwertygiy: It's really a very different case from Gotlandic Swedish. Swedish is never called simply "Norse" the way Scottish Gaelic is frequently – even usually – called simply "Gaelic". As I mentioned above in my reply to Chuck, I am most amenable to keeping the word "Scottish" in Canadian Scottish Gaelic, but even there, I think "Canadian Gaelic" is a more common name for it; and in the literature "Canadian Gaelic" refers only to Scottish Gaelic spoken in Canada. As for Newfoundland Irish, Irish language in Newfoundland says "The Irish language was once spoken by some immigrants to the island of Newfoundland before it disappeared in the early 20th century", suggesting it may have been extinct before 1949, when Newfoundland joined Canada. In that case, Newfoundland Irish really couldn't reasonably be called Canadian Gaelic (quite apart from the fact that it never is). —Mahāgaja · talk09:59, 24 August 2024 (UTC)Reply
Well, we also need to ask ourselves whose standard we want to follow:
For a majority of speakers in Ireland and the UK, saying "Gaelic" automatically and exclusively refers to Gàidhlig na h-Albann, while "Irish" automatically and exclusively refers to Gaeilge na hÉireann. As @Qwertygiy reports, this is not the case in the U.S., where "Gaelic" seems to refer to Gaeilge na hÉireann.
I think it is redundant to denote that Argyll, Perthshire, Uist, etc. are all in Scotland by adding "Scottish", but we should be aware of the fact that this has the potential to cause confusion for less well-informed people.
P.S.: I was never before confused by the term "Canadian Gaelic", because I didn't even think a variety of Gaeilge might have ever been spoken there, but I stand corrected! The potential for confusion abounds everywhere, it seems.
I really think the potential for confusion is minimal, especially since these aren't L2 language headers we're talking about, but just labels next to terms under a ==Scottish Gaelic== header and the corresponding categories that are all inside CAT:Regional Scottish Gaelic. —Mahāgaja · talk15:53, 3 September 2024 (UTC)Reply
I am cautiously in favour. Gaelic really isn't used to refer to either of the other members of the family (in English), and those category names are what those subvarieties are habitually called (including Canadian Gaelic). embryomystic (talk) 01:38, 28 August 2024 (UTC)Reply
Oppose for the reasons cited by User:Qwertygiy. In general we do include the full name of the language in categories containing varieties of that language, and I at least would find it confusing to see something like Category:Harris Gaelic, as I would not know if this is a variety of Scottish Gaelic or Irish. For similar reasons, varieties of e.g. 'Walser German' should be 'Foo Walser German' not just 'Foo German'. Benwing2 (talk) 09:56, 3 October 2024 (UTC)Reply
You wouldn't be confused for long, since anything in CAT:Harris Gaelic would be under a ==Scottish Gaelic== header, and the category itself would be within CAT:Regional Scottish Gaelic. The comparison with Walser German is poorly chosen, since "German" unmodified refers to a different language (de), while "Gaelic" does not refer to a different language. Or consider the varieties of Regional Ancient Greek, which with the exception of Egyptian Ancient Greek are called "Foo Greek", not "Foo Ancient Greek" (despite the fact that "Greek" unmodified refers to the modern language). I ran some Google Ngrams searches:
In every single instance, the "Foo Scottish Gaelic" variant was too rare to be plottable on an ngram. In the cases not listed above, both "Foo Gaelic" and "Foo Scottish Gaelic" were too rare. There was no case where "Foo Scottish Gaelic" was common enough to be plotted at all, let alone being more common than "Foo Gaelic". —Mahāgaja · talk10:18, 3 October 2024 (UTC)Reply
I’m not a Scottish Gaelic editor. But what about a compromise like “Scottish (Arran) Gaelic” or “Scottish Gaelic (Arran)”? Just a thought. — Sgconlaw (talk) 11:19, 3 October 2024 (UTC)Reply
A tough one. I feel that a good analogy would be if we called English "Modern English". Then we'd end up with regional categories like "Australian Modern English" instead of "Australian English". This name isn't wrong, nor is it even confusing. It would just be a Wiktionary-ism; readers could easily make the connection to the formal language name and realise why we were using an unusual name for the dialect.
On the other hand, this project has seen fit to call the category "Attic Greek" instead of the Wiktionary-ism "Attic Ancient Greek". I would prefer the latter name, but if we make exceptions for Ancient Greek it's difficult to justify why we shouldn't do so for Scottish Gaelic too. This, that and the other (talk) 14:01, 3 October 2024 (UTC)Reply
Given that several people have either expressed outright opposition, suggested alternatives that are don't involve removing the word "Scottish" or given at most grudging acceptance, I would not recommend doing this at all. Benwing2 (talk) 19:28, 3 November 2024 (UTC)Reply
It may not logistically work but e.g. "Harris Gaelic" is universally used both by linguists and speakers, and e.g. "Harris Scottish Gaelic" is not a phrase anyone uses. If this change is possible without messing up the coding of the site I would strongly recommend this change.
Strong oppose. It's still officially known as Wi-Fi and that's the spelling you're more likely to see in official, formal, and careful writing (followed by WiFi). It's also the exclusive spelling used by the Wi-Fi Alliance which owns the trademark. wifi remains a largely colloquial spelling. Merriam-Webster, the OED, and Dictionary.com all lemmatize at Wi-Fi as well. If anything, I could see noun sense 3 being moved, since it's already informal, but then that'd put us at a weird juncture. AG202 (talk) 22:11, 23 January 2025 (UTC)Reply
If we are talking "careful writing", then Google Ngram Viewer has Wi-Fi about twice as common as all others combined. WiFi is about a third as common with wi-fi and wifi about equal making up the balance.
I didn't know that we prescribed "careful writing" or gave such sources more weight. As all of these spellings are pronounced the same, "colloquial" doesn't seem the right term to apply to the less formal uses in books. Using durably attested sources naturally gives a major edge to any spelling backed by a plausible legal threat, eg, trademark infringement. DCDuring (talk) 22:38, 23 January 2025 (UTC)Reply
The problem of division of pearlash has been observed. The following is from Riddles and Conundrums (1924):
My first's a precious stone;
My next a well known tree;
Or call my first a fruit,
The next a thong will be.
Whichever way you choose:
This puzzle to divide,
You still will find my whole
A powder will abide.
Pearl-ash, or Pear-lash.
Ash is used in combination to form names of a range of natural alkalis (made from vegetable ash), refined versions, and their synthesized replacements, soda ash, kelp ash, potash.
My hypotheses:
pearl ash may have gotten its name not from the product being derived from mollusk shells but by being a refined (white, shiny?, more valuable?) form of potash; hence, pearl-ash, then pearlash
I have yet to find any discussion of mollusc shells being used in the manufacture of potassium carbonate. Therefore, I am skeptical of the definition in our entry for pearl-ash, which is sourced from "Universal Dictionary of the English Language [UDE], 1896; under snuff.". Looking it up I found a use of pearl-ash in a discussion of the adulteration of snuff. UDE's definition of pearl-ash did not mention shells or molluscs, just plant ashes. I wondered a jesting or naive folk-etymologist contributor is responsible for the entry.
Latest comment: 5 months ago3 comments3 people in discussion
Google Ngrams shows incomparable to be over 400 times more common than uncomparable. I appreciate that incomparable is often used with the meaning "beyond compare", rather than "not comparable" but it is trivially easy to find examples of it in use in terms like incomparable adjective and incomparable adverb, while Ngrams doesn't even register uncomparable adjective when I try to compare them ([4]). While it's certainly possible to find some examples of uncomparable being used as a grammatical term ([5]), results for incomparable are much more numerous ([6]).
For some reason, our entry at incomparable had had the "not comparable" sense marked as "rare" by an old admin since 2008 ([7]), so I think the current situation stems from their misconception that uncomparable was the proper term when talking about grammar, which does not seem to be the case. Theknightwho (talk) 01:51, 4 September 2024 (UTC)Reply
Comment: "incomparable adjective" (presumably pronounced "incompárable") sounds wrong to me because of the clash with "incomparable" pronounced "incómparable". Benwing2 (talk) 22:47, 15 September 2024 (UTC)Reply
Latest comment: 4 months ago7 comments4 people in discussion
There is only a single word suffixed with -ferous that doesnt have a preceding i (indigoferous); it especially bothers me how many pages have "|-i-|-ferous}}" (which also clutters up the category of -i-interfixed terms). Could somebody make a bot replace all "|-i-|-ferous}}" and all other "|-ferous}}" (except indigoferous) with "|-iferous}}" if thats possible?
-form could also be moved to -iform along with most of its derivations (there are about 17 latinate adjectives in -form without i I think) but i dont think its as pressing.
"|-i-|-stan}}" could also be replaced by a new entry -istanSuryaratha03 (talk) 22:06, 5 September 2024 (UTC)Reply
In general a policy would be useful about whether or not to add -i- to etymologies of the structure "Latin lemma + English suffix" because a linking -i- is something of a default but there are enough exceptions Suryaratha03 (talk) 22:33, 5 September 2024 (UTC)Reply
Your first sentence is false. MWOnline has isidioferous "bearing isidia".
That there are also several at Urban Dictionary suggests that -ferous is productive.
I just feel its relevant that none of these words have entries on here. Like we might as well keep -ferous as an alternative form of -iferous, but all the "-i- + -ferous" are still unjustified in my perspective, just from a lexicographical view. Suryaratha03 (talk) 14:19, 7 September 2024 (UTC)Reply
Ok, but as I said in my second sentence, we can keep -ferous but is there any argument against changing -i- + -ferous to -iferous in all these pages? And potentially also -ferous to -iferous in all these lemmas that do end in -iferous? Suryaratha03 (talk) 12:56, 9 September 2024 (UTC)Reply
Latest comment: 4 months ago3 comments2 people in discussion
Several Gaulish pages were recently renamed last month without consultation. And since this discussion regarding reconstructions, specifically regarding Gaulish, received enough votes in favour of Romanising reconstructions (including in Gaulish), I am formally requesting for moving:
Latest comment: 4 months ago31 comments6 people in discussion
@DCDuring, JeffDoozan, Theknightwho AFAICT, the only difference between {{taxfmt}} and {{taxlink}} is that the former is intended for when we do have a Translingual entry for the taxon in question and the latter for when we don't, and {{taxlink2}} seems to be a failed experiment to replace {{taxlink}}. Whether there's a Translingual entry can easily be autodetermined, so I propose merging all three into a new {{taxlink}} template that autodetermines whether there's a Wiktionary entry and acts accordingly. If necessary, we can add a flag to override the autodetermination. Benwing2 (talk) 05:10, 27 September 2024 (UTC)Reply
Support replacing {{taxlink2}} with {{taxlink}}, Abstain merging {{taxlink}} and {{taxfmt}}. There's no technical reason to keep them separate, but I understand that doing so would break an important process used by the biggest contributors of taxon stuff, although now that wantedpages has been cleaned up, it might be possible to use that instead. See the first few posts at Template_talk:taxlink#Convert_to_Lua for more details. JeffDoozan (talk) 10:34, 27 September 2024 (UTC)Reply
@JeffDoozan: Wantedpages is worthless for "Translingual" and English and probably many other reasonably well-covered languages because it is limited to 5,000 items. The least "wanted" page is wanted on 53 pages. The most wanted taxonomic name is usually wanted at most 20 times and necessarily fewer pages than that. DCDuring (talk) 15:12, 27 September 2024 (UTC)Reply
I have a couple of questions:
How will I be able to collect lists of taxonomic names that do not have entries and the counts of their uses? Such lists don't have to be available in real time.
Are we now a point where Moore's Law has brought the cost of checking for the existence of entries down so far that we can have lists of hundreds of species without appreciable burden? That was one of the reasons for the existing arrangement. I do not add lists of hundreds of species, but some genus entries have lists of such size, sometimes commented out completely or in part. DCDuring (talk) 15:00, 27 September 2024 (UTC)Reply
@DCDuring As for #2, according to @Theknightwho there are Chinese-language pages that scrape over 1,000 other pages for transliterations without problem, so I don't expect this to be a major issue. I'm pretty sure fetching page contents is not considered an "expensive" operation so we aren't limited by the limit of 500 expensive operations per page. As for #1, there will be categories automatically generated in real time that list all the wanted taxonomic names. For a specific nonexistent taxonomic name, you can check the transclusions in real time using Special:WhatLinksHere to see how many other pages are using the nonexistent taxon, but I don't think it's possible to generate a real-time sorted list of pages by use. However, this can definitely be done offline. It's very similar to the weekly lists already generated by User:This, that and the other, and I imagine if they don't already have a list that will suffice for this purpose, they can easily modify their scripts to generate such a list. Benwing2 (talk) 07:26, 28 September 2024 (UTC)Reply
@Benwing2 is the intent for the new {{taxlink}} to do an existence check on the target entry every time it is invoked? If it also adds something that gets stored in the database link tables (like a WT:Tracking transclusion or even an external link to a fake domain) upon finding a nonexistent target entry, then DCD can manually navigate to WhatLinksHere or Special:LinkSearch to find transclusions of a specific name (although basic search would probably suffice in that case), and I can easily create a new weekly SQL-based todo list. Without the special transclusion or link, I would have to do it by parsing dumps, which is more tedious. This, that and the other (talk) 07:39, 28 September 2024 (UTC)Reply
@This, that and the other Yes, the intent is to merge {{taxfmt}} (which is intended for the case where the target entry exists) with {{taxlink}} (which is intended for the case where the target entry doesn't exist) and check automatically for existence (which means there should be a Translingual section with the appropriate name, and maybe also checking that it has a {{taxoninfl}} header — which BTW we should rename to {{taxonhead}}). I was planning on adding a category for the taxonomic term uses that link to nonexistent entries but I can easily add a tracking page as well. Would it suffice to have a single tracking page for all nonexistent entries or do you want one tracking page per nonexistent entries? If it's easy for you to create essentially a list of nonexistent but tracked Translingual entries on a weekly basis, sorted by number of uses, that would be great. Benwing2 (talk) 08:20, 28 September 2024 (UTC)Reply
As it is now, pages often have multiple instances of {{taxlink}} for a given taxon, because multiple languages can use the same spelling for the vernacular name for the taxon and the taxon ought to be part of the definition. That seems like something that will be lost using categorization, which is page-oriented, not L2-oriented.
The lists generated by Special:WantedPages, sorted by incoming links were flooded by links from User pages. I commented out most of such pages that were from my user pages, but there are many others that have been run periodically that repeatedly show the same entry, even when blue-linked. The value of the capability to sort ANY searchbox results by the number of incoming links is obviously compromised when User pages and indeed any pages outside principal namespace are included by default.
It is amazing to me that everyone is so solicitous of me wasting my time, without asking me whether I thought I was wasting my time. I actually found that the erroneous uses of {{taxlink}} led me to L2s that had other problems, both errors of commission and of omission. But, obviously others know better whether and how I am wasting my time. Dare I say that perhaps others are wasting their time worrying about me wasting mine. DCDuring (talk) 14:34, 28 September 2024 (UTC)Reply
@DCDuring This isn't just about you - it's about the fact that having completely manual infrastructure for taxonomy makes it very difficult for anyone else to get involved unless they want to dedicate as much time as you do. I also don't understand your point anyway: what exactly would be lost on pages where multiple languages point to the same taxon with {{taxlink}}? The thing it would be checking for is whether the taxon page exists, so the entry's language isn't relevant. Theknightwho (talk) 00:03, 30 September 2024 (UTC)Reply
@Theknightwho: The only reason we have "manual" elements of the taxon infrastructure is that nobody seemed to care enough to actually understand, 1., the process I think Wiktionary needs to follow with respect to adding taxonomic name, 2., taxonomic name formatting, and desirable style for taxonomic name entries. I think the number of "wants" is good for prioritizing taxa, so we avoid having too large a proportion of orphan taxonomic name entries. At present we have, almost always, the right formatting if the person adding the taxon has the right taxonomic rank (or equivalent) for the taxon. I don't see how we can usefully automate definitions very much.
I am interested in the number of "wants" from languages, not pages. If five languages use the same spelling for a vernacular name of a taxon, I count that as being worth as much as five pages that refer to the taxon once. DCDuring (talk) 03:45, 30 September 2024 (UTC)Reply
I don't need to differentiate by language in which the entry exists, so I don't do that with the script that tallies the number of {{taxlink}} (not {{taxfmt}}) occurrences for a given taxonomic name. Once the list is compiled, I add the taxa from the list in decreasing order and change {{taxlink}} to {{taxfmt}} for the taxon in question, reviewing the L2 sections as I go, sometimes adding vernacular names, derived terms, incoming and outgoing links, images, etc.
@DCDuring I agree with @Theknightwho. You're not the only one who uses these templates. In particular I've been using {{taxlink}} and {{taxfmt}} myself to clean up formatting on several pages and find it very awkward to have to manually look up each term. You will need to explain to me what benefits the current system provides that are lost when making it automatic. How does the current system distinguish languages? I doubt it does. (For that matter, it's possible to automatically fetch the current section's language and use it in categorization if so desired.) Benwing2 (talk) 00:25, 30 September 2024 (UTC)Reply
@Benwing: I am not sure what it is that you manually look up for each term. If an ordinary wikilink for a taxon is red, then it needs {{taxlink}}; if it is blue, we have lately decided it needs to be formatted using {{taxfmt}} to remove the need for anyone other than module programmers to master taxonomic formatting. DCDuring (talk) 03:45, 30 September 2024 (UTC)Reply
@DCDuring It is possible to automatically check (a) whether the page exists, (b) if so, whether a Translingual entry exists and (c) if so, whether that entry is a taxonomic entry. The check would be whether all of those conditions are satisfied. It may be possible to make it even more specific, if necessary. Theknightwho (talk) 03:52, 30 September 2024 (UTC)Reply
@Theknightwho: I assumed that (a), (b), and (c) would be possible. I assume that you are talking about real time.
Please tell how the single template will differentiate between names with Wiktionary entries and names missing them in a way that yields a list of missing taxonomic names ordered by the number of occurrences of the taxon enclosed in the template. Can that be done in real time? Does it need to be done by dump-processing (as it is done now)? DCDuring (talk) 18:40, 30 September 2024 (UTC)Reply
@DCDuring Okay, how about you tell me how you're achieving that now? Plus, going by your comment above, you only need to compile that list in order to manually change the template, which would no longer be necessary anyway! Theknightwho (talk) 18:43, 30 September 2024 (UTC)Reply
You miss the point. I add taxa in order of descending number of instances of occurrence of the taxon enclosed in {{taxlink}}. I do a run occasionally of a perl script against a dump that picks out occurrences of {{taxlink}}, groups them by taxon, and orders the taxa by the number of taxlink instances for each taxon. I add entries for the most "wanted" taxa when I have the enthusiasm to do so. DCDuring (talk) 14:13, 1 October 2024 (UTC)Reply
I admit that I selfishly assumed responsibility for taxonomic entries and use of {{taxlink}} when its creator seemed to cease being interested. I was (and perhaps am?) the only one who regularly removed redundant {{taxlink}} and, now, who converts them to {{taxfmt}}. I also seem to be the only one adding taxa based on the number of "wants", though I have advertised User:DCDuring/MissingTaxa a few times over the past dozen or more years. I have offered help to any users who come to my talk page.
The 'workflow' for adding taxa is dispersed among a small number of people who add {{taxlink}} when a bare link (or {{taxfmt}}) is red. We apparently want either {{taxfmt}} or {{taxlink}} to be added for the taxon formatting. {{taxlink}} also provides a means to meliorate over our very modest coverage of taxa by referring people to Wikispecies. It also provides a way to count "wants" for a taxon. In the absence of an explicit proposal (not hand-waving) for dump-processing to identify missing taxonomic names by their number of wants with the merged templates, I don't see how combining these templates offers any reduction in workload to anyone. DCDuring (talk) 03:45, 30 September 2024 (UTC)Reply
@DCDuring I'm completely confused as to how having two templates the choice of which depends on manually checking for a red vs. blue link is superior to doing the same automatically. Maybe you're not understanding my technical proposal, because what you're saying doesn't make a lot of sense to me. If {{taxlink}} does A, and {{taxfmt}} does B, my proposal is simply to have a combined template that automatically does A when the link is red and B when it's blue. How does {{taxlink}} allow us to count "wants" for a taxon that a combined template won't? This doesn't make any sense technically to me so I will need a detailed technical explanation of how this works. Thanks! Benwing2 (talk) 03:52, 30 September 2024 (UTC)Reply
@DCDuring BTW you explicitly mention that some people (not you) use the wrong template, and you have to then go and clean this up. This is exactly the sort of work that will vanish by having only one template. With only one, there's no way to use the wrong one. Benwing2 (talk) 03:53, 30 September 2024 (UTC)Reply
Please tell how the single template will differentiate between names with Wiktionary entries and names missing them in a way that yields a list of missing taxonomic names ordered by the number of occurrences of the taxon enclosed in the template. If this can been done, why not do it for {{l}} (and {{m}}, and all the column templates and make ordered lists of missing entries in order of "want" for all of our languages? DCDuring (talk) 17:58, 30 September 2024 (UTC)Reply
@DCDuring I do think this is possible offline. However, the more pertinent question is, can you do this currently? If so, please explain how it's done currently and I will explain how to do the same in the new system. If you can't do this currently, why are you insisting on it when you can't do it in any case? Benwing2 (talk) 18:58, 30 September 2024 (UTC)Reply
@DCDuring I created WT:Todo/Lists/Wanted taxa, which seems similar to your existing list in your userspace, but may be of use to you nonetheless. (Note this is a dump-based list just like yours, so the todo list infrastructure will automatically regenerate it twice a month.) If it is, please let me know and I will try to fix the problems currently present in the list. Otherwise I will retire it. This, that and the other (talk) 12:23, 5 October 2024 (UTC)Reply
If it matches mine, having it outside of user space is better. It's also a little quicker to use because one can scan the 'wanting' pages to get an English vernacular name or other indication of a part of a potential definition. Adding items from these lists at least means that the added items are not orphans, which are often a waste of contributor time. DCDuring (talk) 14:58, 5 October 2024 (UTC)Reply
Latest comment: 4 months ago2 comments2 people in discussion
The expression appears to be used outside of the first person[1]. It's also the name of the challenge, #PassThePhone. This leads me to believe that the definition would be best fit in the infinitive and without the preposition, as it's written in the hashtag — and isn't that how people search for things on the internet? They'll read I'm passing the phone to but google the lemma pass the phone.
The move can also be to a different title; pass the phone is just an example. Polomo47 (talk) 17:18, 29 September 2024 (UTC)Reply
Latest comment: 4 months ago1 comment1 person in discussion
Hello there, fellow wiktionarians. This page was tagged for moving with the reason given being "This is a terrible StarLing reconstruction that needs to be moved, but I don't know what the correct form is". So I have been digging around through the Internets and found this paper from Maarten Kossman which gives the reconstructed root as √swʔ, the Aorist as *ăswəʔ, the Imperfective as *săssăʔ, and the Verbal Noun as *-săs(s)eʔ; I couldn't find a reconstruction for the lemma, perhaps because of too much variability in the daughter languages? - I don't know. I don't know what's the current policy/preference for reconstructions with just the root consonants; most proto-semitic ones have the full word instead of the root; I did find one with just the root - Reconstruction:Proto-Semitic/w-r-d- (which, however, seems to me to be relatively easy to reconstruct as *warad-, but I'm just an amateur)
Latest comment: 2 months ago2 comments1 person in discussion
I propose to move the relevant forms at *ǵʰrem- to *gʰrem- insofar as they aren't there already. In practice this means moving the Sogdian forms, but only after correction. Then the forms at *ȷ́ʰárati and subpages would need to be disentangled as different roots are given for those by Cheung (and others).
@Hitsuji777, This, that and the other: I guess I have no objection to "R:OED1" or "R:NED" (the latter probably being more technically correct), but since most people know the work as the Oxford English Dictionary I wonder if it is worth actually changing the template name? Also, I would suggest leaving the template at the long form, and making "R:OED"/"R:OED1", etc., shortcuts (i.e., redirects). — Sgconlaw (talk) 15:17, 8 November 2024 (UTC)Reply
I probably would go with making {{R:OED1}} the main entry and leaving the long one as a redirect to encourage people to use the shorter one (a redirect from {{R:NED}} could be made as well, but personally I don't think it's worth it). Numbering the entry seems more intuitive as well, as it's not just any OED. That's only what I would do myself, I wouldn't mind settling on something else as long as the duplicate problem gets solved. Hitsuji777 (talk) 16:13, 8 November 2024 (UTC)Reply
I don't really mind what the template is renamed to. I would note that the modern OED refer to its own first edition as N.E.D., and we often use that abbreviation at RFVE too. But "OED1" makes sense too. This, that and the other (talk) 10:37, 9 November 2024 (UTC)Reply
@Sgconlaw These reference templates are specifically referencing the first edition of this work, as opposed to the work in general, or the modern edition, which is still called the "Oxford English Dictionary". I think it is quite important to distinguish this. This, that and the other (talk) 10:40, 9 November 2024 (UTC)Reply
Support moving as well. Now that you mention it, that seems like the option that makes the most sense. {{R:OED}} could be turned into a redirect to {{R:OED Online}}, similar to Spanish's {{R:DRAE}} linking to the latest online version of the work and not the first edition.
Latest comment: 1 month ago2 comments2 people in discussion
The categories do not contain citations per se, but Citations: namespace pages. There are plenty of citations in the main namespace, which are not in these categories.
Latest comment: 3 months ago3 comments2 people in discussion
Ancient Greek. The page already says "The present is used only as a participle", but according to Beekes ἔθων is not related to εἴωθᾰ. I suggest both get their own pages. If there is to be a page for 'ἔθω', I think it should be as reconstruction *ἔθω, like Beekes gives under 'εἴωθα' (p.395). (moved from rfv) Exarchus (talk) 19:37, 28 October 2024 (UTC)Reply
Hello M @Exarchus. I tried to concentrate on attested forms and expressions as we would do at our exams. I do not do Etymologies. Participles have their own pages in en.wikt anyway. (Most of the many quotations seen are for participles, not for the verb). εἴωθα(eíōtha) may have its own page (without repetitions of material) with its own Etymology. The unattested ἔθω(éthō) may have in front of it an asterisk (lemmatised or just added in-page, according to en.wikt's policy for such cases, perhaps with a little template for * with tooltip). Admin @Mahagaja could review. Thank you. ‑‑Sarri.greek♫I21:18, 28 October 2024 (UTC)Reply
Latest comment: 3 months ago2 comments1 person in discussion
To be moved to what is reconstructed by the references (hint: they don't reconstruct 'fellō'), or to be removed altogether as a Frankic etymon is just one of several etymologies for Latin fellō(“criminal”). I already explained why the Dutch 'descendant' shouldn't be there. Exarchus (talk) 22:14, 29 October 2024 (UTC)Reply
Latest comment: 3 months ago1 comment1 person in discussion
After I removed a few terms, there are only nouns now (as there were originally). Rastorgueva & Edelman reconstruct *xʷāpa- for those, so maybe the page should be moved to *hwāpa (possibly related to Sanskrit स्वाप(svāpa), but maybe independent creations as the Sanskrit term occurs fairly late). The verb that should be reconstructed for Proto-Iranian is rather a verb on -sati: R. & E. give "*hufsa-, *xʷafsa-", LIV gives the zero grade as original. Exarchus (talk) 22:49, 29 October 2024 (UTC)Reply
Latest comment: 2 months ago3 comments2 people in discussion
These are basically two competing reconstructions for the same thing. I'm inclined to think the reconstruction should be *krā́mHti. Exarchus (talk) 11:13, 9 November 2024 (UTC)Reply
Latest comment: 1 month ago2 comments1 person in discussion
I think this should be moved to an athematic deponent verb. Sanskrit ओहते(ohate) is given by Lubotsky as sometimes 3pl.ind. and sometimes 3sg.subj., see {{R:inc:IAIL|page=352}}. The Avestan verb is pretty clearly athematic, see for example {{R:ira:Cheung|page=169}}. This means {{R:grc:Beekes|page=486}} is simply wrong when suggesting those are thematic verbs.
This verb is generally reconstructed to come from a reduplicated present and {{R:ine:HCHIEL|1879}} happens to reconstruct the following: "PIIr. *Ha(H)ugʰžʰa,*Ha(H)ugʰdʰa 2,3sg.inj.med." Exarchus (talk) 18:24, 15 November 2024 (UTC)Reply
Latest comment: 2 months ago5 comments3 people in discussion
Yiddish. If this does exist it's in the wrong script. The third quote suggests it's actually a Hebrew word (it's on Hebrew Wikisource...), but I'm not sure. -saph668 (user—talk—contribs) 22:46, 27 November 2024 (UTC)Reply
The English quote uses it as a nonsense word parallel to "dinglebop". The "Yiddish" has links to what seem to be rabbinical texts in some variety of Aramaic or a mixture of Hebrew and Aramaic, of which the first two use שליימ"ל or שלימל with an extra syllable at the end, and the other uses what looks like Yiddish שלײַם(shlaym), which is closer to slime than membrane. I'm far from fluent in Hebrew, Aramaic or Yiddish, but at best they're sprinkling Yiddish terms into running text in other languages (though the first labels שליימ"ל as ""Ashkenazic", which no doubt means Yiddish), and two out of the three aren't even the right number of syllables. I could be wrong, but I don't think any of them are evidence for a Yiddish word that can be transliterated as "schleem" rather than something like "schlaim" or "schlaimel". Chuck Entz (talk) 00:51, 28 November 2024 (UTC)Reply
I created it on EN wiki because the term (or at least a homograph) also appears in English, in "Rick and Morty". Secondly, this word is a possible etymology for English "slim" =thin, which has never been convincing traced to the cognates meaning "bad" or "crooked". See the struggle of Anatoly Liberman, Word Origins p. 200. Also this wiki is much more active that Yiddish, and easier for me to edit.
Editors who are not familiar with these languages should not be wildly speculating here. There is no question that these three Rabbinic Hebrew quotations use a Yiddish word. The first explicitly describes it as "בלשון אשכנז"=in the language of Germany. This is the normal way for books of the period to cite Yiddish, and it does not seem to have existed in other German dialects, so far as I can tell from dictionaries. The second describes the word as "בל"א" which is an acronym for the same phrase. The third spells it with a diacritical mark known as "gershayim" which is the equivalent of English italics, used here to mark the word as foreign. Admittedly, there is nothing explicitly to say that it is Yiddish as opposed to Polish here, except for its connection to the previous quotations.
It cannot possibly mean slime. I have translated the Hebrew in each of the three quotes into English on the page, leaving the loanwords in transliteration, so anyone should be able to see that "slime" is an impossible translation in context. The first reference tells you that it is a type of "membrane" (Hebrew קרום) and that it is equivalent to French teile, which is what French rabbinic texts use for "membrane". The second and third references give the nearest Hebrew equivalent as "thin skin" (Hebrew עור דק). I don't know what the modern anatomical terms for these exact bits of flesh are, but it seems to be a general term because it's used for 3 different bits in the 3 different quotes.
As for the pronunciation, we can't be certain, and schleim is possible. But schleem or schlim is much more likely, because the normal way of spelling the ei vowel in Rabbinic Hebrew transliterations is with a double yodh. The printer of the version of Sirkis's book I linked to has seen fit to insert a second yodh into his text (שלימ"ל in the first edition) so I assume that he recognized the word and thought it should be pronounced schleimel. By contrast, the third quotation is spelled שלי"ם, with only a single yodh.
I don't know of any book in Yiddish which uses this term. Yiddish printing until the modern era was restricted to certain genres, and did not include technical anatomical works or Rabbinic studies of anatomical subjects. I couldn't find any modern work in any language which uses it. The use of loanwords with survive in Rabbinic Hebrew is a completely standard practice for reconstructing historical dialects. Old French dictionaries, for example, are deeply indebted to Rabbinic Hebrew works of the period (anatomical teile seems to have been missed). GordonGlottal (talk) 21:10, 2 December 2024 (UTC)Reply
I wasn't contending your transliteration. The script it's in is objectively wrong, no 'wild speculation,' and nothing you said in that comment addresses the concern I brought up at all. Our Yiddish entries are in the Hebrew script. -saph668 (user—talk—contribs) 21:14, 6 December 2024 (UTC)Reply
@GordonGlottal: more to the point: our entries are organized by spelling: schleem, Yiddish שלײַם(shlaym), Yiddish schleimel, Yiddish שליימ"ל(shleym"l), Yiddish שלימל(shliml), etc. belong on different pages, though a case could be made for ignoring the vowels that are only present as diacritics, and perhaps the variation in yods- though I think Yiddish is less permissive in that respect than Hebrew (I would have to read through WT:AYI (and maybe WT:AHE) to be sure. Some of the entries would be "alternative form of" soft redirects, but having the main form at a spelling that's different from anything in the quotes supporting it would only work if the editors for the language in question had decided to organize things that way. What you did there would be like having a Hebrew entry at shalom with quotes like "יִשָּׂ֨א יְהוָ֤ה פָּנָיו֙ אֵלֶ֔יךָ וְיָשֵׂ֥ם לְךָ֖ שָׁלֽוֹם".
While I'm at it, I might as well point out that most of the early Rabbinical writings such as the Talmud are written in what we treat as Jewish Babylonian Aramaic: though the writing system is the same as Hebrew and they discuss a lot of Hebrew texts and the concepts in them, so there's lots of overlap. Chuck Entz (talk) 23:11, 6 December 2024 (UTC)Reply
Latest comment: 1 month ago5 comments2 people in discussion
These two templates for Danish verb conjugations could hardly be more different. Not only is the visual appearance completely inconsistent, but the structure doesn't match either: {{da-conj}} has present/past across the top, while {{da-conj-reg}} has active/passive. They don't even have all the same forms.
To add to the confusion, we have some Danish verbs that don't even have a conjugation box at all, like smile.
Okay, I have read some Danish grammar books and now I am less confused. It seems that we need to show:
infinitive and imperative (trivially related)
present tense
past tense (five types: add -te; add -ede; vowel change and add -ede; strong verbs; irregulars)
passive infinitive, passive present, passive past
Note: {{da-conj}} shows the active infinitive but not the passive infinitive. This seems counterintuitive. Putting active/passive as the columns of the table allows space for this form.
present participle
past participle
the auxiliary verb to be used (missing from {{da-conj-reg}})
We also need a table for deponent verbs like enes.
The grammar books I looked at don't mention a gerund, but both our templates identify this form. Is it obsolescent?
Anyway here's my first draft of a replacement, merged Danish verb template, using {{inflection-table-top}} to get the benefits of a sensible width and dark mode support:
@This, that and the other Yes, I agree, quite a mess... I started drafting an Appendix to gather my thoughts on the matter before revamping the templates but life got in the way and I'm not sure I'll get back to it in the immediate future.
The situation with the -s form (probably better referred to as middle voice than passive, as elaborated by the pdf linked below. Mediopassive is also occasionally used in the literature and I would prefer either to passive since many are not at all passive and it's also not even the only passive in North Germanic) is further complicated by the fact that the -s forms actually have several different functions, depending on verb, and it can be difficult to tell when a given usage is being used in a passive, anticausative or reciprocal function even when one knows the language well. For instance, see the examples here and at Mediopassive voice. I don't remember off the top of my head but have a vague feeling that some verbs also do not have an -s form, though we should probably confirm that with a native speaker or someone more qualified before acting on that hunch. Additionally, as you mentioned there are a handful of deponent verbs (such as enes, synes), which occur only in the -s forms. Depending on dictionary, -s forms may get their own independent entries, or they may be subsumed under the active form, where possible (for instance Den Danske Ordbog has [8] redirecting to se, but [9] with its own entry; in the Politikens Nudansk Ordbog, both get their own entry). So, what do we call them and should we automatically generate them for all verbs?
Regarding what might be described as the gerund, the so-called centaur constructions or centaur nominals, this seems to be a topic deserving of more academic research (which is obviously beyond our scope here). I believe they are theoretically possible for many (all?) verbs, although I cannot comment on the actual frequency of their usage in the contemporary language. They are certainly very rarely covered by grammar books, with at least one large English work outright denying the existence of a gerund. Full disclosure, I was the one who translated that wikipedia page on centaur nominals from Danish. Helrasincke (talk) 22:38, 5 December 2024 (UTC)Reply
@Helrasincke I picked red because it's on the Danish flag! I'll go with brown for now, as it is more muted, but still distinct from Swedish blue. You are, needless to say, welcome to change the palette parameter at {{da-conj}} to whatever you wish!
The situation of the gerund seems rather odd to me - in general I would say we should follow grammar books and dictionaries, but clearly some form like this does exist, and since my knowledge of Danish is extremely limited I would naturally defer to you.
Latest comment: 2 months ago4 comments3 people in discussion
The two dictionaries I could find which gave a time frame for this word – besides OED, which just calls it obsolete and Scottish – attribute it to "Old Scots":
Slang and Its Analogues Past and Present: "BARLA-FUMBLE! intj. (old Scots)"
Google Books also says it's found in A Dictionary of the Older Scottish Tongue: From the Twelfth Century to the End of the Seventeenth, volume 1, but I don't have access to it.
The earliest use is in Christis Kirk on the Green, from around 1500 and which is in Middle Scots. And James Maidment's A Book of Scottish Pasquils has a quote for it dating from between 1568-1715 (or maybe he authored it himself? I can't really tell, but if it's not him quoting it then it's from 1868), firmly in the range for Middle Scots. The latest use I could find, in the form barley, was in Walter Scott's Waverley, in 1814, which I think would place it in Scots. It should probably be under Middle Scots. -saph668 (user—talk—contribs) 16:20, 6 December 2024 (UTC)Reply
Latest comment: 2 months ago1 comment1 person in discussion
There should be a separate reconstruction for forms like Sanskrit हृद्(hṛd) (with Vedic nominative/accusative हार्दि(hārdi)) and Avestan 𐬰𐬆𐬭𐬆𐬛(zərəd). In {{R:inc:IAIL|page=230}} Lubotsky gives "j́ʰārd-/j́ʰard-/j́ʰrd-; j́ʰrdaia-". I guess most Iranian forms come from the form without -aya-, one example that does come from the latter is Ossetian зӕрдӕ(zærdæ) (says Lubotsky). Exarchus (talk) 09:59, 11 December 2024 (UTC)Reply
Latest comment: 6 hours ago26 comments12 people in discussion
This is about Portuguese verb forms ending in -se, e.g., suicidar-se. A discussion regarding those forms in Spanish, Galician and Portuguese was previously held at WT:RFDI#curvar-se (see there for more information), but this proposal is only about our handling of Portuguese forms so that a resolution may be achieved more simply.
I propose the following, after discussing with other editors.
If entries exist for both the forms with -se and without it, they will get merged under the page without -se. Thus, the entry at the page with -se will be deleted.
If an entry exists only at the page with -se, it will be moved to the page without -se.
This deserves more compiled context. So, the primary argument for this move is that picking the forms with enclitical -se shows an illogical preference for enclisis, while proclisis is just as acceptable (arguably more acceptable). The reason why entries are currently listed under the enclitical forms is, of course, because those take a hyphen while proclitical forms don’t. However, having a hyphen does not automatically make an entry meet CFI: Idiomaticity rules apply to hyphenated compounds in the same way as to spaced phrases. — though this excerpt talks about compounds, it should definitely apply in this context too.
In comparison, the argument in favor of keeping reflexive-only verbs at the versions with -se is that those verbs are only used reflexively. But that doesn't take into account that they can well be used with proclisis instead. Additionally, it leads to a complicated situation when dealing with verbs that are used reflexively in addition to other manners — do the senses get split into different entries? They shouldn't, but if they don't it's inconsistent with how reflexive-only verbs get treated. The best solution is to not put anything under -se! Polomo47 (talk) 18:24, 11 December 2024 (UTC)Reply
Weak support. I'm fine with this going either way tbh. Frankly it'd just really suck if Portuguese loses these and then they're kept for every other Romance language. But it'd be super cool if all those redundant non-lemma entries went away; I just hope that this is the first step toward eradicating them and not just, well, it. MedK1 (talk) 18:17, 21 December 2024 (UTC)Reply
I support the first bullet point and oppose the second. I agree that we need consistency. I have mixed feelings about pages like despedir-se; I wouldn't mind it being deleted, but does the hyphen make it so different from Spanishdespedirse (which is allowed under CFI)? As for the second point, I believe readers are much more likely to search for reflexive-only verbs with -se in the search, and will find it strange to see the bare infinitive as the page title. This is consistent with the Romance dictionaries I'm familiar with. Looking up suicidar/se, I see -se in the title at Infopédia, Michaelis, and Aulete, but not at Priberam (and not because there's a transitive sense). Personally I'm happy with the Spanish approach; put all definitions under the non-reflexive form unless it's reflexive only, in which case the definition is under the -se form and the bare infinitive is a soft redirect (see agripar). Ultimateria (talk) 19:56, 21 December 2024 (UTC)Reply
It’s critical to note that, though the dictionaries you mentioned include the enclitical reflexive form in their pages’ headwords, such pages are listed under URLs without the -se, invariably, for all dictionaries. I am not necessarily opposed to, in the headword line, listing a combined form, as long as the actual page the word is under does not include the enclitic.
The reason why I don’t want the enclitic is the same as the reason you mentioned for keeping it: usefulness to readers. In Brazilian Portuguese, absolutely no one used enclitical forms instead of the proclitical ones, so I really doubt they would Google the combined form with the enclitic. They either search for the proclitic form, or a form with no clitic — I think I mostly stick to the latter myself, when Googling.
Further, like I mentioned above: proclisis is used in both Brazilian and European Portuguese, while enclisis is very restricted to European Portuguese — mainly in speech, but in writing too, even if it results in nonstandard writing. (Implicitly) prescribing one form over the other is questionable, and prescribing the latter, which sees less use, even more so.
Regarding consistency with Spanish, I don’t see an immediate need for it myself. While yes, it’s true that CFI says those terms can stay — though, isn’t that something that could be changed by vote? even if it’s unlikely —, it’s also true that CFI says the Portuguese equivalents cannot. Given all the arguments in favor of removing the Portuguese entries, the best way forward should be to do this for Portuguese only (which is why I brought it up) and later, if the community still wants “consistency”, it can instead look into changing how Spanish, Italian, etc. do it. Though MedK’s proposal to straight-up delete is a possibility, another possibility is to move the lemma forms to the version without the enclitic, while keeping the enclitic combined form as, well, a combined form.
@Ultimateria you should look at this BP thread, where we talked about the scenario in other Romance languages, and see if your thoughts still apply. The conclusion MedK and I reached: Portuguese and Spanish verb forms have nothing to do with each other. Polomo47 (talk) 02:59, 28 December 2024 (UTC)Reply
Oppose lemmatizing reflexive-only verbs at the non-reflexive variant and Support everything else. I talked with Polomo over Discord, just mentioning that I believe reflexive-only verbs should be lemmatized at their reflexive form and this should apply to all languages with reflexive verbs, at least those with floating reflexive verbs. Possibly the ones with fixed clitic reflexives (Russian, Icelandic, etc.) can be different. Benwing2 (talk) 21:21, 21 December 2024 (UTC)Reply
Support. I agree the lemma should always be the infinitive without any pronouns, but reflexive verbs need to show the proper conjugation. Currently, it's done with {{pt-conj|TERM-se}}, but I've seen it in very few pages (and you need to write it twice, unlike {{es-conj}}). Trooper57 (talk) 21:07, 22 December 2024 (UTC)Reply
@Ultimateria, Benwing2, AG202 I understand that the only remaining argument against this proposal is the possibility that learners of Portuguese, or other users of the site — knowing how Spanish and Italian lemmatize the verbs — might expect Portuguese entries to do it the same way.
On the other hand, please note how the votes among the Portuguese editors are, as of now, unanimous in favour of the proposal. I previously told Benwing on Discord that we couldn't get User:Sarilho1’s opinion — it slipped my mind then how it was his thoughts on the matter that motivated this proposal in the first place: I do think that if one is deleted, all of them should (posted two and a half years ago on the RfD linked at the top of the thread).
If we want to be even more sure of the general opinion on the matter, we can try to ask the remaining editors who did not reply to my initial contact. However, this really is a lengthy debate and I’d understand if they’re not interested.
We’ve established already that Portuguese grammar absolutely supports the arrangement we’re proposing. And since different languages do things differently, can’t they, well, do things differently on Wiktionary, too? Polomo47 (talk) 05:43, 12 January 2025 (UTC)Reply
@Polomo47 I understand your push to make this change and I don't want to be a party-pooper but I really think consistency among Romance languages (esp. Iberian Romance languages) is more important. It's not just that learners of Portuguese will expect things to work a particular way, but more importantly, the dictionary itself looks even more amateurish and poorly put-together than it otherwise does when it eschews consistency among closely related languages. I also don't really buy some of the arguments (particularly the grammatical arguments) made in favor of this change in Portuguese because fundamentally, the choice as to where and how to lemmatize is conventional and not dependent on a particular language's grammar. I should also add that User:Sarilho1's quoted opinion was made in the context of curvar-se (a verb with a non-reflexive equivalent), not a reflexive-only verb; this is a major distinction. Benwing2 (talk) 06:24, 12 January 2025 (UTC)Reply
Strong oppose the second bullet point. I maintain that we're lacking enough input from more PT-PT editors. As I stated in the BP thread:
It seems that PT-BR-based dictionaries don't lemmatize at "-se", which tracks with the comments made here, but we really need input from PT-PT editors before making such a sweeping change. I see that Ultimateria mentioned it, and @Polomo47: your point about "such pages are listed under URLs without the -se, invariably, for all dictionaries" isn't accurate. As seen by the links I've added, both the URLs with -se and the ones without it point to the entries, so it's not like there's a redirect from the latter to the former. And as I've said, having the headword line be different to that extent is not desirable.
Also @Trooper57: having the lemma be at the infinitive without pronouns, but requiring them to show the proper conjugation with -se only creates more confusion on the learner's end. AG202 (talk) 15:41, 12 January 2025 (UTC)Reply
I think that reflexive and pronominal verbs should only have an entry without the -se, similar to how German treats them (as in erinnern). It should have the reflexive or pronominal label in it's meanings rather than in the entry's name. Also, we need to keep in mind that reflexive is not the same as pronominal. OweOwnAwe (talk) 16:23, 14 January 2025 (UTC)Reply
Indeed, the nuance between reflexive and pronominal confounds many.
RFD-Moved. It seems that there is strong consensus for moving all the remaining -se Portuguese verbs to their non-se counterparts. I will accordingly work on moving all such entries. Imetsia (talk (more)) 19:05, 14 February 2025 (UTC)Reply
@Imetsia I don't see this as strong consensus. There are three clear opposes, one weak support, and six (I think) supports. This is a possible consensus but with a lot of opposition. Benwing2 (talk) 23:42, 16 February 2025 (UTC)Reply
I stand by the fact that we have now lost crucial information for learners such as conjugation information for verbs like Portuguese arrepender using the reflexive pronouns, which puts us out of line with learner resources and PT-PT dictionaries, and that we did not have enough input from PT-PT editors (only 1) on the second point. But alas, there’s only so much you can do. And @Imetsia: I do note that you only participated in this discussion (and declined to participate in the BP one) to move the entries, which combined your comment on the discussion for curvar and lack of reply, continues to give me pause on your impartiality on this. AG202 (talk) 03:20, 17 February 2025 (UTC)Reply
I also got a notification recently that you thumbs-downed a comment that I made on Discord last May that your initial close for curvar should’ve been avoided, which I said in the discussion and I still stand by that. And that honestly just seems really petty and a bit weird for an admin to go out of their way to do, after the discussion was already closed. AG202 (talk) 03:24, 17 February 2025 (UTC)Reply
Latest comment: 1 month ago3 comments3 people in discussion
There are currently two different category trees for taxonomic names
Category:Taxonomic name is a variety of Category:Translingual language (language code mul-tax) that's based on the concept of taxonomic nomenclature as a language: its members are all names in the standard taxonomic nomenclatural systems. Aside from the names of viruses, taxonomic nomenclature is basically a rather artificial construct formed from New Latin. It's new and was made possible by the expansion of the capabilities of language varieties a.k.a etymology-only languages. Right now it's a redlink, and I haven't figured out how to get {{auto cat}} to recognize it as valid.
Besides which, the name is kind of silly, since no one uses it to refer to taxonomic names as a group or a system, let alone a language. It would have been better named "Category:Taxonomic names", but the name has been taken by the following.
Category:Taxonomic names is a name category, part of the topical category system, and based on the concept of taxonomic names being something that different languages have. This system has been developed over the years in an ad hoc fashion, so it's a real mixed bag.
Prescriptively, "taxonomic name" refers only to names that are part of past and present systems used to create and manage the official names of biological taxonomic entities. All the names in specific languages as opposed to "translingual" or "multilingual" terms shared across all languages can't be referred to that way. That's because modern taxonomy is done by scientists acting as taxonomists, who have all agreed to abide by the taxonomic codes that dictate what valid taxonomic names are.
The language-specific subcategories mostly violate that. Many of their members are just adapted borrowings of the "real" taxonomic names, equivalent to "cervids" in English from the family Cervidae, which are deer. Others are vernacular names from within the languages that coincide with the taxonomic entities referred to by the official taxonomic names.
Remove all the entries from subcategories that aren't either taxonomic names in the strict sense or derived from taxonomic names in the strict sense.
Move the "terms derived from taxonomic name" categories to "terms derived from taxonomic names" categories under the new Category:Taxonomic names
Convert the language-specific categories to "terms derived from taxonomic names" categories and move them under the new Category:Taxonomic names, or merge them with any of the categories in the previous step if they would have the same name.
Either that, or leave them as they are, but move them under the "terms derived from taxonomic names" categories in the new Category:Taxonomic names and also make them children of the Category:Eponyms by language subcategories
It seems like a good program to rationalize what has grown a bit like Topsy, fitting in to the existing framework, however awkwardly. I'll look at the individual-entry category cleanup today and report back if there are any problems.
I think it is clear that "Translingual" is too big a wastebasket. Segregating taxonomic entries and CJKV characters would leave us with a much smaller wastebasket, itself to be rationalized eventually. DCDuring (talk) 16:03, 15 December 2024 (UTC)Reply
Latest comment: 1 month ago1 comment1 person in discussion
As far as I know, Cumbric is not attested directly. It seems that the following terms are only either mentioned in documents in other languages or deduced from things like place names and personal names, so they should probably be moved to Reconstruction: space:
Latest comment: 1 month ago1 comment1 person in discussion
Merge with I don't speak English § Translations (“I don't speak (fill with the name of the current foreign language)”), as both serve the same purpose (instead of duplicating the content in two places). J3133 (talk) 06:00, 4 January 2025 (UTC)Reply
Please note there was a previous discussion which is archived at “Category talk:Nautical”. I had previously suggested “Category:Water transport” to align the name with “Rail transport” and “Road transport”, but it didn’t gain sufficient consensus. I’m not sure “Nautical terms” is a good alternative; we don’t use the word terms in labels. — Sgconlaw (talk) 23:46, 7 January 2025 (UTC)Reply
A summary of the proposal is below. The rest of the proposal, as well as the first comment in the thread, comprise rationale, examples, and detailing.
Spellings like accôrdo, appêllo, máo shall be mentioned in usage notes under accordo and appelo, mao rather than in entries of their own.
Spellings like pessôa, vêr, and most* of words where, if you take the accent away, you just get the normal spelling shall be kept.
*Spellings like espêlho, as well as inflections like almôços, arrôtos are more commonly — if not exclusively, to the point of otherwise not passing WT:ATTEST — seen as misspellings from 1911–1945 or 1943–1971. I'm proposing their deletion.
The proposal is, generally, to merge entries for accented pre-reform spellings with the equivalent entries without accents, (if one exists), where a usage note will be added.
Examples of changes:
appêllo shall be merged with appello (and thus deleted).
Use of accents prior to the spelling reforms of 1911 (in Portugal) and 1943 (in Brazil) varied greatly between authors. The spelling appêllo, with an explanatory accent, was occasionally seen.
Conversely, words that were more commonly spelled with an accent than without it will be merged under the accented form. A usage note will be added. These are chiefly spellings that use ⟨é⟩ as an etymological way to write the diphthong /ɛj/.
In this pre-reform spelling, the letter ⟨é⟩ represented a diphthong sound. The unaccented spelling idea may be occasionally seen, especially in older texts.
Compare also pellings that end in stressed ⟨á, é, ê, ô, ó⟩, commonly spelled with accents.
Words ending in ⟨as, es, is, os, us⟩, with or without accents, were also written ⟨az, ez, iz, oz, uz⟩ prior to the reforms. This and other factors make it that very few spellings of this type differ at all from the word’s modern spellings (as such, there are few such entries).
Spellings ending in ⟨í, ú⟩, which were less common, are often categorized instead with {{pt-1931|mis=1}} or as general misspellings.
Additionally, some spellings are excepted from any changes. If a corresponding unaccented spelling did not exist, the words shall be kept.
Derivates of the word á, such as áquelle (which have an accent for different reasons) shall be kept.
Words that, aside from their accents, do not differ from the modern spelling. These entries shall be kept so as not to remove them entirely from the dictionary, and they shall employ (uncommon), especially when listed in Alternative forms sections.
Excepted from this criterion are words which are significantly more frequently attested as misspellings from the 1911–1971 period; it is pointless to keep misspellings from a bygone era.
Use of accents prior to the spelling reforms of 1911 (in Portugal) and 1943 (in Brazil) varied greatly between authors. The spelling with no accent, ver, was significantly more frequent overall.
@MedK1 and I are convinced that use of accents was entirely up to an author’s whims. For any and all pre-reform spellings with stressed a, e, o, you can attest spellings that clarify the the stressed syllable and its height by writing á, ê/é, ô/ó — however, the two of us have, without fail, found the unaccented forms to have been much more common. These "clarifying" accents were often employed even when there were no homographs (such as with ceo, or even any possible ambiguity in reading (such as with pessoa); see the common spelling of João Pessoa’s name as João Pessôa. There is precedent that supports this proposal — how we handle languages with inconsistent (i.e., optional) or frequently unmarked diacritics — but it stands alone even without drawing the connection.
Our reason for this change in handling is to not give the wrong idea about what was the norm in pre-reform Portuguese spelling, because accents were not it. We also aim to greatly reduce the size of "alternative forms" sections for words like apelo — note that, if that section were complete, it would also include some of apélo, appélo, appéllo, apéllo...
It’s worth clarifying that currently the only entries that get classified under the 1911 and 1943 categories are spellings that were still in use immediately prior to the reforms being enforced, so nothing that had long fallen out of use — we’ve been counting those as obsolete. Spellings that were used right before 1911/1943, but then stayed in use up to later reforms (or are still used today), are not listed under such categories either — this means that words like espêto are not counted in this proposal because they were the correct spelling up until 1945/1971. However, spellings like espêtos, azêdos were not standard during the same time period and are mainly attested in pre-1911/1943 texts: in order for them to receive an entry, they would need to be attested in works using post-1911/1943 orthography.
To address the exception from the proposal: unlike most former uses of accents, marking oxytone words with ⟨á, é, ê, ô, ó⟩ was indeed the norm; funnily enough, putting aside verb forms, a majority of these are just the same as the modern spellings of the words. It’s worth noting, however, that this was not the case with using ⟨í, ú⟩ — such words are instead marked as 1931-prescribed spellings if they have more than two syllables, or as misspellings if not.
Examples of two authors’s writing. Just because I wanted to talk about this.
Maria Amália Vaz de Carvalho used bountiful accents in her writing; however, her choice between accentuating or not may be inconsistent within seemingly equivalent words.
She would always mark the pluperfect tense with -êra, -ára, etc.
She would always accentuate the diphthongs -áo, -éo.
She would accentuate most oxytones ending in a, e, o, like lá, má, pé, dá, avô, but notably not the verb há (spelled ha)
She would write some infinitives with -êr, and some with -er: verem but revêrem.
She would write fóra and fôra, but also côr.
Graciliano Ramos comparatively used very few accents in his writing.
Of course, he would accentuate all oxytones ending in a, e, o.
He would accentuate to distinguish some homographs: fóra and fôra, pêlo from pelo...
Right. I considered, for a moment, making a usage note template — something like that could be used on pre-reform pages like appello, but we couldn’t place one on a page such as pessoa. I think the best idea is to include this on the Appendix to-be we’re working on over at WT:Aliança Galego-Portuguesa/Reforms of Portuguese orthography.
I think we could move that to the Appendix namespace already, but before finishing the page, we’d need to figure out what format makes it easiest to digest all the information (because I know what we have now isn't very good). Polomo47 (talk) 02:23, 12 January 2025 (UTC)Reply
Support. However, I must say I'd be hesitant about implementing the last bullet point (concerning pessôa, corôa, etc.) alongside this merger as it is possible that those still existed as misspellings decades after standardization (somewhat akin to pêras). Otherwise, 100% agreed. MedK1 (talk) 05:51, 12 January 2025 (UTC)Reply
I must clarify that we should still include the spellings I mentioned if they are thrice-attested in works that otherwise follow the reforms. If pessôa remained a common misspelling after 1911/1943, then it would still be a worthy inclusion. If it did not, then that just shows the use of the accent was just as a reading aid.
Note, however, that we need to distinguish quotes not by date, but rather by the general orthographic choices in the work. There are many books published well into the 1910s that refused to implement the reformed spelling — just like there are many Portugal news outlets today that refuse to implement the 1990 reform, lol. Polomo47 (talk) 20:00, 12 January 2025 (UTC)Reply
Oppose. I see no benefit in deleting pages with obsolete/superseded spellings. If [[appêllo]] becomes red, and someone searches for it, they'll be taken to the search results page and have to decide whether the word they're looking for is [[appello]] or [[appellò]], which is a waste of their time for no benefit. And if the Portuguese section of [[máo]] gets deleted, but the Chinese section remains, then someone looking for the Portuguese word [[máo]] will land on a page with no Portuguese entry, which will disappoint and annoy them, and may even incite them to create a Portuguese section saying that it's an {{obsolete spelling of}}mau, without realizing the page used to say that until it was deleted. Keeping these entries as they currently are, soft redirects using {{obs sp}}/{{sup sp}}, certainly does no harm, while deleting them would do harm. —Mahāgaja · talk14:43, 13 January 2025 (UTC)Reply
Consider that, at the moment, Wiktionary has entire languages with this search problem. If someone sees the Yorubá word dára, they’ll search for it and have to decide between Dara, dara, dará...
I think that yours is a valid concern, but the problem extends much further than these Portuguese words.
Further consider that these pages have very few views. In the past four years, the average page seems to have around 80 views. This is the case for Hyrcânia. Some pages, like Bósphoro, have less than 40. Further compare how Zéphyro received 80, while Zephyro got 170, and Zéfiro got 512.
These word are actually readily recognizable, so there's not much reason for someone to look them up unless they wanna know about the spelling itself. I imagine most of these pages are found by clicking on "See also", because carthaginez has very few views.
I totally support doing something about the "Diacritic" languages in WT:LOL, but I don't see any development on that front. This change to Portuguese pages harms very little and helps way more: I did point out how such entries make Alternative forms sections harder to navigate (and would even more if they were complete) — I hold the section precious, after all, because it is the place to documents how a word’s spelling developed.
As mentioned in my comment below, the situation with Yoruba is different from the one you're proposing. And yes, it is a bit frustrating that someone has to go to dara to find the Yoruba entry, and it's too much effort to try and move them now. (Though I'm not sure why they'd need to go to Dara or dará) We have proposed creating soft redirects so that there's a Yoruba entry at dára that points to dara, but I have not followed up on it yet. AG202 (talk) 19:07, 13 January 2025 (UTC)Reply
Getting very few views is also not a reason to delete anything. And while these words may be readily recognizable to someone with a very good knowledge of Portuguese, someone with a beginning level may genuinely not know whether [[máo]] is an old spelling of [[mau]] or of [[mão]]. —Mahāgaja · talk20:10, 13 January 2025 (UTC)Reply
I did not mention the amount of page views as a reason for deletion, but rather to show that these pages are not useful in the way you are hypothesising. I mentioned in my opening comment that these spellings are considerably rarer than the equivalent spellings with no accents, such that I consistently manage to find just over 3 citations for the accented forms, while the forms with no accents have hundreds. The data I got was just to support that (1) there really aren't many people looking for these words in the first place and (2) the words people do look for are the ones without accents (by almost double?). Polomo47 (talk) 20:25, 13 January 2025 (UTC)Reply
Oppose for now. We should aim to have every word in every language. Since these forms aren't "misspellings", which would put them under the misspellings policy in WT:CFI, they should be subject to the same WT:ATTEST rules like any other word. They must be cited in 3 different texts from 3 different authors spanning at least a year; if they do not, then they would not pass WT:RFV under the normal rules. We are not limited by space on Wiktionary, and plenty of languages, including English, include obsolete spellings all the time. There is no harm in including them.
"There is precedent that supports this proposal - how we handle languages with inconsistent (i.e., optional) or frequently unmarked diacritics."
This isn't really accurate or the full picture. For languages where we strip diacritics from the entry title, we still include the diacritics in the headword line, and this proposal doesn't propose that. That also makes this more of a WT:RFD issue than a WT:RFM issue. On the other hand, there is precedent of direct precedent for maintaining older obsolete spellings like English cœnæsthesis, Spanish ántes, French extraördinairement, and Italian ànno. Again, if these are misspellings then that's another thing (and would go through RFD), but if not, these should just be subject to the normal WT:RFV guidelines per WT:ATTEST. AG202 (talk) 19:02, 13 January 2025 (UTC)Reply
"I must clarify that we should still include the spellings I mentioned if they are thrice-attested in works that otherwise follow the reforms." I just saw this comment. Why not just have these forms go through the normal RFV process? AG202 (talk) 19:10, 13 January 2025 (UTC)Reply
I wrote that comment considering the words’ inclusion as misspellings post-1911/1943. I wrote this RFM precisely because subjecting most of these to RfV would end up with them passing, because for every weird use of accents, there was a very small minority of authors who wrote with them. Note that I’m not really trying to delete the spellings, but rather combine them under a single page aiming to maximize the usefulness of the dictionary. It is not useful for readers to see equivalent forms máo and mao listed separately and taking up double the space in Alternative forms sections — space is a concern for sections like the one in apelo. While I'm, sadly, unacquainted with Yorubá, I'm not ignorant to how Russian works, and this accent usage is very close to the situation with Russian-language acutes to mark stress.
I did not initially propose changing the headword on these entries because the most common spellings did not use accents. My impression is that part of the reason we include optional diacritics in other languages’s headword lines is to clarify the pronunciation (even though we have the Pronunciation section, we do it because dictionaries do it; though I'd appreciate if you could tell me more reasons), and since these Portuguese words are listed for the spelling and not any other content, then I don't see a value to including them.
I am open to ideas about how to clarify that every single Portuguese word had, at some point, a spelling that used weird accents. I plan to make a usage note template like I mentioned to Davi above, but that would unfortunately not work for all entries. Polomo47 (talk) 19:22, 13 January 2025 (UTC)Reply
Space is really not a concern for Wiktionary. Someone seeing an additional word or two on a line is frankly not really a justification for deleting entries and historically reduces our usefulness. Plenty of entries have many more alternative forms, and the alternative forms section is not the section that takes up the most space on average. I don't think you'd advocate for limiting the pronunciation section to just two standard pronunciations for space reasons or limiting the etymology section to just one ancestor. Space optimization really only becomes a concern with super long lists, and even then, we have collapsible tables for a reason and compression methods, way before we get to the point of deleting entries. I've never seen entries get deleted for the purpose of "saving space", and I don't want to see it start now. We're an dictionary on the internet for a reason; we don't need to save paper. Ideally we want to have as much information as reasonably possible as a wiki.
Nonetheless, if they are actually misspellings, then they should be sent to RFD under our existing misspelling policy. There's no need for an RFM. If they are not, then they deserve to have their own entries like the ones I mentioned above, following WT:ATTEST, with another example being English coöperate.
Optional diacritics are not just for clarifying pronunciation. Some languages like Yoruba require diacritics in the standard orthography for disambiguation, but the reason why they're not in the entry title is that they're not often used by the general populace. Additionally, Yoruba does have archaic diacritics like in ẽrú; we simply have not created all of the entries yet, as we have other things we're focusing on right now. But we wouldn't outright eliminate them. AG202 (talk) 19:51, 13 January 2025 (UTC)Reply
Clarifying: the reason for deletion is not to save space — I worded it poorly. The reason is that it is not useful for a reader to see, in an Alternative forms section, two entries for what is really the same spelling: appêlo and appelo; appello and appêllo; apello and apêllo. My thinking is that the only reason for someone to click on that is if they want to know more about the word’s historical spelling, and for that purpose it’s only convenient to consolidate in that page information about how accents were occasionally used (by means of a usage note).
Note that the situation is different for this from how it is with things like English coöperate, in that there really are people who want to know about the spelling with diereses. Not the case with Portuguese, for a variety of reasons I mentioned previously.
Some data: coöperate has >6700 views in the past four years; English cooperate has 10k views (i.e., the alternative spelling gets 2/3 as many views). In comparison, apelo has 1540 views while appêllo has 170, which is closer to 1/10.
The diacritics are also different from the Yorubá diacritics in that there was no standard orthography prior to 1911/1943. At least these diacritics were used to clarify pronunciation, which is why I drew the comparison with Russian. And, in Russian, I am under the impression that one of the reasons we include the diacritics is for learners of the language, which does not apply for such Portuguese words. Polomo47 (talk) 20:21, 13 January 2025 (UTC)Reply
We really should not base things on views. Obviously more archaic spellings are going to get fewer views. More minority languages are going to get fewer views. I work on Jeju and I'm sure that 으키여(-eukiyeo) & 으키어(-eukieo) get similar amounts of views that appêllo gets (mostly from myself), but that doesn't mean that it devalues the entry or that I should delete one of them. No, the goal of Wiktionary is to describe and document a language, regardless of how popular an entry is. An accurate entry will always be useful. If we based our work on popularity, then 95% of the languages covered would simply not exist on this site.
For Yoruba, there was no official standard orthography before 1966, so entries like ẽrú do give a somewhat parallel situation (it was replaced by eérú in that orthography). The tilde was used by some authors to show the long vowel. AG202 (talk) 21:10, 13 January 2025 (UTC)Reply
I don't think you get it. These variations aren't comparable to English naïve/naive or fœtus/foetus/fetus. They're analogous to Latin mensa/mēnsa/ménsa in that albeit helpful for learners, the variants with the diacritics are rarely used (due to a feeling that its presence should be obvious for any user of the language), not considered 'more correct' than the accentless version and the matter of 'what accent to use' being actually fairly contentious. Indeed, we don't mention "ménsa" at mensa, even though some people employ it on purpose because acutes look closer to apices on Classical inscriptions than macrons do.
I actually can agree with putting accent marks in the headword for all these pre-standardization forms as, like with Latin, that's what people would do when teaching the language (see my user page for evidence of this). Plus it's what the plan is for Old Galician-Portuguese, so that's nice too.
I don't think these points should be ground for an entire 'oppose' vote though. Accents back then did not constitute new words — unlike in the modern day — and so there's no real reason to treat them as such, especially since the treatment for Latin is so much closer to the reality of the day.
It might be worth noting that the didactic value an altered obsolete headword can bring — informing a word's pronunciation — is severely diminished when chances are those pages' readers came from the page for the modern-day spelling, which already gave them the pronunciation. In the case they didn't, then the modern spellings (with all the information they could need) are just one click away.
Polomo's point about page views is to drive home the idea that unlike with Latin or with superseded/antique English spellings, nobody really hops into the Portuguese L2 hoping to learn about the etymological spellings, especially not beginner learners. The people who do tend to know a thing or two about the language. This is to say, although I see where you come from when you bring up accenting all these headwords, there's really no reason to find these accents' presence a big deal at all — again, they're of a different nature from the modern-day obligatory accent marks. MedK1 (talk) 04:00, 14 January 2025 (UTC)Reply
My concern centers around removing attested spellings in running text without putting them somewhere in the mainspace, whether in the headword line or in their own entries. I've already given the caveat for misspellings. And FWIW, you and Polomo have given two very different situations. Are they spellings to mark pronunciation or are they misspellings? I also gave more examples in other Romance languages that are more analogous, not just English. If the spellings were obsolete and rare, but attested, then we could easily use {{lb|pt|rare}}{{obsolete form of|pt|...}}. It's also possible to put labels in the headword line, if folks would prefer that over full entries, as with roof. CC: @Benwing2 for more on that. I simply do not want attested spellings to disappear. AG202 (talk) 04:30, 14 January 2025 (UTC)Reply
I never addressed what you said about misspellings because I didn't recall mentioning misspellings in the context of the proposed merges. So, to elucidate: no, they are not misspellings; the misspellings I mentioned are different types of entry.
Also, I was mentioned employing usage notes (mentioning that the use of accents did exist), which would still mean mentioning the accented spellings in mainspace — I indeed do not wish to remove all mentions of the spellings.
If this is the one point of contention, then it applies only to terms for which the only thing setting them apart is the accent: vêr, pessôa, propôr, etc. I didn’t feel that strongly about deleting those anyway, so I think what could be done is, for these words, to keep the entry and add a usage note stating the converse (that the form without an accent was more common).
Yes, I would personally support that option that you mentioned at the end of your comment. Sorry that this took so long to resolve. AG202 (talk) 05:07, 14 January 2025 (UTC)Reply
@Polomo47 I am generally sympathetic to the idea of not including loads of obsolete spellings and especially not misspellings. One thing to keep in mind is the |altform= flag to {{head}} and corresponding internal flag in Module:headword; this takes alternative forms out of Category:Portuguese lemmas and the corresponding POS category and puts them in Category:Portuguese alternative forms (which doesn't yet exist; there isn't yet a split by POS in this category but it could easily be implemented). I would recommend using this flag for obsolete alternative spellings, so they don't clog up the lemma categories. As for the proposal itself, I had some difficulty following it; would a form like espêlho and rapôsa (which I suspect was relatively common pre-reform) go in the ==Alternative forms== of espelho and raposa, or go nowhere? Benwing2 (talk) 00:17, 27 January 2025 (UTC)Reply
The Latin analogy actually takes away from your point: if the disambiguation issue were as pervasive as in Latin and the orthographic situation comparable to that of Latin, noting these forms on every Portuguese entry would be of utter importance. That's why the citation form of a Latin word in just about any dictionary includes the diacritics, and why we display the headwords that way (in {{head}} or a language-specific variant). That same practice exists for Russian and various Slavic languages, among others, hence Wiktionary's adoption of the headline form with diacritics. I am not proposing we do this with Portuguese, but based on this discussion I would say the situation with Portuguese is far more akin to that with Italian. — 2600:4808:9C30:C500:5472:3DDE:ACC3:B69220:25, 4 February 2025 (UTC)Reply
The point was that while Latin entries have diacritics in their headwords for clarification, it doesn't make sense to do it for Portuguese non-lemmas (these are alternative spellings; pre-reform spellings) when the feature of interest is precisely the spelling quirks. Did you understand that this is not modern-day Portuguese, but Portuguese from ~120 years ago?
Indeed my comparison are flawed because I don't believe any other language — as currently implemented on Wiktionary — has the same scenario as Portuguese pre-reform spellings. Polomo47 (talk) 20:33, 4 February 2025 (UTC)Reply
Please enlighten me as to how other languages’ pre-reform spellings are treated, and how those correspond with Portuguese and the accent trouble I mentioned. Having “the same treatment” would mean there is a same a scenario. Polomo47 (talk) 10:32, 7 February 2025 (UTC)Reply
-ável's usage notes: "Some words ending in -ável are descended directly from Latin as opposed to being formed from a stem and a suffix". Likewise, these primitive adjectives, descended directly from Latin, end in -ével, -óvel, or -úvel: delével, móvel, resolúvel, solúvel, and volúvel, and the endings have the same meaning. But Wiktionary doesn't cover those cases.
There are -abilidade and -ibilidade but no entries for the aforementioned cases. Due to them and to every noun ending with -abilidade or -ibilidade's having a corresponding adjective ending in -ável or -ível (see compatibilidade, digestibilidade, and elegibilidade), the two entries are merely SoPs of two suffixes, -ável or -ível + -idade.
What I think we should do is lemmatize at -vel, and list -bil (not an infix) as an alternative form. I want to keep -ável and -ível as alternative forms also, just like we have -amento and -imento — seems this is par for the course with suffixes that take verbs as input.
@Davi6596 These sorts of issues also come up in Russian. The correct analysis IMO of governabilidade is, as you say, governável + -idade, where -idade causes certain morphological modifications to preceding suffixes, such as converting -vel to -bil-. Take a look at Russian -ный(-nyj) for how a similar situation is handled in that language; basically, I added a bunch of usage notes explaining the phonological changes that happen to stems preceding this suffix. These changes are intuitive to a native speaker but definitely non-obvious to a non-native speaker. For how to handle -vel vs. -ável/-ível and similar issues, you might take a look at
Russian abstract noun-forming suffixes -ание(-anije) and -ение(-enije) (the former is usually added onto verbs in -ать(-atʹ) and the latter onto all other verbs, including both verbs with vocalic stems in -е-, -и- etc. and verbs with consonant stems; sometimes they are analyzed as two suffixes -ние(-nije), which is added onto verbs in -ать(-atʹ) and maybe -еть(-etʹ), and -ение(-enije), which is added onto all other verbs);
Russian adjective-forming -ский(-skij) and the extended variants -еский(-eskij), -ческий(-českij), -ический(-ičeskij), etc. (see under ==Related terms== in -ский(-skij) for the full list);
Russian adjective-forming suffix -альный(-alʹnyj), which is sometimes productive on its own but in many cases is best analyzed as -ный(-nyj) added onto French and German adjectives ending in -al, -ell or similar;
Russian verb-forming -овать(-ovatʹ) vs. -ировать(-irovatʹ), where I've analyzed verbs ending in surface -ировать(-irovatʹ) that derive from German verbs in -ieren as actually being the German verb plus the shorter Russian suffix -овать(-ovatʹ), and only used -ировать(-irovatʹ) as an actual suffix where the -ieren + -овать analysis is impossible.
The case with -ável and -ível seems much like the case with -ание(-anije) and -ение(-enije) in that -ável is usually added onto verbs in -ar and -ível onto other verbs. In Russian we chose not to create an entry -ние(-nije) since this suffix is never productive in the modern language; instead -ание(-anije) and -ение(-enije) are alternative forms of each other. You might follow a similar approach; at least, -ável and -ível should include their own etymologies, derived terms and usage notes, as they do now. The only reason I can think of to create -vel would be for surface etymologies of terms in -úvel and such, but I'd actually caution against a surface etymology such as resolver + -vel -> resolúvel unless there are clear synchronic rules governing such creations; otherwise they are mostly just confusing and misleading, and it would be better to just derive resolúvel from Latin resolūbilis, which includes its own derivation from resolvō and -bilis.
For similar reasons, I don't think there should be a Portuguese entry at -bil, -abil, -ibil or similar (because they're not affixes, but simply allomorphs that occur before -idade), and I would likewise delete -abilidade and -ibilidade, which require an incorrect etymological analysis for them to exist at all. Benwing2 (talk) 23:19, 26 January 2025 (UTC)Reply
Your proposed approach also makes sense. But our treatment of, e.g., Latin, does make entries for allomorphs... like those of con-, though they are also mentioned in usage notes. I'm not sure which way I prefer.
My reason for creating a -vel is on the basis that the a in -ável is just from -ar. But then again, that can't be explained for second conjugation verbs, nor for those ending in or, which end in -nível, for similar reasons as how -ável itself becomes -ábil when suffixed. How should we treat that? Polomo47 (talk) 19:45, 27 January 2025 (UTC)Reply
So the difference between con- and -bil is that the former is a productive affix, while -bil isn't; it's just a variant of -vel before -idade, which never occurs as an affix in an etymology. What I mean is that there's no word that is formed by adding -bil onto a verb. Instead you add -idade onto a word ending in -vel and the -vel changes to -bil, but by then it's just three letters at the end of the word. The fact that this happens with other suffixes like -issimo and others shouldn't matter; similarly, there are various suffixes in Russian -ный(-nyj), -ник(-nik), -ница(-nica), etc. that all cause the same modifications to the previous word when added (palatalization, insertion of a prop vowel, stress movement, etc.). The way I'd handle that is to just mention in the usage notes of each suffix that changes -vel to -bil that it does this; or, if there is a cluster of changes that occur in multiple suffixes, put them all in the usage notes of one of the suffixes and refer to that suffix in the usage notes of the other suffixes. As for -ável vs. -ível vs. -nível, this is again reminiscent of -ание vs. -ение in Russian, where the choice mostly depends on the verb class (and there are in fact some verb classes that take -тие instead) but with very occasional exceptions; and similarly there are likely to be exceptional cases in Portuguese where -ável is added onto a verb not ending in -ar. Each of these variants needs its own etymology, as well as its own usage notes describing the circumstances when the variants are used. The definition of -ável might read "-able, -ible; alternative form of -ível and -nível" and similarly for the other two. Benwing2 (talk) 20:20, 27 January 2025 (UTC)Reply
{{Latin personal pronouns}} should be reoriented to give the cases down the side, which matches the other Latin declension templates.
I also don't like the way it implies "possessive" is a separate case form - it should be treated as part of the other axis, since each of the possessive determiners has its own declension paradigm. It should be possible to do this without cluttering things too much.
It's wrongly stating that eius is a possessive determiner, when it isn't - there is no third-person, non-reflexive possessive determiner; the genitive of the third-person pronoun is used instead (which is what eius is). I suspect this is because eius resembles meus, tuus and suus, but it doesn't decline in agreement with its referent (e.g. for an accusative referent, meus → meum, tuus → tuum, suus → suum, but not eius → *eium). Same goes for eōrum and eārum in the plural.
@Theknightwho thanks for your input. I think I will go ahead with remodelling {{la-decl-ppron}} into a per-pronoun template.
As for your other comments:
From looking at many of our Cat:Personal pronoun boxes, I've formed the view that, when there's a choice between making the template "landscape" (wider than it is tall) or "portrait" (vice versa), "portrait" is the better option, because it gives a better experience to mobile users and is more likely to avoid horizontal scrolling on Vector 2022. Here, the "portrait" option involves keeping the orientation of {{Latin personal pronouns}} - persons down the side, cases across the top. Yes, this would be inconsistent with {{la-ndecl}}, but I think the inconsistency is worth it for the other benefits. Moreover, it is how most of our personal pronoun boxes for other declining languages are currently arranged. (Admittedly, a few were transposed by me, but most were already like that before I came along.)
At minimum, the "possessive" column should be set off by a separator (narrow gap, as seen in {{la-adecl}}. Doing what you suggest may make the template too large.
@This, that and the other I'm happy to do the work for {{la-decl-ppron}}, as I've been working on the Latin inflection modules quite a bit at the moment. In terms of the size issue: we could incorporate selective expansion for different parts of the template, which might be a way to have it both ways. Theknightwho (talk) 13:01, 3 February 2025 (UTC)Reply
@Grande1900: I think you have it backward: it doesn't matter how it's transliterated or mistransliterated. Presumably this word is attested, and we should go by the attested spelling(s). We need to find out how it's spelled in actual inscriptions, and then decide where to move it based on that. If it's not attested, we'll need to move it to the Reconstruction namespace. Chuck Entz (talk) 05:49, 6 February 2025 (UTC)Reply
After looking further, an inflected form 𐤮𐤱𐤠𐤭𐤷 as well as multiple adjectival forms (namely 𐤮𐤱𐤠𐤭𐤣𐤶𐤫𐤴) appear twice on this inscription that we (thankfully) have here on commons. This is the exact same inscription Gusmani references for his entry on the word. So unless there is an attestation for either 𐤳𐤱𐤠𐤭𐤣 or 𐤮𐤱𐤠𐤭𐤣 (which we don't know of since the article is completely unsourced), it should be at least moved to Reconstruction:Lydian/𐤮𐤱𐤠𐤭𐤣.
These three lists of English rhymes, for their Phonetic Transcription, are inaccurate for dialects of English, broadly, outside of the United Kingdom. Words like bang, dangle, thanks, and thang are given the IPA and enPR to suggest they are said as -æŋ(-), while the accompanying audio file, whether it is AuE or GA/US, dramatically raises the vowel from /æ/ to /eɪ/. These pronunciations are accurate! /æ/ and /eɪ/ are not allophones in these dialects! One can only assume they are native speakers faithfully representing their national dialect, while audio sources which are phonetically identical are all over the web. Sang even has /seɪŋ/ written for its US English pronunciation, which I think is very appropriate.
Unfortunately, most dictionaries, even chiefly American ones, don't reflect this difference properly. I know that it must be a gradual change for every pronunciation of <a> before <ng> to be altered on Wiktionary. But, I still think an important change is in order.
If this petition fails, the next most appropriate step would be to add an alternative pronunciation to pages like Rhymes:English/æŋ. Most appropriate after that would be to add ambiguity to enPR. but here are my listed reasons as to why a Split would be better.
Giving a rhymes page two different broad transcriptions is kinda silly. Broad transcription should be informative of a general way things are pronounced. It is normal and common for two distinct pronunciations to exist, but being distinct, they themselves would not rhyme. They do not have the same rhyme. I think that's very silly, almost comical. But it's better than the alternative, Inaccuracy:
the grouping of sounds /æŋ/ is not present in many and representative dialects of English. Wiktionary should seek to accurately represent these dialects to aide in informing curious people.
RP should not be prioritized or treated as a default. People who are learning English are not always, and I am inclined to say but cannot numerically prove are not usually, learning Standardized, Prestige-Dialect, British English. Numbers are, of course, not everything and the Standard British English language should be given equal preference to all other standard dialects. For this cause, both Rhymes should get their own pages.
Rhymes exist in a network. Rhymes:English/æŋ is connected to Rhymes:English/æ-, and Rhymes:English/eɪŋ would connect to Rhymes:English/eɪ-. In my dialect of English, the Onset and Coda of "Sang" and "Safe" are definitively identical, and so should go together on Rhymes:English/eɪ-.
There may exist a word used in English that you and I are not aware of, or that may exist in the future, which rhymes with an /-eɪŋ.../ word and not an /-æŋ.../ word, or vise-versa. Not only that, but rhyme pages also often include near-rhymes, which I think Rhymes:English/eɪŋ would already have different from -/æŋ (perhaps even as a pure rhyme); ginseng!
So, for all these reasons and more, please support to begin the work of differentiating these different pronunciations. Because they're different! Cam0mac (talk) 05:04, 11 February 2025 (UTC)Reply
Completely oppose. The point of Rhymes pages is to group together words that rhyme with each other. Regardless of whether you pronounce hang and rang/hæŋ/ and /ɹæŋ/ or [heŋ] and [ɹeŋ] (the accents with raising still don't have the ɪ-offglide), the point is that they rhyme with each other and it would be redundant to have two sets of Rhymes pages with identical lists of words. If there are accents in which ginseng also rhymes with these, that can be marked with a note like {{q|in some accents}}, which is already widely done on Rhymes pages. The contrast between the phonemes /æ/ and /eɪ/ is neutralized before /ŋ/, so it doesn't matter from a linguistic point of view which symbol we pick to represent the underlying vowel, but /æ/ has the weight of tradition behind it, not to mention the fact that even in the U.S. a whole lot of people (myself included) really do pronounce these words with [æŋ], maybe raised slightly to [æ̝ŋ] but certainly not raised as high as [e(ɪ)ŋ]. —Mahāgaja · talk08:31, 11 February 2025 (UTC)Reply
I agree that there’s no need to split these, due to the lack of minimal pairs. Americans may say rang as ‘reng’ or ‘raing’ but those aren’t actual words. The only exceptions I can think of is pang becoming peng and peng becoming ping. Overlordnat1 (talk) 10:55, 11 February 2025 (UTC)Reply
I hear you, and I'm softened to the prospect of leaving things be (though I don't lay too much weight on my changes or suggestions). I'm still bothered by the fact there is less information about this pronunciation. Mostly I'd bring it to anyone's attention because I think record should be kept of this vowel change which I know is present. Cam0mac (talk) 11:31, 11 February 2025 (UTC)Reply
I don't know if I should make this comment here, on this (doomed) Rhymes-pages-centric proposal. But: in our {{IPA|en|...}}s (not Rhymes pages), I support acknowledging the /-eɪŋ/ (or at least the [e(ɪ)ŋ]) pronunciations of bang, sang etc, because as you say, bang and BAME share a vowel and bang and ban do not, for a lot of Americans. Rhymes pages, however — as others have noted above — intentionally overlook differences in transcription wherever these don't actually create splits; we don't have separate /əʊ/ and /oʊ/ pages, either, because the same list of words would be duplicated on each page. The traditional argument against acknowledging /-eɪŋ/ anywhere is of course that there are supposedly not minimal pairs for /æ/ and /eɪ/ in this exact position, preceding ŋ. But when it comes to {{IPA|...}}s (again, not Rhymes pages), I am of the opinion that once two phonemes are distinct somewhere in a language/dialect and we're notating them as separate phonemes there, they should be notated consistently—represented wherever else they occur in that dialect. (This is also why I support our practice of notating that e.g. German Rad ends in /-t/, instead of notating it as /-d/ and expecting casual readers to 'just know somehow' that "the phoneme /d/ means /d/ most of the time, but it secretly means the separate phoneme /t/ sometimes, but we only notate /t/ as /t/ some of the time and at other times notate it as /d/": that approach, which a few people have wanted over the years, seems to me quite hostile to readers. I similarly appreciate that our Russian pronunciations reflect its various vowels, instead of collapsing them all into one vowel as some theorists would like to.) So, because ban and BAME/bane show that /æ/ and /eɪ/ are different sounds, I am inclined to agree that we should acknowledge that both of them, not only one of them, can be found in bang. Or else we might as well notate the pronunciation of bang as /bæh/, since after all, many linguists have claimed there are no minimal pairs for /h/ and /ŋ/ either! (Those linguists are mistaken, there are /h/-/ŋ/ minimal pairs, but that's getting off-topic.) - -sche(discuss)20:58, 11 February 2025 (UTC)Reply
The uppercase version seems better because it has a smaller displaying box: it should be the survivor IMO. The lower case spelling should also be the survivor. I don't know which of the two has better internal logic, consistency with our ways of doing templates, and documentation. DCDuring (talk) 14:51, 12 February 2025 (UTC)Reply
I avoid this kind of project box in taxonomic and vernacular organism name entries. There seem to be fewer than 500 transclusions of each. DCDuring (talk) 14:56, 12 February 2025 (UTC)Reply
I use the inline template {{comcatlite}} (and {{pedia}} and {{specieslite}}). Some people like the big box project links though. Those big rhs boxes appear in the wrong places for someone, like me, who uses rhs ToC. Even the reduced-size project boxes are unsatisfactory, for the same reason. DCDuring (talk) 02:44, 13 February 2025 (UTC)Reply
We need the best features of all: appearance in dark mode, category linking, smaller box, lower-case title, and consistency with our code and documentation standards. DCDuring (talk) 23:36, 12 February 2025 (UTC)Reply
While we're on this, could we also look into merging these two? This has been brought up before. I don't really know the difference between them, though @ExcarnateSojourner says they are useful for different purposes. Could we combine the boxes, and use parameters to modify any features found in one template but not in the other? I use {{Navbox}} for quotation navigation boxes. — Sgconlaw (talk) 22:05, 12 February 2025 (UTC)Reply