Wiktionary:Information desk/2024/August

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Probably doesn't need to be protected Linkyspoot (talk) 10:50, 7 August 2024 (UTC)[reply]

Can this word be added?

[edit]

Some Brazilians mistranslate aprimorar to the verb "aprimorate", which doesn't exist in English. I found many uses of it on Google Scholar and Google Search. I think this accidental gap happens because some Portuguese ar-ending verbs have English cognates (e.g. propagar and propagate). Can that word be added to Wikitionary? Davi6596 (talk) 14:17, 7 August 2024 (UTC)[reply]

Please see WT:CFI and WT:ELE to learn more about editing, and what words can be added and how. Also please see WT:About Portuguese. Vininn126 (talk) 14:22, 7 August 2024 (UTC)[reply]
It's a fake English verb that came from Portuguese, but it isn't used in Portuguese, and I read the criteria but am still unsure if it can be added because no native English speaker uses it. But some Brazilians use it in texts written in English, as you can check on Google Scholar. Davi6596 (talk) 14:38, 7 August 2024 (UTC)[reply]
The issue of code-switching vs true borrowing is a difficult to measure and much debated subject. Vininn126 (talk) 14:39, 7 August 2024 (UTC)[reply]
I think this is a case of borrowing (tho it's limited and mistaken) since aprimorar was adapted to look like a Latinate English verb. Davi6596 (talk) 14:46, 7 August 2024 (UTC)[reply]
@Koavf What do you think? Sorry if mentioning you isn't allowed, I'm new here. Davi6596 (talk) 23:42, 8 August 2024 (UTC)[reply]
It's fine to ping me and since I'm both an admin and someone with some Portuguese competence, it's totally valid to seek me out, but as noted above, this is a pretty tricky case. I'd be in favor of keeping it and finishing all the conjugations but it's probably best to provide some durable sources of the word being used, too. —Justin (koavf)TCM 00:03, 9 August 2024 (UTC)[reply]
Just note that I'm talking about "aprimorate" (used by Brazilians in English) and not "aprimorar" (used in Portuguese), so there are few verb conjugations. And the sources are at the Google Scholar and Search links I put. I'll quote some sentences:
@Vininn126 It's not code-switching, because it's been adapted to English: Portuguese aprimorar has been translated as aprimorate. Theknightwho (talk) 00:29, 11 August 2024 (UTC)[reply]
@RodRabelo7: As the most recent native pt-BR speaker with whom I've interacted, do you have a perspective on this? My Portuguese is pretty elementary and it's a mix of Brazilian and European, so I'm hardly an expert. —Justin (koavf)TCM 20:36, 9 August 2024 (UTC)[reply]
It is used as an English verb in English texts, including the gerund (“regional guidelines for aprimorating the clinical management of fungal infections in pediatric patients”[1]). The relevant question (IMO) is whether the relatively few uses (many more frequent misspelings have no entry) suffice to satisfy the CFI criteria. If included, it can be labeled as {{lb|en|Brazil}}.  --Lambiam 22:07, 9 August 2024 (UTC)[reply]
I'm afraid my knowledge on Wiktionary policies isn't sufficient to help you all with this question. I would nevertheless like to point out that some English phrases have different meanings in Portuguese; see home office and print, for instance. They're probably used every now and then in English texts by Brazilians... Would they satisfy WT:CFI though? RodRabelo7 (talk) 00:42, 10 August 2024 (UTC)[reply]
I mean, "aprimorate" could be added to a "Portuglish" appendix or something else that isn't a standard Wikitionary entry. Native English speakers should be able to find the meaning of that word.
But is there someone with more knowledge on Wiktionary policies that can help us? Davi6596 (talk) 02:51, 10 August 2024 (UTC)[reply]
Let's just create the entry already. The fact it was coined via a mistake doesn't change the fact that it has seen enough use to pass WT:CFI. All this talk of code-switching and misspellings doesn't really make sense, as it's a clear example of Category:Non-native speakers' English. ludopathy is another one from Portuguese, for instance. Theknightwho (talk) 00:28, 11 August 2024 (UTC)[reply]
We even add misspellings and misconstructions of a certain frequency, so there is little argument not to add distinct English words particular to those writing from the non-English speaking world, for which we also have a label {{lb|en|NNSE}} relating its misguidedness. The issue wasn’t “code-switching vs true borrowing” but lexicalization vs. just wrong, and we add wrong, facetious, offensive entries, since they happen to be part of some people’s moderately tended lexica. Fay Freak (talk) 01:12, 11 August 2024 (UTC)[reply]

How to change the title of an article

[edit]

This article has a misspelling of "संस्करण", with it being misspelled "संस्कारण" in the article. I'm not exactly sure how to change the title of an article, so can somebody else either teach me or do it for me?

https://en.wiktionary.org/wiki/%E0%A4%B8%E0%A4%82%E0%A4%B8%E0%A5%8D%E0%A4%95%E0%A4%BE%E0%A4%B0%E0%A4%A3 B.P. Koirala (talk) 18:52, 8 August 2024 (UTC)[reply]

It is possible to move a page and that will rename it, but I'm personally reluctant to do that, since I'm totally ignorant about Hindi. Note that we include variant spellings (color and colour) and occasionally misspellings (teh for the), so are you 100% certain that the current entry is not actually a word at all? —Justin (koavf)TCM 19:21, 8 August 2024 (UTC)[reply]
I suppose it is fully possible that it's a rarer spelling variant of संस्करण, but its definitely not the most common form. I'll look into it more later to see if I can find it being used in Hindi sources. B.P. Koirala (talk) 20:02, 8 August 2024 (UTC)[reply]
धन्यवाद. If necessary, I can contact Hindi-competent editors before moving the page. —Justin (koavf)TCM 20:33, 8 August 2024 (UTC)[reply]

Download IPA Pronunciation of Azerbaijani terms/words

[edit]

Category:Azerbaijani terms with IPA pronunciation

I aspired to download all the words with their pronunciations for my university project. This project entails a teaching Montreal Forced Aligner's acoustic and G2P models for disparate languages, such as the Azerbaijani one as well. For this, I needed to download a large dictionary for make this project work, albeit couldn't find how to download them fully, or partially at all. Could you please lend me the link from where I could scrap these pronunciations?

Thanks in advance.

[2]https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/g2p_train.html

[3]https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/train_acoustic_model.html Thegamercoder19 (talk) 10:56, 11 August 2024 (UTC)[reply]

Entry page textbox case

[edit]

The main lookup textbox at https://en.m.wiktionary.org/wiki/Wiktionary:Main_Page should be reprogrammed to show your word in lowercase by default, not uppercase. If you type in a word without deliberately changing it to lowercase, you often end up with a German noun (capitalized) or a proper name spelled the same way as your desired word. 2603:9000:AC00:5935:7002:E210:AEA6:B99C 18:28, 13 August 2024 (UTC)[reply]

How can I mark the separability of English phrasal verbs?

[edit]

In English, there are separable phrasal verbs like try on: we can say both "try on the shoes" and "try the new shoes on".

There are also inseparable phrasal verbs like count on: we can say "count on me" but not usually "count me on".

How should I include this information in entries? 185.18.68.65 22:37, 13 August 2024 (UTC)[reply]

Currently, there is no standardized way to include this, the Usage notes section is sometimes used to display such information, e.g. turn back, hear out, run into. (Last month, there was a nice proposal to display the correct placement in the headword, although this hasn't been implemented yet.) Einstein2 (talk) 22:57, 13 August 2024 (UTC)[reply]
One could make the argument that only transitive phrasal verbs have this, and verbs such as count on is really just count + prepositional argument on (compare Polish liczyć na kogoś...) Vininn126 (talk) 08:41, 14 August 2024 (UTC)[reply]
Interestingly, the separation for count in and count out is obligatory; in (S)VO order one can say “count me in” but not *“count in me”. When pronounced, there is a difference in the stress pattern: COUNT on versus count IN.  --Lambiam 20:00, 14 August 2024 (UTC)[reply]

Etymology scriptorium

[edit]

An year back I posted the origin of a word in Etymology scriptorium. HULLABALOO https://en.wiktionary.org/wiki/Wiktionary:Etymology_scriptorium/2023/August#Hullabaloo

What is the process for moving it to the main page ? (Entry/ Discussion/ Citation) Ajayjo (talk) 17:15, 14 August 2024 (UTC)[reply]

How do I work with inflection-table syntax?

[edit]

Hello! I noticed the example table in Template:ine-conj-impf (accessible via the "more" button) and wished to include *h₁ésmi as the active voice, first-person singular present indicative form within *h₁ésti. However, after reviewing the documentation, I'm uncertain about how to proceed. Can someone help me? Eduardogobi (talk) 03:28, 18 August 2024 (UTC)[reply]

newly created bot account flagged for anti-abuse measures

[edit]

hi there - i just created User:ColumbaBushBot and was doing a test run of edits per Wiktionary:Bots

however my bot account has been flagged before i could get started... is there any way the flag can be lifted? ColumbaBush (talk) 21:42, 19 August 2024 (UTC)[reply]

Per that page, you should get consensus prior to running a bot, so I appreciate that you were just trying to do a test, but an explanation of what you're trying to do and then consensus to do it should happen first. The bot's user page at least gives a hint, but could you show a diff or two from a manual edit that shows what your bot would do so we can give informed consent. —Justin (koavf)TCM 22:03, 19 August 2024 (UTC)[reply]
Thanks Justin - Sorry I misunderstood, I was under the impression that i first needed to do a test run "on some 10–50 entries" before requesting bot status at Wiktionary:Votes
Right now the bot account itself is banned from even making manual edits at the moment.
Where do I go from here? Should I should start a convo in Beer parlour and try to get consensus for this bot? ColumbaBush (talk) 22:17, 19 August 2024 (UTC)[reply]
I apologize myself: I made it complicated. You are correct that it lists a test run. Let me see why the bot account was unable to edit. —Justin (koavf)TCM 22:20, 19 August 2024 (UTC)[reply]
This is confusing to me: the bot isn't actually blocked as an account and I don't see any prohibited edits in the abuse log. Which page(s) did you try to edit? —Justin (koavf)TCM 22:21, 19 August 2024 (UTC)[reply]
you're good, so i actually was able to make a manual edit just now https://en.wiktionary.org/w/index.php?title=%DC%9A%DC%AC%DC%90&diff=prev&oldid=81326284 (previously not allowed) but when i run pywikibot via
python pwb.py template.py 'aii-infl-noun-m-fempl' 'aii-infl-noun/m-fempl' it says
WARNING: You can't edit page en:ܐܬܐ
also - our paths crossed online probably 10-15 years ago, i hope you're doing well ColumbaBush (talk) 22:28, 19 August 2024 (UTC)[reply]
Weird. The abuse log for that entry doesn't show anything, so that makes me think that maybe the problem is on the end of the bot itself: that it's somehow misconfigured or is trying to... access the wrong API or something? Sorry, I'm not a Python guy. :/ Maybe they can help at the Grease Pit?
And it's nice to run into you again, friend. "Well" is a strong word, but I am doing, that's for sure. —Justin (koavf)TCM 22:34, 19 August 2024 (UTC)[reply]
thanks so much for your help - i forgot to select "Edit existing pages" under https://www.mediawiki.org/wiki/Special:BotPasswords/ so it's working now
glad you are "doing" (sometimes that's all we can hope for) ColumbaBush (talk) 22:45, 19 August 2024 (UTC)[reply]

Proto-Baltic

[edit]

Why is a "Category:Proto-Baltic language" page missing, as well as "Category:Proto Baltic lemmas" with nouns, adjectives, numerals, verbs, adverbs etc.?

Moreover, why when I put the etymology in a reconstructed word in Proto-Baltic (a red link), Wiktionary only allows me to create a new page called "Reconstruction:Proto-Balto-Slavic/*..." insted of the correct "Reconstruction:Proto-Baltic/*..."? I am using the correct code for the language, which is 'bat-pro', while Proto-Balto-Slavic is 'ine-bsl-pro'!

I'll make a concrete example:

  • Proto-Baltic: *ainas

Now check the red link. Cicognac (talk) 09:26, 24 August 2024 (UTC)[reply]

I don't know, but w:Balto-Slavic languages and w:Baltic languages indicate that the existence of a Baltic clade is debated: some linguists (though seemingly not a majority) would consider Baltic a paraphyletic grouping. Wiktionary:About Proto-Balto-Slavic refers to "the uncertainty of the existence of a separate Baltic branch".--Urszag (talk) 09:40, 24 August 2024 (UTC)[reply]
Should we create pages and categories for Proto-Baltic reconstructions, then? Some reconstructed words are in fact available (many of these sources are in Lithuanian, though, but I can try to translate them since the text can be copied and pasted).
As for the technical issue, I could try to contact somebody else. I mean, PBS and PB are two different proto-languages. Cicognac (talk) 10:13, 24 August 2024 (UTC)[reply]
If Baltic is not a clade, then Proto-Baltic never did exist as a separate language from Proto-Balto-Slavic. (This would be the case for example if the split of West and East Baltic occurred around the same time or earlier than the split of Slavic from Baltic.) As far as I can tell, the current position of Wiktionary editors has been to take a conservative stance and avoid reconstructing Proto-Baltic. But we should wait to see what someone actually active in editing these languages says.--Urszag (talk) 11:18, 24 August 2024 (UTC)[reply]
Well, in this moment I am among the very few people editing PB (and a bit of PBS) "-.- As far as I know, PB is assumed by the majority to split from PBS and it is intended as a unified language before Western and Easter Baltic languages split, reconstructed through Old Latvian, Old Latganian, Old Lithuanian and Old Prussian (which is scarcely attested but precious). The only one who did a great job in PB in Wikipedia is @Ed1974LT, whom I cooperated with yesterday in improving numberals in PBS and PB (I'll call him on his user talkpage). Another one is @SeriousThinker, @gnosandes (he is not so active recently) and @Rua. We can try to decide whether or not to put new pages about Proto-Baltic reconstructions here. Cicognac (talk) 11:58, 24 August 2024 (UTC)[reply]
How can you edit the Proto-Baltic language when one does not exist? It is decided that we don't make PB lemmas. Proto-Baltic is etymology-only language, thats why it links to PBS. You are making a mess. Sławobóg (talk) 18:36, 24 August 2024 (UTC)[reply]
I don't think that the Wikipedia article for Proto-Baltic can serve as guidance in any way for Wiktionary. It is stuck in an antedeluvian stage with things like "Linguists are considering the possibility of present-day Baltic and Slavic languages having a common point of linguistic development." The possibilty! –Austronesier (talk) 19:53, 24 August 2024 (UTC)[reply]
I only put numbers from 1 to 10, I don't think I'm making a mess since I stopped after 5 minutes and asked here. @Austronesier I can use etymological dictionaries as a guidance (I found at least two about PB), but you are telling me that the status of PB is still problematic in 2024. That being the case, I won't add PB etymologies. If this is a guideline regarding PB, you should notify it somewhere so that users won't add whole new pages about PB, just to name one. It's nice to see people participating in this discussion! Cicognac (talk) 20:22, 24 August 2024 (UTC)[reply]
One problem is that there's a complicating layer of nationalism involved, somewhat like with Serbo-Croatian. As I understand it, the concept of a common proto-language was used during the Soviet era to subordinate the Baltic languages to Slavic and neutralize them as a force of nationalism in the Baltic states. After independence, it was hard to accept anything that even superficially resembled that. How much that has influenced the consensus among Latvian and Lithuanian scholars is hard for me to say, but we should be aware of the possibility. Chuck Entz (talk) 21:14, 24 August 2024 (UTC)[reply]
@Cicognac it's really cool that you added Proto-Baltic words to the 1-10 entries. Don't remove them. I'd like to see more reconstructed Proto-Baltic words like these in Proto-Balto-Slavic entries. It's probably better to use *w instead of *u̯ and *j instead of *i̯. AshFox (talk) 08:09, 25 August 2024 (UTC)[reply]
Yes, but I am told by our colleagues that PB is still problematic for a number of reasons, even if there are dictionaries about PB and Baltic etymologies! I could investigate more about the debate around PB and maybe put all the different opinion on Wikipedia with sources. But I don't want to violate guidelines, if the guidelines forbid users to add reconstructions in problematic proto-languages (even if reconstructions are sourced). I can keep numbers (they are very basic words), but I am not willing do add more PB etymologies or brand-new pages until a consensus or a compromise solution is reached. Of couse, I don't ignore the existence of PB as a reconstructed language and I'm open to compromises. Yes, I know that PB uses a different way of transcribing *w and *j but I opted for the kind of transcription often used by Balticists, which in turn is used in Wikipedia. I took this transcription as a rule/guideline instead of merging it to PBS transcription, which actually uses *j and *w. Cicognac (talk) 08:57, 25 August 2024 (UTC)[reply]
I talked with a user active in the field of PB, he suggested me to use traditional transcription and not an updated transcription, even if the updated one is easier to read and type and it's nearer to PIE and PBS. He stressed a lot tradition since "we are following a tradition". But I am still not sure whether or not to add PB reconstructions (let's only consider reconstructions in etymologies and NOT new pages), even if they are sourced and reconstructed by eminent linguists such as Maziulis, just to name one (he seems neutral to me, moreover he accepts of course the existence of both PB and PBS and published a lot of works during a long timespan). I was waiting this discussion to proceed a bit further. Cicognac (talk) 18:30, 25 August 2024 (UTC)[reply]

table of Old Chinese → Middle → modern sounds

[edit]

Indo-European sound laws has tables concisely showing the common outcomes of various PIE sounds in various IE languages. Is there anywhere I can find such a table showing the theorized development of various Old Chinese sounds to various modern Chinese varieties' sounds?
If not, would it be possible to generate one from {{zh-pron}}'s data? It seems like the data is already present that "character X has been reconstructed as being pronounced like 1 in Old Chinese, like 2 in Middle Chinese, and is pronounced 3 in Mandarin and 4 in Min (etc)" : could that be used to generate, if not tables of "OC *X- normally produces MC *x-, which normally/most commonly produces Mandarin k-" (or whatever), then at least just a bunch of (sortable) tables of all the cases where OC pronunciations are present in the data, like:

  • character X - OC: /foobar/ - MC: /fooba/ - Mandarin: /fubar/ - Min: /fuba/
  • character Y - OC: /foobaz/ [etc etc]

? - -sche (discuss) 20:13, 25 August 2024 (UTC)[reply]

The best and most updated and innovative reconstruction of Old Chinese is available in "Old Chinese: a New Reconstruction" by Baxter and Sagart (2014), you can download it for free in the internet. Currently I am working on this for the second time (the article in the Italian Wikipedia is mine, I'm rewriting it to improve it). The best reconstruction for Middle Chinese is made by Baxter, 2011, and you can download it again from the internet. To find the development of sounds in Middle Chinese, you should start from the position of the character is a famous rhyme table called Qieyun, then Guangyun; then you must take into account Shanghainese dialect (very conservative) and both general and Taishan Cantonese dialects (very conservative); eventually, you must take into account the pronunciation of a Chinese loanword in a sino-xenic language (Ancient Korean, Japanese and Vietnamese: see hanja, kanji and han tu) in the oldest glosses and dictionaries.
As for Old Chinese, you must again take into account the Chinese Qieyun and Guangyun, the OC substrata in nearby languages (Korean, Japanese, Vietnamese, Thai, etc. just to name some), the tone in Qieyun (e.g., rising tone develops from the loss of a glottal stop at the end of the syllable), the syllable used mostly in loanwords from Sanskrit during Middle Chinese, proto-Min (a conservative variety emerged from OC) and the existence of pre-Qin varieties of a sinogram (then, don't forget that pronunciation of a character in oracle bones is different from the pronunciation of a new coined character during Qin and Han empire). You should try not to use the Shuowen Jiezi since it reflects pronunciation during late Han Chinese and it takes into account later versions of sinograms.
The romanization by Baxter only offers some information about the pronunciation: it's not the actual pronunciation, but it gives a nice idea of how a character could be pronounced.
This is just to start. Cicognac (talk) 03:20, 26 August 2024 (UTC)[reply]
Thanks for your informative response! It sounds like what you're doing is different from what I'm proposing to do. For a large number of characters, the English Wiktionary already contains the information that they have been reconstructed by so-and-so as having such-and-such pronunciation in Old Chinese, Middle Chinese, Mandarin, etc; this is how {{zh-pron}} is able to display "Middle Chinese: kwaeng, Old Chinese: (Baxter–Sagart): /*[k]ʷˤraŋ/, (Zhengzhang): /*kʷraːŋ/" in the entry on even though the only input is just a "y" ("yes", flipping a switch to display the info). I am not suggesting anyone duplicate the efforts of Baxter et al. in comparing Taishanese and the Qieyun etc; I am asking — since Baxter et al. have already done that, and have already reconstructed pronunciations, which we already have in our Chinese data modules — whether anyone can take that information which we already have, and instead of outputting it one character at a time and only on the page for each character, either (1) output it into one or more large (sortable) tables, from which it should be possible to observe what the regular outcome of a given OC initial [etc] in a given environment is in MC, Mandarin, Min, etc, or even (2) extract that information in (semi-)automated fashion (perhaps a script could look over the data and output things like "entries with OC a- list the following MC initials: a- (4602 cases), b- (23 cases), c- (11 cases)") without requiring a human to manually look over tables. I appreciate the list in the Baxter paper you reference, which does show various OC pronunciations, MC pronunciations, and modern Mandarin pinyin for various characters, which is very helpful, although a table of individual phonemes (and more varieties of Chinese) like the PIE table would be ideal IMO. - -sche (discuss) 07:42, 26 August 2024 (UTC)[reply]
Yes, I understood. I don't know if we can do something similar. Proceeding mechanically can be dangerous, since for instance two characters that had the same pronunciation in EMC can have a different pronunciation in OC; one sound in OC can have multiple reflexes in EMC and viceversa. That's why I told you every possible source to reconstruct OC and EMC even if you want to do a mechanical and automated thing. I am a bit worried that putting individual pronunciation is the best solution at least for OC. What you suggest can work in EMC, which is more regular, or it can work in inflection tables... but EMC is a language without pronunciation and OC had a small amount of morphology. Chinese languages are not like Indo-European languages. Cicognac (talk) 12:58, 26 August 2024 (UTC)[reply]
As for the later part of your message, creating pages with criteria like "Chinese syllables starting/ending with sound XXX in Old Chinese" or "Old Chinese type A syllables" (they display pharingalization in Baxter-Sagart reconstruction) vs "Type B syllables" or "Chinese syllables derived through morphological prefix *m-" or "suffix *-s" etc. can be a good idea just to make some comparisons, if the creation of such category pages is well-motivated. I hope I understood everything and aswered everything. Cicognac (talk) 13:03, 26 August 2024 (UTC)[reply]

How do I set a noun to Dutch?

[edit]

Like in the "en-noun" form. I put "dum" and it was Middle Dutch. How do I make it regular Dutch? (i'm kind of a newbie so don't call me dumb or criticize me)

Got it from from https://www.loc.gov/standards/iso639-2/php/langcodes-keyword.php?SearchTerm=dutch&SearchType=ALL EliteSlimeJumpingAround (talk) 23:42, 26 August 2024 (UTC)[reply]

The language code for Dutch is nl. See WT:LOL for all language codes.
Stujul (talk) 07:29, 27 August 2024 (UTC)[reply]