Wiktionary:Information desk/2020/April

How to create verb inflection table templates.

I am a somewhat experienced user, but I know next-to nothing about the computer engineering side of wiktionary. I want to add verb inflection tables for Munsee Delaware, (and others as well). I am totally able to create such tables linguistically, but really do not understand how to make them appear on wiktionary. Can someone run me through the process. I don't know what kind of scripts or computer languages it involves, and I really don't know much HTML, LUA or javascript.

A verb inflection table is crucial for being able to add entries in Algonquian languages, whose vocabulary is very verb-heavy, so it is essential that such templates exist. Thanks — This unsigned comment was added by Hk5183 (talk • contribs) at 20:24, 3 April 2020.

Would the specific inflected forms of a given lemma be added by hand, or are they best computed by an algorithm according to a limited number of paradigms (like for example in French, if you encounter the verb acalifourchonner, even if you have never heard of this and have no idea what it means, you do know that its third person plural past imperfective indicative is acalifourchonnaient )? In the first case, there are templates that you can use as a model. In the second case, you will either need to learn some Lua or recruit someone to do the coding. --Lambiam 23:06, 4 April 2020 (UTC)[reply]

Can you tell me how difficult it would be to program rules for a language with somewhat complex verb morphology? The inflected forms could definitely be added by hand. (this is what I have done so far), but as there are countless thousands of possible verb forms this would be very tedious. While an algorithm could certainly produce proper forms, there are quite a few orthographic conventions which mean that certain letters combine with prefixes and suffixes to form diphthongs which can then further combine. Would it make sense to use a blank verb template for all verbs, or to make many templates for the many different types of stems? I do not know LUA well enough to encode complex spelling rules. For example --> Certain letters change when preceded by specific letters, such as N. Before N: /t/ changes to--> /d/; /k/ --> /g/; /s/ - -> /z/; /sh/ --> /zh/; /ch/ --> /j/... Additionally, many verbs have "unstable stems", which means that in some cases the verb-stem itself changes. For example, take the Animate Intransitive present indicative conjugation of the verb: (stem form) /-amangíixsi-/ "to talk in a loud voice". The first and second person forms can be broken down semantically into the personal prefixes /n-/ (1st person), /k-/ (second person) and the verb stem /-amangíixsi-/ Nŭmamangíixsi, Kŭmamangíixsi. In the 3rd person form, however, person is not marked by a prefix, but by a suffix, /-w/. The final /i/ of the stem blends with the 3rd person suffix in all such 3rd person singular unstable stem forms ending in /ii/, and is realized as a /u/, producing the form amangíixsuw "he/she talks in a loud voice". Hk5183 (talk) 20:58, 1 May 2020 (UTC)[reply]

IP transcribing Hepburn's English-Japanese dictionary

I found Wiktionary:Waei Gorinshūsei 1910 and subpage Wiktionary:Waei Gorinshūsei 1910/1 which is linked from Wiktionary:Public domain sources. Apparently the seventh edition of Hepburn's A Japanese-English and English-Japanese Dictionary. It was started by an IP 8 years ago and that anonymous editor was able to transcribe only one page which is Page 1.

Should these two be moved to an Appendix page? ～ POKéTalker（═◉═） 11:41, 4 April 2020 (UTC)[reply]

It should be moved to Wikisource. —Suzukaze-c ◇◇ 02:19, 9 April 2020 (UTC)[reply]

Translations that are SOP

Hi, what's the procedure for adding a translation where the translation isn't a single word but the term in the foreign language is SOP and unlikely to ever get an entry of its own? I've seen this a few times, but as an example my most recent encounter was while adding translations at Cistercian—Chinese and Korean have no single word to my knowledge meaning "a Cistercian" and will instead say "monk of the Cistercian Order" (시토회 수도사 in Korean). But the Korean is straightforwardly just 시토회 (sitohoe, “Cistercian Order”) + 수도사 (sudosa, “monk”). It would make sense to just link the words separately but not sure how to do that properly (two t templates?). Nizolan (talk) 13:17, 11 April 2020 (UTC)[reply]

{{t|ko|[[시토회]] [[수도사]]}}. —Suzukaze-c ◇◇ 22:43, 11 April 2020 (UTC)[reply]

@Suzukaze-c: Thanks, have corrected accordingly. Didn't think to check since I assumed link syntax would break it. Nizolan (talk) 23:13, 11 April 2020 (UTC)[reply]

Word frequency

I am new on this platform and have a question. Not sure if this is the way to trigger replies.

With a colleague I wrote a book in Dutch on how to deal with dilemmas. Title "Dansen met Dilemma's - Op weg naar wederzijdse winst". (see www.dansenmetdilemmas.nl) We are now considering to have it translated in English but need to adapt some of its content to an English speaking audiance.

In a certain paragraph we discuss the meaning of the word dilemma and graphically list the frequency of usage of the word in articles (in our case the frequence in the NRC newspaper over the last 10 years. We compare its usage to that of words with a similar meaning: Paradox, contradiction, conflict.

In the English version we may limit ourselves to just mentioning the frequency number of those words. How do I find the frequency number of those words? For instance in the Project Gutenberg list of in TV programs? How can I sort the list in alphabetical order?

Thanks Allard Everts Evertsal (talk) 14:00, 14 April 2020 (UTC)[reply]

Those words have rather different meanings. In any case, you should probably use Google Ngrams results. —Μετάknowledge^{discuss/deeds} 14:26, 14 April 2020 (UTC)[reply]

Thank you so much. Didn't know this existed. Indeed the meaning of the words is very different, but each of them is used to indicate a situation of tension. For your interest: many times the words dilemma and paradox are used as synonyms. This is utterly wrong, but such is the way people (mis)use words. The Google statistics are not very recent, but they show the same trend as what we have in our book (2005-2016 statitics from the major Dutch newspaper). I could still approach the Guardian or the Financial Times, but for now this is perfect. Thanks again Evertsal (talk) 15:08, 14 April 2020 (UTC)[reply]

For future reference, another platform for asking questions is the Wikipedia:Reference desk/Language. --Lambiam 20:42, 14 April 2020 (UTC)[reply]

Performing bulk edits

I'm a NLP researcher who uses Wiktionary to collect pronunciation data. As part of this effort we have noticed various inconsistencies (see also here) in phonemic transcriptions, and in one case even developed a spreadsheet with fixes. However there are often thousands of such fixes. Does there exist any tool or API that could allow us to apply bulk edits? Our team has the technical expertise to make use of such tools if they in fact exist. Kylebgorman (talk)

@Kylebgorman: Yes; see WT:BOT. However, you would need to create a vote to get your bot approved, and I would also like to see confirmation from a fluent Georgian speaker that these fixes are correct. —Μετάknowledge^{discuss/deeds} 16:50, 15 April 2020 (UTC)[reply]

Thanks for that. Some Georgian grammars use one, some use the other, but none have both. And it seems that that they're not separate phonemes, they're just in free variation or something like that. I have a student who works with Georgian speakers and I'll ask if we can get an official opinion. Kylebgorman (talk) 18:03, 15 April 2020 (UTC)[reply]

@Kylebgorman: I think rather than manually editing transcriptions, it would be better to replace {{IPA}} with {{ka-IPA}} to automatically generate the transcription. Then if the transcription system needs to change, it requires a single edit to the module rather than many edits to entries. Some of the words in the TSV have already been switched over to {{ka-IPA}} and therefore don't need fixing (but not all based on this search). — Eru·tuon 18:07, 15 April 2020 (UTC)[reply]

@Erutuon: What you said about {{ka-IPA}} and Georgian...would this not also apply to {{bg-IPA}} and Bulgarian (which also has noted inconsistencies along similar lines) and also the Lithuanian pronunciation module (there's no template yet, but once again, there are known and seemingly random inconsistencies)? — This unsigned comment was added by Kylebgorman (talk • contribs) at 22:32, 15 April 2020 (UTC).[reply]

Well, not all languages can be applied without human intervention; respelling or special parameters may be necessary, like pitch accent in Lithuanian. Georgian is apparently a case where this can be done automatically, according to Giorgi. —Μετάknowledge^{discuss/deeds} 23:58, 15 April 2020 (UTC)[reply]

In general language-specific IPA templates make it easier to keep transcriptions consistent so it's a good idea to switch over to them when they exist, though it's not always be possible to do it using a bot, as Metaknowledge says. {{fr-IPA}} is a good example; it often requires manual input because French orthography isn't completely phonemic. — Eru·tuon 06:45, 16 April 2020 (UTC)[reply]

@Kylebgorman: the standard way to add pronunciation to a Georgian article here is to use Module:ka-IPA which is a simple one-to-one mapping from Georgian to IPA. The inconsistencies you guys noticed is due to some articles not using this module.

BTW, geo.tsv seems wrong. It claims აალებადი has this IPA ɑɑlɛbɑdɪ which it does not.

I give a green light to anyone who owns a bot to change all articles not using {{ka-IPA}} to use one. I can even give the interested person a javascript snippet I have been using to correct entries semi-automatically. Giorgi Eufshi (talk) 18:09, 15 April 2020 (UTC)[reply]

@Giorgi Eufshi, Kylebgorman: I'm running a bot on it now! User:Aryamanbot. Let me know if you spot anything weird. —Aryaman^A ^{(मुझसे बात करें • योगदान)} 00:39, 19 July 2020 (UTC)[reply]

I guess I should ping @Dixtosa as well. —Aryaman^A ^{(मुझसे बात करें • योगदान)} 00:47, 19 July 2020 (UTC)[reply]

I thank you Aryaman! Giorgi Eufshi (talk) 13:28, 25 July 2020 (UTC)[reply]

Creating word-forms en masse using ACCEL

I would like to ask if it is possible to create word-forms en masse using WT:ACCEL. Let's say the conjugation table of a word generates 100 word-forms that could be created using ACCEL (i.e. by opening the pages one by one and saving the pages), instead of going through the tedious process of opening and saving 100 pages, is it possible to quickly create all the word-forms generated by the table (given that I have confirmed the correctness of the word-forms, of course)? I am aware of the existence of bots, but I don't have the skills to create a bot. Jonashtand (talk) 10:21, 16 April 2020 (UTC)[reply]

I think it would be easy and useful to have a bot that just does what WT:ACCEL does. It would call the modules directly and generate the entries, then save. —Rua (mew) 11:41, 18 April 2020 (UTC)[reply]

@Rua: Not all languages are equally reliable. But you do have the ability to do that kind of bot run, and I would support it as long as you only do it for languages where a contributor has confirmed that the entries are generally reliable. —Μετάknowledge^{discuss/deeds} 19:22, 22 April 2020 (UTC)[reply]

Would it be possible for someone to do this without relying on another party to run a bot? Editor clicks "create accelerated forms" on a table, and whatever it is does its thing. Perhaps it could make edits in that editor's name, instead of having a bot account. —Rua (mew) 20:13, 22 April 2020 (UTC)[reply]

Help:Language inflection bot is a decent start to doing botting. --Vitoscots (talk) 00:05, 23 April 2020 (UTC)[reply]

@Rua: That doesn't really help with the problem at hand, e.g. hundreds of Swahili nouns that need plurals, etc. —Μετάknowledge^{discuss/deeds} 02:30, 23 April 2020 (UTC)[reply]

@Vitoscots Cool! I wasn't aware of such a help page. Thanks! Jonashtand (talk) 19:11, 23 April 2020 (UTC)[reply]

Is there a way to search by Middle Chinese Initials?

Either the characters themselves, or the various transliterations. I found Module:ltc-pron, and it seems like there should be some way to find entries based on that, but I have no clue if that's possible. Kiragecko (talk) 15:35, 17 April 2020 (UTC)[reply]

What was the word for "insect" in Middle English?

"Insect" sounds scientific and Latinate, and "bug" feels like a modern Americanism. What would Chaucer have called this type of organism? Equinox ◑ 09:25, 18 April 2020 (UTC)[reply]

beastie? --Vitoscots (talk) 09:32, 18 April 2020 (UTC)[reply]

Nah that is any kind of animal. That could be cats or dogs. Equinox ◑ 11:36, 18 April 2020 (UTC)[reply]

If I had to fake a Chaucer text, I'd call them creepie-craulies or beastlete. --Vitoscots (talk) 15:25, 18 April 2020 (UTC)[reply]

Perhaps it was still, like in Old English, wyrm – not yet specialized to the legless creepy-crawlies.

Maybe more specific terms were more common, like bitle (see the etymology of beetle) or flye? Or maybe there was an SOP umbrella term for them all.... Andrew Sheedy (talk) 17:55, 18 April 2020 (UTC)[reply]

wyrm most definitely did refer primarily to worm-shaped animals, as that is the primary meaning in all the Germanic languages. What the Germanic languages also have in common is the lack of an umbrella term for all insects. It's quite likely that speakers of the time simply didn't consider them to have much in common, other than all being small. In other words, not only is the word "insect" modern, but the concept, too. —Rua (mew) 15:32, 19 April 2020 (UTC)[reply]

It's definitely not a match for the modern concept, but it does seem like worm might be the closest term in Middle English, where it could be used in a broad sense to include all 'crawling things' (arthropods, worms, and reptiles). Taking some quotes from the MED, worms can be

spiders ('The venymous spyþur hatte aranea and is a worme')
scorpions ('Scorpiun is a cunnes wurm')
grasshoppers ('If hungir were sprungyn in þe lond & pestilence…& locust & werm [L bruchus]')
moths, six-legged and many-legged arthropods, salamanders, snakes ('Of wormes beþ many maner diuerse kyndes…somme beþ water wormes and somme beþ londe wormes, And of þilke some beþ in herbes and in wortes, as melschragges and oþere suche…and some in cloþes, as moþþes…and among wormes, some beþ footeles, as addres and serpentes, and some haueþ many feete, and some haueþ sixe feete, And some beþ…enemyes to mankynde, as serpentes and oþere venemous wormes…And some wormes…beþ ygendred and gendreþ nouʒt, as þe salamandra')
fireflies ('To make a continuall lyght withoute fyre…Take…wormys that schynen anyghte tyme in the ffeeldys')
ants and flies ('Many wormes he made also, As amptis, flies and oþir mo')

There's about half as many quotes for worm in this broad sense as for worm in reference to worm-shaped animals, so it wasn't the primary meaning by any stretch, but still pretty common. — Vorziblix (talk · contribs) 16:15, 12 May 2020 (UTC)[reply]

Maybe it didn't appear as much in formal writing? "Bug" is a very common word, but is typically limited to an informal register in the general sense. Perhaps "worm" was the general word for bug among the illiterate classes. Andrew Sheedy (talk) 02:55, 13 May 2020 (UTC)[reply]

Mount Niqiu now as Mount Nisha

Hello, can anyone explain if they know or have some information as to really why Mount Niqui was changed to Mount Nishan. Just curious in my Confucius studies...thanks

A better name than either in English is Mount Ni. The terms 丘 (qiū) and 山 (shān) are synonyms, both meaning “mount”, ”hill”. So Mount Nishan is so much as Mount Mount Ni. I don’t know the reason for the Chinese name change. Perhaps to avoid confusion with the town by the name of Niqui located some 300 km to the south. The spelling in Chinese characters of the two Niquis is different, though, so Chinese pilgrims wouldn't have been confused. --Lambiam 22:49, 19 April 2020 (UTC)[reply]

According to this archived web page, the name change was to avoid a taboo on the name Qiu, which, if I understand correctly, was given to Confucius. Someone who understands Chinese can probably be of more help. --Lambiam 23:16, 19 April 2020 (UTC)[reply]

The same taboo explanation is offered in this book. --Lambiam 23:25, 19 April 2020 (UTC)[reply]

And here is that story again, now in English. --Lambiam 13:22, 20 April 2020 (UTC)[reply]

Why is there no category for homonyms?

We're having a discussion over at w:Wikipedia:Articles for deletion/List of true homonyms and curious why there are no category for homonyms. Also if a bot could take words copy and pasted over somewhere, and then search for those words on the wiktionary and add that category automatically. I searched around and see you have a template at Template:nyms that list various other categories that could be added as well. Dream Focus (talk) 18:49, 21 April 2020 (UTC)[reply]

@Dream Focus, we have Category:English terms with homophones and Category:English terms with multiple etymologies, which cover both senses of homonym. — Ungoliant ^(falai) 18:36, 22 April 2020 (UTC)[reply]

RQ

In templates like {{RQ:Milton Paradise Lost}}, which I've been using for ages now, what do the letters RQ stand for? This is out of pure curiosity. Reference Quote?--Vitoscots (talk) 11:14, 22 April 2020 (UTC)[reply]

Well, R is the prefix for reference templates, and Q is the template (previously a template prefix) for quotation templates, so I suppose so. —Μετάknowledge^{discuss/deeds} 19:20, 22 April 2020 (UTC)[reply]

Avestan font?

On ire, I get "this font is missing" boxes for Avestan in the section "Etymology 2". How do I get to see it correctly? --Palnatoke (talk) 15:57, 22 April 2020 (UTC)[reply]

@Palnatoke: You need to download an Avestan font, or use a browser that already applies one for you (as I believe Safari does, for example). I don't think there's anything we can do from our side. —Μετάknowledge^{discuss/deeds} 19:18, 22 April 2020 (UTC)[reply]

@Palnatoke: Two fonts you can download and install: Ahuramazda and Noto Sans Avestan. These should be automatically used on Wiktionary at least once you have installed them and maybe restarted your browser. (They are assigned to Avestan-script text in MediaWiki:Common.css.) — Eru·tuon 05:51, 23 April 2020 (UTC)[reply]

Thank you very much, both of you. Not that I can read Avestan, but I do find the "this font is missing" boxes quite annoying. --Palnatoke (talk) 10:58, 23 April 2020 (UTC)[reply]

Now I have Avestan on the page, but in the edit box I still have the boxes. --Palnatoke (talk) 11:05, 23 April 2020 (UTC)[reply]

@Palnatoke: It might be browser-specific behavior. My browser is Firefox and it automatically applies Noto Sans Avestan to Avestan-script text in the edit box. (It tends to get fonts right more often than I remember Chrome doing.) Maybe you have a different browser? — Eru·tuon 19:31, 23 April 2020 (UTC)[reply]

Sounds plausible. I use Brave, which is Chromium-based. Oh well. Thank you for the help anyways. --Palnatoke (talk) 20:25, 23 April 2020 (UTC)[reply]

Loanwords list

How to get list of loanwords ? Borneq (talk) 16:58, 22 April 2020 (UTC)[reply]

@Borneq Category:English borrowed terms. — Ungoliant ^(falai) 18:31, 22 April 2020 (UTC)[reply]

How to edit content that is inside a template portion of a entry?

How do I access the content of a templated portion of an entry? I can see the source, which is only a link to the template, but I cannot select the text inside the template box:

 Ex: 齿

Trying to remove the extra "etc." TEXT: For pronunciation and definitions of 齿 – see 齒 (“tooth; tooth- or zigzag-like thing, such as sawtooth, cogwheel, fern, etc.; etc.”). (This character, 齿, is the simplified form of 齒.) Salty3dog (talk) 21:19, 24 April 2020 (UTC)[reply]

@Salty3dog: It depends on the template. In this case, the template is {{zh-see}} and the text comes from the entry being linked to, 齒 / 齿 (chǐ). — Eru·tuon 21:45, 24 April 2020 (UTC)[reply]

I've edited the entry to remove the extra "etc." — Eru·tuon 21:50, 24 April 2020 (UTC)[reply]

Why is it a lemma?

Excuse me for asking, but why does the Roman Empire is a lemma? -and the similar- It is not a term like the Holy Roman Empire which has specific meaning. ‑‑Sarri.greek ^♫ | 19:33, 26 April 2020 (UTC)[reply]
Of the empire#Derived_terms some regard both space and time: There is no 'other' Roman Empire. It is an empire by the Romans, by the Ottomans, etc. and the first word explains it all. Others are not, and I understand why they have a lemma (Russian Empire, British Empire: it is not clear what period we are referring to. ‑‑Sarri.greek ^♫ | 19:41, 26 April 2020 (UTC)[reply]

The proper noun does have a specific meaning: the empire that existed from 27 BC to AD 476/1453, rather than any empire pertaining to Rome. "Roman" only explains it sufficiently when it's used as a common noun, e.g. "Barbarossa ... aimed to establish a Roman empire that could compete with the church" ([1]). —Nizolan ^(talk) 21:29, 14 May 2020 (UTC)[reply]

Thank you Nizolan, I just saw your answer. ‑‑Sarri.greek ^♫ | 14:25, 23 May 2020 (UTC)[reply]

looking for word

Is there a word in English to denote something that causes a disaster, or any kind of trouble? Thanks. ---> Tooironic (talk) 08:44, 28 April 2020 (UTC)[reply]

Maybe troublemaker is too obvious... I think in general, the term that comes to mind is "inciting incident" or "cause" more than any specific word that means, "This is an immediate cause of something exclusively bad". A "hazard" has the potential to cause a problem but isn't necessarily actualized as one. Maybe that helps? —Justin (koavf)❤T☮C☺M☯ 09:00, 28 April 2020 (UTC)[reply]

In some cases trigger, but not really specific to disasters. Equinox ◑ 14:35, 28 April 2020 (UTC)[reply]

Thanks everyone. ---> Tooironic (talk) 01:21, 29 April 2020 (UTC)[reply]

Meaning of "c" in TV listings

e.g. "ALL NEW Bering Sea Gold | Friday 10/9c". Is it catch up? Should we add it to Wiktionary? Equinox ◑ 14:33, 28 April 2020 (UTC)[reply]

If that's a US listing that stands for "Central Time". As I understand it, network TV broadcasts are delayed to the appropriate hour for the timezone in the western half of the US, but not in the Central timezone, which is only an hour different. Chuck Entz (talk) 14:48, 28 April 2020 (UTC)[reply]

Yeah. And, for the curious, it's spoken aloud as "Friday [at] ten, nine Central". (There's probably audio of it on YouTube, somewhere.) - -sche (discuss) 21:47, 19 July 2020 (UTC)[reply]

Additionally, the mainland States has four time zones (plus Alaska and Hawaii), so Central Time (Chicago, Houston, etc.) airs at the same time as Eastern (Boston, Miami, New York City, etc.) and Mountain Time (Boise, Denver, Phoenix, etc.) airs alongside Pacific (Los Angeles, Portland, Seattle, etc.) Additionally, it's occasionally true that live programs have an East Coast and a West Coast broadcast. —Justin (koavf)❤T☮C☺M☯ 22:46, 19 July 2020 (UTC)[reply]

Grammatical terms used in the definitions of the Oxford English Dictionary

For example, the entry for the noun "counsel" reads:

(Usually a collective plural, but sometimes treated as a numeral plural; formerly, in ‘to desire the benefit of counsel’, treated as a collective sing.: cf. quot. 1681.)

The terms don't appear either in the Glossary of grammatical terms

I'd like to know where I can find the definitions/explanations of all the grammatical terms used throughout the OED, such as collective plural, numeral plural, collective singular. --Backinstadiums (talk) 11:26, 1 May 2020 (UTC)[reply]

The term “collective plural” can be used for a collective noun that takes a plural verb form in British English: “Counsel have had little part in regulating and disciplining their colleagues.” If (mainly in non-British English) a collective noun is used with a singular verb form, for example as in “The counsel was disbanded”, it is a “collective singular”. Some nouns can also be used as a count noun that retains its singular form although representing a specified number of members: “The fourth respondent is not one of the three counsel nominated.” This is an example of a numeral plural. --Lambiam 08:03, 3 May 2020 (UTC)[reply]

@Lambiam: Would numeral plurals and "unchangable plurals" then be the same? --Backinstadiums (talk) 11:51, 3 May 2020 (UTC)[reply]

It is likely that people mean the same, although I’d prefer to say “unchanged plural”. It is not hard to find examples of the plural “counsels”, as in “each of the counsels”; likewise for collective unchanged plural “fish” vs. the inflected plural “fishes”. So the nouns “counsel” and “fish” are not per se unchangeable. --Lambiam 15:38, 3 May 2020 (UTC)[reply]