Wiktionary:Grease pit/2019/November
Categorising -er (agent noun)
[edit]While updating the etymology section at beachcomber, I noticed that adding "id2=agent noun" does not categorise this term in the same way that it does for comber. Can a change be made to the 'compound', 'com', 'affix', and 'af' templates to allow for this, when the second compound has the suffix ? Leasnam (talk) 02:25, 3 November 2019 (UTC)
- I hope I'm not hijacking if I raise my occasional wonder about whether this category is appropriate for something probably formed as beach + comber and not *?beachcomb + -er? Equinox ◑ 02:50, 3 November 2019 (UTC)
- @Leasnam: That's because there's no affix in the template (
{{af|en|beach|comber}}
). The template only adds categories for affixes ({{af|en|beach|comb|-er|id3=agent noun}}
).{{af}}
doesn't know that comber has the suffix -er. — Eru·tuon 03:25, 3 November 2019 (UTC)- @Erutuon:, exactly; but would it be possible to have the template altered so that it does ? beachcomber is also suffixed with -er, even if it isn't formed from beachcomb + -er Leasnam (talk) 08:18, 3 November 2019 (UTC)
- Except that it isn't. Our idea of "suffixed" doesn't mean "contains the suffix" but rather "formed by suffixation". That's not the case for beachcomber, which if your etymology is correct is formed by compounding. —Rua (mew) 10:34, 3 November 2019 (UTC)
- Ok, I see. Thank you for the explanation. Leasnam (talk) 17:41, 3 November 2019 (UTC)
- Yes, let us please admit there is an elephant in the room. Thank you. Equinox ◑ 10:44, 3 November 2019 (UTC)
- It seems beachcomber may have been the originator term (see here [[1]]) with beachcombing and beachcomb coming later. Leasnam (talk) 17:50, 3 November 2019 (UTC)
- It would be quite a bit of work to try to make
{{affix}}
figure out that comber has -er in it: get the wikitext of the page comber, find the correct entry in it, get its etymology section, find{{affix}}
in that section, parse it, find -er in its parameters. And it would increase Lua memory usage. I don't think it's a good idea. — Eru·tuon 18:21, 3 November 2019 (UTC)- Well, if we were to do this, we would probably want to add an optional argument to the template, so it wouldn't have to figure anything out. It would be stated explicitly. Leasnam (talk) 06:05, 6 November 2019 (UTC)
- Except that it isn't. Our idea of "suffixed" doesn't mean "contains the suffix" but rather "formed by suffixation". That's not the case for beachcomber, which if your etymology is correct is formed by compounding. —Rua (mew) 10:34, 3 November 2019 (UTC)
- @Erutuon:, exactly; but would it be possible to have the template altered so that it does ? beachcomber is also suffixed with -er, even if it isn't formed from beachcomb + -er Leasnam (talk) 08:18, 3 November 2019 (UTC)
- It seems to me that we are trying to get the affix templates to do two things without doing the work to do those things properly. If we use them to analyze the headword into the smallest morphemes we do violence to the likely course of derivation. If we fail to do so, we will on occasion miss things such as this. We could augment the existing template
{{affix}}
(only?); require two template families, one for synchronic and the other for diachronic derivations; or require hard categorization for cases such as the one at hand. I suspect that hard categorization is adequate for this case. DCDuring (talk) 23:51, 3 November 2019 (UTC)
Google Code-In will soon take place again! Mentor tasks to help new contributors!
[edit]Hi everybody! Google Code-in (GCI) will soon take place again - a seven week long contest for 13-17 year old students to contribute to free software projects. Tasks should take an experienced contributor about two or three hours and can be of the categories Code, Documentation/Training, Outreach/Research, Quality Assurance, and User Interface/Design. Do you have any Lua, template, gadget/script or similar task that would benefit your wiki? Or maybe some of your tools need better documentation? If so, and you can imagine enjoying mentoring such a task to help a new contributor, please check out mw:Google Code-in/2019 and become a mentor. If you have any questions, feel free to ask at our talk page. Many thanks in advance! --Martin Urbanec 07:28, 5 November 2019 (UTC)
Category:English verbs suffixed with -en -- looping link needs fixing
[edit]Hi.
Category:English verbs suffixed with -en
Reads as follows:-
This category should be empty. The contents of this category should now be found at Category:English words suffixed with -en (inchoative). If any pages link here, please update the link, as this page may be deleted.
Click given link to get:-
Category:English words suffixed with -en (inchoative)
(Blah, blah, blah)
English words ending with the suffix -en.
Subcategories
This category has only the following subcategory.
!
English verbs suffixed with -en (157 e)
Click that link takes you straight back to Category:English verbs suffixed with -en.
And so the world goes round and round and up and down .... la-la-la.
I was going to add redden, however ...... --ALGRIF talk 11:28, 6 November 2019 (UTC)
- Hm, I agree with Rua that Category:English words suffixed with -en (inchoative) is preferable, because it's more descriptive (though it uses a technical term: see w:Inchoative verb). So I am emptying Category:English verbs suffixed with -en into to Category:English words suffixed with -en (inchoative), and deleting it. — Eru·tuon 16:23, 6 November 2019 (UTC)
Developer environment for this wiki
[edit]I have raised a question here about creating developer-friendly images of each wiki, which people can use to create scripts and mediawiki extensions on their own PC without needing to request an account on Wikimedia Tools or their sister services. I hope an image can be available for install in one click. If you are experienced in how to make this happen (or does it already exist...?), or know how it would be useful for this wiki, please respond either here or there. Thanks. --Gryllida/ (talk) 23:25, 6 November 2019 (UTC)
Search bar in the mobile site
[edit](is this the right place for this?)
On the mobile site, the search bar starts in capital if the setting isn't turned off locally. I think it should be disabled with <input autocapitalize=off>
because I think when most people type they expect the search bar to take them to what they've written, not the proper noun that has the same spelling as the thing they've written. --betseg|g 20:33, 7 November 2019 (UTC)
- @Betseg: I think this is something that Wiktionary admins have no control over. If so, it would have to be submitted as a bug report on Phabricator. — Eru·tuon 20:57, 7 November 2019 (UTC)
- We might be able to affect it via MediaWiki:Mobile.js, I can't test because I joked about using Common.js to mine bitcoin. - TheDaveRoss 21:13, 7 November 2019 (UTC)
I tried on User:Betseg/Mobile.js, doesn't work. --betseg|g 21:27, 7 November 2019 (UTC)uhh I did something wrong, gonna try some more things, brb --betseg|g 21:32, 7 November 2019 (UTC)- @Betseg: I thought your code might be running before the search bar was loaded, so I edited your JS. But I tried the code and it still doesn't work. Perhaps the search bar is dynamically generated, in which case this might require a
MutationObserver
– a bit more complicated. — Eru·tuon 21:35, 7 November 2019 (UTC)- @Erutuon: Wait, does this mean we can't have user-specific mobile JS scripts? --betseg|g 22:21, 7 November 2019 (UTC)
- Yep, it does. 'MobileFrontend does not load [...] any of the user stylesheets.' (and presumably, scripts) --betseg|g 22:25, 7 November 2019 (UTC)
- @Betseg: Oh, yeah, the Mobile.js user subpage doesn't work. I wasn't thinking. Your common.js subpage should work though. One of the scripts that I have in my common.js sent a notification on the mobile site. — Eru·tuon 22:29, 7 November 2019 (UTC)
@Erutuon: OK, so, what I'm doing in User:Betseg/common.js is technically working, but it doesn't work in practice, because I think what's happening after the user taps on the searchbox is that the site loads the searchbox window, the keyboard opens in capitalize mode, and then theautocapitalize
attribute gets set. So basically I'm out of ideas. --betseg|g 22:54, 7 November 2019 (UTC)- v2 of what I think is happening is that my
.click()
is being called before the actual event call that opens the search window, which is why I can't edit its attibutes. --betseg|g 23:01, 7 November 2019 (UTC) - It works with
debugger;
on desktop on mobile simulation mode with developer tools open but doesn't work if any of those things arent there??? I give up lol. --betseg|g 23:46, 7 November 2019 (UTC)
- @Betseg: Oh, yeah, the Mobile.js user subpage doesn't work. I wasn't thinking. Your common.js subpage should work though. One of the scripts that I have in my common.js sent a notification on the mobile site. — Eru·tuon 22:29, 7 November 2019 (UTC)
- @Betseg: I thought your code might be running before the search bar was loaded, so I edited your JS. But I tried the code and it still doesn't work. Perhaps the search bar is dynamically generated, in which case this might require a
Development environment for this wiki (take 2)
[edit]If I make a (Vagrant or Docker or VirtualBox) development for this wiki, would you be interested in including
- 1. all gadgets or only some (if so then which)
- 2. all extensions or only some (if so then which)
- 3. all content or only some (if so then what part)
- 4. all settings or only some (if so then which ones)
Also what extensions, or improvements to existing extensions, would you like to see developed if someone volunteers to do it. Is there a wish list...
Thanks, --Gryllida 05:10, 12 November 2019 (UTC)
no element before hydrogen
[edit]The template {{elements}}
allows parameters 3 and 4, the previous element, to be blank, but in hydrogen (obviously the only place this can arise) it doesn't suppress the line but outputs it as Previous: {{{3}}} ({{{4}}}). A similar problem no doubt exists for whatever successor of ununennium we stop at. --80.169.223.146 10:35, 12 November 2019 (UTC)
- Okay, hydrogen is fixed now. — Eru·tuon 21:08, 12 November 2019 (UTC)
The categories under this one all mistakenly link to the page medical signs and symptoms. Ultimateria (talk) 23:50, 12 November 2019 (UTC)
There are 39 Latvian adjectives that link to (none) as the adverb form. Ultimateria (talk) 03:32, 13 November 2019 (UTC)
- @Ultimateria Fixed. Benwing2 (talk) 16:35, 13 November 2019 (UTC)
Errrors with Miscellaneous Symbols and Pictographs pages
[edit]After problems with one page, I tried looking at more and they all have the same problem. Any page in Category:Miscellaneous Symbols and Pictographs block fails to display properly; the page content is there but it seems to be missing it style sheet, so is badly broken. See e.g. 🍩, 🍞 or 🏧, but it’s all of them as far as I can tell.
Looking in Safari’s Web Inspector, there is an error that does not normally appear:
Failed to load resource: The operation couldn’t be completed. Protocol error
And it seems to have failed to load something that could correspond to a stylesheet, though Safari only identifies it as 'load.php' among its resources.--109.157.71.242 11:08, 13 November 2019 (UTC)
I found some more with the same problem. Some but not all of the pages in Category:Miscellaneous Symbols block and all the ones I checked in Category:Vai block. I also had a look on an iPad, and Mobile Safari in desktop mode the pages have the same problem.--109.157.71.242 19:44, 13 November 2019 (UTC)
- I don't see this issue, does anyone else? Benwing2 (talk) 14:52, 14 November 2019 (UTC)
- I just tried them again and the problem seems to have fixed itself. I would say it’s due to restarting my browser or Mac, but then how come it was happening on my iPad too? The only other thing is my internet connection was playing up so I reset the router earlier, and maybe that fixed it.--109.157.71.242 20:24, 14 November 2019 (UTC)
- I looked at some of the entries mentioned right after the original post, but didn't encounter the bug. — Eru·tuon 22:18, 14 November 2019 (UTC)
@Hippietrail, Octahedron80, Alifshinobi, Lo Ximiendo: These templates serve the Northern Thai language, which is written in two scripts, the Tai Tham (Lana) and Thai scripts. Moreover, the Thai script is used in two different ways - with the Siamese sound values ('thap sap') and with the sound values of the corresponding symbols in the Tai Tham script as used in Northern Thai ('rup pariwat'). Converting between the three systems needs lexical exception tables.
These two headword templates are set up to produce abstract nouns from verbs/adjectives by adding a prefix; in all examples on Wiktionary so far, they use the default prefix. The default works well enough for the Tai Tham script, but the forms are different for the two Thai script systems.
One way forward is to add an option to the headwords to specify which Thai script orthography is used - many spellings are appropriate for both systems. Should this be used to determine a label for the headword? Are there precedents to follow for the name of the option and the form of the values - I am tempted to abbreviate the names to "ts" and "rp", with "both" as a default value calling forth both. (BCP 47 variant tags have yet to be requested.)
The default adjectival prefix ᨤ᩠ᩅᩣ᩠ᨾ/ความ/ฅวาม, a well-established borrowing from Siamese, has an older, native alternative ᨣᩤᩴ/ᨣᩣᩴ/กำ/คำ (two different Tai Tham spellings, both very likely to be mistyped). Should it be possible to simple choose the prefix, or should editors be expected to type these forms in full? --RichardW57 (talk) 00:48, 18 November 2019 (UTC)
- I added ᨣᩤᩴ/กำ as another adjectival prefix. (Some verb can have both ก๋าน and กำ/ความ.)
Also, ᨢ-ᨤ reads /x/ that's why we use ฃ-ฅ.(You should know that ฃ-ฅ once pronounced /x/ in ancient times) --Octahedron80 (talk) 02:17, 18 November 2019 (UTC)- @Octahedron80: The recent biscript New Testament translation, which is written using thap sap, doesn't use ฃ or ฅ at all. There's only a case for them if [x] and [kʰ] are different phonemes; I don't think there's necessarily such a distinction, though some individuals make a consistent distinction, with [kʰ] typically reserved for what are perceived as loans from Siamese. The letter ᨢ is redundant unless it is going to replace ᨡ. On the other hand, the distinction between ᨣ and ᨤ remains very real, which gives ฅ a major role in rup parisat. Rup parisat tends to be used for academic work, and in rup parisat the purely verbal prefix is การ with no tone mark. --RichardW57 (talk) 09:02, 18 November 2019 (UTC)
- ᨡᩬᩴ < PT *k.roːᴬ means to beg and ᨢᩬᩴ < PT *xɔːᴬ mean hook, are they redundant? I have a lot of distinct ᨡ VS ᨢ vocabulary. --Octahedron80 (talk) 10:19, 18 November 2019 (UTC)
- Yes. The second word also gets spelt the same way as the first word, and the MFL explicitly lists a whole set of words spelt with HIGH KXA purely on the basis of their existence with that sound in White Tai. I have read a Northern Thai complain about academics trying to restore HIGH KXA and HIGH CHA (ᨨ). --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
- I got these from มาลา คำจันทร์, a S.E.A. Write writer. [2] --Octahedron80 (talk) 16:40, 18 November 2019 (UTC)
- Yes. The second word also gets spelt the same way as the first word, and the MFL explicitly lists a whole set of words spelt with HIGH KXA purely on the basis of their existence with that sound in White Tai. I have read a Northern Thai complain about academics trying to restore HIGH KXA and HIGH CHA (ᨨ). --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
- I can say that the New Testament doesn't use ฃ-ฅ because they are just obsolete in Thai. --Octahedron80 (talk) 10:39, 18 November 2019 (UTC)
- And unnecessary for thap sap - at least, until such time as /kʰ/ is established as a new phoneme in Northern Thai. Or are you going to provide a set of thap sap quotations (not mere 'examples') to show that those two letters are used in thap sap? The New Testament is the only durable archived thap sap I know of. --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
- NVM I will use ค as its sound then. --Octahedron80 (talk) 17:02, 18 November 2019 (UTC)
- If you're talking of respelling to show the sound in a notation for Thais, feel free to use ฃ and ฅ for /x/. I was only concerned with how Northern Thai is *normally* written. --RichardW57 (talk) 23:51, 18 November 2019 (UTC)
- NVM I will use ค as its sound then. --Octahedron80 (talk) 17:02, 18 November 2019 (UTC)
- And unnecessary for thap sap - at least, until such time as /kʰ/ is established as a new phoneme in Northern Thai. Or are you going to provide a set of thap sap quotations (not mere 'examples') to show that those two letters are used in thap sap? The New Testament is the only durable archived thap sap I know of. --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
- ᨡᩬᩴ < PT *k.roːᴬ means to beg and ᨢᩬᩴ < PT *xɔːᴬ mean hook, are they redundant? I have a lot of distinct ᨡ VS ᨢ vocabulary. --Octahedron80 (talk) 10:19, 18 November 2019 (UTC)
- @Octahedron80: Meanwhile, you've now created the problem I raised this topic to avoid. At least, I don't believe กำแพง means 'belovedness'. The relevant word would be *คำแพง or กำแปง, though I'm not sure how we qualify them for inclusion in Wiktionary - I don't think they're common words. All Thai script Northern Thai adjectives that aren't simultaneously thap sap and rup pariwat now generate incorrect abstract nouns by default. --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
- กำแปง should only exist for แปง because we read that. I mean แพง (transliteration) for nod should not be there. You may see Northern Thai, Southern Thai, and Isan lemmas categories at my wiki, we (almost) spell Thai script as real sound (some may need cleanup). If we had not done that, all lemmas would have fallen back on central Thai every word, and transcription might be mislead visitors how to read.
- Kindly use the proper procedure for deleting entries. What the English Wiktionary should be recording is how words are normally written, not phonetic transcription. English speakers are used to the concept of words being pronounced quite differently in different regions despite being written the same, even if the variation is nowhere near as extreme as the variation across Sinitic languages. For example, I don't read Pali with an attempt at Thai sounds just because it's written in Thai characters. --RichardW57 (talk) 23:51, 18 November 2019 (UTC)
- กำแปง should only exist for แปง because we read that. I mean แพง (transliteration) for nod should not be there. You may see Northern Thai, Southern Thai, and Isan lemmas categories at my wiki, we (almost) spell Thai script as real sound (some may need cleanup). If we had not done that, all lemmas would have fallen back on central Thai every word, and transcription might be mislead visitors how to read.
- @Octahedron80: The recent biscript New Testament translation, which is written using thap sap, doesn't use ฃ or ฅ at all. There's only a case for them if [x] and [kʰ] are different phonemes; I don't think there's necessarily such a distinction, though some individuals make a consistent distinction, with [kʰ] typically reserved for what are perceived as loans from Siamese. The letter ᨢ is redundant unless it is going to replace ᨡ. On the other hand, the distinction between ᨣ and ᨤ remains very real, which gives ฅ a major role in rup parisat. Rup parisat tends to be used for academic work, and in rup parisat the purely verbal prefix is การ with no tone mark. --RichardW57 (talk) 09:02, 18 November 2019 (UTC)
--Octahedron80 (talk) 16:48, 18 November 2019 (UTC)
- Whatever then. I will add one more parameter like tl=yes to tell that a lemma is 'transliteration' (รูปปริวรรต you called), with this we will generate proper prefixes for it. แพง is okay with usage note given. tl=yes currently do nothing on other POSes but it maybe useful in the future. --Octahedron80 (talk) 02:44, 19 November 2019 (UTC)
- I object to the term 'transliterated' because it's more complicated than that. Whether Tai Tham BA winds up as bo bai mai or po pla is complicated. I think I've seen mai tri turn up on dead syllables; Northern Thai Tai Tham is slow to acquire extra tone marks, so that can hardly be called transliteration. "Rup pariwat" works in English because being in a foreign language weakens its original meaning. I think a better keyword would be "es" for 'etymological system'. In a family of etymological writing systems, the inherited words look the same across the diasystem. It seems that a translation of the Nan Chronicle was done by first making an etymological transcription to the Thai script and then working with that form. --RichardW57m (talk) 13:55, 19 November 2019 (UTC)
- Whatever then. I will add one more parameter like tl=yes to tell that a lemma is 'transliteration' (รูปปริวรรต you called), with this we will generate proper prefixes for it. แพง is okay with usage note given. tl=yes currently do nothing on other POSes but it maybe useful in the future. --Octahedron80 (talk) 02:44, 19 November 2019 (UTC)
- I'm uneasy with a binary opposition, because I thought the headword would be a good place to record the writing system. It seems that I am wrong. We may have a further division of the thap sap writing system on the basis of the system of phonetic tones. Chiang Mai tones are heavily assimilated to Siamese, and Chiang Rai makes a different association of its tones with Siamese tones, and this is reported to affect the way thap sap is used. I don't believe this division affects the regular derivation of abstract nouns. --RichardW57m (talk) 13:55, 19 November 2019 (UTC)
- If we turn one Thai script form into soft redirect except for the spelling issues, the 'redirection' can state the writing system of the form being redirected from. How then do we document the writing system of the main lemma? --RichardW57m (talk) 13:55, 19 November 2019 (UTC)
- I think you can use
{{tlb}}
at the end of the headword line; that's how Ancient Greek dialects are marked. The form that receives the primary entry lists the dialects that it belongs to in{{tlb}}
in the headword line, and the other dialectal forms have soft redirects with{{alternative form of}}
. See for instance ἥλιος (hḗlios). You would want to add labels for the spelling systems in Module:labels/data/subvarieties. — Eru·tuon 19:33, 19 November 2019 (UTC)- That looks good. I think it may even allow us to distinguish between a Chiang Mai thap sap spelling used as a main lemma and a Chiang Mai dialect word given in thap sap spelling, and let the displayed text be revised without disturbing the source text of the entries. As the etymologies seem to prefer the Lanna script, it probably makes sense to use them as the main entries. It may not always be possible - there seems to be a slight element of diglossia between the two scripts, caused by a Siamese form replacing its cognate. --RichardW57 (talk) 22:51, 19 November 2019 (UTC)
- I think you can use
- If we turn one Thai script form into soft redirect except for the spelling issues, the 'redirection' can state the writing system of the form being redirected from. How then do we document the writing system of the main lemma? --RichardW57m (talk) 13:55, 19 November 2019 (UTC)
- Thai script for Northern Thai, we are *likely* to use sound values on basic words and transcription on Indic loanwords (since the latter is harder to express sound values), that yields unstable Thai spellings. I see your article at Incubator that is impressive. --Octahedron80 (talk) 02:37, 18 November 2019 (UTC)
- Thanks. I'm sure the grammar needs improvement, though. I couldn't persuade my wife to proofread it in any of the three writing systems. It may need slicing up - Lua runs out of memory if I ask it to do two transliterations. I've yet to work out how to edit it in the Thai script. --RichardW57m (talk) 13:17, 18 November 2019 (UTC)
Off topic, I have some nod collection if you want to see: [3]. Plus one printing that I cannot share online. มาลา คำจันทร์. พจนานุกรมคำเมือง. เชียงใหม่ : บุ๊คเวิร์ม, 2551. →ISBN --Octahedron80 (talk) 17:36, 18 November 2019 (UTC)
- Mostly good background material and useful for checking. Of course, for Wiktionary, what would be really useful is a usenet group with a lot of discussion written in Northern Thai. --RichardW57 (talk) 07:38, 20 November 2019 (UTC)
Some Data on Memory Use
[edit]The out-of-memory module errors in 9 CJKV character entries (我, 一, 人, 學, 彼, 月, 水, 生 and 酒) have defeated our best efforts to fix them. There's also a discussion about how much more memory we should ask for if it were made available. It's really, really hard to understand the use of memory in an entry with dozens of templates calling dozens of modules: some memory is shared between module calls, some isn't, and I have yet to see any explanation that can accurately account for what is and isn't shared.
I can't help much with the theory, so I decided to see if I could come up with some data. I chose 水 as my subject, which is the character for "water", and has about as many different language sections as any entry of this type. I systematically commented out parts of the entry in preview mode to see how memory usage varied. These tables show the relationship between language sections and memory use. There are 16 language sections, but Chinese, Japanese, Korean, Okinawan and Vietnamese are the ones using the most memory, so I concentrated on those. I tried every combination of the 5 languages with everything else treated as a group and recorded the memory usage. The languages are abbreviated C,J,K,O and V, and "Oth" refers to everything outside those sections. A "0" means the language section was commented out, a "1" means it wasn't:
Oth | C | J | K | O | V | Mem
(MB) |
---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 00.00 |
0 | 0 | 0 | 0 | 0 | 1 | 06.29 |
0 | 0 | 0 | 0 | 1 | 0 | 06.13 |
0 | 0 | 0 | 0 | 1 | 1 | 09.71 |
0 | 0 | 0 | 1 | 0 | 0 | 11.22 |
0 | 0 | 0 | 1 | 0 | 1 | 14.63 |
0 | 0 | 0 | 1 | 1 | 0 | 14.49 |
0 | 0 | 0 | 1 | 1 | 1 | 16.08 |
0 | 0 | 1 | 0 | 0 | 0 | 20.12 |
0 | 0 | 1 | 0 | 0 | 1 | 22.32 |
0 | 0 | 1 | 0 | 1 | 0 | 22.19 |
0 | 0 | 1 | 0 | 1 | 1 | 22.19 |
0 | 0 | 1 | 1 | 0 | 0 | 22.17 |
0 | 0 | 1 | 1 | 0 | 1 | 23.67 |
0 | 0 | 1 | 1 | 1 | 0 | 24.56 |
0 | 0 | 1 | 1 | 1 | 1 | 27.78 |
Oth | C | J | K | O | V | Mem
(MB) |
---|---|---|---|---|---|---|
0 | 1 | 0 | 0 | 0 | 0 | 28.53 |
0 | 1 | 0 | 0 | 0 | 1 | 28.52 |
0 | 1 | 0 | 0 | 1 | 0 | 29.45 |
0 | 1 | 0 | 0 | 1 | 1 | 32.66 |
0 | 1 | 0 | 1 | 0 | 0 | 34.31 |
0 | 1 | 0 | 1 | 0 | 1 | 37.52 |
0 | 1 | 0 | 1 | 1 | 0 | 38.97 |
0 | 1 | 0 | 1 | 1 | 1 | 42.18 |
0 | 1 | 1 | 0 | 0 | 0 | 46.36 |
0 | 1 | 1 | 0 | 0 | 1 | 47.01 |
0 | 1 | 1 | 0 | 1 | 0 | 47.00 |
0 | 1 | 1 | 0 | 1 | 1 | 47.00 |
0 | 1 | 1 | 1 | 0 | 0 | 47.04 |
0 | 1 | 1 | 1 | 0 | 1 | 47.05 |
0 | 1 | 1 | 1 | 1 | 0 | 47.04 |
0 | 1 | 1 | 1 | 1 | 1 | 47.04 |
Oth | C | J | K | O | V | Mem
(MB) |
---|---|---|---|---|---|---|
1 | 0 | 0 | 0 | 0 | 0 | 09.42 |
1 | 0 | 0 | 0 | 0 | 1 | 13.91 |
1 | 0 | 0 | 0 | 1 | 0 | 13.69 |
1 | 0 | 0 | 0 | 1 | 1 | 14.25 |
1 | 0 | 0 | 1 | 0 | 0 | 18.27 |
1 | 0 | 0 | 1 | 0 | 1 | 20.52 |
1 | 0 | 0 | 1 | 1 | 0 | 19.54 |
1 | 0 | 0 | 1 | 1 | 1 | 20.20 |
1 | 0 | 1 | 0 | 0 | 0 | 22.62 |
1 | 0 | 1 | 0 | 0 | 1 | 25.86 |
1 | 0 | 1 | 0 | 1 | 0 | 26.50 |
1 | 0 | 1 | 0 | 1 | 1 | 29.35 |
1 | 0 | 1 | 1 | 0 | 0 | 28.97 |
1 | 0 | 1 | 1 | 0 | 1 | 28.97 |
1 | 0 | 1 | 1 | 1 | 0 | 28.98 |
1 | 0 | 1 | 1 | 1 | 1 | 28.98 |
Oth | C | J | K | O | V | Mem
(MB) |
---|---|---|---|---|---|---|
1 | 1 | 0 | 0 | 0 | 0 | 34.01 |
1 | 1 | 0 | 0 | 0 | 1 | 37.22 |
1 | 1 | 0 | 0 | 1 | 0 | 38.09 |
1 | 1 | 0 | 0 | 1 | 1 | 41.30 |
1 | 1 | 0 | 1 | 0 | 0 | 43.52 |
1 | 1 | 0 | 1 | 0 | 1 | 46.73 |
1 | 1 | 0 | 1 | 1 | 0 | 47.42 |
1 | 1 | 0 | 1 | 1 | 1 | 47.62 |
1 | 1 | 1 | 0 | 0 | 0 | 47.32 |
1 | 1 | 1 | 0 | 0 | 1 | 47.32 |
1 | 1 | 1 | 0 | 1 | 0 | 47.33 |
1 | 1 | 1 | 0 | 1 | 1 | 47.32 |
1 | 1 | 1 | 1 | 0 | 0 | 47.46 |
1 | 1 | 1 | 1 | 0 | 1 | 47.48 |
1 | 1 | 1 | 1 | 1 | 0 | 47.48 |
1 | 1 | 1 | 1 | 1 | 1 | 50+ |
From this we can see that Korean, Okinawan and Vietnamese all overlap with either the Chinese or the Japanese section- if both are present, it doesn't really matter what combination of the other three are there. The big exception is when all the language sections are present- for some reason the memory usage jumps by at least 2 1/2 MB and causes a module error by going over the 50 MB limit. There's a similar jump at the bottom of the first table, but not at the bottoms of the second or third. If we could figure out why, we might be able to fix this particular entry.
I also did some work with the templates within the Chinese section, but this is all I have time and energy for tonight. Chuck Entz (talk) 05:56, 18 November 2019 (UTC)
- @Rua Do you have any insight into how repeatedly loading the same module works? Does it cache anything? How aggressively does Lua free unused memory? I know for example that Python does reference counting and also has GC to detect circles; the reference counting means that unused memory should be freed the instant it is no longer used, provided there are no cycles. I notice for example that duplicating the
{{inh|yoi|jpx-ryu-pro|sort=みん|*mezu}}
call multiple times results in about 100K extra memory used per call, even though it should be triggering the exact same code paths repeatedly (if reference counting is being used, I would expect the amount of memory to increase only by the size of the generated text, not by 100K). Benwing2 (talk) 02:38, 19 November 2019 (UTC)- That's the part that puzzles me too. It seems that the invocations do not share memory, not even when they are identical. Both are apparently allocated their own separate memory space to be executed in parallel. —Rua (mew) 09:57, 19 November 2019 (UTC)
- Lua 5.1 has a garbage-collector that can be customized with a function that is not made available to Scribunto, described here. I don't have an in-depth understanding of the garbage-collection, but unused memory is freed at garbage-collection cycles, not immediately.
- I read somewhere that the bytecode for a module is cached (so the parsing step need not be repeated), but it must be re-executed in each invocation through
{{#invoke:}}
. If the tables or functions returned by modules were cached, then they could be mutated so that information could be passed between module invocations, which is not permitted (phab:T67258). So if Module:languages is loaded withrequire
in one{{l}}
and then in another{{l}}
, the module is parsed into bytecode once, but executed twice, once in each invocation, to yield two distinct export tables. So at least some part of the memory total used the export table would be doubled, as if the body of Module:languages (minus the finalreturn
) were placed in a loop and executed twice. But how that relates to the memory usage figure, I don't know. I haven't figured out how memory is tracked in the Scribunto source code. (It doesn't help that I've never learned a class-based programming language.) - However, inside a single invocation, calling
require("Module:languages")
more than once yields the same table, and mutations to the table are visible inside the same invocation. So if one module setsrequire("Module:languages").getByCode = false
, and then calls another module that callsrequire("Module:languages").getByCode("en")
, the second module will trigger the error "attempt to call a boolean value" becauserequire("Module:languages").getByCode
isfalse
. mw.loadData
, unlikerequire
, only executes a module once per page. The table returned by the module is retained between invocations, but is prevented from being mutated by being restricted to immutable types, and by being supplied to Scribunto wrapped in a metatable. The wrapping metatable itself uses extra memory and accessing the data is slower because it is done though metamethods. This does not matter unless some code accesses data many times, as in Module:family tree and Module:list of languages, in which case loading the data withrequire
reduce memory and execution time. — Eru·tuon 19:30, 19 November 2019 (UTC)
Can we add the Hachijō language?
[edit]Can we add the Hachijō language as a full language in the Japonic family? — This unsigned comment was added by MiguelX413 (talk • contribs).
- Probably yes (in which case the code could be
jpx-hac
). One thing that gives me pause is that it seems to be considered a dialect rather than a language in Japanese sources; do you have some evidence relating to mutual intelligibility to demonstrate that it can't be covered as part ofja
? —Μετάknowledgediscuss/deeds 20:21, 18 November 2019 (UTC)- Shibatani, in his The Languages of Japan, mentions the "Hachijōjijima dialects" in the same sentence as the "Ryūkyuan dialects" (page 207). We consider Okinawan to be a language, likewise for Miyako, North vs. South Amami-Oshima, etc., so I consider his use of "dialects" here to be either outmoded, or reflective of the Japanese government's official view. Interesting related Linguistics Stackexchange thread here, suggesting that Shibatani may have changed his mind after writing the book. ‑‑ Eiríkr Útlendi │Tala við mig 21:31, 18 November 2019 (UTC)
- @Metaknowledge As Eirikr mentioned above, 八丈語 being called dialect is typically done by those who also call the Ryukyuan languages dialects. MiguelX413 (talk) 20:19, 21 November 2019 (UTC)
- That is useful, but obvious some things Shibatani might call a dialect (Kansai-ben, say) really are dialects and don't deserve language codes. I'm asking for some positive evidence that we should treat Hachijō as a language. —Μετάknowledgediscuss/deeds 05:21, 22 November 2019 (UTC)
- @Metaknowledge As Eirikr mentioned above, 八丈語 being called dialect is typically done by those who also call the Ryukyuan languages dialects. MiguelX413 (talk) 20:19, 21 November 2019 (UTC)
- Shibatani, in his The Languages of Japan, mentions the "Hachijōjijima dialects" in the same sentence as the "Ryūkyuan dialects" (page 207). We consider Okinawan to be a language, likewise for Miyako, North vs. South Amami-Oshima, etc., so I consider his use of "dialects" here to be either outmoded, or reflective of the Japanese government's official view. Interesting related Linguistics Stackexchange thread here, suggesting that Shibatani may have changed his mind after writing the book. ‑‑ Eiríkr Útlendi │Tala við mig 21:31, 18 November 2019 (UTC)
- This suggests Hachijō is distinct enough to possibly form its own branch of Japonic.
←₰-→Lingo Bingo Dingo (talk) 15:30, 22 November 2019 (UTC)- @Metaknowledge, would you like some sample texts and Japanese resources as proof? I think there's enough for it to be considered a language if we consider the Ryukyuan languages all languages. MiguelX413 (talk) 03:25, 25 November 2019 (UTC)
no errors and yes request
[edit]Congratulations all. This week I have encountered zero module errors. Please pat yourself on the backs. In other news, I'd like a list of all the templates for abbreviations, acronyms and initialisms. I assume a fancy search involving the letters "init" "acron" and "abbr" would be a good place to start. --Vealhurl (talk) 01:50, 19 November 2019 (UTC)
Help info in Category Pages
[edit]I think we should at the very least add a reference to Help:Category to all of our category boilerplate. There are people who try to add things directly to the category pages or, worse, to the category tree data modules. Chuck Entz (talk) 15:03, 21 November 2019 (UTC)
- @Chuck Entz I can implement this. What format should the reference take? Can you give an example e.g. of how one of the
{{poscatboiler}}
or{{topic cat}}
pages should look? Benwing2 (talk) 21:22, 23 November 2019 (UTC)
Renaming "Lamboya" as "Laboya"
[edit]I've renamed "Lamboya" to "Laboya" (Category:Laboya language), as requested by User:Allahverdi Verdizade, due to more hits on Google Scholar and Google Books for "Laboya language" compared to "Lamboya language". Can someone help with the following tasks?
- Renaming all categories that begin with "Category:Lamboya ..." as "Category:Laboya ..."
- Renaming the language header "Lamboya" as "Laboya" KevinUp (talk) 10:26, 26 November 2019 (UTC)
- I've moved all the categories beginning with "Lamboya" and deleted all the ones that were redirects. — Eru·tuon 19:54, 26 November 2019 (UTC)
- Done All entries, categories and translations containing "Lamboya" have been renamed to "Laboya". KevinUp (talk) 07:26, 30 November 2019 (UTC)
Latvian vocative
[edit]The declension template for Latvian is not rendering the vocative singular correctly. I'm not even lv-1 on Latvian, but I believe that 1st declension masculines drop the -s of the nominative, so draugs (“friend”) should become draug, but it's showing as unchanged draugs. Likewise zirgs (“horse”) (the example I first spotted, so I read around to see if vocative only applied to people), and the palatalized ending in kaimiņš (“neighbour”). More strangely, tēvs (“father”) actually adds an -s, and becomes tēvss. The -is type nouns seem to be correct, going by brālis (“brother”), which shows as brāli. --80.169.223.146 13:08, 29 November 2019 (UTC)
- Fixed, I think. — Eru·tuon 18:34, 30 November 2019 (UTC)
<nowiki>*</nowiki>mæser
[edit]Hi. Can someone please check the Declension table at *mæser ? It's displaying the text in the Subject line above Leasnam (talk) 04:20, 30 November 2019 (UTC)
- Done—Suzukaze-c◇◇ 22:59, 30 November 2019 (UTC)
- Thank you ! Leasnam (talk) 03:22, 1 December 2019 (UTC)
Manual cleanup list
[edit]Hey all. I remember we have a manually created list somewhere of all pages with non-standard POS heading. Something like Category:Entries with non-standard headers but made by one of you technically gifted users. Where is it? Can a new page be generated? --Vealhurl (talk) 17:55, 30 November 2019 (UTC)
- @Vealhurl: My list is User:Erutuon/mainspace headers/possibly incorrect. I've got to clean up a bunch of them before the next dump starts being generated (sometime tomorrow) or the exact same bad headers will be in the next list (very boring and annoying). — Eru·tuon 22:50, 30 November 2019 (UTC)
- Kinda cool. I like how kelaunikui became a FWOTD having "Word" as its POS. Anyway, @Erutuon:, I removed Initialism, Abbreviation and Acronym from your whitelist to catch a few more that may not be included in the Cleanup category . --Vealhurl (talk) 21:17, 2 December 2019 (UTC)