Wiktionary:Grease pit/2016/October
Be consistent about template 'lang' parameters
[edit]To take one example: (suffix|en|...) and (suffix|...|lang=en) both work, but (lb|en|...) does not. Can this kind of thing be made consistent one way or another? Equinox ◑ 15:51, 1 October 2016 (UTC)
- The
lang=
parameter is supported for backwards compatibility because there hasn't been an agreement to bot-fix all the entries and remove the parameter. —CodeCat 15:52, 1 October 2016 (UTC)- Mostly because it is a bad idea. We should be moving in the direction of transparency not obscurity. - TheDaveRoss 17:24, 1 October 2016 (UTC)
- I haven't seen you propose ideas of that kind, rather than merely complain about changes that benefit editors' ability to improve the dictionary. —Μετάknowledgediscuss/deeds 18:19, 1 October 2016 (UTC)
- It is all subjective, but I have opined against proposed changes which result in general obfuscation of the wikitext. I don't agree that fewer characters is de facto easier on editors, especially on newer or lower-volume editors. It is nearly impossible to keep up with the correct order of parameters in all templates, and it is harder to use a template if you have to go look up the order before you can use it. I get the appeal of typing |en| instead of |lang=en|, but that sometime convenience is outweighed in my view by the long term frustration that unnamed parameters can result in. - TheDaveRoss 18:30, 1 October 2016 (UTC)
- Which is exactly why it makes sense to use
{{suffix|en}}
everywhere, because having the first parameter be the language is a standard practice. To learn which templates needlang=
and which don't is a lot more complicated to editors. They should all be changed to conform to the same standard so that there is less confusion. —CodeCat 18:46, 1 October 2016 (UTC)- I agree here. I don't think a new user is going to have an easier time with a lang= parameter than a required first parameter. If anything, it may be harder because their instinct will be to leave out the lang= parameter and then they'll get confused by the resulting error message. Having the language as the first param makes it very clear that it's required. Benwing2 (talk) 19:18, 1 October 2016 (UTC)
- Sure. However if you are not sure and type |lang=en| then you will not break anything, whereas if you type |en| then it may or may not work, may throw module errors, may create incorrect display text... - TheDaveRoss 20:07, 2 October 2016 (UTC)
- Can we have our cake and eat it too? Does our software have to be so fragile? If a user were to include a lang= parameter anywhere among the arguments, couldn't we make sure that the template does not take the first unnamed parameter as a language code?
- We can make all of the templates much more complex and accomplish that, or MediaWiki could be modified to accomplish that, or a module could be created to accomplish that. I think just using named parameters does it with a lot less fuss, and also results in template source which is easier to read, as well as wikitext which is easier to decipher. Currently if the lang parameter defaults to the first unnamed parameter, and the lang parameter is explicitly defined, it seems to ignore the first parameter altogether. - TheDaveRoss 12:49, 3 October 2016 (UTC)
- We seem to always default to what is easier to program, rather than what is easier for ordinary users, which, I venture, is part of what discourages new users. DCDuring TALK 13:47, 3 October 2016 (UTC)
- In OOP land there is the concept of an interface. For example, if a template implemented the Language interface, it must be able to support input with lang= or the first parameter. Or if a template implemented a Transliteration interface, it must be able to support input with tr= or translit= or transliteration=. We would have to come up with some way to enforce this on the module side. DTLHS (talk) 13:59, 3 October 2016 (UTC)
- We seem to always default to what is easier to program, rather than what is easier for ordinary users, which, I venture, is part of what discourages new users. DCDuring TALK 13:47, 3 October 2016 (UTC)
- We can make all of the templates much more complex and accomplish that, or MediaWiki could be modified to accomplish that, or a module could be created to accomplish that. I think just using named parameters does it with a lot less fuss, and also results in template source which is easier to read, as well as wikitext which is easier to decipher. Currently if the lang parameter defaults to the first unnamed parameter, and the lang parameter is explicitly defined, it seems to ignore the first parameter altogether. - TheDaveRoss 12:49, 3 October 2016 (UTC)
- Can we have our cake and eat it too? Does our software have to be so fragile? If a user were to include a lang= parameter anywhere among the arguments, couldn't we make sure that the template does not take the first unnamed parameter as a language code?
- Which is exactly why it makes sense to use
- It is all subjective, but I have opined against proposed changes which result in general obfuscation of the wikitext. I don't agree that fewer characters is de facto easier on editors, especially on newer or lower-volume editors. It is nearly impossible to keep up with the correct order of parameters in all templates, and it is harder to use a template if you have to go look up the order before you can use it. I get the appeal of typing |en| instead of |lang=en|, but that sometime convenience is outweighed in my view by the long term frustration that unnamed parameters can result in. - TheDaveRoss 18:30, 1 October 2016 (UTC)
- I haven't seen you propose ideas of that kind, rather than merely complain about changes that benefit editors' ability to improve the dictionary. —Μετάknowledgediscuss/deeds 18:19, 1 October 2016 (UTC)
- Mostly because it is a bad idea. We should be moving in the direction of transparency not obscurity. - TheDaveRoss 17:24, 1 October 2016 (UTC)
- (ec) I think that may be the case in some instances, but I don't think that is the case here. I think that making templates easier to read and modify is a boon to editors, especially new editors. I don't think it impacts readers either way (except that they may become editors more readily). All of the arguments I hear about unnamed parameters being preferable are about editing being more laborious, either via increased volume of wikitext or additional keystrokes while entering templates (which could probably be worked around via bots if it was truly a concern). - TheDaveRoss 14:02, 3 October 2016 (UTC)
- I think of three classes of contributors: module and template mavens, frequent editors, occasional editors (close to "normal" users). Occasional editors are the ones who might benefit from named parameter like lang=; frequent editors benefit from brevity, ie, a numbered parameter; mavens benefit from just doing one or the other, not both. A supermaven might have standard code for accepting either numbered or named parameters to be applied where beneficial. Less skilled template authors (eg, me, a template copier/adapter) are happy to get a functioning template without regard to ease of use for anyone other than themselves, let alone normal users. We less skilled would need to be handed the means for allowing both named and numbered parameters on a silver platter.
- All benefit from consistency across templates, Equinox's original point. Having code that supported both for language code use in templates would allow all template users to have the consistency that Equinox is seeking. DCDuring TALK 17:50, 3 October 2016 (UTC)
- I agree with you (and Equinox) that consistency is a desirable outcome. I disagree that removing the lang= parameter is the right direction to take. I think we should add the parameter names where they don't exist, instead of removing them where they do. Requiring that all templates be behemoths with layers upon layers of case statements or module integration does not seem like a solution which would improve things for anyone whose primary concern is anything but minimizing the length of template calls in wikitext. - TheDaveRoss 17:59, 3 October 2016 (UTC)
- I don't think new users are baffled at all by unnamed positional parameters, provided (1) the first positional parameter is reliably the language code, and (2) the template has an informative documentation subpage. Unfortunately, there are currently some 50,000 Templates and modules needing documentation, and that, I submit, is what makes using templates confusing to newcomers. —Aɴɢʀ (talk) 21:00, 3 October 2016 (UTC)
- The templates which need work are a red herring in this case, since nothing under discussion proposes to change the documentation either way. I agree that it would be better if more templates, especially oft-used ones, were documented. My position is that named parameters are easier to decipher, even if sometimes unnamed parameters are decipherable. - TheDaveRoss 21:36, 3 October 2016 (UTC)
- I don't think new users are baffled at all by unnamed positional parameters, provided (1) the first positional parameter is reliably the language code, and (2) the template has an informative documentation subpage. Unfortunately, there are currently some 50,000 Templates and modules needing documentation, and that, I submit, is what makes using templates confusing to newcomers. —Aɴɢʀ (talk) 21:00, 3 October 2016 (UTC)
- I agree with you (and Equinox) that consistency is a desirable outcome. I disagree that removing the lang= parameter is the right direction to take. I think we should add the parameter names where they don't exist, instead of removing them where they do. Requiring that all templates be behemoths with layers upon layers of case statements or module integration does not seem like a solution which would improve things for anyone whose primary concern is anything but minimizing the length of template calls in wikitext. - TheDaveRoss 17:59, 3 October 2016 (UTC)
- (ec) I think that may be the case in some instances, but I don't think that is the case here. I think that making templates easier to read and modify is a boon to editors, especially new editors. I don't think it impacts readers either way (except that they may become editors more readily). All of the arguments I hear about unnamed parameters being preferable are about editing being more laborious, either via increased volume of wikitext or additional keystrokes while entering templates (which could probably be worked around via bots if it was truly a concern). - TheDaveRoss 14:02, 3 October 2016 (UTC)
"Sense" categorisation using the "label" template
[edit]The template {{lb}}
often categorises an entry.
- Using "archaic"" - for example with an archaic word τας. It is incorrectly categorised in Category:Greek terms with archaic senses. - it should be in the non-existent Category:Greek archaic terms. But on the other hand
- Using "dated" - with συντροφιά, which has current usage - labelling one of its senses as dated places it in Category:Greek dated terms rather than the non-existent Category:Greek terms with dated senses.
(1) Shouldn't {{lb}}
behave in the same fashion with each descriptive.
(2) Don't we need two types of category "Category:Greek terms with xxxx senses" and "Category:Greek xxxx terms"?
- And a way of labelling each? — Saltmarshσυζήτηση-talk 06:06, 3 October 2016 (UTC)
- Per Wiktionary:Votes/2011-04/Lexical categories, which was voted and approved in 2011, it should be "Category:English archaic terms". --Daniel Carrero (talk) 06:30, 3 October 2016 (UTC)
- I don't think that answers my point - what about terms where ONE sense is dated/archaic? — Saltmarshσυζήτηση-talk 06:34, 3 October 2016 (UTC)
- @Saltmarsh Personally, I would prefer having a single category for everything that is archaic. Why make people navigate between two separate "archaic" categories? Splitting the "archaic" category (and "dated", "rare", etc.) in two would require us to work hard in editing all entries, and I'm not sure the benefits justify it. It would require continuous maintenance: an entry with only 1 sense that is archaic would have to be transferred to the other category as soon as it gets a second, non-archaic, sense. Instead of doing it, we could just have Category:English archaic senses and that's it. The sense is being categorized, not the sense in relation to the other senses. In my opinion, having two categories Category:English archaic terms and Category:English terms with archaic senses is like splitting Category:English nouns into Category:English terms that are only nouns and Category:English terms that are nouns as well as other parts of speech. But it may be just me. --Daniel Carrero (talk) 21:38, 6 October 2016 (UTC)
Perhaps a solution to the problem regarding archaic terms versus terms archaic senses would be wider use of {{term-label}}
. {{term-label}}
(which is placed at the end of the headword line) should be used to categorize things in, for instance Category:English archaic terms, while {{label}}
should categorize in Category:English terms with archaic senses (or Category:English archaic senses). Then, to switch from one category to the other, you simply switch templates (and of course the position of the template), while the label remains the same. I suppose the current form of Module:labels does not allow this, though... — Eru·tuon 21:47, 6 October 2016 (UTC)
- @Erutuon - I've been away - I was originally drawing attention to the differing behavour of
{{lb}}
depending upon whether archaic or dated is used, ie a technical problem. I'll take up the matter elsewhere. (@Daniel Carrero, Erutuon - I think discussion on the appropriate categorisation belongs elsewhere.) — Saltmarshσυζήτηση-talk 05:48, 16 October 2016 (UTC)
- @Erutuon - I've been away - I was originally drawing attention to the differing behavour of
Stuff's going to break in 2017
[edit]TLDR:
- Someone/some page at this project should probably be subscribed to m:Tech/News.
- The announcement in the most recent edition about mw:Parsing/Replacing Tidy is going to affect Wiktionaries, too.
More details:
w:HTML Tidy is a tool that the devs have been using to silently compensate for some typos in HTML and wikitext code after a page has been saved. Tidy is being removed (but not during 2016) as part of a multi-year plan to update the parsers and improve accessibility.
To give a simple example, </br>
is an invalid HTML code (it should be <br>
instead). This currently displays as if it were correct because Tidy cleans it up. You can see the pages affected by this particular error by searching for insource:/\<\/br\>/
in the regular search box. There are only about 35 pages in the mainspace and about 15 templates at the English Wiktionary that have this particular error, but that's only one of the errors.
More information, and a list of the major changes, is available at mw:Parsing/Replacing Tidy. In (probably) December, there will be a tool that you can use to visually check previews on pages that you're concerned about (it'll probably be available in Special:Preferences, but turned off by default). In the meantime, there is a list of known errors at mw:Parsing/Replacing Tidy that you may want to review and check your wiki for. I also recommend dropping by w:Wikipedia talk:WikiProject Check Wikipedia and watch for information about scripts and tools. Much of this work can be handled with scripts or bots, but some of the changes (e.g., where to close a table that is missing the |}
code to signal the end of the page) will require human judgment.
Most of the information about projects like this is delivered via m:Tech/News. However, nobody at this wiki is subscribed to that weekly newsletter. If you aren't reliably getting this information via another wiki or mailing list, then you may want to subscribe and start watching for announcements like this.
I expect formal announcements about this change to go out later, but I thought you'd want to know about this sooner rather than later. Also, if you work at any other project, please share this information. If you have questions or information to share with the devs about this project, please feel free to {{ping}}
me. Whatamidoing (WMF) (talk) 17:19, 5 October 2016 (UTC)
- @Whatamidoing (WMF): This kind of post should go to the WT:Grease pit, not the WT:Tea room. --WikiTiki89 17:40, 5 October 2016 (UTC)
- Moved accordingly. —Μετάknowledgediscuss/deeds 17:44, 5 October 2016 (UTC)
- @Whatamidoing (WMF): Pinging again since the discussion was moved. --WikiTiki89 17:58, 5 October 2016 (UTC)
- Moved accordingly. —Μετάknowledgediscuss/deeds 17:44, 5 October 2016 (UTC)
Disabling the character box
[edit]I'm not using the huge character box when editing pages, and it always takes a small moment to load as a full list and then collapse, which bothers me. Can I disable the character box somehow? I didn't find anything at "Preferences" and the character list does not seem to be stored at a module or template. --Daniel Carrero (talk) 22:39, 5 October 2016 (UTC)
Why does the headword module generate such sortkeys?
[edit]It appends 0x0a (newline) and the headword again. This is not a WM thing as you can see here - French Table only the lemma category is affected not proper nouns which was added manually.
This is also true for English, but not for Georgian. One difference I have noticed about these three languages is that Georgian has an empty _rawData.sort_key
property while the other two do not.
Bonus question: why does Module:languages make sortkey uppercase in the Language:makeSortKey
function. We could save some time by not doing a pointless conversion, right? --Giorgi Eufshi (talk) 11:33, 6 October 2016 (UTC)
- Table has a sortkey of Table\nTable, which may seem a meaningless duplication, while Amérique has Amerique\nAmérique : [1]. It guarantees multi-level sorting. Making uppercase was necessary when Wiktionary didn’t automatically do it. Now it can be deleted, if there is no problem in handling σ and ς. — TAKASUGI Shinji (talk) 12:29, 6 October 2016 (UTC)
- The makeSortKey function converts to lowercase, and it does so because the custom per-language sorting rules expect the input to be all lowercase. Removing it would break the sort key generating rules for any entries containing uppercase letters that need to be modified. For example, for German, Ä would no longer be converted to plain A. —CodeCat 18:39, 6 October 2016 (UTC)
Changing Ancient Greek prepositions to prefixes in etymologies
[edit]Could this be done by bot? There are a lot of Ancient Greek etymologies in which prepositions – for example, ἐκ (ek) – need to be changed to prepositional prefixes – for example, ἐκ- (ek-) – and I was doing it manually, but it occurs to me that it's such a trivial task that a bot could do it.
I don't know if this is helpful, but I was using the following regex on gEdit: m(\|grc\|\w{2,3})(\|)\|([\w ]*)\}\} \+ \{\{m\|grc(\|\w*\|)\|([\w ]+)
> affix\1-\2t1=\3\4t2=\5
. Wouldn't work in all cases, but it is a start. — Eru·tuon 18:28, 6 October 2016 (UTC)
- @Erutuon: Why do we even need to have them as prefixes? Why not just treat them as compounds? — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 21:22, 7 October 2016 (UTC)
- @ObsequiousNewt: Well, sometimes the prefixes have a similar meaning to one of the prepositional phrases composed of the corresponding preposition and the genitive, accusative, or dative, or the corresponding adverb: for instance, when when ἐκ- (ek-), κατα- (kata-), συν- (sun-), and μετα- (meta-) mean "out", "down", "with" and "after". In other cases they don't: for instance, when ἐκ- (ek-), κατα- (kata-), and συν- (sun-) are just intensifiers, or when μετα- (meta-) expresses the idea of change of state. And the prefix can never have all the meanings expressed by the preposition. So either a set of meanings associated with the prefix has to be appended to the end of the entry on the preposition or adverb, or a separate entry for the prefix has to be created. I favor the separate entry, because it's neater – and then you don't have to decide which POS to put the prefix definitions under: for instance, whether the definitions of μετα- (meta-) should go in the Preposition or Adverb section of the μετά (metá) entry. — Eru·tuon 21:56, 7 October 2016 (UTC)
- @Erutuon: Ah, yes, I hadn't considered that. So, yeah, that sounds like a good idea. Make sure you don't apply it to adverbs, though, where the preposition itself is going to be the root. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 22:33, 7 October 2016 (UTC)
- Also, treating them as prefixes allows categories like CAT:Ancient Greek words prefixed with μετα-; we don't have any infrastructure for a category like "Ancient Greek words compounded with μετά". —Aɴɢʀ (talk) 22:55, 7 October 2016 (UTC)
- @Erutuon: Ah, yes, I hadn't considered that. So, yeah, that sounds like a good idea. Make sure you don't apply it to adverbs, though, where the preposition itself is going to be the root. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 22:33, 7 October 2016 (UTC)
Module:grc-translit is messing up iota+rough breathing combinations
[edit]Example: ἱ (hi) ἵ (hí) ἳ (hì) ἷ (hî). All these are showing as ih with an accent on the h, when it should be hi accented on the i. Even weirder things happen if you combine them (not that this is ever needed, but I tried to do it just now writing this post): ἵ ἳ ἷ (hí hì hî). This bug doesn't seem to be affecting any other vowels, or even capital iotas: Ἱ (Hi) Ἵ (Hí) Ἳ (Hì) Ἷ (Hî) are fine. 86.138.252.146 18:56, 6 October 2016 (UTC)
- @ObsequiousNewt: any ideas? —Aɴɢʀ (talk) 18:28, 7 October 2016 (UTC)
- In the quotation at Θεός, ὁ is currently being transliterated as oh instead of ho; yet when I link ὁ (ho) by itself, its transliteration is correct. —Aɴɢʀ (talk) 18:48, 7 October 2016 (UTC)
- Another similar bug, capital letters with rough breathings don't come out with the correct capitalization if they're not at the start: e.g. οἱ Ἕλληνες (hoi Héllēnes). (I guess with this one that the hÉ is what the algorithm spits out, and there's a check for miscapitalization that only looks at the first two characters rather than the whole string.) 86.138.252.146 21:02, 7 October 2016 (UTC)
- I'm aware of this; working on fixing it. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 21:25, 7 October 2016 (UTC)
- Done. I feel confident in saying this fix should work for everything; the code is much more robust now. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 22:30, 7 October 2016 (UTC)
- You should probably write some testcases. DTLHS (talk) 22:32, 7 October 2016 (UTC)
- Done. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 18:54, 8 October 2016 (UTC)
- You should probably write some testcases. DTLHS (talk) 22:32, 7 October 2016 (UTC)
- Done. I feel confident in saying this fix should work for everything; the code is much more robust now. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 22:30, 7 October 2016 (UTC)
- I'm aware of this; working on fixing it. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 21:25, 7 October 2016 (UTC)
- Another similar bug, capital letters with rough breathings don't come out with the correct capitalization if they're not at the start: e.g. οἱ Ἕλληνες (hoi Héllēnes). (I guess with this one that the hÉ is what the algorithm spits out, and there's a check for miscapitalization that only looks at the first two characters rather than the whole string.) 86.138.252.146 21:02, 7 October 2016 (UTC)
- In the quotation at Θεός, ὁ is currently being transliterated as oh instead of ho; yet when I link ὁ (ho) by itself, its transliteration is correct. —Aɴɢʀ (talk) 18:48, 7 October 2016 (UTC)
I just noticed that the Ionic form ἑωυτοῦ (heōutoû) in ἑαυτοῦ (heautoû) is transliterated as heōytoû. Seems odd. ωυ is a diphthong, and so the second element should be u. Or is ōy supposed to be the Attic and Koine pronunciation, because Athenians or worldwide Greek speakers wouldn't recognize ωυ as a diphthong? — Eru·tuon 08:51, 8 October 2016 (UTC)
- Whoops, I missed that. Fixed. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 18:54, 8 October 2016 (UTC)
- Why was the transliteration of υ changed to y? I've reverted it, per WT:GRC TR. —CodeCat 19:10, 8 October 2016 (UTC)
- y is a more accurate representation of the fronted pronunciation of the monophthongs ῠ, ῡ in Classical Attic, Ionic, and Koine than u. However, it's probably not accurate for Aeolic or Doric, though unfortunately there isn't as much information on them. Boeotian (an Aeolic dialect) apparently had a back ῡ, since it was sometimes written as ου in an Attic-influenced spelling system. Perhaps, then, there should be an Attic–Ionic–Koine transliteration that uses y and a transliteration for other dialects that uses u. — Eru·tuon 19:19, 8 October 2016 (UTC)
- Transliterations don't have to be phonetically correct, and the consensus was arrived at by people who were aware of the phonetic facts. This isn't the place to debate it. More urgently, @ObsequiousNewt, the recent edits have resulted in module errors in dozens of entries. Chuck Entz (talk) 19:56, 8 October 2016 (UTC)
- Fixed; dumb error. I changed ⟨u⟩ to ⟨y⟩ because ⟨y⟩ always appears in Latin (and so in French, English, etc.), because /y/ is the sound which υ represents in Attic (and presumably Ionic/Epic as well, and together those make up the vast majority of the Greek corpus) and is not the same sound that appears in diphthongs, and because the transliteration of υ as ⟨u⟩ is an idiosyncracy which I have only seen elsewhere in beta code. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 23:20, 8 October 2016 (UTC)
- It's also worth noting that in Boeotian, /u/ was spelled ⟨ου⟩ upon the adoption of the Ionic alphabet (as occasionally in some other dialects, which is partly how we know that they had /u/ to begin with), and οι, which had become a front rounded monophthong (probably /ø/), was often spelled ⟨υ⟩. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 00:57, 9 October 2016 (UTC)
- Transliterations don't have to be phonetically correct, and the consensus was arrived at by people who were aware of the phonetic facts. This isn't the place to debate it. More urgently, @ObsequiousNewt, the recent edits have resulted in module errors in dozens of entries. Chuck Entz (talk) 19:56, 8 October 2016 (UTC)
- y is a more accurate representation of the fronted pronunciation of the monophthongs ῠ, ῡ in Classical Attic, Ionic, and Koine than u. However, it's probably not accurate for Aeolic or Doric, though unfortunately there isn't as much information on them. Boeotian (an Aeolic dialect) apparently had a back ῡ, since it was sometimes written as ου in an Attic-influenced spelling system. Perhaps, then, there should be an Attic–Ionic–Koine transliteration that uses y and a transliteration for other dialects that uses u. — Eru·tuon 19:19, 8 October 2016 (UTC)
- Why was the transliteration of υ changed to y? I've reverted it, per WT:GRC TR. —CodeCat 19:10, 8 October 2016 (UTC)
- I support using y in transliterations, at least for Attic, Ionic, Koine, and Medieval Greek words. I suppose using u for /y/ wouldn't be totally nonsensical, because after all that's its pronunciation in French, but it's misleading to use the same symbol for the second element of diphthongs and for the independent vowel when they have different pronunciations. ObsequiousNewt, I think W. Sidney Allen said Ionic also had /y/ based on a borrowed word in Herodotus, though I don't have his book on hand. I'm not sure how we would know if the language at the time of the writing of the Iliad and Odyssey did; at the very least, the form of the text that we have is heavily influenced by Attic, so it makes the most sense to use an Attic pronunciation. I would rather use u for whichever dialects didn't have a fronted pronunciation, though I am not sure how that would be implemented... by creating language codes for Ancient Greek dialects, perhaps? — Eru·tuon 03:05, 14 October 2016 (UTC)
l-self and Modern Greek
[edit]I recently edited {{el-nF-η-ες-1}}
to use {{l-self}}
rather than bare links. Unfortunately, it's causing things to break in inflection tables whenever the generated form is identical to the page name (i.e. exactly the circumstances under which it's supposed to present boldface and no link rather than roman face and a link). See the declension table at αρχή#Declension, for example. Is there something at {{l-self}}
or one of the modules it uses that needs to be fixed? —Aɴɢʀ (talk) 18:27, 7 October 2016 (UTC)
- Fixed;
{{el-decl-noun}}
is what should have been edited. --WikiTiki89 18:41, 7 October 2016 (UTC)
grc-conj
[edit]@ObsequiousNewt or anyone else who feels up to the task: {{grc-conj}}
or Module:grc-conj is having a problem with the passive of the liquid/nasal futures. {{grc-conj|fut-ln|X|Y}}
is causing the future passive of all verbs to appear as ήσομαι instead of as -ήσομαι suffixed to Y
. See πίνω (pínō), for example, where the template says {{grc-conj|fut-ln|πῐ|ποθ|form=mp}}
but the table is showing the future passive as ήσομαι rather than ποθήσομαι. —Aɴɢʀ (talk) 22:07, 7 October 2016 (UTC)
- @Angr Fixed. — ObſequiousNewt — Geſpꝛaͤch — Beÿtraͤge 23:58, 7 October 2016 (UTC)
I've done a big dump
[edit]Hoho. I've just finished the overnight processing of all the English-language mainspace entries, so that I can do some of my todos, like creating missing etymologies and plurals. Any other statistics etc. that anyone needs on these entries? I might be able to do that. (Provisos: there are literally just three or four entries I had to skip over, because my code choked on a handful of rare symbols like the Hawaiian okina.) Equinox ◑ 09:33, 8 October 2016 (UTC)
- It would be useful to see if there are any entries with English headers that aren't in either CAT:English lemmas or CAT:English non-lemma forms. If categories are a problem, then look for English entries lacking headword templates. Chuck Entz (talk) 18:06, 8 October 2016 (UTC)
- Category links don't seem to be present in the page dumps, so I would have to deduce this by checking for the presence of any headword template (as you say). Where would I get a list of all valid headword templates? Sounds like quite a parsing challenge. Equinox ◑ 11:38, 9 October 2016 (UTC)
- Might be easier to grab the category links dump. I think it is sql only (no xml) but you wouldn't have to reinvent the wheel if you are running a SQL server or want to just parse the query. - TheDaveRoss 18:09, 21 October 2016 (UTC)
- Category links don't seem to be present in the page dumps, so I would have to deduce this by checking for the presence of any headword template (as you say). Where would I get a list of all valid headword templates? Sounds like quite a parsing challenge. Equinox ◑ 11:38, 9 October 2016 (UTC)
- @Equinox, find circular definitions. That'd be great.--Dixtosa (talk) 19:34, 22 October 2016 (UTC)
- I wonder if one could prove a fixed point theorem or something in graph theory for lexicography, demonstrating that there must be circular definitions. Also, see w:Circular_definition#Circular_lexicographic_.28dictionary.29_definitions at WP.
- I think that circular definitions that involve cycles of four or more definitions would not be noticed by users. I don't know about 3-definition circularity. Two-definition circularity defintely seems like a problem. Perhaps circular definitions of 3 or more headwords with the same stem (eg, quick, quickly, quickness, quicken) would be noticed. DCDuring TALK 23:12, 22 October 2016 (UTC)
- Every graph that is not a tree contains a cycle. If the dictionary was structured as a tree there would be words without definitions. Therefore there will always be circular definitions. DTLHS (talk) 23:23, 22 October 2016 (UTC)
- D'oh. But the only ones users notice are the one's involving the smallest of cycles? DCDuring TALK 00:14, 23 October 2016 (UTC)
- I think there is a different definition of cyclical which is more relevant than the mathy one. We care more about short-form definitions which are often simply more common synonyms than we do long-form definitions which describe the meaning of the word instead of providing a synonym. An example: definition of boat is ship, definition of ship is boat. It is OK that the definitions of the contains an a, and the definition of a contains a the. - TheDaveRoss 18:32, 1 November 2016 (UTC)
- D'oh. But the only ones users notice are the one's involving the smallest of cycles? DCDuring TALK 00:14, 23 October 2016 (UTC)
- Every graph that is not a tree contains a cycle. If the dictionary was structured as a tree there would be words without definitions. Therefore there will always be circular definitions. DTLHS (talk) 23:23, 22 October 2016 (UTC)
- @Equinox, also find entries that have mismatched number of senses and translation tables.--Dixtosa (talk) 18:28, 1 November 2016 (UTC)
Where's the English word count?
[edit]On our front page, we give our total entry count ("4,915,340 entries with English definitions from over 2500 languages") and we show the word counts for Wiktionaries in other languages, but we don't list how many English words we have in en.wikt. Is that trivial to add? I wanted to compare it with the results of my dump script, which generated 675,738 English entries on my computer. Equinox ◑ 11:37, 8 October 2016 (UTC)
- @Equinox: Category:English lemmas yields 443,225 entries. That category seems to be a pretty good handle on the English term count. I agree that English term count seems to be a more interesting number than the total number of lemma and non-lemma entries since the latter very much depends on our coverage of non-lemmas of highly inflected langauges, and therefore, it would be cool to see the number of English lemmas on the front page. --Dan Polansky (talk) 13:11, 8 October 2016 (UTC)
- Hm, I wonder why the disparity. What English entries don't go into English lemmas? (head|en|noun) does (e.g. moules marinières) and so do symbols (e.g. the verb ♥). I did not keep redirects since they have no language header. Anyway, everything I've got seems to look okay. Equinox ◑ 13:25, 8 October 2016 (UTC)
- Oh, haha, I'm an idiot. My list includes non-lemmas, which... kind of aren't lemmas. Equinox ◑ 13:26, 8 October 2016 (UTC)
- And Category:English non-lemma forms yields 240,611 items, which, together with the lemma count, adds up to 683,836. I kind of do not think of non-lemmas as words; to me, words are lexemes. --Dan Polansky (talk) 14:17, 8 October 2016 (UTC)
- Since there are entries such as wound that are both lemmas and non-lemmas, that can only be an approximation. Chuck Entz (talk) 17:55, 8 October 2016 (UTC)
- Good point. The sum can still serve as some sort of a quick sanity check for a number determined by other means. --Dan Polansky (talk) 12:53, 15 October 2016 (UTC)
- If words were lexemes then we wouldn't really need the word "lexeme"... Equinox ◑ 18:09, 8 October 2016 (UTC)
- @Equinox: We would, to remove ambiguity. Scientists like to coin specialized terminology that is less ambiguous. If someone asks me what is the number of words covered in a dictionary, I expect them to ask about the number of lexemes. Others may differ. Similarly, if someone asks what the number of words a language has, I expect them to be talking about lexemes. --Dan Polansky (talk) 12:53, 15 October 2016 (UTC)
- Since there are entries such as wound that are both lemmas and non-lemmas, that can only be an approximation. Chuck Entz (talk) 17:55, 8 October 2016 (UTC)
- And Category:English non-lemma forms yields 240,611 items, which, together with the lemma count, adds up to 683,836. I kind of do not think of non-lemmas as words; to me, words are lexemes. --Dan Polansky (talk) 14:17, 8 October 2016 (UTC)
- Oh, haha, I'm an idiot. My list includes non-lemmas, which... kind of aren't lemmas. Equinox ◑ 13:26, 8 October 2016 (UTC)
- The number comes from Special:Statistics. That particular number is the "number of content pages". Perhaps we should give 2297446 as the total number of "gloss1 definitions" (Footnote: "1 The number of gloss definitions a language has is the number of senses the words in that language have. Inflected forms which are merely defined as "plural of x", "past tense of x", etc are not included in the count of gloss definitions, while they are included in the count of total definitions."). --WikiTiki89 17:56, 10 October 2016 (UTC)
- All interesting stuff. The lemma to non-lemma figures can swing the other way in foreign languages compared to English. In Bokmål currently 12176 lemmas and 31739 non-lemmas, and some of those can be both as Chuck Entz mentioned. DonnanZ (talk) 18:57, 14 October 2016 (UTC)
- Another thought. I don't think "more" and "most" comparative forms of English adjectives are recorded as non-lemmas, so the non-lemma figures probably reflect that. Similarly in other languages that use this form of comparative. DonnanZ (talk) 19:08, 14 October 2016 (UTC)
"book rendering failed" when generating PDF
[edit]See Wiktionary:Information_desk/2016/October#book_rendering_failed. Our PDF generation feature seems to be broken, with this error: "book rendering failed. There was an error while attempting to render your book". To reproduce the error, visit the dog entry (for example) and choose Print/Export, Download as PDF on the left-hand menu panel. Apparently it works fine on French Wiktionary. Equinox ◑ 08:55, 9 October 2016 (UTC)
- Seems to have also been a problem on Wikipedia but got fixed. See w:Wikipedia:Village_pump_(technical)/Archive_150#Download as PDF completely broken? —Enosh (talk) 12:21, 9 October 2016 (UTC)
Can we use CSS in place of ----
between language sections?
[edit]Would it be possible to use CSS to have the lines between language sections be generated automatically so that we wouldn't have to manually add ----
? --WikiTiki89 20:07, 10 October 2016 (UTC)
- Would we have to do something like
h2 { border-bottom: 1px solid black }
? That would also put a line after the last section on the page, something we don't do now. DTLHS (talk) 22:19, 10 October 2016 (UTC)h2
is just the header, not the whole section. There's currently no HTML element to contain the section itself, though I think they are "working" on that. —CodeCat 22:21, 10 October 2016 (UTC)- I think this would work:
.ns-0 h2:not(:first-of-type) {
border-top: 1px solid #aaa;
margin-top: 1.5em;
}
- It adds a grey line above each h2 in the Main mainspace except for the first one on a page. —suzukaze (t・c) 22:32, 10 October 2016 (UTC)
- Where are the h2 styles defined anyway? Does anyone know? --WikiTiki89 16:04, 11 October 2016 (UTC)
- I don't think it matters. Putting styles in MediaWiki:Common.css can override any default MediaWiki styles. —suzukaze (t・c) 10:34, 13 October 2016 (UTC)
- @Suzukaze-c: It matters because I want to see the code. --WikiTiki89 20:28, 17 October 2016 (UTC)
- Right-click any second level header and choose "inspect element". If you are landed on a span element click h2 which should be just above it. On the right you'll see all the styles that affect h2. This is for Chrome but other browsers behave similarly. --Giorgi Eufshi (talk) 06:10, 21 October 2016 (UTC)
- @Suzukaze-c: It matters because I want to see the code. --WikiTiki89 20:28, 17 October 2016 (UTC)
- I don't think it matters. Putting styles in MediaWiki:Common.css can override any default MediaWiki styles. —suzukaze (t・c) 10:34, 13 October 2016 (UTC)
- Where are the h2 styles defined anyway? Does anyone know? --WikiTiki89 16:04, 11 October 2016 (UTC)
- this gives exactly the same look at least for the vector skin. --Giorgi Eufshi (talk) 12:20, 13 October 2016 (UTC)
hr { display:none; } .ns-0 h2:not(:first-of-type){ height: 48px; line-height: 69px !important; margin-top: 3px !important; border-top: 1px solid #aaa; }
- If we did this, would we have to remove all of the manual ----'s first? DTLHS (talk) 22:29, 14 October 2016 (UTC)
- We could probably hide those with css, too. Chuck Entz (talk) 22:54, 14 October 2016 (UTC)
- What we do first or second doesn't really matter, but yes, if we do this, we will get rid of the manual
----
s. --WikiTiki89 20:30, 17 October 2016 (UTC)- I tested that CSS code. It seems to work well: it hides the manual "hr" horizontal line and it causes the automatic CSS horizontal line to appear above each h2 title except the first, in the main namespace only. Can we implement it? --Daniel Carrero (talk) 03:31, 11 November 2016 (UTC)
- The obvious question: do any of our templates use <hr>? It can't be that many, so workarounds would be simple- but we should at least be aware whether they do. Chuck Entz (talk) 05:07, 11 November 2016 (UTC)
- If we implement that CSS code and subsequently remove all the ----'s from all entries, we can remove the code
hr { display:none; }
and let templates use horizontal lines freely again if needed. --Daniel Carrero (talk) 05:10, 11 November 2016 (UTC)- Specifically, I tested the CSS code with both Tabbed Languages and non-Tabbed Languages (with and without all the ----'s), and the proposed CSS seems works perfectly well in both cases. Technically, the Tabbed Languages view already hides the "hr" between languages anyway and all the "h2" (language names, like
==English==
) are converted to a table so nothing changes. --Daniel Carrero (talk) 15:22, 11 November 2016 (UTC)
- Specifically, I tested the CSS code with both Tabbed Languages and non-Tabbed Languages (with and without all the ----'s), and the proposed CSS seems works perfectly well in both cases. Technically, the Tabbed Languages view already hides the "hr" between languages anyway and all the "h2" (language names, like
- If we implement that CSS code and subsequently remove all the ----'s from all entries, we can remove the code
- The obvious question: do any of our templates use <hr>? It can't be that many, so workarounds would be simple- but we should at least be aware whether they do. Chuck Entz (talk) 05:07, 11 November 2016 (UTC)
- I tested that CSS code. It seems to work well: it hides the manual "hr" horizontal line and it causes the automatic CSS horizontal line to appear above each h2 title except the first, in the main namespace only. Can we implement it? --Daniel Carrero (talk) 03:31, 11 November 2016 (UTC)
- What we do first or second doesn't really matter, but yes, if we do this, we will get rid of the manual
- We could probably hide those with css, too. Chuck Entz (talk) 22:54, 14 October 2016 (UTC)
- I do think this is pretty ingenious. The "----" is devoid of actual meaning and only functions as a visual separator, so it would be really good to replace it with visual markup. Equinox ◑ 04:52, 11 November 2016 (UTC)
- Can this be done now, or does it need a vote? I recall that this proposal was made before by me, but went nowhere. —CodeCat 14:08, 11 November 2016 (UTC)
- I think we should have a vote for a couple of reasons: a vote might help more people see this (in conjunction with a BP topic) since it is such a major change in terms of how people will see the wikitext displayed, and we will also be modifying the ELE which needs a vote. - TheDaveRoss 15:08, 11 November 2016 (UTC)
- A vote would be wise.
- Doesn't the testing need to include the different major browsers (with different OSes, including the mobile ones), both with and without tabbed languages? This seems either burdensome or impossible for one or two people to do. Perhaps a page containing testing instructions and a list of hardware, OS (version), browser (version) combinations to be tested could help get other involved and get more assurance that the change had no major glitches. I, for example, could test FF and Chrome on Android devices Samsung Galaxy Nexus and Samsung tablet. DCDuring TALK 16:22, 11 November 2016 (UTC)
- If we want to be really thorough, we can either make it a gadget that can be tested on all pages, or put code in either common.css or common.js that applies to only one or a few pages. I prefer the latter, so that we can have everyone look at the page with their own customary configuration, and some without logging in. The test page should have a random combination of language headers both with and without the "----", and some representative content. Among that content should be known trouble items such as images,
{{wikipedia}}
and other boxes, and collapsible boxes, both before and after the language header. I'd especially like to know what happens when floating items get pushed past the end of a language section. - We should have a section on the talk page or on a separate page where people can log their observations, along with their browser, skin, and any gadgets that they may have active which have anything to do with language headers (or even that can intrude things into the boundary between language sections, such as right-hand TOC). I know that, just among the regulars, we have Macs and PCs, and at the very least Chrome, MS Edge, Firefox, IE, and Safari browsers, and undoubtedly other configurations. Chuck Entz (talk) 17:12, 11 November 2016 (UTC)
- I tested the new CSS using only Firefox 49.0.2 on Windows 7. w3schools states that the selectors used work with all the 5 listed major browsers: "first of type", "not".
- If you guys want to be really thorough and do all these tests, I'm OK with it, be my guest. But I "vote" for just implementing the CSS, because TabbedLanguages won't be affected anyway since it already hides both the "hr" and "h2" as I said above, and non-TabbedLanguages is basically just the addition of a border. Although I admit it would be ugly if either ":first of type" or ":not" didn't work in a certain browser, which does not seem to be the case for those 5 major browsers listed. Just my 2 cents.
- Incidentally, text-only browsers presumably would not be able to display the borders or the horizontal lines anyway -- or so I believe, but I didn't test it. I tried installing Lynx (which is a text-only browser) now from this site, but the browser generates an error message saying "SSLEAY32.dll is missing" and won't start. --Daniel Carrero (talk) 17:37, 11 November 2016 (UTC)
- I would not rush into this. Giorgi's styling does not give the same look in other three skins and even in vector the result depends on the last html tag in the previous language section. This often happens to be
ol
,ul
or NavFrame'sdiv
though. Also, we have a gadget that places country flags in second-level headers and that also needs testing.Lastly, there will probably be a little delay due to CSS being downloaded separately. See this entry in French wiktionary and note the delay owing to styling headers (shrinks) and table of contents(collapses to some level).. Not true. there is no delay for CSS for some reason. --Dixtosa (talk) 21:52, 11 November 2016 (UTC)
- I would not rush into this. Giorgi's styling does not give the same look in other three skins and even in vector the result depends on the last html tag in the previous language section. This often happens to be
- If we want to be really thorough, we can either make it a gadget that can be tested on all pages, or put code in either common.css or common.js that applies to only one or a few pages. I prefer the latter, so that we can have everyone look at the page with their own customary configuration, and some without logging in. The test page should have a random combination of language headers both with and without the "----", and some representative content. Among that content should be known trouble items such as images,
- I think we should have a vote for a couple of reasons: a vote might help more people see this (in conjunction with a BP topic) since it is such a major change in terms of how people will see the wikitext displayed, and we will also be modifying the ELE which needs a vote. - TheDaveRoss 15:08, 11 November 2016 (UTC)
- Can this be done now, or does it need a vote? I recall that this proposal was made before by me, but went nowhere. —CodeCat 14:08, 11 November 2016 (UTC)
- (I think Giorgi/Dixtosa's "identical look to Vector" is unnecessarily specific. —suzukaze (t・c) 05:32, 13 November 2016 (UTC))
Diff problem
[edit]I found a problem with the proposed CSS code. If we compare revisions of any entry, we'll notice that the diff page contains an h2 title like "Latest revision as of 23:25, 12 November 2016" at the top. Therefore, in this case, the first language section (like "English") is actually the second h2 title of the current page. As a result, if the proposed CSS code is enabled, a horizontal border is added at the top of the first language section in diff pages. --Daniel Carrero (talk) 01:30, 13 November 2016 (UTC)
hr { display: none; } .ns-0 h2:not([class]) ~ h2:not(:first-of-type) { border-top: 1px solid #aaa; margin-top: 1.5em; }
- seems to work. —suzukaze (t・c) 02:00, 13 November 2016 (UTC)
- I'm pretty sure the border-top should not be 10px; it's too thick. --Daniel Carrero (talk) 04:12, 13 November 2016 (UTC)
- Oops, that was for testing purposes to make sure that it worked. (It was bright red too.) —suzukaze (t・c) 05:19, 13 November 2016 (UTC)
- Apparently, you successfully fixed the diff problem that I had mentioned: I compared revisions now with that CSS enabled, and the first language section did not have a border above it. The top border only appears from the second language onwards, as planned.
- But, if we are aiming to use a CSS code that is either identical or very close to the current use of ----'s, then the proposed code still has a very noticeable problem: the normal look using ----'s has the top border somewhat far from the language name. The proposed CSS has the top border is too close to the language name -- in fact, the top border is as close to "English" as the bottom border is close to "English". --Daniel Carrero (talk) 06:04, 13 November 2016 (UTC)
- I forgot to test in the Vector skin... (in Monobook there is a comfortable space under the fake "----" already) What about this:
hr { display: none; } .ns-0 h2:not([class]) ~ h2:not(:first-of-type) { border-top: 1px solid #aaa; margin-top: 1.5em; padding-top: .5em; }
- —suzukaze (t・c) 06:13, 13 November 2016 (UTC)
- Oops, that was for testing purposes to make sure that it worked. (It was bright red too.) —suzukaze (t・c) 05:19, 13 November 2016 (UTC)
- I'm pretty sure the border-top should not be 10px; it's too thick. --Daniel Carrero (talk) 04:12, 13 November 2016 (UTC)
Edit request: MediaWiki:Common.css
[edit]Please change the
/* Chinese (Han) */
block to the following
/* Chinese (Han) */
/* Hani: generic */
/* Hans: simplified */
/* Hant: traditional */
.Hani,
.Hans {
font-family: PingFang SC, Heiti SC, DengXian, Microsoft Yahei, SimHei, Source Han Sans CN, Noto Sans CJK SC, SimSun, NSimSun, SimSun-ExtB, Song, sans-serif;
}
.Hant {
font-family: PingFang TC, Heiti TC, Microsoft Jhenghei, Source Han Sans TW, Noto Sans CJK TC, PMingLiU, PMingLiU-ExtB, MingLiU, MingLiU-ExtB, Ming, sans-serif;
}
.Hani,
.Hans,
.Hant {
font-size: 1.2em;
}
.Hani, .Hani *,
.Hans, .Hans *,
.Hant, .Hant * {
font-style: normal;
font-weight: normal;
}
big.Hani, strong.Hani, b.Hani, b .Hani,
big.Hans, strong.Hans, b.Hans, b .Hans,
big.Hant, strong.Hant, b.Hant, b .Hant {
font-size: 137%;
}
.Hani b,
.Hans b,
.Hant b {
font-size: 125%;
}
per discussion at Wiktionary talk:About Chinese#New font for Chinese?. Thanks. Wyang (talk) 10:40, 13 October 2016 (UTC)
- (But leave the Korean and Vietnamese sections alone. —suzukaze (t・c) 10:44, 13 October 2016 (UTC))
- @Atitarev Could you help please? Wyang (talk) 06:55, 14 October 2016 (UTC)
- @Wyang I will, as soon as I get to my desktop and have a moment. Sorry for the accidental revert, BTW. Silly Safari on iPhone.--Anatoli T. (обсудить/вклад) 06:58, 14 October 2016 (UTC)
- No worries, thanks! Wyang (talk) 06:59, 14 October 2016 (UTC)
- @Wyang I will, as soon as I get to my desktop and have a moment. Sorry for the accidental revert, BTW. Silly Safari on iPhone.--Anatoli T. (обсудить/вклад) 06:58, 14 October 2016 (UTC)
- @Atitarev Could you help please? Wyang (talk) 06:55, 14 October 2016 (UTC)
- @Wyang Done, please check. --Anatoli T. (обсудить/вклад) 07:18, 14 October 2016 (UTC)
- Looks good, thanks! Wyang (talk) 07:25, 14 October 2016 (UTC)
- @Wyang Done, please check. --Anatoli T. (обсудить/вклад) 07:18, 14 October 2016 (UTC)
OrphicBot and variation appendices
[edit]There's a bug in user:OrphicBot , it doesn't keep track of the existence of the variation appendices. See this diff [2] where at Uru, the Appendix:Variations of "uru" already appears, but OrphicBot adds accented and capitalized variants atop that as well.
I would suppose the fix would be to check for the existence of a variations page, scrape that appendix and see if the variations that OrphicBot is tracking is listed on the appendix page or not, and then add a fixme banner with the missing variations to the top of the appendix page. The bot would then add the appendix page to entry pages that are missing links to the appendix.
-- 65.94.171.217 04:00, 14 October 2016 (UTC)
- Pinging the operator, @Isomorphyc —Μετάknowledgediscuss/deeds 04:56, 14 October 2016 (UTC)
- @Metaknowledge, 65.94.171.217: Currently, the also links consist of computed and user-added variations. If there are eight or fewer computed variations, I add the direct links, whether or not an appendix exists. If there are more than eight, I do not add links, but only the appendix. (Currently, I have not committed anything to appendices yet; writing to the appendices is still a work in progress.) I think this a reasonable policy because appendices often contain data outside of the primary scope of the also templates, such as alternate script variants. The suppression of links and creation of an appendix depend on the size of the computed list, but neither depends on the other, because anyone can add an appendix for any reason independent of list size. Isomorphyc (talk) 10:02, 14 October 2016 (UTC)
- How about adding the
{{also}}
to the variation appendix instead, with the variations that OrphicBot is tracking; while each entry page would get linked to the variation appendix? Anyone cleaning up the appendix page can add entries to the proper locations there, while entry pages show the appendix link instead of the variations that OrphicBot is tracking. (this would not affect non-tracked links in the{{also}}
on entry pages) -- 65.94.171.217 04:36, 15 October 2016 (UTC)
- How about adding the
- @65.94.171.217: This is close to the policy which will be followed for pages with more than eight computed variants, except that since there are only a few hundred appendices, it is really not that hard to put words under appropriate headings directly after performing some de minimis normalisation of the appendices. But about 99.85% of the
{{also}}
templates do not involve appendices. The primary purpose is to provide easily accessible links in a prominent place. A line has to be drawn someplace for including links whether or not an appendix exists, and eight was the smallest number suggested. The largest was fifteen. Thanks for thinking about this, though. Isomorphyc (talk) 18:13, 16 October 2016 (UTC)
- @65.94.171.217: This is close to the policy which will be followed for pages with more than eight computed variants, except that since there are only a few hundred appendices, it is really not that hard to put words under appropriate headings directly after performing some de minimis normalisation of the appendices. But about 99.85% of the
- Does/Can OrphicBot also add the variation appendices into the
{{also}}
on each entry page for which a variation appendix page exists ? -- 65.94.171.217 08:44, 19 October 2016 (UTC)- @65.94.171.217: It does not currently, but it will in the next set of updates. Isomorphyc (talk) 15:10, 24 October 2016 (UTC)
Noun Classes in translation tables
[edit]When you enter a genus and a class, you get an error saying you can't have both, but within Germanic languages, speaking of masculine class I vs. feminine class I is quite common. Why the blockage? Korn [kʰũːɘ̃n] (talk) 09:36, 14 October 2016 (UTC)
- Classes here are meant to be used for languages that have classes in place of genders, such as the Bantu languages. It is not meant to be a declension class. --WikiTiki89 14:56, 14 October 2016 (UTC)
- Although it's such a pain and so unhelpful that I pretty much never add noun classes. Makes me think we really should do away with gender altogether for translations. —Μετάknowledgediscuss/deeds 04:57, 15 October 2016 (UTC)
- I don't see how this initial intention is a reason to actively block the usage of both when such usage could be used. Korn [kʰũːɘ̃n] (talk) 10:48, 15 October 2016 (UTC)
- Translation tables aren't grammar tables. —CodeCat 21:26, 22 October 2016 (UTC)
- I agree. The translation tables should just link to the entry and the entry should have all the information. I would also do away with transliterations/transcriptions everywhere except entries and etymologies. But these are radical changes that are unlikely to be accepted. --WikiTiki89 13:48, 24 October 2016 (UTC)
- Translation tables aren't grammar tables. —CodeCat 21:26, 22 October 2016 (UTC)
- I don't see how this initial intention is a reason to actively block the usage of both when such usage could be used. Korn [kʰũːɘ̃n] (talk) 10:48, 15 October 2016 (UTC)
- Although it's such a pain and so unhelpful that I pretty much never add noun classes. Makes me think we really should do away with gender altogether for translations. —Μετάknowledgediscuss/deeds 04:57, 15 October 2016 (UTC)
Template:wikipedia
[edit]Hello. I copy wikipedia template to Azerbaijani Wiktionary. But I have a design problem. Can somebody fix it? --Aabdullayev851 (talk) 18:32, 14 October 2016 (UTC)
- What's the problem? --WikiTiki89 18:34, 14 October 2016 (UTC)
Edit request: template:audio
[edit]There's a redundant and very obtrusive space between a list marker and the box, and this is visible everywhere. This is one way to fix. This is the third time I have asked this. --Dixtosa (talk) 12:11, 15 October 2016 (UTC)
- OK, I fixed it the way you requested. Please let me know if something breaks as a result. Benwing2 (talk) 21:44, 19 October 2016 (UTC)
- @Dixtosa Benwing2 (talk) 21:45, 19 October 2016 (UTC)
- @Benwing2: I am sorry that was not a perfect fix. Please apply this change too. --Giorgi Eufshi (talk) 06:36, 20 October 2016 (UTC)
- @Dixtosa Benwing2 (talk) 21:45, 19 October 2016 (UTC)
Removing redundant en-noun 'head' parameters
[edit]I've been thinking about doing this automatically. Is it worthwhile/desirable?
- For English noun entries only: if (i) the headword contains at least one space, and (ii) the entry uses the en-noun template with an explicit head parameter, and (iii) the value of that head parameter is identical to the headword except that each space-separated word is enclosed in [[...]]...
- ...then remove the head parameter. For example, this would change {{en-noun|head=[[puppy]] [[mill]]}} to {{en-noun}}.
- Rationale: this parameter value is redundant: it no longer makes any difference to the formatting and hyperlinks in the displayed headword, therefore it is just another piece of clutter in the markup. However, it is very commonly present because it used to be required.
Equinox ◑ 12:33, 15 October 2016 (UTC)
- Why limit this to English nouns? —Aɴɢʀ (talk) 13:32, 15 October 2016 (UTC)
- The automatic thing cannot inflect, right? And it is preferable to directly link to lemmas, I think.--Dan Polansky (talk) 13:38, 15 October 2016 (UTC)
- Because I'm most familiar with en-noun and not sure what I might break by doing it to other parts of speech... Equinox ◑ 14:54, 18 October 2016 (UTC)
- I interpreted Angr's question as referring to nouns in other languages - have I got that wrong? DonnanZ (talk) 14:59, 18 October 2016 (UTC)
- Because I'm most familiar with en-noun and not sure what I might break by doing it to other parts of speech... Equinox ◑ 14:54, 18 October 2016 (UTC)
- I have removed a few redundant headers, but sometimes headers are still necessary, e.g. in a three-word entry which is actually derived from two parts. I also find headers are necessary for proper nouns such as Den dominikanske republikk. OK, that's not English, but the same can apply in English. DonnanZ (talk) 13:46, 15 October 2016 (UTC)
- A three-word entry derived from two spaced parts, like ice cream maker, won't be matched by the rules I stated above (unless already mis-entered, in which case my process will neither help nor harm it, but only remove the redundant markup). Equinox ◑ 13:50, 15 October 2016 (UTC)
- That looks like the right logic. Conversely, I think instances, with rare exceptions, of English lemma noun headwords containing a hyphen ought to have head= and wikilinked components. DCDuring TALK 14:02, 15 October 2016 (UTC)
- While it achieves not a lot, yes, I think it's always good practice to remove redundant parameters from templates, as you do see things like {{en-noun|puppy mills|head=[[puppy]] [[mill]]}} which can simply be converted to {{en-noun}}. It's mostly for human readability although making pages slightly smaller seems like a good enough reason also. If you can make a page load 1% faster with no ill effects, why not do it? Renard Migrant (talk) 15:24, 18 October 2016 (UTC)
- Okay. Should I start a vote or something? I have never run a "bot" as such although I sometimes create software that performs edits in small batches, which I babysit, rather than leaving overnight. Equinox ◑ 00:35, 19 October 2016 (UTC)
- @Equinox I created a tracking category: Category:English terms with redundant head parameter. DTLHS (talk) 00:49, 19 October 2016 (UTC)
- There are still some redundant headers in Norwegian inflection entries which I remove when I find them, the same is also true in Danish. DonnanZ (talk) 10:28, 19 October 2016 (UTC)
Could someone with access to Module:translations allow it to pass lit=
to full_link
? Crom daba (talk) 17:14, 15 October 2016 (UTC)
- I'll try to look at this when I have a chance. Benwing2 (talk) 21:42, 19 October 2016 (UTC)
- @Benwing, @Benwing2 here's the edit that I'm proposing, and here's what the results should look like. Crom daba (talk) 17:06, 23 October 2016 (UTC)
- @Crom daba Done. Benwing2 (talk) 20:00, 23 October 2016 (UTC)
- Come on, could you at least keep the parameters in alphabetical order? :p —CodeCat 20:01, 23 October 2016 (UTC)
- @Crom daba Done. Benwing2 (talk) 20:00, 23 October 2016 (UTC)
- @Benwing, @Benwing2 here's the edit that I'm proposing, and here's what the results should look like. Crom daba (talk) 17:06, 23 October 2016 (UTC)
Categories for number of syllables in a word
[edit]I noticed @Bcent1234 manually categorizing words by number of syllables. It seems like a worthwhile but bot-like task that could be accomplished much more easily. Is there any way this could be done automatically by {{hyphenation}}
, at least for words over one syllable? — Eru·tuon 20:23, 17 October 2016 (UTC)
Whoops, I see that Module:hyphenation has categorize_syllables
array, so maybe English just has to be added to that array? — Eru·tuon 20:25, 17 October 2016 (UTC)
It seems there was already a discussion on this: Wiktionary:Beer parlour/2016/February#Proposal: "Category:English trisyllabic words" and similar categories. Since hyphenation is different from syllabification, the categorization cannot be done by the template. Would be neat if it could be done by the IPA template, but syllabification itself is such a thorny issue that perhaps that is impossible too... — Eru·tuon 20:30, 17 October 2016 (UTC)
- We've already discussed this a billion times. English hyphenation is not equivalent to the syllabification. --WikiTiki89 14:10, 18 October 2016 (UTC)
@Daniel_Carrero wrote some Lua code which looked at the information in the IPA template, but the marking of syllables was not there consistently, some words were mis-classified. There was some discussion, and even though we were manually marking the syllabification in the IPA for wrong cases, it wasn't fast enough, and someone with the ability to disable it chose to remove the code. As I need the syllabification for my class for English as a second language, I'm manually marking it. Bcent1234 (talk) 11:54, 18 October 2016 (UTC)
- Okay. It sounds like both the hyphenation template and the IPA transcriptions are unlikely to work. But what if there were a template, like
{{syllables|en|6}}
or{{syllabification|en|syl|la|bi|fi|ca|tion}}
? Would that make things easier? — Eru·tuon 15:58, 19 October 2016 (UTC)- How is that easier than just adding the category? I've already suggested
{{cln|en|X-syllable words}}
, which uses an existing template, but that is just a minor improvement. --WikiTiki89 17:12, 19 October 2016 (UTC)- @Wikitiki89: Well,
{{syllables|en|6}}
would be a little shorter to type than[[Category:English 6-syllable words]]
or{{cln|en|6-syllable words}}
;{{syllabification|en|syl|la|bi|fi|ca|tion}}
could maybe both categorize and display some sort of spelled-out syllabification, similar to the hyphenation... — Eru·tuon 17:26, 19 October 2016 (UTC)- I would oppose having a spelled-out syllabification, otherwise we'd do it in the IPA. And when you're copying and pasting, the length doesn't really matter, there is still just one character in
[[Category:English 6-syllable words]]
that needs to be changed from page to page. --WikiTiki89 17:29, 19 October 2016 (UTC)- @Wikitiki89: Perhaps
{{IPA}}
should have a parameter that turns on categorization, to be used with transcriptions that actually have all syllable breaks marked. — Eru·tuon 17:34, 19 October 2016 (UTC)- But since we decided that it is detrimental to show syllable breaks in English, that wouldn't solve this problem. --WikiTiki89 17:35, 19 October 2016 (UTC)
- @Wikitiki89: Huh. Why would it be detrimental? Is there a discussion you could link to? — Eru·tuon 17:44, 19 October 2016 (UTC)
- In case you're unaware, everything in this discussion has already been discussed in a million different places over the past month or two. Check the WT:BP and User talk:Daniel Carrero, and I'm definitely forgetting some other places. --WikiTiki89 17:51, 19 October 2016 (UTC)
- Meh. I found a couple of relevant discussions. English syllabification is such a problem, and the IPA really needs a symbol for ambisyllabicity. Maybe there should be a note about the result of these discussions somewhere... — Eru·tuon 18:05, 19 October 2016 (UTC)
- In case you're unaware, everything in this discussion has already been discussed in a million different places over the past month or two. Check the WT:BP and User talk:Daniel Carrero, and I'm definitely forgetting some other places. --WikiTiki89 17:51, 19 October 2016 (UTC)
- @Wikitiki89: Huh. Why would it be detrimental? Is there a discussion you could link to? — Eru·tuon 17:44, 19 October 2016 (UTC)
- But since we decided that it is detrimental to show syllable breaks in English, that wouldn't solve this problem. --WikiTiki89 17:35, 19 October 2016 (UTC)
- @Wikitiki89: Perhaps
- I would oppose having a spelled-out syllabification, otherwise we'd do it in the IPA. And when you're copying and pasting, the length doesn't really matter, there is still just one character in
- @Wikitiki89: Well,
- How is that easier than just adding the category? I've already suggested
It occurred to me that syllables can be counted by searching for syllable nuclei (vowels or syllabic consonants). I'll work on a module that does that. — Eru·tuon 19:13, 19 October 2016 (UTC)
- That's a much better idea. That was the path I was trying to get Daniel Carrero to go down, but he seemingly didn't want to. However, there are a few things that would need to be considered, especially involving quasi-syllabic consonants. Does fire have one syllable or two? Does real have one syllable or two? We sort of agreed that these sorts of ambiguous cases should be placed in both categories. See this discussion that I finally found. --WikiTiki89 19:31, 19 October 2016 (UTC)
- I saw that discussion at some point when making the first few posts in this thread. All those words with vowels and a liquid should probably have two transcriptions – for GA real, /ɹil/ and /ɹi(.)əl/ or /ɹi(.)l̩/ – and one transcription would cause them to be categorized as having x number of syllables, the other as x + 1 (and so on, if there is more than just one ambiguous sequence in the word). — Eru·tuon 21:52, 19 October 2016 (UTC)
- Can this be done for all languages simultaneously or would you have to account for language specific rules? DTLHS (talk) 22:46, 19 October 2016 (UTC)
- There will have to be some sort of language-specific rules to determine which vowel sequences are diphthongs. English transcriptions don't currently use non-syllabic diacritics, so I'm going to have to define which vowel sequences are diphthongs and which aren't. For instance, GA lower is transcribed as /ˈloʊɚ/, with a three-vowel sequence, but two of those vowels form a diphthong, while the other forms another syllable. Finnish transcriptions, on the other hand, do seem to use non-syllabic diacritics, so I could just use the presence of that diacritic as an indication that something is a diphthong. (On the other hand, it might be good to have rules defining diphthongs just in case some Finnish transcriptions don't use the non-syllabic diacritic.) — Eru·tuon 23:27, 19 October 2016 (UTC)
- (e/c) I was going to say the same thing .... /aɪ/ is a diphthong in English but two vowels in Russian. As for real, I don't think you should count on there being two transcriptions and instead try to have the module automatically categorize into both N and N+1 syllables in ambiguous cases, unless it's unpredictable. Keep in mind that our use of IPA for English isn't super-consistent and there are potentially lots and lots of cases that would need to be fixed up the way you suggest. Benwing2 (talk) 23:31, 19 October 2016 (UTC)
- BTW as for syllable divisions, they are helpful in some cases, particularly at juncture boundaries in compound words (e.g. /ˈgeɪtˌweɪ/ is pronounced quite differently from /ˈgeɪˌtweɪ/; OTOH in most of these cases there will be a primary or secondary stress marker at the syllable boundary, making the syllable dot unnecessary) but I think the general agreement is that syllable divisions aren't terribly helpful in English in most cases because of the difficulty of deciding where to place the boundary in common words like color, writer etc. I don't think there's even a generally accepted theory of how to divide English words into syllables, how to determine which consonants are ambisyllabic, or for that matter whether ambisyllabic consonants exist at all. Benwing2 (talk) 23:38, 19 October 2016 (UTC)
- When the Toronto Globe and Mail fouled up one of my contributions in 1984 I asked them how they could ever hope that their machine would know that "realm" was always and everywhere monosyllabic. They shrugged me off, but Lo! twenty years later even a vastly improved system managed to divide it. I presented them with their own rea-lm in triumph. They conceded my point -- and duly got it fixed in something like another ten years flat.
- David Lloyd-Jones (talk) 12:42, 20 October 2016 (UTC)
- There will have to be some sort of language-specific rules to determine which vowel sequences are diphthongs. English transcriptions don't currently use non-syllabic diacritics, so I'm going to have to define which vowel sequences are diphthongs and which aren't. For instance, GA lower is transcribed as /ˈloʊɚ/, with a three-vowel sequence, but two of those vowels form a diphthong, while the other forms another syllable. Finnish transcriptions, on the other hand, do seem to use non-syllabic diacritics, so I could just use the presence of that diacritic as an indication that something is a diphthong. (On the other hand, it might be good to have rules defining diphthongs just in case some Finnish transcriptions don't use the non-syllabic diacritic.) — Eru·tuon 23:27, 19 October 2016 (UTC)
- Can this be done for all languages simultaneously or would you have to account for language specific rules? DTLHS (talk) 22:46, 19 October 2016 (UTC)
- I saw that discussion at some point when making the first few posts in this thread. All those words with vowels and a liquid should probably have two transcriptions – for GA real, /ɹil/ and /ɹi(.)əl/ or /ɹi(.)l̩/ – and one transcription would cause them to be categorized as having x number of syllables, the other as x + 1 (and so on, if there is more than just one ambiguous sequence in the word). — Eru·tuon 21:52, 19 October 2016 (UTC)
Okay, so I have created a set of definitions for English diphthongs in Module:syllables, based on the list of symbols in Appendix:English pronunciation. There's so far one ambiguity: IPA(key): /iə/ is a diphthong in New Zealand and a disyllabic sequence in GA. Thus, the module currently says that IPA(key): /aɪˈdiə/, the transcription of idea in General American, has two syllables rather than three. Other than that, I can't think of any ambiguities. I s'pose this could be solved by adding a dialect parameter to the {{IPA}}
template, and then defining which diphthongs exist in each dialect. — Eru·tuon 03:15, 21 October 2016 (UTC)
- Not a bad idea, but that would essentially make the
{{a}}
template useless in the long term. Also to consider: will every call to{{IPA}}
refer to at most one dialect? —CodeCat 21:30, 22 October 2016 (UTC)- This isn't exactly an answer, but I am thinking an easier solution would be to simply replace /iə/ with /i.ə/. I did a search in the source code, and there were only 104 results. Wouldn't be too hard to go through them and make replacements when appropriate. Adding dialects to the IPA template seems like it would be too much work to solve one tiny problem. And there are cases where one transcription is given for two dialects (for instance, in how). Currently there are transcriptions that don't have dialect specified, but that should be corrected, because most words will have at the very least one vowel that is transcribed differently across the dialects in Appendix:English pronunciation. — Eru·tuon 22:06, 22 October 2016 (UTC)
- This issue is far from unique to English. In the "standard" pronunciation of Finnish, /ua/ is a two-vowel sequence, but in Savonian it's a diphthong. —CodeCat 22:18, 22 October 2016 (UTC)
- I guess either we allow
{{IPA}}
to accept a dialect parameter, or make sure that either diphthongs are marked with the non-syllabic diacritic or ambiguous non-diphthongs are marked with a syllable divider. I did a bunch of edits to add syllable divisions in /i.ə/. There are still cases of (old-fashioned?) RP /ɪə/ that's supposed to be /ɪ.ə/, though, and those may be harder to correct... though they may be findable by Module:IPA/templates, since their syllable will never be preceded by a stress mark. (It would require more complex code.) — Eru·tuon 21:51, 23 October 2016 (UTC)
- I guess either we allow
- We now have a working function in Module:syllables. We should probably integrate this into Module:IPA though. Can someone unprotect the module? —CodeCat 23:54, 22 October 2016 (UTC)
- @Daniel Carrero —CodeCat 14:30, 24 October 2016 (UTC)
- This issue is far from unique to English. In the "standard" pronunciation of Finnish, /ua/ is a two-vowel sequence, but in Savonian it's a diphthong. —CodeCat 22:18, 22 October 2016 (UTC)
- This isn't exactly an answer, but I am thinking an easier solution would be to simply replace /iə/ with /i.ə/. I did a search in the source code, and there were only 104 results. Wouldn't be too hard to go through them and make replacements when appropriate. Adding dialects to the IPA template seems like it would be too much work to solve one tiny problem. And there are cases where one transcription is given for two dialects (for instance, in how). Currently there are transcriptions that don't have dialect specified, but that should be corrected, because most words will have at the very least one vowel that is transcribed differently across the dialects in Appendix:English pronunciation. — Eru·tuon 22:06, 22 October 2016 (UTC)
Pronunciation for alternate spellings?
[edit]What is our policy on how to handle alternate spellings like oute ? Do we put a pronunciation with them from the current spelling [out], do we ask someone to put a spelling on the word using rfap with the argument that it might have been pronounced differently when the alternate spelling was in vogue, or do we just leave it without a pronunciation, thereby inpoverishing wiktionary ? Bcent1234 (talk) 14:52, 18 October 2016 (UTC)
- It doesn't impoverish Wiktionary to keep all the information at the main entry. Also, oute probably never had a final schwa sound, the only difference in pronunciation back then might have been the main vowel, which would have applied equally to the spelling out. --WikiTiki89 14:57, 18 October 2016 (UTC)
Template:wikipedia layout considerations -- luafication has broken certain HTML / CSS behavior
[edit]Apparently the {{wikipedia}}
template was luafied on 13 October. This now breaks certain kinds of layout. In the past, multiple instances of {{wikipedia}}
could all be clustered in a single <div> element for on-page layout and grouping. That is no longer possible. This causes some unfortunate layout artifacts, such as at 野兎, where the {{ja-kanjitab}}
template should (and formerly did) appear immediately to the left of the {{wikipedia}}
templates as rendered, where the three appeared in the same block, all aligned on the right of the page, roughly like:
-------- -------------
| ja-k | | wikipedia |
| anji | -------------
| tab | | wikipedia |
-------- -------------
As now observable at 野兎 and numerous other pages, this layout is no longer possible. In the past, one could explicitly wrap the {{wikipedia}}
calls in a <div> element to force this, such as:
<div style="float:right;">
{{wikipedia|lang=ja}}
{{wikipedia|ENGLISH TITLE}}
</div>
{{ja-kanjitab|...}}
However, this no longer works, apparently because the Lua invocation forces a break in the containing div, so it only wraps the first {{wikipedia}}
template and not the second.
This leads me to questions:
- Why luafy this in the first place? This template doesn't do much (not many moving parts, nothing that requires heavy logic), and it's been stable for a long time.
- Can this be reverted without causing serious disruption?
- If reversion is undesirable, can the layout at least be fixed, so that multiple instances of
{{wikipedia}}
can still be wrapped in a single <div> element?
TIA,
‑‑ Eiríkr Útlendi │Tala við mig 20:33, 18 October 2016 (UTC)
- There was one too many closing
</div>
s, I fixed that now. But I still can't get 野兎 to look the way you want. —CodeCat 20:49, 18 October 2016 (UTC)
- @Eirikr: I think in the case of the entry 野兎, the reason why the kanji box does not display to the left of the two Wikipedia boxes is actually because of the image that intervenes between the two. I removed the image and added the
<div style="float: right;"></div>
around the Wikipedia boxes, and then the kanji box moved up and sat to the left of the Wikipedia boxes rather than underneath them. I think the module generates the exact same Wikipedia boxes as the original template did; it's just your addition of an image that messed things up. — Eru·tuon 00:14, 19 October 2016 (UTC)
- Actually, I just put the div around the Wikipedia boxes and the image in 野兎, and now, at least for me, the kanji box sits in the position that you want it to. What do you think; does it look better now? — Eru·tuon 00:18, 19 October 2016 (UTC)
- That doesn't look bad to me even with rhs TOC. DCDuring TALK 00:40, 19 October 2016 (UTC)
- Fabulous! @CodeCat, I suspect it was the extra </div> -- thank you for finding and fixing that. :)
- @Erutuon, exactly what you describe -- the wrapping div trick is what I've used for some time (including the image too, sorry I forgot to mention that before). With CodeCat's fix, the div wrapper now works again. :)
- Cheers all! ‑‑ Eiríkr Útlendi │Tala við mig 01:46, 19 October 2016 (UTC)
Improving the display of Module:grc-pronunciation
[edit]Several people have complained about the Ancient Greek IPA template {{grc-IPA}}
. By default it displays three transcriptions with right arrows, but it has a "more" button that you click to view more information. This button displays on the far right side of the content area. The HTML content of this template is generated by Module:grc-pronunciation. The <div>
that contains the whole transcription thingy does not have width
set, so it defaults to width: auto;
. This apparently makes it the width of the whole content area. An example follows:
Pronunciation
POS header
Now, I came up with what I thought was a solution: setting width: max-content;
, white-space: nowrap;
, and adding some more <div>
s. But for some reason that makes the POS header immediately below the Pronunciation section vanish. See the example to the right (generated by {{grc-IPA/sandbox}}
and Module:grc-pronunciation/sandbox). There's a blank space under the pronunciation, where the header POS header should be. The header is in the HTML source code, but it doesn't display.
I did a little tinkering, and it's the width: max-content;
that's the culprit. When I remove that CSS property, the POS header reappears. I have no idea why this is the case. Does anyone know what's going on, or have any alternative ways to fix the Ancient Greek IPA box so that the "more" button is not all the way over on the right? — Eru·tuon 16:26, 19 October 2016 (UTC)
Creating Redlink Categories
[edit]Can anyone explain to me how redlink categories (for example, Category:German redlinks) are created and added to Category:Redlinks by language? I think editors would find it useful. --Lo Ximiendo (talk) 03:51, 20 October 2016 (UTC)
- It is controlled by
{{redlink category}}
. And it's only useful if there's someone who wants to use it. DTLHS (talk) 04:13, 20 October 2016 (UTC)- Apparently, populating redlink categories for all languages causes a huge load on the server and some pages end up with module errors. That's why only some languages have redlink categories activated at the moment. You can add/remove languages by editing that template linked by DTLHS. --Daniel Carrero (talk) 17:50, 21 October 2016 (UTC)
- It would be much better accomplished by a dump-parsing bot or toolserver tool. Anything computationally expensive which doesn't need to be real-time probably shouldn't be real time. - TheDaveRoss 17:54, 21 October 2016 (UTC)
- I don't mind if people decide to replace that system by dump parsing. I created the current redlink categorization and I don't have a lot of experience in parsing dumps, so the current situation is the best that we have until further notice. One thing that I like about the current system is that, if someone wants to clean up redlink categories, they get smaller in real time and it's clear how many entries are left. Parsed dumps can get outdated soon if not constantly updated, like our indices. We only probably need to list redlink categories for a few languages anyway-- those where people want to fix the redlinks, so the strain is much smaller than it would be if all languages had those categories. @SemperBlotto said that was interested in fixing the Italian entries at some point. --Daniel Carrero (talk) 18:06, 21 October 2016 (UTC)
- It would be much better accomplished by a dump-parsing bot or toolserver tool. Anything computationally expensive which doesn't need to be real-time probably shouldn't be real time. - TheDaveRoss 17:54, 21 October 2016 (UTC)
- Apparently, populating redlink categories for all languages causes a huge load on the server and some pages end up with module errors. That's why only some languages have redlink categories activated at the moment. You can add/remove languages by editing that template linked by DTLHS. --Daniel Carrero (talk) 17:50, 21 October 2016 (UTC)
Lining up ruby.
[edit]In an ideal world we need better type for ruby. E.g.:
In the entry for 国籍, we find the synonym 市民権 with the hiragana ruby しみんけん (shiminken), both correct, but the しみん appears over the 市, which is actually only the し. 民 is みん.
Almost nobody is going to be led astray by this, but it would be nice in both senses of the word if the ruby lined up correctly. This is the sort of stuff that the typography folks have been working on tirelessly back to Gutenberg in the West, and who knows who in the East, for centuries, so I won't hold my breath, simply being satisfied that somebody will get around to it in due course... All praise to Donald Knuth.
David Lloyd-Jones (talk) 12:33, 20 October 2016 (UTC)
- @David Lloyd-Jones: See diff, which tells the
{{ja-r}}
template of boundaries to follow when positioning kana. (This must be a manual job at the moment.) —suzukaze (t・c) 05:18, 23 October 2016 (UTC)
What controls this list? Why is File:MalayanSunBear.jpg for example, on it? DTLHS (talk) 16:36, 21 October 2016 (UTC)
- It is just a normal page. The bear is there because Connel added it [3]. Equinox ◑ 16:42, 21 October 2016 (UTC)
- I think it had to do with a vandal who liked to add it to entries. - TheDaveRoss 17:59, 21 October 2016 (UTC)
- Is the page actually used for anything, e.g. anti-vandal bots? It hasn't been updated since mid-2009. Equinox ◑ 18:01, 21 October 2016 (UTC)
- [[File:MalayanSunBear.jpg|100px]] results in "" vs. [[File:Malaienbaer2 fcm.jpg|100px]] resulting in
- I think it is built into MW that files listed on that page are prevented from being displayed on other pages. - TheDaveRoss 18:15, 21 October 2016 (UTC)
- Right. To quote WP's description, "The images listed on MediaWiki:Bad image list are prohibited by technical means from being displayed inline on pages, besides specified exceptions. Images on the list have normally been used for widespread vandalism where user blocks and page protections are impractical." If the vandal who liked the bear picture is gone, we could remove it from the blacklist. - -sche (discuss) 13:41, 26 October 2016 (UTC)
- We could probably remove all of them, image insertion vandals are very rare these days. - TheDaveRoss 13:49, 26 October 2016 (UTC)
- I have removed the red links and the picture of the bear. DTLHS (talk) 14:21, 26 October 2016 (UTC)
- We could probably remove all of them, image insertion vandals are very rare these days. - TheDaveRoss 13:49, 26 October 2016 (UTC)
- Right. To quote WP's description, "The images listed on MediaWiki:Bad image list are prohibited by technical means from being displayed inline on pages, besides specified exceptions. Images on the list have normally been used for widespread vandalism where user blocks and page protections are impractical." If the vandal who liked the bear picture is gone, we could remove it from the blacklist. - -sche (discuss) 13:41, 26 October 2016 (UTC)
Setting things to "template editor" protection level
[edit]Now that we have this protection level, we should use it. That means changing the protection settings of a whole lot of stuff, more than would be reasonable to do manually. Does anyone want to do this automatically for stuff like the language data modules? —Μετάknowledgediscuss/deeds 17:27, 21 October 2016 (UTC)
- I don't mind making the changes through the API, the trickier part is defining what should be changed in a way which is able to be fed to the bot. If you have some criteria? - TheDaveRoss 17:44, 21 October 2016 (UTC)
- Can we just apply it to all currently protected modules and templates? DTLHS (talk) 17:46, 21 October 2016 (UTC)
- Yes, that would be the easier solution of all. The question is whether or not there are modules and templates which should not be reduced to that protection level. - TheDaveRoss 17:55, 21 October 2016 (UTC)
- I don't think there are any, actually. Protection levels could of course be changed, should there be reason to do so. —Μετάknowledgediscuss/deeds 19:29, 21 October 2016 (UTC)
- Well, if there are no objections, I can go ahead and write a script to change all of the protection levels. I'll give it a little while so that anyone who is concerned can speak up. - TheDaveRoss 12:34, 24 October 2016 (UTC)
- Should I increase the level of protection on autoconfirmed templates and modules? Or just lower the protection on the sysop ones? There are not actually that many sysop ones. - TheDaveRoss 14:55, 25 October 2016 (UTC)
- I'm going to guess that if someone gave something a lower protection level (just autoconfirmed), then they did that for a reason, so you might as well just lower the protection on sysop ones. —Μετάknowledgediscuss/deeds 17:08, 25 October 2016 (UTC)
- Done. I think that all of the Template and Module namespace pages which were protected at the sysop level are now protected at the template editor level. - TheDaveRoss 20:17, 25 October 2016 (UTC)
Outdated code in MediaWiki:Gadget-legacy.js
[edit]The code related to retrieving a random page in a given language is outdated and can't be updated either because the way we get a random page has changed. It used to query hippie's tool on toolserver, now we use Special:RandomInCategory and the lemma categories. --Dixtosa (talk) 21:42, 22 October 2016 (UTC)
I note that {{ux}}
(and other templates) require a parameter "inline=1" to force templated content to appear where one would want it. See strazds recent history. Using the "inline" parameter leads to improper display of {{taxlink}}
. Can the misbehavior of {{ux}}
and/or of {{taxlink}}
be corrected or must one use workarounds (eg, removing the taxa to derived terms in the case of [[strazds]] or dropping use of {{ux}}
when it doesn't give desired results)? DCDuring TALK 23:27, 22 October 2016 (UTC)
- What specifically is the problem with the display? DTLHS (talk) 23:28, 22 October 2016 (UTC)
- The desired behavior in the case of [[strazds]] is that the translation of the Lativian usex appear on the same line as the Latvian text and that
{{taxlink}}
produce its usual display of its taxon. DCDuring TALK 23:36, 22 October 2016 (UTC)- The last usage example didn't have the inline parameter. I still don't see what the difference is in the taxlink display. DTLHS (talk) 23:39, 22 October 2016 (UTC)
- I may have made a mistake when converting the page: I changed all instances of "=" to "
{{=}}
"- if that parameter was inside of another template it would create an error. I'll try to fix this. DTLHS (talk) 23:41, 22 October 2016 (UTC) - (After e/c) These were the behaviors I saw with various combinations of presence of absence of inline=1 and of
{{=}}
:
- The desired behavior in the case of [[strazds]] is that the translation of the Lativian usex appear on the same line as the Latvian text and that
:::#: {{ux|lv|dziedātāj'''strazds'''|song '''thrush''' ({{taxlink|Turdus philomelos|species|noshow=1|nocat=1}})|inline=1}} [edited, replacing "<nowiki>{{=}}" with "=" to decateogrize this page] :::#: {{ux|lv|dziedātāj'''strazds'''|song '''thrush''' ({{taxlink|Turdus philomelos|species|noshow=1|nocat=1}})}} <nowiki>:::#: {{ux|lv|dziedātāj'''strazds'''|song '''thrush''' ({{taxlink|Turdus philomelos|species|noshow{{=}}1|nocat=1|noshow=1}})|inline=1}} <nowiki>:::I neglected the one combination that produced the correct result, ie, removed {{temp|{{=}}}}: <nowiki>:::#: {{ux|lv|dziedātāj'''strazds'''|song '''thrush''' ({{taxlink|Turdus philomelos|species|noshow=1|nocat=1}})|inline=1}}
- If there are a lot of insertions of
{{=}}
, then it is beyond my capability to reverse them. DCDuring TALK 23:52, 22 October 2016 (UTC)- I removed
{{=}}
from the two entries that had both{{ux}}
and{{taxlink}}
, the only misbehavior that I was looking for. DCDuring TALK 00:04, 23 October 2016 (UTC){{ru-ux}}
seems to prevent named parameters within{{taxlink}}
from working, eg, noshow=1, which is supposed to remove the entry from a category. DCDuring TALK 00:08, 23 October 2016 (UTC)- @DCDuring: Can you link to an example of this? --WikiTiki89 13:54, 24 October 2016 (UTC)
- @Wikitiki89 русский, for one. Here are all six (only 6!) entries that use both
{{ru-ux}}
and{{taxlink}}
. - I had misdiagnosed the problem as categorization, where it was actually something to do with search. My searchlist when using "insource" wasn't updating promptly when there were many (c. 800) results of the search AND
{{taxlink}}
was within{{ru-ux}}
. If this portends other problems with{{ru-ux}}
, it might be worth addressing. Apparently I can easily work around it now, mostly by making the search more selective, using MediaWiki's suggested means for doing so. DCDuring TALK 17:02, 24 October 2016 (UTC)
- @Wikitiki89 русский, for one. Here are all six (only 6!) entries that use both
- @DCDuring: Can you link to an example of this? --WikiTiki89 13:54, 24 October 2016 (UTC)
- I removed
- If there are a lot of insertions of
- I have analyzed my edits and determined the only problems were on strazds and ģints. DTLHS (talk) 00:09, 23 October 2016 (UTC)
Detecting incorrect use of Template:etyl and Template:der
[edit]It seems like there are lots of cases where {{etyl|language|ancestor}} {{m|ancestor|term}}
is used instead of {{inh|language|ancestor|term}}
. It wouldn't be too hard to correct these using some regex and AWB, but there's no way to select for these cases. I imagine, since ancestor languages are defined in the data modules attached to Module:language, that there's some way to program Module:etymology to put all uses of {{etyl|language|ancestor}}
into a category, which could then be swept through using AWB. Or maybe a bot could do it, because who knows how many entries there would be.
I think there was a bot replacing {{etyl}}
{{m}}
with {{der}}
, but that was discontinued, presumably because it doesn't result in different categorization. But more fully implementing {{inh}}
would move terms from derived categories to inherited ones.
I suppose there are the problematic cases, where a word was not inherited from its ancestor language, but more of a learned borrowing. That's a problem. I mean, you don't want a bot making an etymology say that French linguiste was inherited from Latin. Man, that's kind of irritating. But it wouldn't prevent an AWB user from doing it, if people could trust them not to mess things up... — Eru·tuon 05:12, 23 October 2016 (UTC)
- We talked about this a couple months back; most people preferred having humans rather than bots decide whether to replace
{{etyl}}
+{{m}}
with{{der}}
or with{{inh}}
. —Aɴɢʀ (talk) 09:17, 23 October 2016 (UTC)- I continue to use
{{etyl}}
, as I don't want to decide between borrowed and inherited. I don't like the term "borrowed" anyway. DonnanZ (talk) 09:36, 23 October 2016 (UTC)- You can use
{{der}}
if you don't want to decide between borrowed and inherited. —Aɴɢʀ (talk) 10:56, 23 October 2016 (UTC)- OK,
{{der}}
looks more realistic, I may try that. DonnanZ (talk) 11:07, 23 October 2016 (UTC)- If people are going to start using
{{der}}
as a replacement for{{etyl}}
, then the whole argument of not doing a bot replacement because it allows editors to see what needs to be checked is no longer valid. —CodeCat 14:30, 24 October 2016 (UTC)- An editor may make a wrong decision anyway when choosing between
{{bor}}
and{{inh}}
. DonnanZ (talk) 14:50, 25 October 2016 (UTC)
- An editor may make a wrong decision anyway when choosing between
- If people are going to start using
- OK,
- You can use
- I continue to use
- It occurred me that no changes to the modules are needed; I can just browse categories such as Category:English terms derived from Middle English in AWB. — Eru·tuon 17:42, 24 October 2016 (UTC)
- Until the proposal is implemented that makes
{{inh}}
and{{bor}}
categorise in that too. Then you're sore out of luck. —CodeCat 17:47, 24 October 2016 (UTC)- What proposal? I thought
{{inh}}
just categorized entries in Category:English terms inherited from Middle English. That seems to be confirmed by a random entry from the category, which is in the "inherited" category and not the "derived" one. — Eru·tuon 18:34, 24 October 2016 (UTC)- WT:Beer parlour/2016/September#{{bor}} and {{inh}} should also categorize into "Foo terms derived from Bar". —CodeCat 18:57, 24 October 2016 (UTC)
- Huh, thanks for the link. Somehow I didn't see that. — Eru·tuon 19:25, 24 October 2016 (UTC)
- WT:Beer parlour/2016/September#{{bor}} and {{inh}} should also categorize into "Foo terms derived from Bar". —CodeCat 18:57, 24 October 2016 (UTC)
- What proposal? I thought
- Until the proposal is implemented that makes
Term "singular" in Template:definite of necessary?
[edit]I think we no longer need the term "singular" in template:definite of, since it has already been included in another template. Considering the template's name, it will also be misleading if the term "singular" is included. I would like to apply this template to Old Javanese words, such as ṅuni, which have no singular nor plural marker. Cahyo Ramadhani (talk) 11:43, 23 October 2016 (UTC)
A little background: Cahyo Ramadhani changed the template, but I reverted the edit as it affected thousands of Norwegian noun and adjective inflection entries. DonnanZ (talk) 12:03, 23 October 2016 (UTC)
- I'm sure replacement of the template for the Norwegian entries can be done using AWB. Cahyo Ramadhani (talk) 12:24, 23 October 2016 (UTC)
- I haven't got the faintest clue about AWB. DonnanZ (talk) 14:50, 23 October 2016 (UTC)
- I have decided to start making Norwegian inflection entries more "bombproof" anyway. DonnanZ (talk) 12:05, 25 October 2016 (UTC)
It does not work. I rewrote it. Code's at here. Please update it. --Giorgi Eufshi (talk) 08:30, 25 October 2016 (UTC)
Is there any way to rename this ugly-as-sin category? I assume is it automatically generated by MediaWiki but we may be able to at least alter it. Renard Migrant (talk) 20:45, 26 October 2016 (UTC)
- I have no idea. From this page I gather this is a temporary tracking category while they are moved to a parser function. DTLHS (talk) 21:04, 26 October 2016 (UTC)
Entries that should have subscripts or superscripts
[edit]I would like to add a definition of "Tg" (the "glass transition temperature" of a material). Should I add it to Tg? (I note that, as an example, we have H2O instead of "H2O".) SemperBlotto (talk) 10:39, 27 October 2016 (UTC)
- Actually H2O is a redirect to H₂O with the Unicode "subscript two" character. However, AFAICT there is no "Latin subscript small letter g" character, nor is there any way to add formatting to page names, so I think Tg is your only option. —Aɴɢʀ (talk) 14:22, 27 October 2016 (UTC)
- OK I've done that. I couldn't get the
{{en-noun}}
template to display the headword properly. SemperBlotto (talk) 14:29, 27 October 2016 (UTC)- @SemperBlotto: Is it English only, or is it rather translingual? —Aɴɢʀ (talk) 14:35, 27 October 2016 (UTC)
- Not sure. I've only seen it in English texts. SemperBlotto (talk) 14:39, 27 October 2016 (UTC)
- French Wikipedia uses Tv (and calls Tg "anglais") and German Wikipedia uses TG, so it's probably best to call it English. —Aɴɢʀ (talk) 14:48, 27 October 2016 (UTC)
- Not sure. I've only seen it in English texts. SemperBlotto (talk) 14:39, 27 October 2016 (UTC)
- @SemperBlotto: Is it English only, or is it rather translingual? —Aɴɢʀ (talk) 14:35, 27 October 2016 (UTC)
- OK I've done that. I couldn't get the
- For H2O, I prefer having H2O as the main entry and H₂O as a redirect. --WikiTiki89 14:34, 27 October 2016 (UTC)
- That seems like an issue for a different discussion. —Aɴɢʀ (talk) 14:35, 27 October 2016 (UTC)
- What is wrong with having the most common orthographically nice form be the main entry and all the easy-to-type forms be hard redirects? We could even have all the possible Unicode forms be hard redirects as well. (Could we speed the creation of such redirects in the same way we speed the creation of English plurals?) DCDuring TALK 15:17, 27 October 2016 (UTC)
Auto cat
[edit]The template {{auto cat}}
does not handle plural category names properly. For example, at "Category:en:Roses", it displays the text "English terms for rosess". — SMUconlaw (talk) 19:31, 29 October 2016 (UTC)
- There was a mistake in Module:category tree/topic cat/data/Plants. DTLHS (talk) 19:34, 29 October 2016 (UTC)
{{auto cat}}
doesn't actually do any of the work itself. Its job is only to decide which other template to use. That other template,{{topic cat}}
in this case, does the rest. —CodeCat 19:44, 29 October 2016 (UTC)- Thanks. — SMUconlaw (talk) 19:48, 29 October 2016 (UTC)
Module:documentation changes
[edit]I have updated it to detect specified data modules (Module:Quotations/*
, Module:zh/data/dial-pron/*
etc.) and categorize automatically. Also specified data module lists won't get "needing documentation" category. --Giorgi Eufshi (talk) 08:05, 31 October 2016 (UTC)
- Nice, that was really needed. The category just went down from 39,000 to 3,180. —Enosh (talk) 12:05, 31 October 2016 (UTC)
1400 bytes less yet no difference
[edit]diff. Has anyone faced this abnormality and figured it out?--Dixtosa (talk) 20:53, 31 October 2016 (UTC)
- I don't know. How did you intend to edit the page? DTLHS (talk) 21:08, 31 October 2016 (UTC)
- I opened and saved it without changing a bit trying to null-edit. --Giorgi Eufshi (talk) 05:41, 1 November 2016 (UTC)
- Maybe there is some kind of metadata that got updated, since the page hadn't been edited in 2 years. DTLHS (talk) 21:15, 31 October 2016 (UTC)
- The software ignores you if you try to save a null edit. What did you actually do? (P.S. I recently saw a clever zip file that contains a compressed version of the identical zip file. Turtles all the way down!) Equinox ◑ 21:17, 31 October 2016 (UTC)
- But then it wouldn't generate a diff, would it? This is a case of "this shouldn't happen, but it did". —CodeCat 21:22, 31 October 2016 (UTC)
- It must have something to do with the recent edits to the documentation module. DTLHS (talk) 21:37, 31 October 2016 (UTC)
- Oddly,
Module:zh/data/hak-pron/*
modules along withModule:Quotations/*
's were the only ones that stayed longer than they should have stayed (12h+) in the category, but mod:Quotations did not have this thing. --Giorgi Eufshi (talk) 05:41, 1 November 2016 (UTC)
- Oddly,
- It must have something to do with the recent edits to the documentation module. DTLHS (talk) 21:37, 31 October 2016 (UTC)
- But then it wouldn't generate a diff, would it? This is a case of "this shouldn't happen, but it did". —CodeCat 21:22, 31 October 2016 (UTC)
- I think I know what happened. The bytes coincide with the number of lines. So I think it is a \r\n and \n thing. --Giorgi Eufshi (talk) 05:47, 1 November 2016 (UTC)
- Wow, how did you figure that out? You'd think the software would normalize line endings the way it normalizes combining characters. Anyway, I guess this is evidence that Wyang uses Windows... --WikiTiki89 19:05, 1 November 2016 (UTC)