Template talk:ja-see
Add topic"For a list of all kanji with on'yomi えい, not just those used in Sino-Japanese terms, see [...]"
[edit]@Dine2016 What about linking to Category:Japanese kanji with on reading えい? —Suzukaze-c◇◇ 04:55, 5 February 2019 (UTC)
@Dine2016, Poketalker This page doesn't have any categories added by the template . DTLHS (talk) 14:31, 22 April 2019 (UTC)
- @DTLHS: solved; just forgot to put an "h" to
{{ja-see}}
. ~ POKéTalker(═◉═) 21:19, 22 April 2019 (UTC)- "h" is no longer needed :) --Dine2016 (talk) 06:10, 26 May 2019 (UTC)
historical kana spellings
[edit]@Suzukaze-c Do you think showing the historical kana spelling in {{ja-see}}
(e.g.
For pronunciation and definitions of 白川夜舟 – see the following entry at 白河夜船. |
|
(This term, 白川夜舟, is an alternative kanji spelling of 白河夜船.) |
) is a good idea? I'm afraid casual readers may mistake it as the katakana spelling or the (modern) pronunciation of the term, but I don't know any better way to place it. --Dine2016 (talk) 06:10, 26 May 2019 (UTC)
- Hm, perhaps it is inappropriate if
{{ja-see}}
is supposed to be simple. —Suzukaze-c◇◇ 06:15, 26 May 2019 (UTC)- @Suzukaze-c: Thanks. Actually the template would look much clearer if the whole header (しらかわよふね〔シラカハヨフネ〕【白河夜船・白川夜船・白川夜舟】) got removed:
For pronunciation and definitions of 白川夜舟 – see the following entry at 白河夜船. |
|
(This term, 白川夜舟, is an alternative kanji spelling of 白河夜船.) |
- The header were added solely to distinguish between words in rare cases like this:
For pronunciation and definitions of 白川夜舟 – see the following entry at 白河夜船. |
|
(This term, 白川夜舟, is an alternative kanji spelling of 白河夜船.) |
- What about removing the header when there is only one matching word and displaying the headers when there is more than one? --Dine2016 (talk) 06:42, 26 May 2019 (UTC)
- I don't have a particular opinion. —Suzukaze-c◇◇ 06:50, 26 May 2019 (UTC)
@Dine2016, see かんじざいぼさつ for 観自在菩薩. The Category:Japanese terms with usage examples should not be in kana spelling form of the kanji entry, is this intentional? ~ POKéTalker(═◉═) 22:58, 17 August 2019 (UTC)
- @Poketalker: Um..., when I wrote
{{ja-see}}
, I was influenced by western linguistics, which regarded the spoken language as the language and the written language a mere encoding of it. Given this view, かんじざいぼさつ and 観自在菩薩 denoted the same term, and since that term had an usage examples, it followed that かんじざいぼさつ and 観自在菩薩 would both be Category:Japanese terms with usage examples. What distinguished the two was that 観自在菩薩 belonged to Category:Japanese spellings with usage examples while かんじざいぼさつ did not. - However, it seems that Wiktionary doesn't distinguish spellings from terms, so I wouldn't object to removing that category. (It's easy, just remove
elseif a == 'ja-usex' or a:find('^quote') then -- special hack return '[[Category:Japanese terms with usage examples]]{{=' .. b
- and add
['ja-usex'] = true,
totemplates_to_exclude
from Module:ja-parse.) --Dine2016 (talk) 11:42, 18 August 2019 (UTC)
Should we consider using Template:ja-see for romaji entries as well?
[edit]This {{ja-see}}
template provides much more useful information to the user than the older {{ja-romanization of}}
. Should we consider using {{ja-see}}
in romaji entries, instead of {{ja-romanization of}}
?
If readers of this page support this idea, then {{ja-see}}
would need a bit of reworking. Here's an example of what {{ja-see}}
looks like on romaji entries. The template now states that the romaji form is an "alternative" spelling.
Also, if we decide to proceed with this idea, considering the kerfuffle from last time when we tried a few iterations of {{ja-romanization of}}
, we should probably broach the topic at WT:BP or WT:GP. ‑‑ Eiríkr Útlendi │Tala við mig 22:34, 28 August 2019 (UTC)
- I don't think so. The very first reason I created
{{ja-see}}
was because the older format for kana soft-redirects,
==Japanese== ===Noun=== {{ja-noun}} # {{ja-def|重箱読み}} A reading pattern for certain kanji compound words, using the Chinese-derived ''[[on'yomi]]'' for the first kanji, and the native Japanese ''[[kun'yomi]]'' for the second kanji. [[Category:ja:Linguistics]]
- duplicated content (POS, definition, category) from the lemma entry. The current format for rōmaji,
==Japanese== ===Romanization=== {{ja-romaji}} # {{ja-romanization of|じゅうばこよみ}}
- does not duplicate any content, so there is no reason to replace it with
{{ja-see}}
when the current format does a good job. --Dine2016 (talk) 01:35, 29 August 2019 (UTC)- Granted, reducing data duplication is a good motive, and
{{ja-see}}
does an excellent job of that. - I realize I wasn't very clear on my main motivation for bringing this up for romaji entries: usability. The current approach with
{{ja-romanization of}}
is poor usability, in that it presents the user with nearly no information, and it requires the user to click through two different entries (the links on the romaji entry, and then the links on the kana entry) before arriving at the desired main entry. I think it would be much more useful and user-friendly to do something at least similar to{{ja-see}}
, by providing users with entry information already on the romaji page, without having to click through -- and if they want to see a full entry, have the romaji page provide direct links, rather than the indirect link to the kana entry, where the user would have to click through again. - Perhaps
{{ja-see}}
itself isn't the correct template for the job for romaji entries. Would you be supportive of something similar? ‑‑ Eiríkr Útlendi │Tala við mig 04:09, 29 August 2019 (UTC)- Rōmaji could use a similar idea to that for kyūjitai entries (c.f. Talk:天道蟲), namely to simply point to the kana form (in source code) and have the template find the lemma (by fixing double redirects). But it needs to filter the result once again. For example, after tō (tō) fetches content from とう and fixes double redirects, it should discard words like 問う as well as POS like the "proper noun" part of 東, which are romanized differently. (Well, kyūjitai also needs to filter the result once again, if the words involve ambiguous kanji like 弁.)
- Alternatively, rōmaji could link to the lemma entries (in source code) directly. The advantage with this approach is that acceleration is faster (After creating 重箱読み, simply make じゅうばこよみ and jūbakoyomi (jūbakoyomi) point to it, instead of creating a two-level hierarchy). The disadvantage is that homophone lists like 刀, 灯, 当, ... must be repeated on both とう and tō (tō). --Dine2016 (talk) 06:38, 29 August 2019 (UTC)
- Granted, reducing data duplication is a good motive, and
withdrawn |
---|
@Eirikr I'm sorry, but I still don't get why anyone would want to look up modern Japanese terms by rōmaji. First, there are two transcriptions, of which Kunrei-shiki is official and Hepburn is de facto with numerous variants (and we have created our own variant). Should we build both or stick to our own variant of Hepburn? Second, some words have pronunciation variants. For example, 食う could be either クウ or クー, and 用いる could be either モチイル or モチール. This leads to different transcriptions even in the same transcription scheme. Moreover, capitalization, spaces and hyphens all add further complexities. For example, when the user searches "san", should the results of "San" or "-san" be displayed as well? By the way, sometimes rōmaji usage conforms to neither Kunrei-shiki or Hepburn. For example, anime websites often use a variant of Hepburn with the long sounds expanded according to the rules of 現代仮名遣い or Waapuro. So 東方天空璋 is "Touhou Tenkuushou", neither proper Hepburn "Tōhō Tenkūshō" (nor its usual English rendering "Toho Tenkusho") nor proper Waapuro "Touhou Tennkuusyou/-shou". Here are more examples. I think the ultimate solution is to change the underlying software to use a more sophisticated search interface which supports various transcription schemes, instead of building the search result pages (i.e. rōmaji entries) ourselves. --Dine2016 (talk) 07:52, 4 September 2019 (UTC) |
- @Dine2016: The inclusion of romaji forms was based on discussions many years ago relating to usability and discoverability. As this is the English Wiktionary, we can safely assume that our readership can read English, which is written in the Latin alphabet (romaji). We cannot assume that our readership can read kana or kanji. So if an EN WT user has encountered a Japanese word, possibly in transcription, and they come here to look it up but without being able to input Japanese, the argument went that we still needed some way for them to find the entry. Since Hepburn is the most common Japanese transcription system used in the English-reading world, this was what we adopted here at Wiktionary (with some tweaks). That doesn't mean that we cannot include other romaji renderings -- just that we only include modified Hepburn in our "official" links, such as in translation tables, or in entry headword lines, or as the romanization we target with
{{ja-r}}
and similar templates.
- We specifically don't target 訓令式, as that is the romanization scheme adopted in Japan for Japanese readers, and it has various oddities that make it inappropriate for English readers (like zi not being pronounced /zi/, or syu not being pronounced /sju/), and oddities that actually render it deficient for describing Japanese (the inability to transcribe certain sounds, like ファ or ティ).
- We also specifically don't target ワープロ式, as this isn't a standard so much as a de facto practice with many variations, based on what various input method editors will accept for conversion. For instance, long vowels might be the same vowel twice, or the vowel plus a hyphen. Various consonantal sounds have multiple representations, with ちゃ renderable as tya, tixya, cha, cya, and possibly more. ん might be nn, as you note, but even in ワープロ式 it could be a single n or even an m so long as it's followed by a consonant.
- Again, we have no stricture against the creation of romaji entries based on such alternative spelling conventions, and indeed, if users create such entries, I believe we should keep them, so long as they are properly formatted and redirect the user to the appropriate Japanese entries. However, we do not target these for display in our lemma entries, and in Wiktionary:Japanese transliteration (linked to from WT:AJA#Romaji_entries) we explicitly explain that we use a modified version of Hepburn.
- Returning to your main question of "why anyone would want to look up modern Japanese terms by rōmaji", it comes down to the basic position that we have no alternatives for how to help English readers find Japanese terms, when they don't know kana or kanji and might not even have a Japanese IME installed. This is the same reason we have romanization entries for Gothic, and why the topic of romanized entries for other scripts keeps coming around from time to time in the Beer Parlor and other discussion pages. If you have some technical approach that would allow a user to input romaji in either the search bar or the URL and still land on the lemma page (or at least the kana soft-redirect page) corresponding to that romaji string, and somehow that romaji string cannot also be interpreted as a word in another language, then I think we can safely get rid of all of our Japanese romaji entries. I agree that romaji entries are a cludge, and an inelegant one at that, but we (the EN WT community dealing with JA entries) have not been able to come up with a better approach. ‑‑ Eiríkr Útlendi │Tala við mig 17:44, 17 September 2019 (UTC)
- @Eirikr: What about building rōmaji indexes in the Appendix or Index namespace? We can support multiple transcriptions (standard Hepburn, Waapuro Hepburn and Kunrei-shiki) this way.
- As for mainspace, I have no objections against rōmaji entries as long as they're voluntary and within reasonable bounds. I vehemently opposed them in the posts above because I mistakenly believed they would be given equal weight to kana entries like a writing system. For example, the current い entry is already taking up 20 MB of memory. If we transclude or build the same list at i it will probably cause memory error and break the rest of the page. But that's clearly not your intention, and I apologize for that. I should have said "The time for using
{{ja-see}}
for rōmaji entries is not mature" rather than attacking users looking up by rōmaji directly (though I still suspect anime fans may try to find 竜 at "ryuu" and be disappointed). - What about this compromise: employ something like
{{ja-see}}
in ordinary entries like tentō mushi, but switch back to the older format for entries like "i" once memory limits are breaked? --Dine2016 (talk) 19:40, 17 September 2019 (UTC)- @Dine2016:
- Re: multiple romanization schemes in the
Appendix:
orIndex:
namespaces, I think that's a wonderful idea. Theoretically, we could have a separate appendix or index set up for each romanization scheme, with the boilerplate at the top of each such page explaining what the scheme is and (in brief) how it encodes the Japanese kana and/or sound values. Presumably, so long as each scheme is a regular encoding, this could be programmatically generated? And we wouldn't need to have existing romaji pages for each word's spelling, like we do for the categories? I have no idea how to go about implementing something like this, however. - Re: using something like
{{ja-see}}
for romaji entries that don't have memory issues, I would also welcome that. - Good ideas, thank you! ‑‑ Eiríkr Útlendi │Tala við mig 21:10, 17 September 2019 (UTC)
inflected forms
[edit]Hi everyone. What do you think is the best way to show inflection of alternative spellings?
My initial plan was to make them automatically generated by {{ja-see}}
, on the respective entries of alternative spellings. For example, let's suppose we're soft-redirecting 言いだす to 言い出す. In addition to fetching the definitions and categories from 言い出す, {{ja-see}}
could also fetch the inflectional type (godan verb ending in -su) and inflect the alternative spelling accordingly:
For pronunciation and definitions of 言いだす – see the following entry at 言い出す. (This term, 言いだす, is an alternative kanji spelling of 言い出す.)
Conjugation of "言いだす" (See Appendix:Japanese verbs.) Stem forms Imperfective (未然形) 言いださ いいださ iidasa Continuative (連用形) 言いだし いいだし iidashi Terminal (終止形) 言いだす いいだす iidasu Attributive (連体形) 言いだす いいだす iidasu Hypothetical (仮定形) 言いだせ いいだせ iidase Imperative (命令形) 言いだせ いいだせ iidase Key constructions Passive 言いだされる いいだされる iidasareru Causative 言いださせる
言いださすいいださせる
いいださすiidasaseru
iidasasuPotential 言いだせる いいだせる iidaseru Volitional 言いだそう いいだそう iidasō Negative 言いださない いいださない iidasanai Negative continuative 言いださず いいださず iidasazu Formal 言いだします いいだします iidashimasu Perfective 言いだした いいだした iidashita Conjunctive 言いだして いいだして iidashite Hypothetical conditional 言いだせば いいだせば iidaseba
The advantage of this approach is that no extra work is needed, as far as modern spellings are concerned. However, once we start creating kanji spellings involving historical kana such as 言ひ出す or 言ひだす, then we need some way to tell the template to use the volitional ending -さう instead of -そう, to keep kana orthography consistent. In other words, the template needs to know whether the current alternative spelling is in modern or historical kana, so that it can supply appropriate inflectional patterns. (Spellings like 言出す add additional complexity: if we give modern inflections to 言いだす and historical inflections to 言ひ出す, then it makes sense to give both to 言出す. So there are really three possibilities, not two.)
It might be tempting to add a new parameter to {{ja-see}}
to indicate the kana orthography of the current alternative spelling, but that defeats the purpose of automatically generating the inflection table and is essentially no better than adding {{ja-go-su-hist}}
manually. An alternative solution is to mark the kana orthography in the lemma entry, for example via {{ja-kanjitab|...|alt=言いだす,言出す|halt=言ひ出す,言ひだす,言出す}}
or {{ja-kanjitab|...|alt=言いだす,言出す-mh,言ひ出す-h,言ひだす-h}}
. I prefer this approach, but I'm not sure which format is better. What do you think?
Alternatively, we could expand the lemma entry directly:
Form | Modern kana orthography | Historical kana orthography | Rōmaji |
---|---|---|---|
Stem forms | |||
Imperfective (未然形) | いいださ【言い出さ・言いださ・言出さ】 | いひださ【言ひ出さ・言ひださ・言出さ】 | iidasa |
Continuative (連用形) | いいだし【言い出し・言いだし・言出し】 | いひだし【言ひ出し・言ひだし・言出し】 | iidashi |
Terminal (終止形) | いいだす【言い出す・言いだす・言出す】 | いひだす【言ひ出す・言ひだす・言出す】 | iidasu |
Attributive (連体形) | いいだす【言い出す・言いだす・言出す】 | いひだす【言ひ出す・言ひだす・言出す】 | iidasu |
Hypothetical (仮定形) | いいだせ【言い出せ・言いだせ・言出せ】 | いひだせ【言ひ出せ・言ひだせ・言出せ】 | iidase |
Imperative (命令形) | いいだせ【言い出せ・言いだせ・言出せ】 | いひだせ【言ひ出せ・言ひだせ・言出せ】 | iidase |
Key constructions | |||
Passive | いいだされる【言い出される・言いだされる・言出される】 | いひだされる【言ひ出される・言ひだされる・言出される】 | iidasareru |
Causative | いいださせる【言い出させる・言いださせる・言出させる】 いいださす【言い出さす・言いださす・言出さす】 |
いひださせる【言ひ出させる・言ひださせる・言出させる】 いひださす【言ひ出さす・言ひださす・言出さす】 |
iidasaseru iidasasu |
Potential | いいだせる【言い出せる・言いだせる・言出せる】 | いひだせる【言ひ出せる・言ひだせる・言出せる】 | iidaseru |
Volitional | いいだそう【言い出そう・言いだそう・言出そう】 | いひださう【言ひ出さう・言ひださう・言出さう】 | iidasō |
Negative | いいださない【言い出さない・言いださない・言出さない】 | いひださない【言ひ出さない・言ひださない・言出さない】 | iidasanai |
Negative continuative | いいださず【言い出さず・言いださず・言出さず】 | いひださず【言ひ出さず・言ひださず・言出さず】 | iidasazu |
Formal | いいだします【言い出します・言いだします・言出します】 | いひだします【言ひ出します・言ひだします・言出します】 | iidashimasu |
Perfective | いいだした【言い出した・言いだした・言出した】 | いひだした【言ひ出した・言ひだした・言出した】 | iidashita |
Conjunctive | いいだして【言い出して・言いだして・言出して】 | いひだして【言ひ出して・言ひだして・言出して】 | iidashite |
Hypothetical conditional | いいだせば【言い出せば・言いだせば・言出せば】 | いひだせば【言ひ出せば・言ひだせば・言出せば】 | iidaseba |
The advantage with this approach is that it is more logical, and users searching alternative spellings of inflected forms is likely to land on the lemma entry directly, with MediaWiki's searching facility. The disadvantage is that the inflection templates must be reworked, and such a format would increase data duplication.
Which approach do you prefer?
(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 11:08, 16 September 2019 (UTC)
- @Dine2016: Both approaches are very interesting, thank you for the efforts. I have no objections. Let's see what other people are going to say. --Anatoli T. (обсудить/вклад) 11:46, 16 September 2019 (UTC)
- After briefly looking over the proposal, my only concern at this point is the proposal for expanding the lemma entry's table, particularly in edge cases where a given term might have multiple historical kana spellings. The sample above seems to be able to show two forms side-by-side well enough, but I'm not sure it would scale very well to three, as with 用いる (mochiiru, “to utilize, to use”) -- modern kana もちいる, historical etymological kana もちゐる, historical technically-misspelling kana もちひる. There are probably other examples out there as well of terms with multiple historical spellings. ‑‑ Eiríkr Útlendi │Tala við mig 19:35, 16 September 2019 (UTC)
- I'm not sure which one I like yet, but the second one is rather busy. Perhaps newlines would be an improvement. —Suzukaze-c◇◇ 04:27, 17 September 2019 (UTC)
- @Dine2016: It seems only users with certain permissions (sysop, autopatrolled, etc.) that could edit with this template (when the whole page is only contains a Japanese entry, I guess). I got this error when trying to edit 除ける:
Errors: * This action has been automatically identified as harmful, and therefore disallowed. If you believe your action was constructive, please inform an administrator of what you were trying to do. A brief description of the abuse rule which your action matched is: strips L3
Marlin Setia1 (talk) 23:43, 14 October 2019 (UTC)
- @Marlin Setia1: That's because the template breaches the standard entry layout, which requires alternative spellings be formatted as
==Japanese== {{ja-kanjitab|よ|yomi=k}} ===Verb=== {{ja-verb|さける|type=2}} # {{alternative spelling of|ja|除ける}}
- The new format with
{{ja-see}}
has not been formally recognized. - This issue needs to be brought to Wiktionary:Grease pit. In the meanwhile, you can try the following instead:
==Japanese== {{ja-kanjitab|よ|yomi=k}} ===Definitions=== {{ja-see|避ける}}
==Japanese== ===Etymology 2=== {{ja-kanjitab|よ|yomi=k}} {{ja-see|避ける}}
- (It's a bit wonky, as there's
===Etymology 2===
even though there's only one etym on that page right now, and it's missing the のける reading, but anyway. :) ) - The absolute smallest form for these soft-redirects, for spellings with only one reading, would be something like the following (assuming that 除ける were only read as yokeru):
- (It's a bit wonky, as there's
==Japanese== {{ja-kanjitab|よ|yomi=k}} {{ja-see|避ける}}
- I'm uncertain how @Marlin Setia1 was running into the abuse filter? ‑‑ Eiríkr Útlendi │Tala við mig 19:03, 15 October 2019 (UTC)
Sorting on pages using this template
[edit]@Dine2016, Suzukaze-c, anyone else interested --
I'm curious if this template / module could be updated to apply sorting. By way of examle, 抱っこ uses {{ja-see|だっこ}}
, but the 抱っこ entry is currently sorted in Category:Japanese_childish_terms under 抱っこ rather than the expected だっこ. ‑‑ Eiríkr Útlendi │Tala við mig 18:05, 7 November 2019 (UTC)
- I gave it a try, but I still think sortkeys should eventually be eliminated. Sorting だっこ under た and 抱っこ under 抱 (or 扌/手) allows users to look up the same term by either spelling. This may not be obvious for small categories like Category:Japanese childish terms but will make a difference for larger categories with hundreds or thousands of words. But the most important argument against custom sortkeys is that many editors forget it. For example, the editors of the current entry for だっこ forgot to add a sortkey, so that it is sorted under だ instead of the correct た. --Dine2016 (talk) 03:34, 8 November 2019 (UTC)
- I've asked a few times in a few different fora over the years, both here and on other WM sites, about how to fix sorting for Japanese, and no one seems to know jack shit about how to improve things at the base level, frankly. (That may be my frustration showing. 苦笑) There was a related thread not long ago asking similar questions about categories for Hungarian, which at least uses the Latin alphabet. The approach there was to use Lua to customize how sorting happens. Given the MediaWiki team's complete apathy with regard to some of our basic-functionality needs, perhaps a similar approach, leveraging Module:languages/data2 or some other code, could be applied to Japanese? (Asking in ignorance of the possible complexity, as I have not understood the current module infrastructure -- IMHO, our module documentation and code comments are pretty horribly lacking...) ‑‑ Eiríkr Útlendi │Tala við mig 18:58, 8 November 2019 (UTC)
- @Eirikr: If we use modules then it must be for converting だ to た', 抱 to ⼿05, etc. There is no easy way to convert kanji to kana.
- MediaWiki categories are unservicable in the first place. The best we can do is to sort kana under kana and kanji under kanji, so that the user can look for terms beginning with むらさき by https://en.wiktionary.org/w/index.php?title=Category:Japanese_lemmas&from=むらさき, and look for terms beginning with 紫 by https://en.wiktionary.org/w/index.php?title=Category:Japanese_lemmas&from=紫, but there's no way to look for terms ending with something or do composition (kanji beginning with X and kana beginning with Y, both a verb and obsolete, etc.). And most importantly, there are no way to customize how entries appear in categories, for example to make むらさきいろ appear as "むらさきいろ【紫色】" and 紫色 appear as "紫色(むらさきいろ)". Suzukaze-c thinks that the best way to improve Wiktionary's usability is via some third-party searching function, and this requires Wiktionary data to be machine parseable. Looking at the current entry layout, I don't think it is. And more disappointing is the fact that the community's efforts are wasted in pandering to MediaWiki's deficient facilities (sorting, references, etc.) even when it would burden more complexity on page sources and make them unserviable to other interfaces than MediaWiki. --Dine2016 (talk) 01:26, 9 November 2019 (UTC)
- Apologies for my lack of clarity; regarding sorting features, what I was envisioning was actually what you propose: for kana, using Lua to deal with the current hackish workarounds of adding
'
on the end for initial kana with 濁点 and''
for initial kana with 半濁点, and for kanji, using Lua to sort by radical + additional stroke count, rather than just sorting by the raw character itself. Ideally, editors wouldn't have to bother with sortkeys for Japanese at all. ‑‑ Eiríkr Útlendi │Tala við mig 07:14, 10 November 2019 (UTC)
- Apologies for my lack of clarity; regarding sorting features, what I was envisioning was actually what you propose: for kana, using Lua to deal with the current hackish workarounds of adding
- I've asked a few times in a few different fora over the years, both here and on other WM sites, about how to fix sorting for Japanese, and no one seems to know jack shit about how to improve things at the base level, frankly. (That may be my frustration showing. 苦笑) There was a related thread not long ago asking similar questions about categories for Hungarian, which at least uses the Latin alphabet. The approach there was to use Lua to customize how sorting happens. Given the MediaWiki team's complete apathy with regard to some of our basic-functionality needs, perhaps a similar approach, leveraging Module:languages/data2 or some other code, could be applied to Japanese? (Asking in ignorance of the possible complexity, as I have not understood the current module infrastructure -- IMHO, our module documentation and code comments are pretty horribly lacking...) ‑‑ Eiríkr Útlendi │Tala við mig 18:58, 8 November 2019 (UTC)
New implementation
[edit]The new implementation parses the lemma entry in one pass, instead of dividing it by Etymology headers and accepting/rejecting each in an all-or-nothing manner. This means that the following snippet is parsed correctly:
===Noun=== {{ja-noun|おたまじゃくし}} # musical note // current list of alt spellings: おたまじゃくし ===Noun=== {{ja-noun|おたまじゃくし|オタマジャクシ}} # tadpole // current list of alt spellings: おたまじゃくし, オタマジャクシ
But it also means that fewer categories are copied. In fact, only categories from headword lines and definitions are copied, since the new implementation does not make any assumptions of the format of the rest of the entry.
@Eirikr, Suzukaze-c, Huhu9001 The new implementation also handles {{ja-see}}
and {{ja-see-kango}}
in the same manner (as the old {{ja-see-kango}}
), and their only difference is that the latter speaks of "Sino-Japanese terms" instead of "terms". What about using {{ja-see}}
for both kango and wago, and putting "Sino-Japanese" in the Etymology section? --Nyarukoseijin (talk) 11:26, 23 March 2020 (UTC)
- I think it is good to unify these 2 temps. I don't think it is quite necessary to clearly distinguish kango from wago in a soft redirect page. -- Huhu9001 (talk) 02:44, 24 March 2020 (UTC)
- I discovered what might be a failure mode of sorts. See 仙 and セント (sento). The call to
{{ja-see}}
on the 仙 page should presumably only pull in theEtymology 1
section from セント (sento), the section that has{{ja-kanjitab|alt=仙}}
(which I think was also the previous implementation's behavior). However, what I see on the 仙 page is senses from both etym sections at セント (sento), which incorrectly gives 仙 the "saint" sense as well. ‑‑ Eiríkr Útlendi │Tala við mig 17:44, 27 May 2020 (UTC)
@Nyarukoseijin, Suzukaze-c, anyone else -- could you please have a look at the いただきます page and suss out why this template isn't working correctly there? I suspect it might be related to the POS header at lemma form 頂きます, but that's just a guess. ‑‑ Eiríkr Útlendi │Tala við mig 22:15, 11 June 2020 (UTC)
- Done? This made it better somehow. —Suzukaze-c (talk) 04:26, 13 August 2020 (UTC)
Pinging @Suzukaze-c, Huhu9001, welcoming anyone else with insight --
Issue
[edit]{{ja-see-kango}}
doesn't handle alt spellings very well. If a listed kanji compound is an alternative form entry that is just a stub, it gets placed at the bottom of the list in smaller font:
- (The following entry is uncreated:
[KANJI]
.)
- (The following entry is uncreated:
I haven't confirmed, but this might affect {{ja-see}}
too.
Background
[edit]I just had a go at 器械・機械 and きかい, lemmatizing at 器械. The other two use {{ja-see-kango}}
. The two kanji entries are rendering as expected. However, きかい has the 機械 (the stub entry) at the bottom, stating that it hasn't been created yet.
Ideas
[edit]I've found that this happens if {{ja-see-kango}}
can't find an alt spelling or a kana spelling. I'm not sure of the best way of adding it to a stub entry; 機械 has {{ja-kanjitab}}
, so we could presumably add alt=きかい
, but that feels weird since this isn't really an "alternative" spelling, strictly speaking, and listing kana in the "alternative spellings" box looks odd. I also tested that and it didn't seem to work, so if we decide to go with this approach, it would require a change to {{ja-see-kango}}
, and possibly {{ja-see}}
.
Looking forward to your input.
‑‑ Eiríkr Útlendi │Tala við mig 18:24, 21 September 2020 (UTC)
- I don't understand the code, but my guess is that it calls 機械 'uncreated' because there isn't really content— just
{{ja-see}}
. —Suzukaze-c (talk) 20:22, 22 September 2020 (UTC)- (I still don't really understand, but following variables, it seems to call it 'uncreated' because there aren't any definitions— and
{{ja-see}}
isn't a definition. —Suzukaze-c (talk) 20:25, 22 September 2020 (UTC))- I'll echo Suzukaze's guess.
- To add to that, in combination with Dine2016's notes at User talk:Eirikr#Reply to your question, is that
{{ja-see-kango}}
is not intended to point to entries that amount to little more than stubs as alternative spellings, which is effectively what 機械 is. - I didn't understand the intended behavior when I last edited the きかい entry, and the template's output message stating that the 機械 entry didn't exist -- when it clearly does, albeit as a stub -- was confusing to me.
- The easy solution at the きかい entry is simply to remove 機械 from the list of arguments. ... Which I've now done. :) ‑‑ Eiríkr Útlendi │Tala við mig 17:31, 19 July 2021 (UTC)
- (I still don't really understand, but following variables, it seems to call it 'uncreated' because there aren't any definitions— and
"key" param -- what is this for?
[edit]The documentation isn't clear what the use case is for this parameter. Could someone please explain? ‑‑ Eiríkr Útlendi │Tala við mig 22:49, 24 December 2020 (UTC)
- @Eirikr: As far as I understand, it's just like
|pagename=
of other templates. Used only in tests. -- Huhu9001 (talk) 14:55, 12 January 2021 (UTC)
Message from Dine2016
[edit]This is Dine2016, the original creator of this template.
There are some things I did wrong with this template:
- 1. Having two variants,
{{ja-see}}
and{{ja-see-kango}}
, is certainly wrong. Most Japanese dictionaries don't treat Sino-Japanese words specially. It is unclear why Wiktionary should. - I initially introduced
{{ja-see-kango}}
so that Sino-Japanese words could be grouped together. Later I realized that native words were also rich in homophones, and remodeled{{ja-see}}
after{{ja-see-kango}}
. Now the two templates are almost identical; the only difference is that the later says "Sino-Japanese" in the footer. - The current practice seems to group soft-redirects to native words in one Etymology header, and those to Sino-Japanese words in another. I suggest dropping this boundary, using a single ===Etymology n=== section and a single
{{ja-see}}
for all soft-redirects on the same page. - 2. Creating
{{ja-gv}}
was also a mistake. The original assumption was that graphemes like 竜/龍, 灯/燈, 画/畫/畵, and 代々/代代 were always interchangeable, regardless of which words they spell. So we could simplify
WORDS SPELLINGS だいだい / 代々 代々 ─ 代代 / 代々 よよ / 代代 世々 ─ 世々 \ 世世 せぜ 世々 ─ 世々 \ 世世
- to
WORDS SPELLINGS GRAPHEME-VARIANTS だいだい 代々 代々 \ / 代々 \ 代代 よよ / 世々 世々 \ / 世々 \ 世世 せぜ / 世々
- The problem is that I didn't convey this two-level hierarchy clearly (I couldn't write good English, to begin with), so people started using the template in the wrong way, like redirecting from 代代 to 世々. This totally defeated what the template was created for.
- 3.
{{ja-see}}
requires the alternative spelling to appear in the main entry. This is not necessary if the main entry contains only one word, in which case no ambiguities will arise. - 4. Current version looks too ugly. Unfortunately I was not a web designer and didn't know how to make it mobile-friendly.
Part of the reason I couldn't write good English lies in the fact that Wiktionary's terminology was unclear. For example, Wiktionary has never developed a concept for "words" (or more accurately, "lexical items"). Instead, its entries are organized around spellings. So one entry may contain several words (くらい = 'dark', 'rank/approximately', 'eating') and one word may span several entries ('rank/approximately' = くらい, くらゐ, 位). {{ja-see}}
requires you to think in words, to organize the data around words, which is difficult in a spelling-centric mode of editing. So I kinda regret creating these templates.
(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Suzukaze-c, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo): --2409:894C:3C36:279D:5B65:2A08:CFCA:DED6 16:23, 11 July 2021 (UTC)
"The following entry is uncreated"—what?
[edit]I don't understand what the template means when it says this. Is it broken in detecting whether entries are created, or is it worded/placed poorly?
I see it in entries like 遣る, where it says:
- (The following entry is uncreated: やる.)
but やる does, in fact, exist—it's the main entry for the word. What's going on here? --TreyHarris (talk) 20:01, 27 November 2021 (UTC)
- Fixed. The reason is that you used two kanjitab for each alternate form, where in actuality you just need one kanjitab and group the alternative forms together separated by comma. Because how the template were designed, having two kanji tabs confused the ja-see template as it doesn’t know which one to direct. Shen233 (talk) 20:36, 6 December 2021 (UTC)
Category
[edit]I noticed that this template automatically adds entries to Category:Japanese non-lemma forms, which seems problematic. It's a bit inconsistent for Category:Japanese lemmas to contain some kanji spellings, but not others. Binarystep (talk) 23:02, 14 March 2022 (UTC)
Show romaji at kana-only spellings
[edit]I understand why we don't transliterate when this template is placed at a spelling that contains kanji, but when there's nothing but kana there are no technical reasons not to: we don't have to worry about dealing with multiple readings, and it should be easy to transliterate plain kana without having to fish around in other entries like we do when kanji are involved. We don't even have to link to a romaji entry- it would be fine to just display it like we do in our other templates: かな (“kana”). As for the matter of consistency: I would argue that almost all kana-only entries with this template are basically transliteration entries already- but in a script that's opaque to those who haven't completely mastered the kanas (i.e., most of our readers). Chuck Entz (talk) 07:39, 1 May 2023 (UTC)
Hyphen after keyword
[edit]I'm not a big fan of the hyphen in e.g.: For pronunciation and definitions of ことば – see the following entry/entries. I was thinking about switching the sentence around, as "See the following entry for" etc., thus putting the term at the end. But I wanted to ask if this isn't a stupid idea first. DAVilla 15:39, 3 August 2024 (UTC)