Module talk:IPA/data/symbols
Add topicMultiple XSAMPA symbols?
[edit]@Giorgi Eufshi, Dixtosa: Is there any way to associate multiple XSAMPA symbols with the same IPA symbol? There are comments saying that P _R _F _~
are equivalent to v\ _/ _\ ~
respectively; in addition '
is equivalent to _j
, and I'd personally like to add ,
as an equivalent to =
since using =
within templates like {{x2i}}
is awkward. —Aɴɢʀ (talk) 16:04, 30 November 2016 (UTC)
- I just moved some of the data from Module:IPA/data. I have no understanding of either data module. --Dixtosa (talk) 17:50, 30 November 2016 (UTC)
- @ZxxZxxZ, CodeCat, what do y'all think? Is it possible? —Aɴɢʀ (talk) 18:45, 30 November 2016 (UTC)
@Angr, Dixtosa: It would be easy to achieve this if a separate table of X-SAMPA and IPA were created. I could copy my table from w:Module:Sandbox/Erutuon/X-SAMPA. — Eru·tuon 00:16, 21 February 2017 (UTC)
- Done. And now
N=, N_0
are correctly transformed to ŋ̍ ŋ̊ rather than ŋ̩ ŋ̥. — Eru·tuon 01:37, 21 February 2017 (UTC) - But I need to make the "with descender" form of the diacritic also be used when there is already a diacritic underneath the letter. — Eru·tuon 01:30, 27 February 2017 (UTC)
Pharyngealisation
[edit]@CodeCat, could you please add ˤ as an allowable character? I'm not sure what the right way of adding it is. —Μετάknowledgediscuss/deeds 00:06, 21 February 2017 (UTC)
- I noticed that character wasn't included in the list. It's actually not a real IPA character: MODIFIER LETTER SMALL REVERSED GLOTTAL STOP. The IPA character is MODIFIER LETTER REVERSED GLOTTAL STOP. The two look identical in some fonts: ˤ vs. ˁ. — Eru·tuon 00:13, 21 February 2017 (UTC)
- That's great to know! Then we won't add it, and I'll tell DTLHS to add it to the characters to be globally replaced. —Μετάknowledgediscuss/deeds 00:16, 21 February 2017 (UTC)
- @Erutuon, ř is a character that should be added, however, since it seems that the whole Unicode decomposition thing is getting in the way again. —Μετάknowledgediscuss/deeds 00:27, 21 February 2017 (UTC)
- Or perhaps the IPA could be decomposed and we could search for combining diacritics (making the list of valid characters much shorter). Hmm. In any case, I've added that character. — Eru·tuon 01:11, 21 February 2017 (UTC)
- @Erutuon: That would indeed be good. We have the same issue with ŋ̍, ɺ̡, ɺ̢, and other combinations as well, it would seem. —Μετάknowledgediscuss/deeds 07:32, 21 February 2017 (UTC)
- @Metaknowledge: Ahh, the problem with those is the diacritics. Apparently they're not on the list. The list is frustrating because it's not very organized. I'll do some work on it. — Eru·tuon 07:46, 21 February 2017 (UTC)
- Much appreciated. Category:IPA pronunciations with invalid IPA characters had about 15,000 members when I suggested working on it to DTLHS and now it has 11,000-odd members. I've been cleaning up ones with actual problems by hand, but a large part seems to be IPA that's fine, just not recognised by the module. —Μετάknowledgediscuss/deeds 07:51, 21 February 2017 (UTC)
- @Metaknowledge: Ahh, the problem with those is the diacritics. Apparently they're not on the list. The list is frustrating because it's not very organized. I'll do some work on it. — Eru·tuon 07:46, 21 February 2017 (UTC)
- @Erutuon: That would indeed be good. We have the same issue with ŋ̍, ɺ̡, ɺ̢, and other combinations as well, it would seem. —Μετάknowledgediscuss/deeds 07:32, 21 February 2017 (UTC)
- Or perhaps the IPA could be decomposed and we could search for combining diacritics (making the list of valid characters much shorter). Hmm. In any case, I've added that character. — Eru·tuon 01:11, 21 February 2017 (UTC)
@Metaknowledge: Hm, it seems that the palatalized hook below and rhotic hook below (Unicode hex numbers 0x321
and 0x322
) are obsolete or nonstandard, but are not in the list of obsolete or nonstandard symbols. I think that instead of ɺ̡, ɺ̢ invalid IPA characters (̡), ɺʲ, invalid IPA characters () should be used, but I could be wrong because I am not familiar with any languages whose transcription uses those symbols. — Eru·tuon 01:35, 27 February 2017 (UTC)
- I think its only use here is for Pashto, and it follows the use in the Wikipedia article Pashto phonology. I see as a box, which certainly does not make me want to recommend it, but I don't know what characters are best. —Μετάknowledgediscuss/deeds 01:40, 27 February 2017 (UTC)
- I see that character as a box unless it is enclosed in templates that tag it with
class="IPA"
. I guess it must be so newly added that my browser (Chrome) doesn't know which fonts to assign it. Gentium, the font that I have assigned to the IPA class in my common.css, does have the letter, though. Oh, now that I look at w:Retroflex lateral flap, the symbol invalid IPA characters () isn't standard IPA? Huh. I guess I will add the rhotic hook, but not the other diacritic. It's not listed in w:International Phonetic Alphabet, but it is probably semi-valid at least. — Eru·tuon 02:17, 27 February 2017 (UTC)- is a Private Use character. Don't add it, its appearance depends on the whims of the font maker. —suzukaze (t・c) 02:23, 27 February 2017 (UTC)
- I see that character as a box unless it is enclosed in templates that tag it with
@Metaknowledge, Erutuon According to w:Pharyngealization#IPA symbols (as mentioned at Wiktionary:Grease_pit/2019/June#IPA_template), we have the pharyngealization symbols backwards. The official IPA symbol is ˤ (U+02E4 modifier letter small reversed glottal stop), which the module is currently deprecating. The symbol the module is currently preferring is ˁ (U+02C1 modifier letter reversed glottal stop), which WP says "the IPA Handbook does not mention … at all". —Mahāgaja · talk 13:42, 19 June 2019 (UTC)
Spacing "raised" diacritic
[edit]@Erutuon, CodeCat: Can we add ˔ (the spacing equivalent of the "raised" diacritic ̝ ) to the list of valid characters? It needs to be used, for example, at tlakate, since the symbol it modifies already has another diacritic beneath it. Thanks! —Aɴɢʀ (talk) 18:03, 21 February 2017 (UTC)
- @Angr: Done! I also added the spacing version of the "lowered" diacritic, which was missing. — Eru·tuon 04:25, 23 February 2017 (UTC)
Superscript parentheses
[edit]The majority of the entries with invalid characters are coming from {{ru-IPA}}
generating pronunciations with ⁾ and ⁽. Is this absolutely incorrect, or can these symbols be added? DTLHS (talk) 02:48, 22 February 2017 (UTC)
- In what context are those parentheses used? Around superscript j? — Eru·tuon 03:01, 22 February 2017 (UTC)
- @Benwing2 Can you provide context? DTLHS (talk) 03:59, 23 February 2017 (UTC)
- Yes, these are around superscript j, indicating optional palatalization of a consonant, typically when directly preceding another palatalized consonant. Benwing2 (talk) 14:17, 23 February 2017 (UTC)
- The raised parens are also used by
{{ny-IPA}}
for the transcription of the sound formerly spelled with ŵ. —Aɴɢʀ (talk) 15:18, 23 February 2017 (UTC)
- The raised parens are also used by
- Yes, these are around superscript j, indicating optional palatalization of a consonant, typically when directly preceding another palatalized consonant. Benwing2 (talk) 14:17, 23 February 2017 (UTC)
- @Benwing2 Can you provide context? DTLHS (talk) 03:59, 23 February 2017 (UTC)
I think the superscript parentheses should be allowed; ⁽ʲ⁾ and ⁽ᵝ⁾ look neater than (ʲ) (ᵝ). I've added a rule in Module:IPA that allows these parentheses, but only around superscripts. (Hopefully I haven't broken anything in creating the list of superscripts, though.) — Eru·tuon 22:12, 23 February 2017 (UTC)
- Still seeing the invalid character message (абсорбировался for example) DTLHS (talk) 22:38, 23 February 2017 (UTC)
- Oops, a regex syntax error. Now it works. — Eru·tuon 22:42, 23 February 2017 (UTC)
Preocclusion symbols
[edit]@Erutuon, Octahedron80, DTLHS: Can someone please add ᵇ ᵈ ᶢ
to the list of valid IPA characters? They're needed for preoccluded consonants in languages like Manx. Or should we be using b͡m d͡n ɡ͡ŋ instead? —Aɴɢʀ (talk) 15:43, 10 October 2017 (UTC)
- @Angr: I'm not sure which is correct according to the standard use of the IPA. The Wikipedia article doesn't mention pre-occlusion. But I've added those symbols anyway. It can be up to you which to use. — Eru·tuon 22:38, 10 October 2017 (UTC)
- I thought these are [bⁿ] [dⁿ] [gⁿ]. --WikiTiki89 22:43, 10 October 2017 (UTC)
- Those would be nasal release. kwami (talk) 21:02, 11 May 2022 (UTC)
- I thought these are [bⁿ] [dⁿ] [gⁿ]. --WikiTiki89 22:43, 10 October 2017 (UTC)
extIPA
[edit]@Benwing2, Erutuon, Octahedron80, Rua, Surjection: On Kwamikagami's talk page (User talk:Kwamikagami#𝼆) he and I discussed the possibility of adding some extIPA characters to the list of approved IPA characters here. That way we could use ⟨𝼆⟩, for example, instead of ⟨ʎ̥˔⟩ for the voiceless palatal lateral fricative. What do y'all think? I would support adding only the extIPA characters that represent sounds found in non-disordered speech, namely (per Wikipedia) ⟨ʪ ʫ ꞎ 𝼅 𝼆 𝼄 ¡⟩. —Mahāgaja · talk 20:41, 11 May 2022 (UTC)
- For a bit of background, JIPA will accept extIPA characters for its 'illustrations of the IPA' articles. (I specifically asked about <𝼆 𝼄>.) Ladefoged used <ꞎ> in his description of Toda.
- There's also [𝼈], found in languages spoken by hundreds of millions of people, and the rarer [ᶑ] and [𝼊] (the latter most commonly rendered with non-IPA <‼>). The IPA supported superscript variants of all three what they called "implied" IPA letters as part of last year's expansion of Unicode support of the IPA.
- Some of the extIPA diacritics are also used in non-disordered speech, e.g. unaspirated, alveolar and the parentheses, occasionally others. kwami (talk) 20:45, 11 May 2022 (UTC)
- That's true; we already use ◌͈ "stronger articulation" for Old Irish and, I believe, Korean. ⟨ᶑ⟩ is already whitelisted in the module. —Mahāgaja · talk 06:47, 12 May 2022 (UTC)
Chi
[edit]@Mahagaja We should be using the Greek letter, not the Latin one. Appendix 2 of the Handbook of the International Phonetic Association specifies the code point for the voiceless uvular fricative to be 03C7, as does the List of symbols and diacritics with descriptions & identifiers published in 2020. Nardog (talk) 22:58, 17 June 2023 (UTC)
Update: I thought Mahagaja had added code that suggested the Latin one for a Greek input, but now I see it was the other way around. So my comment only concerns his summary. Nardog (talk) 11:44, 20 June 2023 (UTC)
- OK. Since I knew what I had done all along, I assumed that the comment above was solely in reply to my edit summary and not my actual edit, which is why I didn't bother answering. —Mahāgaja · talk 11:55, 20 June 2023 (UTC)
Invalid characters
[edit]@Theknightwho, are your recent edits the reason why there are 13,000+ entries in Category:IPA pronunciations with invalid IPA characters? Most of them are alphabetized in the category under "Ɩ" even though they don't even use that character and never did. —Mahāgaja · talk 07:50, 4 December 2023 (UTC)
- @Mahagaja There was a bug a couple of days ago which accidentally put everything into that category - I assume the ones still in there are simply filtering through, since pages don't update immediately. Theknightwho (talk) 15:00, 4 December 2023 (UTC)