Wiktionary talk:About Hebrew/archives/2008

My thoughts on Hebrew

Latest comment: 16 years ago19 comments4 people in discussion

Hi. I'm Gilgamesh, and I've been working privately in Hebrew for years. I'm neither Jewish nor Israeli, nor am I 100% fluent in the language (mostly just as an issue of finer points I periodically refresh myself with). My main study has been Biblical Hebrew from a nonsectarian approach, though I eventually started to study some Modern Hebrew so I could better understand the nuances and differences. And here's some things from my years of observation.

Biblical Hebrew is still an important study for historical, religious and academic purposes. No one person has really seemed to agree on which phonemic analysis of Biblical Hebrew is the "right" one, but observations have shown that by far the most detailed and helpful one is Tiberian vocalization, a now-abstract set of phonological distinctions reflecting the liturgical language in Tiberias when and where the Masoretic Text was standardized. When transcribed diligently, it is as phonologically adaptable to any living dialect. I've seen subtle variations in transcription style, but they all have had these symbols in common: ʼ b g d h w z ḥ ṭ y k l m n s ʻ p r š ś t i ē e a ā o ō u. Beyond that, I've seen subtle variations. Most of the most professional sources I've encountered have been either in print or PDF, as it seems HTML and Unicode aren't yet used or exploited adequately.
- The question of distinguishing hard and soft begadkepat is one of mere allophony, as the Masoretic Text attested both soft-hard and soft-soft consonant clusters of these. In their most polished academic forms, I've seen the soft forms written with macrons below the letters, or above the letters with descenders (such as g p). However, considering that there is no p-macron in Unicode and it is still difficult to do this neatly using combining diacritics, sometimes ġ ṗ are used for the letters with descenders. ġ has a precedent in Arabic transcription where it is also used for soft G, and Hebrew has no emphatic P so there can be no educated confusion of ṗ. I have also seen (both electronically and in print) bh gh dh kh ph th, though this seems to draw the ire of transcriptionists who seem to demand one-symbol-per-phoneme. These are not ambiguous though, as hard-starting b-h g-h d-h k-h p-h t-h clusters do not exist in the Masoretic Text.
- For צ, I've seen ṣ, ẓ and ç in various publications. ṣ is in common with the transcription for Arabic ص, and is widely used by linguists who believe the Tiberian dialectual pronunciation of צ was fricative. Some use it whether or not they believe it was fricative or affricate. But as often as ṣ, I've also seen ẓ in use for this same Masoretic phoneme. ç is used in at least one of the books on my bookshelf (Teach Yourself Biblical Hebrew). Considering that the Hebrew Academy for the modern language has traditionally used ẕ for the modern language, it seems more familiar overall to use ẓ to transcribe the Masoretic phoneme, though this may ultimately be a matter of individual taste as there are no other phonemes in Hebrew that could be confused with any of these symbols.
- For ק, the vast majority of sources I've encountered used q. An extremely slim minority use ḳ. Again, probably a matter of individual taste.
- Using ś for שׂ with sin dot is under the understanding that it was traditionally a lateral fricative and written ש not ס, but it does not imply by necessity that it was any different from s at the time of the Masoretic Text.
- The academic consensus is at the very least that Hebrew of the Masoretic Text has at least seven long vowels and at least five short vowels. Not everyone agrees on exactly how to distinguish these. Zere and holem are always long, and few disagree on transcribing them ē ō. Likewise, long A-grade qamez is transcribed ā. The five short vowels are i e a o u. Among hireq, seghol, pathah, qamez and qibbuz/shureq, the vowel is short if it is unstressed word-final, or followed by a consonant cluster or a doubled consonant (when the vowel is still not stressed). A vowel is long if it is stressed or otherwise emphasized, or far more commonly always when it comes before a single consonant before another syllable with a fully articulated vowel, thus giving Masoretic Hebrew a tendency towards having generally more long vowels than short vowels. If a long vowel exists before a consonant before a shewa, then it usually (but not always) means that the shewa is shewa na and not shewa nah, e.g. Asenath not Asnath. But this rule isn't absolute, e.g. Gershom not Gereshom. This confused translators of the King James Version of the Bible, where both (otherwise identically-named) Basemath (wife of Esau) and Basmath (daughter of Solomon) can be found. Syllables before maqqeph are treated as if they flow run-on into the next word, so that even the vowel in כָּל־ is short.
- Every consonant written immediately before another consonant without taking so much as a shewa nah is silent and treated as if it doesn't exist, so that even the spelling of Issachar (יִשָּׂשכָר) is not ambiguous at all. Consonants at the ends of words or before maqqeph are fully pronounced with the exceptions of aleph without mappiq, or he without mappiq, or yodh after hireq, zere or seghol, or waw with holem, or shureq. When a vowel carries such a quiesced consonant (hireq-yodh, zere-yodh, qamez-he, holem-waw, shureq, etc.), some transcribe them î ê ệ ậ â ô û. This is fundamentally extraphonemic and does not necessarily imply that the vowel is long, as context and not quiescence principly determines whether a Masoretic vowel is long or short.
- Vowels of words with nonfinal accents are commonly transcribed with an acute mark ḗ é á ṓ etc. But Unicode has no ā with additional acute mark, and combining diacritic neatness problems still exist. Some may use ǻ, though this is fundamentally a kludge as it does not mesh visibly with ḗ ṓ.
- Some approaches (including the one I use) fundamentally assume that the last long vowel of a word is stressed, and the last stressed vowel of a word is long, and the vowels after the stressed vowel can be treated as if they are short. (This was partially a concession to an Ashkenazi Orthodox friend in England who insisted that I not prescribe the stress explicitly, as he along with many old-world Ashkenazi Jews tend to use different syllable stress.) I have tended to use ī ē ẹ ạ ā ō ū, avoided mixing with any î ê ệ ậ â ô û, and I've transcribed word-final unstressed qamez as å, which brings me to my next point.
- Though the Masoretic Text would seem to imply that qamez has only one vowel articulation in its long and short forms, this would only mean that Tiberian Hebrew merged its A-grade and O-grade qamez like Ashkenazi and Yemenite pronunciations, but this is known not to have been universally true, otherwise A-grade and O-grade distinctions would not exist at all. To solve this, and without by necessity prescribing different pronunciations for them, I transcribe typical long qamez as ā (as it is almost always A-grade), typical short qamez as o (as it is almost always O-grade), final unstressed qamez as å (as it is always A-grade but unstressed), and I transcribe the rare long O-grade qamez as ọ (where it usually represents an emphatic variant pronunciation of a short O-grade, e.g. ʼOhŏlīḇāmā and ʼỌhŏlīḇāmā). Few premodern dialects traditionally distinguished the A-grade and O-grade long qamez (not even Spanish Sephardic, e.g. Aholivama), but I use ọ to emphasize "this is O-grade, however you intended to pronounce it". It is notable though that Israeli Hebrew has fully congealed A-grade and O-grade qamez as a and o.
- Shewa na and the hataphim have been seen in different styles. Some write them all as small superscript letters ^{ə e a o}, but this is not commonly trivial in computer displays. Far more commonly, ə ĕ ă ŏ is seen. Sometimes, əy becomes ĭy (I tend to do this). Some transcriptions mark them as is regardless of the consonant under which they are found (ʼAškănạz and Mordŏḵạy), whereas others (including myself) tend to treat the hataphim as vowels that behave identically to shewa (ʼAškənạz and Mordəḵạy) except when they are under the guttural letters aleph, he, heth or ayin. And while many (including myself) as content to let a shewa nah remain silent under one of these consonants (Isaiah יְשַׁעְיָ֫הוּ is Yəšaʻyāhu not Yəšaʻăyāhu), shewa na under one of them tends to become ă (Halah חְלַח is Ḥălạḥ not Ḥəlạḥ). And in my own work, since short vowels before single consonants before vocalized vowels can only be either shewa or hataphim (except if the latter vowel is itself one of the hataphim), I found myself after a while not marking the hataphim with a breve at all, and so they become ə iy e a o everywhere without the slightest bit of ambiguity in transcription. And though shewa na and hateph seghol are not universally pronounced the same, under these conditions, it is possible even to write them e iy e a o without any real confusion, though in my more polished work I tend to still use ə.
Though often seen as stilted, the old-fashioned Hebrew Academy transcriptions are still useful, as long as official bodies such as in the United States government and the Israeli Central Bureau of Statistics (not to mention Israeli road signs) still use them. Ẕefat, Reẖovot, Qiryat Gat, Pétaẖ Tiqwa, Newe Shalom etc.
In terms of comparing Biblical and Modern Hebrew, I really think they are fundamentally the same language, closer than, say, Classical Greek is to Modern Demotic Greek. The differences are primarily in phoneme articulation (Modern Hebrew tends to merge many consonants and vowels), grammar (classical participles becoming modern present tense verbs, and the widespread modern use of the preposition של instead of the more traditional construct noun, etc.), vowel preferences (yoqṭān vs. yuktan, Yərōḥām vs. Yrukham, ribbī vs. rabi, etc.), some vocabulary, and modern creolization in Israel. The latter is notable because it has produced the e/ei phoneme split in a way different from how it existed in Ashkenazi Hebrew. Sephardic Hebrew didn't have ei, but Ashkenazi Hebrew had it for zere whether or not it was before yodh, and Israeli tends to leave zere without yodh as e, but has both seghol and zere before yodh as ei. Masoretic Hebrew has no true diphthongs at all, where zere-yodh is a pure vowel, seghol-yodh is a pure vowel, and other vowels before yodh are...a vowel before the consonant yodh.
Years ago at Wikipedia, when I was younger and quite a bit less mature and more arrogant, I wrote List of Hebrew names, which was later moved here to Wiktionary. It was chock full of misspellings (both in consonants and in vowels), arrogant false grammatical assumptions made by learners, and quite a bit of original research, and insufficient source citations. I also *ahem* hadn't actually read the entire Hebrew Bible even in English, and I left out broad huge swaths of otherwise notable names. Since then, I've gotten much better so that my original research has sharply decreased, and I can provide adequate sources and sound grammatical explanations. This is some of my latest work: User:Gilgamesh/Hebrew names. Recently I've been encouraged to participate more in Hebrew on Wiktionary because someone saw me doing this. And so, here I am.

I've been working heavily at Ancient Greek, adding tons of new entries, writing lots of technologically advanced templates (Category:Ancient Greek declension templates, Category:Ancient Greek IPA tokens, etc.), and I'm willing to help here if my input is welcome. :3 - Gilgamesh 01:48, 18 February 2008 (UTC)Reply

One must be careful to distinguish between transliteration into English (romanizations, if you will) on the one hand and transcription of pronunciation on the other. For the latter, we can have as many transcriptions as there are or were pronunciations, and those should be in IPA, SAMPA, or what-have-you. As far as transliteration goes, though, we really need not concern ourselves much with historical phonology, and we certainly don't want to transcribe the same word twice using different schemes (unless, perhaps, we really can't agree on one). In my opinion, transcriptions should be easy to read and (hence) primarily (or completely) Latin-1. One decent transcription system was proposed by Ruakh above on this talk page (although there are some specifics in there that I don't like); another is on my Web site at [1]. The idea should be readability by a speaker of English: specific pronunciation information can be found in the Pronunciation section of the entry, and specific vowelization information in the headword (inflection line), so those two type of information are not needed in the transliteration.—msh210℠ 21:33, 26 March 2008 (UTC)Reply

Agreed. Complex technical transliterations are used in situations when authors are not capable of 1. Hebrew script or 2. IPA script. We lack neither. Thus, the full Hebrew word, with all the crazy little marks that those Hebrew scribes put in there can be put in the entry, as well as as many pronunciations as you Hebrew folks can agree upon. However, the transliterations on non-English entries should be as simple as humanly possible, as it is simply meant to allow the reader who is not familiar with Hebrew script to get the faintest gist of the word. More specific than that belongs on the entry itself. However, I disagree with the statement about multiple transliterations if agreement cannot be reached. Agreement must be reached. There must be a single standard. So get to it. :) -Atelaes λάλει ἐμοί 05:08, 27 March 2008 (UTC)Reply

The problem is, Hebrew transcription is one subject that no one of attached cultural/religious background has been able to agree on for centuries. For a neutral secular approach that Wikimedia projects require, we should use the best independent approach, independent of religious, sectarian or national biases. Even using something as seemingly broad as general Israeli is not neutral enough, as it is surprisingly offensive to anti-Zionist Orthodox Jews such as Satmar. There will always be someone who is going to be offended by any religious or national approach we use. So the ISO 259 standard is simultaneously secular and neutral. As for constraining transcriptions to Latin-1, I do not agree at all—this is locale-biased (Western Europe and its former colonies), and Wiktionary supports the whole of UTF-8. No corner can be cut to provide a proper scientific transcription, especially that which has been honored in professional linguistics for decades. I cannot in appropriate academic conscious abandon ISO 259, even if I am in the minority among lay people. If we want articles to sparkle with informed data, we should use the high educated standards, and not dumb it down until it's less potentially useful for higher academic consumption. Lay people may be confused at first, but that is only because they are not appropriately educated, and it would be perverse to reward ignorance for the sake of momentary convenience. Maybe I should create a new template system for Hebrew transcriptions similar to what I designed for Ancient Greek, but following Hebrew phonemic and morphological principles as the language has been phonologically codified from the 8th century (the focus of Biblical Hebrew study) to modern terms (the focus of Israeli study). - Gilgamesh 11:15, 27 March 2008 (UTC)Reply

I am writing to agree with Ruakh's statements, below, that transcription is inherently for the uninformed, and that having a transcription aimed at those who know Hebrew is silly. Moreover, concerning Latin-1, I differ with you, Gilgamesh: you write that "this is locale-biased (Western Europe and its former colonies), and Wiktionary supports the whole of UTF-8". Well, first of all, I meant "the subset of Unicode corresponding to Latin-1" rather than "Latin-1" itself: sorry for the ambiguity. More importantly, though, if you were complaining about restricting to that subset of Unicode because it's too regional, I differ strongly: this is English Wiktionary, and we cater to those who can read our pages in English.—msh210℠ 16:36, 27 March 2008 (UTC)Reply

Yes, for people who read in English. But editors are held to a higher standard. Besides, I could not exclusively use a transcription system that is phonologically simplified purely for Israeli use—for all Biblical words, I would use at least ISO 259, no exceptions. It would be sloppy for me to do anything less. - Gilgamesh 21:02, 27 March 2008 (UTC)Reply

Editors are held to the standard of making sure their readers can understand what's written. Hence, we should use legible transcriptions, as I've said. Moreover, I don't understand your "I would" and "I could not". Do you mean that you would follow you own opinion even if community consensus is found to conflict with it? (There is no clear consensus at this point. I'm just asking.)—msh210℠ 22:01, 27 March 2008 (UTC)Reply

I cannot eliminate that much phonemic detail, especially for distinctions I have well learned and practice. - Gilgamesh 11:08, 28 March 2008 (UTC)Reply

*finished eating a small meal* I just realized how confusing some of that must have sounded. What I mean is...Hebrew is a situation where using only one transcription standard is untenable, because of delicate conflicting politics among the various established interests that make any practical use of Hebrew. If limited to one, I would use only ISO 259 always. I do not object in any way to using general Israeli phonological transcription—in fact I provide a column for it in my own projects, as you can see. But that alone is insufficient for any serious deep Biblical Hebrew study. It may be tempting to simply declare that Biblical Hebrew and Israeli Hebrew are two different languages, and to treat them differently like Ancient Greek and Modern Greek. But the situation is not quite the same, as Hebrew went for centuries not being spoken in any serious secular capacity, and so secular use was extended in the 19th century from established casual intrareligious use, like when Eliezer Ben Yehuda visited Algiers and Hebrew was the one language he and local Sephardic Jews could mutually understand. That said, if we were to start treating Biblical and Modern Hebrew as two separate categories, I would not object, though I may end up working almost entirely on the Biblical Hebrew category as that's been more or less my only study of it for years (just as I work only in Ancient Greek and not Modern Greek). I have put barely any exclusive study in Modern Hebrew (when I've studied it's always been with Biblical Hebrew also squarely in my thoughts), and to use only a general Israeli approach is painfully counterintuitive for someone like me. I routinely distinguish all the consonants when I vocalize, including even samekh [s] and sin [ɬ]. I once even wrote a short song to the melody of Wilt Heden Nu Treden, and the lyrics still roll off my tongue like his (with one additional syllable of pretone compared to the Dutch song, resulting in two syllables of pretone before the first formal measure): [ɾə.uː.βeːn̪ wə.ʃim.ʕoːn̪ wə.leː.wiː wiː.huː.ðɒː wɒː.ðɒːn̪ wə.n̪aɸ.t̪ɒːliː wɒː.ɣɒːð wə.ʔɒːʃeːɾ wi.jiɬ.ɬɒː.xɒː.ɾ-uz.βuː.luːn̪ wi.joː.seː.ɸ-um.n̪aʃ.ʃɛː wə.ʔɛɸ.raː.ji.m-u.βin̪.jɒː.miːn̪]. (I flapped all the reshes except for the last one, which I found myself always heavily trilling off the tip of my tongue.) - Gilgamesh 11:54, 27 March 2008 (UTC)Reply

It seems like transliterations are inherently for the uninformed — those who don't know how to read the script in question. (Also for the fontless, I suppose — those whose browsers won't display the script in question — but that's not usually an issue for Hebrew, and anyway what you're suggesting is no better for them.) Catering our transliterations specifically to those whose understanding of Hebrew is deeper than most native speakers' doesn't seem like a good idea; however, I'd be O.K. with a proposal that was essentially Modern-Hebrew-transliteration-plus-stray-marks-for-the-erudite. Unfortunately, for this we need at least a few digraphs (shin, tzadi, probably khaf/khet, … maybe also thav, to split the difference between tav and sav). —Ruakh_TALK 11:58, 27 March 2008 (UTC)Reply

ISO 259 specifies certain spellings. See this at Wikipedia. It's all taken care of. - Gilgamesh 12:04, 27 March 2008 (UTC)Reply

It's not as simple as that. There is no reason that Wiktionary needs to retain that standard, as Wiktionary is a different situation with different needs. Ultimately, the major difference between ISO 259 and our current standards is the removal of all the diacritics. The simple fact is that we don't need to be that specific. Since we deal with the actual Hebrew spellings and can have actual IPA pronunciations on the entries, the kind of specificity that ISO 259 has is not necessary. On Wiktionary, transliterations are for the uninformed, and thus using an extremely specific, technical romanization is oxymoronic. -Atelaes λάλει ἐμοί 21:44, 27 March 2008 (UTC)Reply

Agreed.—msh210℠ 22:01, 27 March 2008 (UTC)Reply

Disagreed. It's not merely the removal of all the diacritics. It's the removal of phonemic distinction for the sake of Israeli-only phonology. Aleph and ayin are not the same. Bheth and waw are not always the same. Heth and khaph are not the same. Teth, taw and thaw are not the same. Kaph and qoph are not the same. Samekh and sin are probably the same at least during the Masoretes, and I could concede that. Zadhe is not necessarily the same as taw+samekh, taw+sin, teth+samekh or teth+sin. As I always studied Biblical Hebrew, I learned these distinctions well, and it is how I've always practiced them. And I'm not alone—there is still some distinction in the Ashkenazi dialects, and far far more in the Yemenite dialects. But no one is going to insist firmly that these are separate languages. We do not treat English as if only Received Pronunciation matters—we also have other parts of the U.K., and Ireland, and the U.S., Canada, Australia, New Zealand, South Africa, Singapore. We don't treat them all as irrelevant. And we shouldn't treat global Hebrew, nearly all of which (except Samaritan Hebrew which is a different enough language) derives from a common Masoretic standard. For specifically Israeli vocabulary, I have no problem with using Israeli-only transcriptions. But for Biblical vocabulary, it would be wholly irresponsible to use anything less than codified Masoretic phonology—ISO 259. I would sooner have this project split in two than abandon it, because to abandon it would be to say that the only thing that matters internationally is what's spoken on the streets of Tel Aviv and Haifa, and though that tends to be better known, it is not the entire picture of Biblical Hebrew. I will work on this project if I can, but I will practice the nucleus of my years of book studies—Biblical Hebrew and its codified Masoretic phonological detail. Everything else I work with is only as a subset of that study. And I've studied a great deal over the years. You cannot suggest to me that there is no place for that here, at "About Hebrew". If it's only the Israeli vernacular, then it might as well be "About Israeli Hebrew". But it's "About Hebrew". And where "Hebrew" is considered, it includes Tiberian Hebrew, Yemenite Hebrew, Iraqi Hebrew, Ashkenazi Hebrew, Lithuanian Hebrew, Sephardic Hebrew, and Israeli Hebrew. They are a common language, unless this project is forced to split on the issue. - Gilgamesh 11:08, 28 March 2008 (UTC)Reply

*ate another meal* (I think I become easier to work with and less argumentative after I have a good meal. X3) You know, on second thought... If the major complaint is unreadability of transcription, I am also familiar with more informal transcription systems for Biblical-detail Hebrew that use fewer Unicode characters. Consonants: ', bh, b, bb, gh, g, gg, dh, d, dd, h, w, ww, z, zz, ħ, ţ, ţţ, y, yy, kh, k, kk, l, ll, m, mm, n, nn, s, ss, `, ph, p, pp, ç (or ş), çç (or şş), q, qq, r, š (or sh), šš (or ssh or shsh), ś (or s), śś (or ss), th, t, tt. Vowels: ə (or ĕ or e), ĕ (or e), ă (or a), ŏ (or o), i, í (or ī), ī, î (or ī or i depending on length), ē, ê (ēy maybe to mark the Israeli diphthong but technically does not exist masoretically except in ēyy), e, é (or ẹ), ẹ, ệ (or ẹ or e depending on length), a, á (or ạ), ạ, ậ (or ạ or a depending on length), å (or ā word-final if there's an acute-accented vowel earlier in the world), ā, â (or ā or å depending on length), o, ō, ô (or ō), u, ú (or ū), ū, û (or ū or u depending on length). Now, to acute-accent or not to acute-accent... In my own experience, I've been able to work without it just by clearly marking which vowels are short or long and marking a final unstressed qamez as å. But the majority of approaches I've seen use some kind of acute-accentation, at least with é. As for the circumflex diacritic, it was used by ISO 259 to mark a long vowel that has swallowed a silent consonant, but it isn't necessarily pronounced differently from a normal long vowel; some people prefer to keep it (at least î, ê, ệ, ô, û where its presence may indicate deeper grammatical possibilities), but I tend to go without for phonological straightforwardness. Anyway, the result of this simplified transcription system still preserves phonological detail: Re'ūbhēn, Shim`ōn, Lēwī, Yehūdhā, Dān, Naphtālī, Gādh, 'Āshēr, Yissākhār, Zebhūlūn, Yōsēph, Menasshẹ, 'Ephrạyim, Binyāmīn. Berēshīth bārā Elōhīm ēth hashShāmạyim we'ēth hā'Āreç. - Gilgamesh 11:43, 28 March 2008 (UTC)Reply

A few things to keep in mind. First, it is possible that it might be useful to divide ancient and modern Hebrew into two projects (SIL does). However, fairly minor pronunciation differences like this are not, in my opinion, enough justification. Greek has different orthography, inflection, and meanings between ancient and modern (on top of pronunciation), and all this makes the split a good one. However, a discussion of why splitting would be a good or bad idea is probably worth having. Although, at this point, the evidence is pointing me to favour a unified approach. In any case, Gilgamesh, you need to keep in mind that transliteration on Wiktionary is not a strict phonemic representation, it is a rough approximation. A unified standard is more important, even if it does not accurately represent every pronunciation scheme. For example, the transliteration scheme for grc does not represent the phonology of Koine and Byzantine Greek well, but them are the breaks. Israeli Hebrew seems like the natural standard to base the transliteration on. Differences in pronunciation between Israeli and others can be distinguished at whatever detail you like within the pronunciation section of the entries themselves. -Atelaes λάλει ἐμοί 18:38, 28 March 2008 (UTC)Reply

I still disagree, but...thank you for trying to be sensitive. I've been feeling like my years of studies were being dismissed out of hand as unfamiliar on a modern random street. I've met editors on Wikipedia fluent in Israeli Hebrew who were unfamiliar with and didn't want to learn or even have to deal with things like nequddoth, cantillation, dialectual distinctions, historical terms, etc. It was feeling like much of the same sense of...apathy or even antipathy towards the classical language. Anyway, I have learned in my studies and among my various international contacts on the issue, that Israel itself is a very divisive subject, even within the Jewish diaspora. I have encountered observant Orthodox Jews (such as my aforementioned Lithuanian Jewish friend from England) who are staunchly ideologically opposed to the conventions of the State of Israel, and find the exclusive use of Israeli conventions to be insulting and incendiary. After such startling negative impressions, it just seemed safer and more neutral (less offensive to somebody) overall to avoid approaches to Hebrew linguistics that are centered on any one national standard. I don't really care for all the messy politics. I just like Hebrew. But when it comes to the State of Israel, even within Jewish diaspora communities, it becomes inalienably political, and I'm not interested in insulting anyone by making my academic projects Israeli-centered. So at best, I spread it around. ISO 259 is neutral if it's the only standard, and I include Israeli stuff if I cite multiple conventions together. But it has seemed...Israeli conventions cannot be considered neutral enough to be a central standard. - Gilgamesh 19:03, 28 March 2008 (UTC)Reply

Rest assured that my training in Hebrew was a classical one as well (although, we were taught modern pronunciation, with some clarifications on some of the more important distinctions between ancient and modern). Ultimately, Wiktionary has a general policy of not heeding people's sensitive sensibilities in favour of just doing what works and what makes sense. Ultimately, we're going to offend someone no matter which way we go. Since Israel is the current center of Hebrew speech, it is the natural choice for a transliteration standard. And yes, your expertise and knowledge are certainly appreciated and welcome here, but it must be borne in mind that the living standard of a language is always going to take precedence over an extinct one. However, I think that, aside from transliteration, Wiktionary provides the space to do justice to every form of the language. -Atelaes λάλει ἐμοί 19:33, 28 March 2008 (UTC)Reply

I hope you're right. I'll do what I can, with what I know and can provide. But I believe...the ship of monosystemic Hebrew may have already sailed. I'll do my best to meet both interests. - Gilgamesh 23:46, 28 March 2008 (UTC)Reply

gershayim and quotation marks

Latest comment: 16 years ago3 comments3 people in discussion

I made the entry אחדשה"ט as a hard redirect to אחדשה״ט. Should it be a "alternative form" entry instead?—msh210℠ 20:12, 13 March 2008 (UTC)Reply

I think you did right. English contractions don't get separate entries for their straight-apostrophe (') and curly-apostrophe (’) forms, and this strikes me as very much analogous. —Ruakh_TALK 23:06, 13 March 2008 (UTC)Reply

Yes this is exactly what I've been doing too. Same with geresh and apostrophes. It's more a technology issue regarding keyboards and fonts than a spelling issue involving letters and characters. — hippietrail 09:01, 14 March 2008 (UTC)Reply

Entries from Strong's

Latest comment: 16 years ago4 comments3 people in discussion

There are many entries copied from Strong's Concordance. Some of them have what seem to be folk etymologies - I don't know that much about Hebrew etymology, but the Greek entry λύκος had an etymology that was clearly wrong. What should be our policy about etymologies, or definitions, from Strong's? PierreAbbat 23:56, 23 April 2008 (UTC)Reply

I don't believe they are actually copied from Strong's, as no version of Strong's that I've ever seen has done much in the way of etymologies, but rather they're simply sorted by Strong's numbers, which I see as a good thing. I don't know exactly where Dubaduba (talk • contribs) and his sock/disciple 8 (talk • contribs) got their information from. Most of their Ancient Greek and Sanskrit entries have been cleaned up and checked. The Hebrew ones still need a whole lot of work. Anyone who is capable and willing is more than welcome to get at them. -Atelaes λάλει ἐμοί 00:06, 24 April 2008 (UTC)Reply

Oh, wow! I was also assuming these oddball etymologies were coming from Strong's. I've been wrongly besmirching his good name. :-( Should we just remove any Hebrew etymologies that look suspicious and aren't sourced? :-/ —Ruakh_TALK 01:53, 24 April 2008 (UTC)Reply

That may not be a terrible idea. Also, Ivan is an incredible etymology source, and has already done some remarkable work with Semitic etymologies. And yes, Strong's was simply a concordance, an exhaustive index and numbering system, with perhaps the briefest of definitions. -Atelaes λάλει ἐμοί 02:06, 24 April 2008 (UTC)Reply

Shva na transliteration

Latest comment: 16 years ago27 comments3 people in discussion

I've been reading up on the newer transliteration rules issued by the Academy of the Hebrew Language in 2006 and adopted by the United Nations in 2007[2]. Without discussion per se whether this should be the standard used in Wiktionary, the one clause that struck me the most is this:

The shva ( ְ ) is of two kinds: shva naẖ, which is omitted in transliteration, and shva na, which occurs at the beginning of a word or syllable. It is transliterated by e only where it is actually sounded. Example: בְּנֵי בְּרַק Bne Brak (not Bene Berak), but גְּאוּלִים Ge'ulim.

The problem is, for words not before encountered, how do you know which shva na have their syllables opportunistically collapsed and which shva na retain their syllables? I already know that certain clusters have shva na that have a tendency to completely collapse the shva syllable, such as in ביאליק Byalik, בני ברק Bne Brak, שבט Shvat, דלילה Dlila, דבורה Dvora, כפר Kfar, שפרעם Shfar'am, כנען Kna'an, etc. But what are all the consonant combinations that tend to do this regularly on a stable basis in a word-initial position, and which combinations are not stable enough for this? I figure that words with a consonant followed by aleph or ayin do this, such as Ze'ev, and words where the first two consonants are the same, such as Dedan. Which of these word-initial consonant combinations tend to collapse and completely lose their syllables? The ones I'm pretty sure collapse are shown with their e removed. I'm not a native Modern Hebrew speaker...help me narrow down all the pairs and correct my mistakes? And then, this could help others using the 2006 Academy rules.

ve'	vev	veg	ved	veh	vez	vekh	vet	vy	vl	vem	ven	ves	vef	vets	vek	vr	vesh
be'	bev	beg	bed	beh	bez	bekh	bet	by	bl	bem	bn	bes	bef	bets	bek	br	besh
ge'	gv	geg	ged	geh	gz	gekh	get	gy	gl	gm	gn	ges	gef	gets	gek	gr	gesh
de'	dv	deg	ded	deh	dz	dekh	det	dy	dl	dem	den	des	def	dets	dek	dr	desh
ze'	zv	zg	zd	zeh	zez	zekh	zet	zy	zl	zm	zn	zes	zef	zets	zek	zr	zesh
te'	tv	teg	ted	teh	tez	tkh	tet	ty	tl	tem	ten	tes	tf	tets	tek	tr	tesh
ye'	yev	yeg	yed	yeh	yez	yekh	yet	yiy	yel	yem	yen	yes	yef	yets	yek	yer	yesh
khe'	khv	kheg	khed	kheh	khez	khekh	khet	khy	khl	khm	khn	khs	khf	khets	khek	khr	khsh
ke'	kv	keg	ked	keh	kez	kekh	ket	ky	kl	km	kn	ks	kf	kets	kek	kr	ksh
le'	lev	leg	led	leh	lez	lekh	let	ly	lel	lem	len	les	lef	lets	lek	ler	lesh
me'	mev	meg	med	meh	mez	mekh	met	my	ml	mem	men	mes	mef	mets	mek	mr	mesh
ne'	nev	neg	ned	neh	nez	nekh	net	ny	nel	nem	nen	nes	nef	nets	nek	ner	nesh
se'	sv	sg	sd	seh	sez	skh	st	sy	sl	sm	sn	ses	sf	sets	sk	sr	sesh
fe'	fev	feg	fed	feh	fez	fkh	fet	fy	fl	fem	fn	fs	fef	fets	fek	fr	fsh
pe'	pev	peg	ped	peh	pez	pkh	pet	py	pl	pem	pn	ps	pf	pets	pek	pr	psh
tse'	tsv	tseg	tsed	tseh	tsez	tskh	tst	tsy	tsl	tsm	tsn	tses	tsf	tsets	tsk	tsr	tsesh
re'	rev	reg	red	reh	rez	rekh	ret	riy	rel	rem	ren	res	ref	rets	rek	rer	resh
she'	shv	shg	shd	sheh	shez	shkh	sht	shiy	shl	shm	shn	shes	shf	shets	shk	shr	shesh

- Gilgamesh 12:26, 14 July 2008 (UTC)Reply

It's hard to think of examples, but the shvas are normally silent in "zkukim" ("needing"), "tzdadim" ("sides"), "pkakim" ("traffic jams"), "ktiv" ("writing"), and "ktzat" ("a bit"). Conversely, they're normally pronounced in the conjunction "v'-" and prepositions "b'-" and "l'-", no matter what word follows. Also in "m'laf'fon" ("cucumber"), and in the pi`el/pu`al participial prefix "m'-" ("m'rashem" ("registrar"), "m'lamed" ("teaches"), "m'yabesh" ("drier")). (Actually, I can't think of any mr-/ml-/my- cases where the shva is silent, but will take your word for it.) —Ruakh_TALK 15:22, 14 July 2008 (UTC)Reply

More generally, the issue seems to involve more than just phonotactics; for example, I say (deprecated template usage) תְּקוּעָה with /tk-/ and (deprecated template usage) תְּקַבֵּל with /tək-/. On introspection, I think it's because that's just how the prefix /tə-/ is pronounced, regardless of anything. —Ruakh_TALK 16:59, 14 July 2008 (UTC)Reply

I recognize those v/b/l cases—inseparable prepositions. So if it's an inseparable preposition, it's pronounced, right? That's fairly predictable. But I still crave more information. Based on the new Academy rules, I would appreciate a more methodical approach of how to determine whether shva na is collapsed. X3 It's relevant for this page I'm working on, User:Gilgamesh/Tanakh names, since I'm machine-generating both the Hebrew spellings and the Latin script transliterations from a common rich markup format. For example, the biblical name Merari (מְרָרִי) uses the markup code m @ r a: r i:y, and the Java utility I wrote automatically parses and algorithmically generates Modern transliteration Mrari. But if this is a three-syllable word in Modern Hebrew phonology, should it be Merari? I don't know either way—I'm guessing based on what I've observed. The PDF is rather vague in this area, since masoretically all shewā nāʻ have their own syllable, albeit a short one. Linguistically, this appears to be a phonemic split in the Modern language. - Gilgamesh 20:39, 14 July 2008 (UTC)Reply

Eh, forget it. Modern Hebrew shva na collapse prediction is a royal pain in the ass for a non-speaker. I'm going to use apostrophes throughout. Thank you very much Hebrew Academy... - Gilgamesh 10:16, 15 July 2008 (UTC)Reply

Heh, sorry. BTW, I suppose you don't need it now, but I thought of another example: the triliteral pa`al second-person plural past tense forms, "sh'martem" and so on, are always trisyllabic. Growing up, I always took this as "shamar" + "-tem", but grammatically the first vowel is a sh'va na, and I suppose maybe it is. (I also always thought "l'vad" was "lavad", so this wouldn't be a unique case for me.) Also, I'm not sure "phonemic split" is quite the right term, though I can't think of a better one. I think it's not that sh'va na has become two phonemes, but rather that the single sh'va na phoneme has disappeared in many contexts. For another example (but one that's hopefully similar enough for analogy), would you say that English /k/ underwent a phonemic split when /kn-/ became /n-/ (e.g. in "known")? —Ruakh_TALK 15:44, 15 July 2008 (UTC)Reply

Maybe phonemic split was the wrong term, but a split of some kind did happen, though merely with one of the two directions going to zero. Also, I'd read at least two different places that remaining syllabic shva na is pronounced [e], not [ə]. So it's actually really [ə], essentially a sixth pure vowel in Modern Hebrew? - Gilgamesh 16:07, 15 July 2008 (UTC)Reply

I've read that many speakers pronounce it as [e], but I don't think I've read that all speakers do. I've always known that my mother says "be'emet" and not "b'emet", for example, and when I learned enough grammar to determine that the definite form of "emet" is "ha'emet" (as in Genesis 32:10; and for that matter, in my mother's speech, in "ha'emet hi she-"), so that one would expect indefinite "b'emet" or definite "ba'emet", I was confused and didn't have an explanation. (I still don't have an explanation, but my mother's pronunciation is borne out by Proverbs 29:14. And if anyone does have an explanation, please let me know or edit the relevant entry here.) —Ruakh_TALK 16:52, 15 July 2008 (UTC)Reply

It's beemet because it's a segol rather than a sh'va na. The aleph has a chataf-segol, and a sh'va cannot precede another sh'va at the start of a word (where by "sh'va" I here mean to include chataf-vowels). So, usually, before a chataf-segol, a sh'va turns to a segol; before a chataf-patach, to a patach; and before a sh'va, to a chirik. (I can't picture a chataf-kamatz at the start of a word almost at all, and am not sure what a preceding sh'va would turn into.)—msh210℠ 17:25, 15 July 2008 (UTC)Reply

Oh my gosh, you have no idea how long I've wondered that! Thanks! :-D Does that only affect b'-/l'-/k'-/v'-, or does it affect other places where a sh'va na` might precede a sh'va na` or khataf __ (if there are any)? By which I mean, which entries' usage notes should we edit? :-) —Ruakh_TALK 21:03, 15 July 2008 (UTC)Reply

Hm, actually, a sh've before a chataf-segol doesn't always turn into a segol: consider בֵּאלֹהֵינוּ. I'm not sure what the rule is. In any event, what "other places" are you thinking of? In future-tense verbs' "prefixes" (those are quote marks for distance), the same rule applies: for masculine singular second-person (אתה), for example, piel has tav-sh'va תְּ, but paal has either tav-chirik תִּ or tav-segol תֶּ or tav-patach תַּ, depending on the following letter: tishmor תשמר, techenak תחנק, taavod תעבד.—msh210℠ 21:59, 15 July 2008 (UTC)Reply

Re: future-tense prefixes: You're quite right, and those rules are still very much followed in Modern Hebrew (though sometimes we have /ta-/ before sh'va, as in "takhzor", which I believe used to be "takhazor"). It never occurred to me that there might be a larger pattern at work. Re: be'e- vs. bei- for aleph: I think this is part of a larger pattern. If I recall correctly, my conjugation book, which unfortunately I lost in the mail a few months ago (though it was in Russian anyway, and I don't speak a word of Russian, so it was of limited use to me), distinguished between conjugations that were pei-alef and those that were nakhei (?) pei-alef. My assumption (which I don't have any evidence for) was that some Ancient Hebrew words started with vowels, others with glottal stops, initial א־ being used for both. On the other hand, by this theory le'ekhol would reflect a glottal stop in 'akhal, while tokhal would reflect the lack of one, so I can't claim it really makes sense. :-P (I don't remember if the book classified 'akhal's alef as nakhei or not.) —Ruakh_TALK 23:31, 15 July 2008 (UTC)Reply

I'm no expert in Modern Hebrew, but I do study a lot of Biblical Hebrew in Tiberian vocalization. In the following examples, C represents a consonant and H represents a guttural consonant. ʼ represents an aleph glottal stop.

əCəC → iCC
əyəC → īC
əHĕC → eHĕC
əHăC → aHăC
əHŏC → oHŏC
əʼĕC → eʼĕC or ēC
əʼăC → aʼăC or āC
əʼŏC → oʼŏC or ōC

From what I understand though, Modern Hebrew tends to dispose of these habits and just crams consonants together throughout, e.g. b'y'rushaláyim instead of birushaláyim ("at Jerusalem"). - Gilgamesh 22:07, 15 July 2008 (UTC)Reply

Thanks! I think it disposed of those habits with the clitics (except in fixed expressions like "be'emet", "bimkom", etc.), but not with the verb conjugations; but since I'm very bad at knowing which words have which niqqud (segol vs. khataf-segol and so on), it's hard to be sure. —Ruakh_TALK 23:31, 15 July 2008 (UTC)Reply

Oh, I just remembered some more cases. Tiberian vocalization has three stress modes, see? Stressed, secondarily stressed and unstressed. For inseparable prepositions, if the preposition is to be attached immediately before a fully stressed syllable, then the the shewa of the preposition becomes fully qamez gadhol, e.g. kāyām = "as a sea". This is not necessarily ambiguous, because "as the sea" would be kayyām (since ha- geminates the following consonant). The tensing of the preposition to qamez gadhol also applies equally to bā-, wā- and lā-.

The rules for inflecting wə- are additionally nuanced and rather subtle. If the consonant after wə- is a labial consonant (b/bh, w, m, p/ph) carrying a syllable with secondary stress, then wə- becomes u-, e.g. ubhattīm = "and [some] houses". In most cases, wəCəC becomes uCC even if the first C is not a labial consonant, e.g. uKhnạʻan = "and Canaan". However, wəyəC becomes wīC everywhere, and wəHăC etc. becoming waHăC etc. everywhere. In Tiberias at the time of the standardization of the Masoretic Text, u- was pronounced ʼu-, as if there was an invisible aleph before it. But not all Middle Eastern varieties of Hebrew were like this; some others pronounced this wu-. It is suspected that Tiberias had ʼu- because its waw and bheth had either merged or were nearly merged to [v], and since bə- still became bhə- ([və-] or [βə-]) after another vowel, and wə- was pronounced [və-] or [ʋə-], this made wə- and bhə- sound the same or almost the same. So wə- in wəCə- or before a labial consonant became glottalized to ʼu- (not wu- or wi-) to ease this ambiguity. In locales where w was still pronounced [w] though, this glottalization was unnecessarily, and wu- remained for as long as waw was distinct from bheth. Interestingly, this glottalization of wə- to ʼu- also happened in Samaritan Hebrew because bheth and waw had merged, but in a different way—beth was [b] and bheth was [w], and waw joined the b/g/d/k/p/t club by being [b] at the beginning of a word or after shewa nah, and [w] everywhere else. Samaritan speakers call their language Iwrith. And...I ramble far too easily to try to illustrate a point. - Gilgamesh 00:48, 16 July 2008 (UTC)Reply

Re: wā-: Ah, that explains "erev va-voker" (found in the liturgy). My father told me that happens when the next syllable is stressed, but there were just too many exceptions. A multiple-levels-of-stress system would account for the variation. Re: u- before labials: Yeah, that still happens in "grammatical" Hebrew (it's called the "bumaf" rule), though I don't know anyone who really talks that. This language has too many rules! —Ruakh_TALK 01:10, 16 July 2008 (UTC)Reply

Well, it helped that these rules, though varied, were entirely organic to those using them, whether in speech or in liturgy. I believe many Tiberian Hebrew phonology rules can be broken down logically. If you think about it, there are fewer vowels than are actually spoken. The legacy of three Semitic short vowels and three Semitic long vowels is obvious, but other vowels came about from the particular position (or stress) of vowels, and from diphthongs that evened out into monophthongs. The Semitic short vowels a/i/u in particular became a/i/o in Hebrew shut syllables, ā/ē/ō in stressed syllables, etc. (though ā was ẹ or ạ in certain conditions in many segolate words), but I think there is subtle evidence to account for a fourth short metavowel that arose in Hebrew that was not present in Proto-Semitic. *ə is i in shut syllables, ẹ in segolates and ā in stressed syllables. Most Hebrew words that have this vowel correspond with *a in other languages, such as Hebrew Miryām vs. Arabic Maryam, and Hebrew Gilʻādh vs. Arabic Jalʻād, and Hebrew Shimʻōn vs. Arabic Samʻān, etc., but it seems in general that this Hebrew phenomenon arises from vowel weakening followed by vowel retensing in Hebrew developmental phonotactics. This *ə is very plastic and easily influenced by its surroundings, which is part of why there are so many rules and exceptions to rules of how to inflect it—it reflected more on the way the language evolved organically than on trying to torture the learner. In this sense, Modern Hebrew's way of ironing out and simplifying the rules of how to inflect this vowel is no less valid than the old rules, from a morphological point of view. Theoretically, even many of the *i in verb inflections don't have to be there; "he will protect" can just as easily be y'sh'mor or yashmor instead of yishmor. From a broken down phonotactic point of view, pa'al is *pəʻal (with a tense long ə becoming ā—re: stress rules), and yif'ol is *yəpəʻôl, etc. This can also clearly explain why, for instance, Sidon is Hebrew Çīdhōn but Arabic Ṣaydā—if you treat it logically as *Ṣəydā(n), then the historic phonotactics of each language fall neatly into place. Arabic kept distinct *a while Hebrew developed *a and *ə meta vowels, and *əy became ī. Semitic ā became Hebrew ō, and -n is an archaic Semitic grammatical ending for absolute nouns that became fused to the root or stem in certain Hebrew versions of words (like how Latin -us and -um became -es and -on in some French words but no longer has strict grammatical significance in most cases). - Gilgamesh 02:06, 16 July 2008 (UTC)Reply

Interesting, thanks! And yes, I certainly agree that these rules developed naturally and felt natural to their users — every language has lots of crazy rules, that native speakers don't realize the craziness of until they're pointed out to them. And I definitely don't object to the changes in Modern Hebrew, but it's a bit of a pain that there's such a sharp difference between the form of the language that people usually use (which is the only form I know) and the form of the language that people consider correct. —Ruakh_TALK 02:24, 16 July 2008 (UTC)Reply

They consider it correct because it's honestly very, very pretty. If you can figure it out. XD I myself honestly prefer biblical grammar and phonotactics over those in the Modern language—the Modern language sounds like a scrunchie in a pressure cooker in my ears. Though I understand that the different verb rules in Modern (as opposed to Biblical) evolved organically that way in neighboring Aramaic (where participles became present tense, imperfect tense became future tense, etc.). I just prefer the old grammar. The ubiquitous word shel does not exist by itself in Biblical Hebrew—I like the pretty construct/absolute noun forms better. XD - Gilgamesh 11:23, 16 July 2008 (UTC)Reply

Well, "grammatical" Modern Hebrew still isn't Biblical Hebrew in either grammar or phonotactics. It's somewhere in between normal Modern Hebrew and Biblical Hebrew (closer in some ways to one, in some ways to the other; I'd guess it's most similar to late liturgical Hebrew, but I don't know enough to say for sure). But then, it's also hard to distinguish between "grammaticality" and a formal register, since "grammaticality" is already so arbitrary. —Ruakh_TALK 15:12, 16 July 2008 (UTC)Reply

The explanation I heard for erev vavoker, and which I believe is correct, is that when words come in pairs, and the second has its first syllable stressed and starts with v', then it starts with va instead. Thus http://www.mechon-mamre.org/i/t/t0108.htm Genesis 8:22]; also, for example, bayit vagan.—msh210℠ 15:57, 16 July 2008 (UTC)Reply

I just reread what you, Gilgamesh, wrote above on this, and which I quote because it may have gotten lost in the length of the discussion:

For inseparable prepositions, if the preposition is to be attached immediately before a fully stressed syllable, then the the shewa of the preposition becomes fully qamez gadhol, e.g. kāyām = "as a sea". This is not necessarily ambiguous, because "as the sea" would be kayyām (since ha- geminates the following consonant). The tensing of the preposition to qamez gadhol also applies equally to bā-, wā- and lā-.

"Kayam" with two kamatzes sounds terrible to my ears; are you sure it's correct? Moreover, I can't think of any similar words where k'-, b'-, or l'- gets a kamatz. Can you point to examples in, say, the Aleppo Codex?—msh210℠ 16:26, 16 July 2008 (UTC)Reply

Kayam was a random possible example I formed in my head. Actually, I read about it in my copy of Teach Yourself Biblical Hebrew (by R.K. Harrison, →ISBN. In chapter XI, The Inseparable Prepositions, page 64:

If the preposition falls in the pretone, the vowel under it is frequently qameç, e.g., בָּמַ֫יִם in water, לָבֶ֫טַח securely.

Does that help? - Gilgamesh 17:16, 16 July 2008 (UTC)Reply

Yes and no. It would help, if I knew what a fully stressed syllable were, as opposed to a secondarily stressd one. I assume that this does not correspond to the difference between a cantillation mark and a meteg, right? (Those are two different stresses also.)—msh210℠ 16:10, 17 July 2008 (UTC)Reply

Well, think about it this way. Tiberian Hebrew is a mora-timed language, similarly to Ancient Greek, Finnish or Japanese. Hebrew in Tiberian vocalization has short and long vowels, and single consonants, consonant clusters no greater than two, and geminated (doubled consonants). Word stress is the final long vowel of a word. A short vowel (shut vowel or shewa/hateph) is one mora long, a long or stressed vowel is two morae long. One consonant before a vowel does not occupy a mora, but a consonant before a consonant takes one mora, and a doubled consonant adds one mora, and a consonant at the end of a word takes one mora. A pretone (as mentioned above) is a long vowel syllable immediately before the stressed syllable. This makes lengthened (two-morae) inseparable prepositions possible only if the first syllable of a word is its stressed syllable. Otherwise, the preposition occupies only one mora, scrunching if necessary according to Tiberian phonotactics. Tiberian phonotactics tends to favor more length in absolute-mode grammatical forms as longer than construct-mode grammatical forms, which creates a favorable environment for the existence of long pretones before a stressed syllable. To illustrate the mora-timed concept, here are some examples split into mora ticks:

Word	1	2	3	4	5	6	7	8	9	10
ירושלים	ye	rū		shā		lạ		yi	m
בירושלים	bī		rū		shā		lạ		yi	m
אסנת	ʼā		se	nạ		th
לאסנת	le	ʼā		se	nạ		th
בנימין	bi	n	yā		mī		n
ובנימין	u	bh	ni	yā		mī		n
ישראל	yi	s	rā		ʼē		l
ישראלי	yi	s	re	ʼē		lī
אשכנז	ʼa	sh	ke	nạ		z
אשכנזי	ʼa	sh	ke	nā		zī
מצריים	mi	ç	rạ		yi	m
מצריימה	mi	ç	rạ		y	må
שווא	she	wā
השווא	ha	sh	she	wā
מים	mạ		yi	m
במים	bā		mạ		yi	m
מי	mē
במי	be	mē

The long pretone preposition takes advantage of the absolute grammatical form, and it scrunches into a shorter vowel further back in the construct, just like non-final vowels in construct forms of absolute nouns. Does this help? I'm not entirely sure it does, as Modern Hebrew has mostly eliminated the biblically ubiquitous construct-then-absolute grammar forms in favor of the newer structure involving the use of the preposition של shel with absolute noun forms almost throughout. Shel as a word all by itself does not exist in the Bible—ירושלים של זהב Yerūshālạyim Shel Zāhābh would have been ירושלי זהב Yerūshelē Zāhābh. XD But do notice that Yerūshelē (construct) and Yerūshālạyim (absolute) rhyme nicely with bemē (construct) and bāmạyim (absolute). - Gilgamesh 22:16, 17 July 2008 (UTC)Reply

Re: "Modern Hebrew has mostly eliminated the biblically ubiquitous construct-then-absolute grammar forms": Certainly smikhut is less common in Modern Hebrew than in Biblical Hebrew, but it's still productive in forming compounds, and in formal registers it's fairly common. Also, there's no guarantee a Biblical speaker would have chosen "y'rushalei zahav" over the other ways of expressing that: my Biblical Hebrew is not what it could be, but I think "y'rushalayim [asher] l'zahav" would work — and personally I don't think "y'rushalei ha-zahav v'ha-n'khoshet v'ha-or" sounds very good. :-P —Ruakh_TALK 01:02, 18 July 2008 (UTC)Reply

Very true, I know shel is historically a contraction of asher-l'. I suppose I'm not really sure the construct form would be Y'rushlei—it was just a semi-educated guess, as I always thought Y'rushalayim was a grammatical dual, but I could be wrong. However, you understand my point, I take it. :3 - Gilgamesh 01:14, 18 July 2008 (UTC)Reply

Quick question

Latest comment: 16 years ago6 comments2 people in discussion

Am I right in guessing that, in Modern Hebrew, these would be homophones? These aren't actual words, but phonetic combinations:

אָיִי ayi and אָאִי a'i
אִיָה iya and אִיאָה i'a

Considering aleph and ayin seem to function more as syllable breaks than anything else in the Modern language. I've seen /iyya/ reduced to /ia/ all over the place, and I've known at least one Israeli person online to consistently use /a'i/ instead of /ayi/ in on-the-spot transliteration of words. - Gilgamesh 19:32, 16 July 2008 (UTC)Reply

My spelling is atrocious, but for your first example, I think "goat" has ayi- and "American" has -a'i, yes? They have pretty much the same sound, maybe the exact same sound. I can't think of any examples for the second. —Ruakh_TALK 20:35, 16 July 2008 (UTC)Reply

As I said, I didn't intend to cite any actual words—just phonetic combinations. But for specific examples, I've seen Yirmia for Yirmiya, Y'rushala'im for Y'rushalayim, etc. This yod-clipping was consistently used in ad-hoc transliterations by someone I know in Ra'anana. - Gilgamesh 21:15, 16 July 2008 (UTC)Reply

Not having studied Modern Hebrew phonetics (nor that of any language besides French), I'm very much dependent on specific examples to consider and extrapolate from. When I say "Y'rushalayim", I don't think I generally pronounce the /j/ (but I definitely do pronounce it if I say each syllable separately, in hyper-articulated speech, so I'm surprised someone would drop it from the transliteration). —Ruakh_TALK 01:51, 17 July 2008 (UTC)Reply

Well, of course. Yitsħak doesn't sound like Itsħak, since word-initial /j/ is always articulated. But my impression is that, when syllables are strung together, /j/ with /i/ on one side and any other vowel on the other side tends to be elided. It's similar, say in Japanese, to ohayou ("good morning"). Enunciated, mora by mora, it's o-ha-yo-u, but it's always spoken ohayō as a united phrase, to the point where saying ohayou in one breath produces a foreign accent. However, it depends on which viewpoint you want to treat it. Japanese spell that word with kana for o-ha-yo-u, but standard Hepburn romanization mandates ohayō. So one has to decide what's more important—back-end structure or front-end pronunciation. So for Modern Hebrew, the question is whether it's more important to transliterate the back-end (Ħayyim, Eliyyahu, Saray) or the front-end (Khaïm, Eliahu, Sarai). - Gilgamesh 02:13, 17 July 2008 (UTC)Reply

err...sorry for that mini-ramble. I'm rather drowsy. I'm not entirely certain what my point was supposed to be. Sometimes I ramble for no other reason than I am pedantic and think out loud. X3 - Gilgamesh 02:30, 17 July 2008 (UTC)Reply

present tense verbs, and "actor" nouns

Latest comment: 16 years ago7 comments4 people in discussion

(See, e.g., פורץ.) Can every present-tense verb be used as a noun indicating the actor? Do we want to list both sections on every page? If so, do we want to somehow minimize the noun section? Discuss.—msh210℠ 18:13, 17 December 2008 (UTC)Reply

Any present-participle/present-tense verb form can be used nounishly, yes. In many cases this gives rise to an independent noun that has its own meaning and can be used any way that a noun can, such as (deprecated template usage) רופא (rofé) (from (deprecated template usage) רופא (rofé), masculine singular present participle and present tense of (deprecated template usage) רפא (rafá)). Obviously these cases need their own entries. But in most cases, it seems to be fairly restricted in nounish use; for example, I think (deprecated template usage) הבוחר בעמו ישראל באהבה (habokhér b'amó yisra'él b'ahavá) basically means "who chooses His people Israel with love", not really "the chooser of His people Israel with love". (I have some vague thoughts on how one could test this hypothesis, but what I'm describing right now is just my gut instinct.) I think similar things happen in a lot of languages, including Arabic and Latin. (And English, for that matter: our gerund-participles frequently act as action nouns, as in "Giving is good", and I think they can act as agent nouns sometimes as well; "The starving should not refuse bread" sounds awkward to me, but not ungrammatical.) It might be worth asking at the Beer parlour for other editors' insights; or alternatively, asking some editors who are knowledgeable in those languages (such as EncycloPetey and Stephen G. Brown) to comment here if they have any thoughts. —Ruakh_TALK 13:23, 19 December 2008 (UTC)Reply

Hm, yes, as you say, Ruakh, this seems analogous to the situation in English with present participles' being used as action nouns: Defenestrating is fun! Although [[defenestrating]] is listed as a verb only, as (I suspect) are the vast majority of such words, some are listed as both, such as eating (defined as "the act of consuming food"), flyering ("the act of distributing flyers (leaflets)"), and kicking ("the action of the verb to kick"). I've mentioned this disucssion on EP's and SGB's talkpages.—msh210℠ 20:19, 1 January 2009 (UTC)Reply

It sounds analogous to the situation with Latin adjectives. Almost every Latin adjective can be used in a substantive (noun) sense. How have I dealt with this? For Latin adjectives with a specific substantive sense, but without specific gender when used as a noun, I add a context label of (substantive) to the head of the definition line but keep the definition in the Adjective section. This is a bit like what we do for English nouns, which don't get a separate adjective section just because they're used attributively. However, if the substantive sense of a Latin adjective applies only to a specific gender, such that the inflection of that sense is restricted to just the one gender, then the entry will have a separate Noun section in order to accomodate a separate inflectional table, etc. --21:24, 1 January 2009 (UTC) — This unsigned comment was added by EncycloPetey (talk • contribs).

Can you show me an example of each of those two scenarios, please?—msh210℠ 22:05, 1 January 2009 (UTC)Reply

Yes:

rusticus - Primarily adjective. A substantive sense exists for both masc. and fem., but is clearly related the to the adjective sense(s).
sagittarius - Substantive sense exclusively masculine, so a separate noun section exists in the entry.
planus / planum - Substantive sense exclusively neuter, for which the lemma is a separate form and therefore on a different page. (Yes, there is a noun sense on the planus page, but it's from a different etymology.)

--EncycloPetey 23:22, 1 January 2009 (UTC)Reply

Arabic has a verb form that is sometimes called a verbal noun, sometimes an infinitive, but which I think is the form you’re talking about here. Every verb has one and it’s like the English -ing form. Arabic participles are used not only as adjectives and nouns, but also stand in for finite verbs. A present participle is often used as the present tense of a verb: انا عارفك = I know you (literally, I knowing you). Or هم مسافرين = they are leaving (literally, they leavers). —Stephen 15:56, 2 January 2009 (UTC)Reply

In the example הבוחר בעמו ישראל באהבה, I think it's a noun, and would be curious to see your, Ruakh's, ideas on how to test your hypothesis. רופא is of course a special case, in that its use as a noun is exceedingly more common than its use as a verb, and in that its feminine form is not what one would expect from the verb. (The feminine ferb is רוֹפֵאת, rofet; the feminine noun is רוֹפְאָה, rof'a.) I didn't mean to include such nouns in this discussion.—msh210℠ 21:21, 5 January 2009 (UTC)Reply