Talk:sută
Add topicAppearance
Latest comment: 22 days ago by Nicodene in topic Sută/sutã in Romanian and Aromanian
Sută/sutã in Romanian and Aromanian
[edit]Hi Nicodene! Just want to provide you with my reasoning behind the revert. Scholars have for a long time disputed a direct Slavic origin due to the fact that all the other numbers are of Latin origin and a ŭ > u sound shift is hard to explain. This has led some scholars to believe that a Persian origin is worth considering. I don't believe that one theory is more plausible than the other, but it does give us reason to mark the etymology as "unknown" or at least "disputed" and provide readers with brief explanations of the existing theories. As I said before, I have the utmost respect for your contributions and expertise and hope that this doesn't cause a rift between us. Robbie SWE (talk) 17:12, 26 January 2025 (UTC)
- @Robbie SWE:
all the other numbers are of Latin origin
- All numbers from 11 to 19, and all multiples of ten from 20 to 90, are calqued from Slavic. For the next multiple (100) to have been directly borrowed, rather than calqued, would not be surprising since there the Slavic pattern breaks: there is no **desętь-desętь (“ten-tens”), only a non-compositional *sъto.
a ŭ > u sound shift is hard to explain
- That is in fact what happened in early borrowings from Slavic to neighbouring languages, when yeru would have been [ʊ] or similar. Various examples attested in Greek, Old High German, etc. are listed here (pp 21–2). For Romanian in particular, the following words are mentioned:
- One might add other comparanda like:
- Nicodene (talk) 04:03, 27 January 2025 (UTC)
- Let me address your comments @Nicodene:
- The numbers
- You're bypassing the issue: all the cardinal numbers – from one to ten – are of Latin origin in Romanian and all of its variants (Aromanian, Megleno-Romanian and Istro-Romanian). Even the numbers from 11 to 19 are formed using the Latin super (e.g. cincisprezece). In Aromanian, yinghiț (< Latin vīgintī) survived while it was lost in Romanian. Why would the word for "hundred" be the only diverging numeral? Even thousand (o mie) is of Latin origin. Let's not forget that Albanian has a similar way to form numbers above ten and to some extent, even Armenian. It could be argued that this might be a substrate feature, which in Romanian might have been enforced by Slavic (for more info, read Indo-European Numerals. (Trends in Linguistics. Studies and Monographs, 57). Edited by Jadranka Gvozdanovic. Berlin & New York: Mouton de Gruyter, 1992). Sidenote: the Romanian literary language was formed in the 19th century. Informally today and in the vernacular of the time, they did not use forms like unsprezece, doisprezece, treisprezece etc. They used unșpe, doișpe, treișpe etc. with regional variants such as unsprece, unspece, doisprece, treipce, treiprece and triisprece. All these seem eerily similar to other Romance languages such as Italian undici, dodici, tredici or even French onze, douze and treize.
- Romanian sută is feminine and always requires a determiner. All the Slavic derivatives are either masculine or neuter, making sută closer to Proto-Finno-Ugric *śëta than Slavic *sъto. As I said, there are several reasons why scholars still discuss its origin, be it Daco-Thracian, Iranian or Slavic.
- The comparanda
- Highly disputed examples:
- The word mătură could just as easily have come from Latin since it has several near-cognates in Southern Italian dialects. It could also just be a substrate akin to Proto-Slavic *metla. The intervocalic shift from l < r usually presents itself in inherited Latin terms, rarely in Slavic.
- Rather confused as to why you mentioned bulgar – the sources I've seen mention *bulgarinŭ or баъгарини. In either way, the original form was bălgar and bolgar which today are considered archaic or regional, the form bulgar being a more recent development probably influenced by Bulgarian българин.
- The word cumătru is more likely to be derived analogically as a masculine from the related feminine form cumătră (from Latin commāter, commātrem) especially considering that the equivalent forms in Slavic languages are very different (Bulgarian, Croatian, Russian kum, Serbian kumak).
- Names of villages in historically Hungarian regions are almost exclusively borrowed directly from Hungarian. Where the Hungarian names came from has very little to do with Romanian morphology.
- The word tocmai and its varianta tocma, tomna, togma, togma, toima, toma, tomnai, tucma, tucmai, tucna, tuma and tumna are said to be borrowed from a Slavic form токма.
- Robbie SWE (talk) 20:43, 27 January 2025 (UTC)
- @Robbie SWE
You're bypassing the issue: all the cardinal numbers – from one to ten – are of Latin origin in Romanian and all of its variants (Aromanian, Megleno-Romanian and Istro-Romanian). In Aromanian, yinghiț (< Latin vīgintī) survived while it was lost in Romanian. Why would the word for "hundred" be the only diverging numeral? Even thousand (o mie) is of Latin origin.
- Odd question.
- Surely you do not think 1 or 2, but there is no other way to read your question.
Even the numbers from 11 to 19 are formed using the Latin super (e.g. cincisprezece).
- Yes, that is what is meant by calquing: language A arranges its own elements in a template taken from language B. If as you say this is a ‘substrate feature’, that too would imply calquing one way or another.
Let's not forget that Albanian has a similar way to form numbers above ten
- Let’s also not forget that Albanian has a mixed vigesimal system without any parallel in Balkan Romance.
- Or for that matter the lack of evidence from the pre-Slavic period for any of the following in the Balkans:
- A word like suta for ‘hundred’
- ‘N-on-ten’ compounds for 11/12/etc
- New ‘N-ten’ compounds for 20/30/etc replacing the old PIE *-ḱomt type
even Armenian
- Armenian has neither ‘N-on-ten’ compounds for 11/12/etc nor new ‘N-ten’ compounds for 20/30/etc replacing the *-ḱomt type.
They used unșpe, doișpe, treișpe etc. with regional variants such as unsprece, unspece, doisprece, treipce, treiprece and triisprece. All these seem eerily similar to other Romance languages such as Italian undici, dodici, tredici or even French onze, douze and treize.
- treispce and treisprece.
- The consistent /spr~sp~ʃp/ (< spre) breaks any ‘eeriness of resemblance’ for me, but I digress.
The word mătură could just as easily have come from Latin since it has several near-cognates in Southern Italian dialects.
- I suppose you are referring to the argument of Ciorănescu.
- He tries a couple of ways to arrive at the form mătură:
- 1) matta f sg ⇒ *maturi n pl ⇒ maturi f pl ⇒ mătură f sg.
- Comments:
- No such neuter form is attested.
- This does not account for the apparent cognates with /l/.
- 2) matta (“rush-mat”) + meto (“to reap, harvest”) + -ula (diminutive ending) ⇒ *metula (“broom”, somehow) > mătură.
- Comments:
- None of the languages in the Balkans have a verb continuing meto, and neither the Latin verb nor its (genuine) Romance continuations have any apparent relation to sweeping, brushing, or otherwise using a broom.
- That is quite the labyrinthine journey just to arrive at a form and meaning *metula (“broom”) that are all but identical to a simple Proto-Slavic compound *metъla.
- By the way, the reason Ciorănescu has to invoke a crossing with the Latin verb meto is that the a of matta would fail to account for:
- In fact this is a problem for his first theory as well, but he only mentions it in the second.
- As for the supposed ‘several near-cognates in Southern Italian dialects’, only one actually has the sense of ‘broom’, namely the mattǫ́rrə recorded by Rohlfs in the town of Oriolo, alongside the common Italian type skǭ́pə (< Latin scopa).
- Comments:
- The same survey found nothing like mattǫ́rrə ‘broom’ anywhere else in Italy. I have not managed to find anything like it either.
- Interestingly, there are no less than four Arbëreshë towns near Oriolo (Castroregio, Farneta, San Costantino Albanese, San Paolo Albanese) so some form of Albanian influence can easily be at play.
The intervocalic shift from l < r usually presents itself in inherited Latin terms, rarely in Slavic.
- Both forms are attested in Istro-Romanian: méturę and métulę.
- Given that Romanian has zero inherited words ending in unstressed ulă, the adjustment of that sequence to ură in an early borrowing (thereby conforming to words like ghindură, lamură, lingură, marmură, negură, păcură, papură, scândură, vergură, volbură) would not be unusual. It seems măgulă (cf. Albanian magulë) underwent just such an adjustment to măgură.
Romanian sută is feminine and always requires a determiner. All the Slavic derivatives are either masculine or neuter, making sută closer to Proto-Finno-Ugric *śëta than Slavic *sъto.
- Not in the slightest. As pointed out by Loporcaro 2021 (p 78, underline mine):
- ‘As is well known, Romanian borrowed sută “100” from Old Slavic sŭto, which has been adapted as feminine like all o-ending neuters among ancient loanwords from Slavic (Mihăilă, 1960; Petrovici, 1962; Buchi, 2006: 75f.; Livescu, 2008: 2648). In addition, Romanian calqued all numerals from “11” on, except inherited mie “1000”: unsprezece/nouăsprezece “11/19” = OBlg. jedinŭ/devętĭ na desęte, doizeci = OBlg. dŭva desęti “20”, etc. (cf. e.g., Schulte, 2009: 248)’.
- Compare cerneală, ciudă, copită, nicovală, ocnă, pravilă, sită, sticlă, vadră, vâslă.
Rather confused as to why you mentioned bulgar – the sources I've seen mention *bulgarinŭ or баъгарини.
- блъгаринъ (blŭgarinŭ) is transparently блъгар-инъ (blŭgar-inŭ), just as блъгарьскъ (blŭgarĭskŭ) is блъгар-ьскъ (blŭgar-ĭskŭ). See *-inъ, *-ьskъ. The morpheme borrowed into Romanian is блъгар- (blŭgar-), without suffixes.
In either way, the original form was bălgar and bolgar which today are considered archaic or regional
- Is bulgar/bolgar/bălgar any more recent than mătură/mătoră/mătără?
the form bulgar being a more recent development probably influenced by Bulgarian българин
- Do you believe both of the following?
- 1) Older Slavic /ъ/ > Older Romanian /u/ is implausible.
- 2) Modern Bulgarian /ъ/ > Modern Romanian /u/ is plausible.
The word tocmai and its varianta […] are said to be borrowed from a Slavic form токма.
- Whether it entered Romanian via an older Slavic тъкъма or a younger Slavic то́кма, or even both at different times, the etymon is still Proto-Slavic *tъkъmo/-a and this is still an example of original ъ resulting in a Romanian u (~o).
The word cumătru is more likely to be derived analogically as a masculine from the related feminine form cumătră (from Latin commāter, commātrem) especially considering that the equivalent forms in Slavic languages are very different (Bulgarian, Croatian, Russian kum, Serbian kumak).
- Per my comment, the point is not the direction of borrowing but rather the correspondence of Slavic ъ (*kъmotrъ > kmotr, kmotor) and Romance u (cumătru).
Names of villages in historically Hungarian regions are almost exclusively borrowed directly from Hungarian. Where the Hungarian names came from has very little to do with Romanian morphology.
- I suppose you mean phonology. In any case, per my comment, the point is not how the name reached Romanian but rather, again, the correspondence of early Slavic ъ (*mъx-) and another language’s u.
- The same point is made more bluntly by Saenko, who goes over about a hundred examples (pp 21–2). His comments afterwards:
- «Получается, что на тот момент, когда славяне, расселившись с территории своей прародины, вступили в контакт с греками, румынами, немцами, финнами, скандинавами и балтами, *ъ и *ь в их языке звучали достаточно близко к u и i в языках этих народов. Однако если мы считаем окончанием праславянской эпохи именно расселение с территории прародины (V–VI вв.), то следует признать, что в течение всей праславянской эпохи *ъ = u, *ь = i, а какие-либо изменения в месте артикуляции этих звуков начались вряд ли раньше конца общеславянского периода (X–XI вв.) и являются уже скорее фактами истории отдельных славянских языков». [‘Apparently when the Slavs dispersed from their urheimat and came into contact with the Greeks, Romanians, Germans, Finns, Scandinavians, and Balts, their ъ and ь sounded rather similar to the u and i found in the languages of those nations. If we take this dispersal (5th–6th c.) to mark the end of the Proto-Slavic period, then we have to accept that throughout it *ъ was u and *ь was i and that any changes to their place of articulation hardly began before the end of the Common Slavic period (10th–11th c.) and are rather phenomena belonging to the histories of the individual Slavic languages.’]
- Nicodene (talk) 09:42, 29 January 2025 (UTC)
- Thank you for your input, Nicodene. It's clear that this is a nuanced issue and ongoing discussions among linguists highlight the complexity of determining the origin of sută. Given the valid points on both sides of the debate, I believe that rewriting the etymology in any definitive direction at this point in time would be premature and not within Wiktionary's scope. I appreciate your examples and the highlights you've made, even if I wouldn't come to the same conclusions. I look forward to reading more on this subject and for further clarifications as research continues. Robbie SWE (talk) 19:51, 29 January 2025 (UTC)
- @Robbie SWE
- Michele Loporcaro, cited above as stating that the Slavic origin of sută is well-known (and citing several other sources) is one of the leading Romance linguists in the world.
- Likewise Martin Maiden, head of the Research Centre for Romance Linguistics at Oxford University, who comments for instance in The Oxford Guide to the Romance Languages that sută is a Slavonic borrowing (p 582).
- A search through Google Books pulls up countless other examples of Romance linguists saying the same.
- A similar search for supporters of a substratum theory pulls up hardly anyone other than Sorin Paliga (and him, more than once). In fact even this search pulls up more sources supporting a Slavic borrowing.
- As far as the field of Romance Linguistics is concerned, the substratum theory is WP:FRINGE.
- I am amenable to mentioning that there are supporters of such a theory, but certainly not to presenting the etymology as ‘unknown’, given the state of academic opinion and the apparent absence of any valid counterargument against a borrowing from Slavic. I am open to hearing one if you wish to make the case.
- Nicodene (talk) 01:16, 30 January 2025 (UTC)
- I guess a sensible approach would be to mark it as "origin disputed" and proceed by presenting the different theories, possibly mentioning that the Slavic origin is "more likely". Does this sound like a good idea? Robbie SWE (talk) 20:20, 30 January 2025 (UTC)
- @Robbie SWE The phrasing I usually see in cases like this is more like ‘From [X language]. Some scholars propose [Y language, Z language] instead’. Nicodene (talk) 20:34, 30 January 2025 (UTC)
- I don't know, that kind of negates that there is an ongoing discussion. I've seen several examples of "Disputed" in the beginning of the entry and the theories presented with all the necessary arguments (compare gërfej). Robbie SWE (talk) 20:05, 31 January 2025 (UTC)
- @Robbie SWE Not when there is an obvious scholarly majority against a minority. The comparison to sută is cases like berserker.
- As for the arguments, I am happy to paste this thread over to the relevant talk page or summarize it in the etymology. Nicodene (talk) 20:39, 31 January 2025 (UTC)
- I don't know, that kind of negates that there is an ongoing discussion. I've seen several examples of "Disputed" in the beginning of the entry and the theories presented with all the necessary arguments (compare gërfej). Robbie SWE (talk) 20:05, 31 January 2025 (UTC)
- @Robbie SWE The phrasing I usually see in cases like this is more like ‘From [X language]. Some scholars propose [Y language, Z language] instead’. Nicodene (talk) 20:34, 30 January 2025 (UTC)
- I guess a sensible approach would be to mark it as "origin disputed" and proceed by presenting the different theories, possibly mentioning that the Slavic origin is "more likely". Does this sound like a good idea? Robbie SWE (talk) 20:20, 30 January 2025 (UTC)
- Thank you for your input, Nicodene. It's clear that this is a nuanced issue and ongoing discussions among linguists highlight the complexity of determining the origin of sută. Given the valid points on both sides of the debate, I believe that rewriting the etymology in any definitive direction at this point in time would be premature and not within Wiktionary's scope. I appreciate your examples and the highlights you've made, even if I wouldn't come to the same conclusions. I look forward to reading more on this subject and for further clarifications as research continues. Robbie SWE (talk) 19:51, 29 January 2025 (UTC)