Jump to content

Wiktionary:Beer parlour/2021/March

From Wiktionary, the free dictionary

Wikifunctions logo contest

[edit]

01:51, 2 March 2021 (UTC)

Dentonius

[edit]

Equinox has indefinitely blocked User:Dentonius. While I agree that Dentonius has been disruptive, annoying, and has engaged in an awful lot of transparently political gamesmanship, I feel this should have been discussed and a consensus reached.

I'm not going to undo the block for now, but we need to make sure that the community agrees to it.

Aside from the rather nitpicky objections of a certain German IP and their apparent alter ego, User:Dubitator, which are themselves probably politically motivated, I haven't seen any indication of problems with his work in Jamaican entries. The only real problem has been his participation in rfd, in votes, and his canvassing on user talk pages.

I'm not sure if he would go along with it, but we might explore more creative approaches such as a Wikipedia-style ban on participating in deletion discussions and deletion-related votes, blocking him only from editing the Wiktionary and vote namespaces, etc. Chuck Entz (talk) 14:58, 2 March 2021 (UTC)[reply]

I think his behavior warrants a namespace block more than a sitewide block, at least to start. Vox Sciurorum (talk) 15:04, 2 March 2021 (UTC)[reply]
"Dentonius" should be made into a half-user because I'm unpopular? That is just sad. Some egalitarian community this turned out to be. I'd like to continue participating in all areas of community life here with no restrictions on my ability to participate. It's unfair to take this away because, at this point in time, my views are different from the community standard. I am not a troll and I contribute though not as frequently as some of you since I'm unable to. This block is an admission of something else. (Dentonius) — 2003:C2:2F01:3A09:7DCF:5762:2954:A62E 15:08, 2 March 2021 (UTC)[reply]
It's not a matter of your opinions or your popularity. It's a matter of the disruptive tactics you're using to forward your agenda. It's a matter of the rather divisive and borderline paranoid characterization of people who disagree with you as "terrorists" and "the clique". It's a matter of voting in rfd not based on the merits of individual cases, but on making a political point. Chuck Entz (talk) 15:33, 2 March 2021 (UTC)[reply]
Okay. Do what you have to do. — (Dentonius) 2003:C2:2F01:3A09:A1BD:96A:5416:A838 15:43, 2 March 2021 (UTC)[reply]
I don't agree with the block. Even though Dentonius can be a huge pain in multiple body parts at times, I believe his edits to Patwa are invaluable and that he shouldn't be censored. I would, however, like to once again ask Dentonius to start contributing to the mainspace more than to the RFDs and votes (for example by just voting keep/delete, support/oppose; you don't have to say anything while voting). Thadh (talk) 16:02, 2 March 2021 (UTC)[reply]
I don’t agree with the block, for much the same reasons as Thadh. I think it should have been discussed beforehand. I also agree with Thadh that it would be better for Dentonius (if he comes back) to contribute less in RfD and more on the Creole entries.--Tibidibi (talk) 16:08, 2 March 2021 (UTC)[reply]
@Thadh, Tibidibi: I agree that his Jamaican Creole edits of are high value to the project, but he makes very few of them, only creating one or two entries a month. (He claims this is due to being busy, despite having enough time to generate lists of RFD enemies and other questionable pursuits.) Would you both support Chuck's proposed namespace block? —Μετάknowledgediscuss/deeds 16:10, 2 March 2021 (UTC)[reply]
I personally would rather he not yet be blocked from the Votes and WT namespaces. I believe that if he agrees to my proposal (i.e. not explaining his votes, unless he can do it in a way that doesn't offend other editors), he may not cause as much desription, despite his highly unusual thoughts on RFD (which I myself don't agree with). Any user should have the right to fight for the Wiktionary he believes in, even if he stands alone in this fight. If he, however, continues to behave the way he did up till his block, I believe a namespace block will be necessary. Also, if he is blocked from WT, won't he be blocked from contributing to Jamaican Creole's RFVs and BP/TR/ID discussions? Thadh (talk) 16:27, 2 March 2021 (UTC)[reply]
I agree with Thadh and would prefer that his rights be restored for the time being.--Tibidibi (talk) 00:41, 3 March 2021 (UTC)[reply]
  • For clarification, the lockdown has changed my life circumstances. I have been teaching my child at home more and have less personal time. The way I contribute in mainspace is fairly time-consuming. I like to create detailed entries from the get-go. It would be more of a fair assessment to look at my global contributions. I contribute to nine other Wiktionary servers and there's only one of me. I wasn't aware that there's a deadline? If I'm not making daily contributions, am I no longer useful? Are mainspace contributions necessarily more valuable than Wiktionary space contributions? That's debatable. For every word saved in RFD, that's a word not removed from our dictionary. Believe it or not, keeping an eye on what's happening in recent changes in Jamaican Creole and editing when needed is also of value. In January, I took a little time off from Wiktionary to consider how I was using my time here. This hobby can easily consume a lot of your time if you let it. You may not have noticed, but for some time, I have stopped justifying my voting decisions in RFD. I only vote. In the recent votes, my opinions were sought. People pinged me to get my view on something. They would not have heard from me had they not pinged me. I agree, however, if my username is returned to me, with all the rights it had before, I will be less talkative. This process, as I said, was already underway. — (Dentonius) 2003:C2:2F01:3A09:6924:D1B8:144F:E758 17:17, 2 March 2021 (UTC)[reply]
  • I support excluding him from RFD discussions. Simply asking him to explain his votes has not proved successful in the past, because he just says "It seems like more than SOP to me" (or something along those lines). And yes, I do believe his persistent voting "Keep" on blatant SOPs is intended to be disruptive. He first came to my attention when he tried to get a bunch of entries including the N-word deleted on the grounds of their being offensive. When he failed to get them deleted, because they were both attested and idiomatic, he turned around and started voting Keep on all sorts of garbage. This is classic w:WP:POINTy behavior: "If you won't delete the entries I want deleted, then I'm going to try to stop you from deleting the entries you want deleted." But I don't think he should be completely blocked from editing Wiktionary, either. I bet we don't have a single other regular editor who's a native speaker of Jamaican Creole, and even adding only one or two entries a month is a hell of a lot more coverage of that language than we would get without him. —Mahāgaja · talk 17:21, 2 March 2021 (UTC)[reply]
Mahagaja, you are mistaken about how sincere I am about wanting to keep words in this dictionary. Anybody who has been paying attention to me knows that I care deeply about this. I am Black. I believe there is another Black person now who contributes to Yoruba. You have very few Black people here. Why do you think I was upset by the "n-word killer" entry which is gone now, I think? The admins just blocked a user called "kikeshooter". I think Jewish people would be upset by that too. With that said, I have come to accept that N-word entries are here to stay as full entries. I love this dictionary so I have dealt with it. — (Dentonius) 2003:C2:2F01:3A09:6924:D1B8:144F:E758 17:28, 2 March 2021 (UTC)[reply]
I don't doubt your desire to help the dictionary by contributing to our coverage of Jamaican Creole, and I absolutely agree that we need both more editors of color in general and more editors to help us with African and African-diaspora languages in particular. But there is a huge difference between blocking someone with an offensive username (which I wholeheartedly support; as a gay man, I certainly wouldn't want us to permit a user to call himself "Fagbasher" or something) and trying to delete offensive entries from the dictionary (which I absolutely reject as long as they are attested and idiomatic, and no, nigger killer has not been deleted). —Mahāgaja · talk 17:56, 2 March 2021 (UTC)[reply]
I think you have that backwards: the campaign against the n-word and the like was a response to not getting the response he wanted to his campaign against SOP as a reason for deletion. The premise seems to have been "if you're going to be deleting things, why aren't you deleting this"? Chuck Entz (talk) 07:53, 3 March 2021 (UTC)[reply]
  • Preventing disruptive edits to votes and RFD should be a matter of policy. I've been turning over in my head a proposal to only, for example, allow autopatrollers to vote on RFDs. Perhaps we could instead make a special role for those users we entrust to vote on these matters. As Equinox himself has said in the context of user names, "If there are unspoken rules... then they need to be codified, right?" This is even more true when it comes to an indefinite block to any user. Relatedly, I don't know that Dentonius had sufficient notice that this behavior would have him blocked, let alone blocked indefinitely. So I do not agree with the block as presented, but would be on board with excluding him from RFD. Imetsia (talk) 17:49, 2 March 2021 (UTC)[reply]
    • He has been warned and then blocked before for disrupting RFD. Vox Sciurorum (talk) 00:08, 3 March 2021 (UTC)[reply]
      • Vox, I recall a post you made in RFD with the summary "keep the black man down". Here's the diff Did anybody ever speak to you about that? For the record, I found that comment pretty offensive and insensitive, but I didn't raise the matter because I already know what the outcome would be. — (Dentonius) 2003:C2:2F01:3A63:C852:9163:C0CC:E532 00:53, 3 March 2021 (UTC)[reply]
        • The question here is whether user Dentonius should be blocked. Whether another user should be blocked or warned has no bearing on that. But whether Dentonius is going to keep playing the victim is highly relevant. Vox Sciurorum (talk) 01:10, 3 March 2021 (UTC)[reply]
          • Vox, my point is: if your open racism is tolerated here, why should I be blocked forever for giving Ultimateria a response when she pinged me for feedback? I'm assuming that's why I was blocked since there's no warning or anything else on my talk page to explain what's going on. — (Dentonius) 2003:C2:2F01:3A63:C852:9163:C0CC:E532 01:14, 3 March 2021 (UTC)[reply]
            • What open racism? That was a play on words, not meant to be taken at face value. Insensitive, perhaps, but not at all racist.
But it does illustrate a big part of the problem: a wiki is a community, and personal attacks against a fellow member of your community damage the community itself. It's not "us" vs. "them"- it's "us" vs. "us".
A great deal of your argumentation has been gratuitously ad hominem, and some of your most egregious stunts seem to have been solely for the purpose of setting things up for more ad hominems. There have been occasional arguments based on ideas, but mostly it's been something along the lines of "those people over there are lying racist hypocrites." The stunt that I blocked you for is a perfect example: you've never nominated anything like it for deletion, and when you nominated this one, you didn't nominate the lemma- you nominated a minor alternative form entry that happened to have been added by someone who was on your target list. There's simply no plausible explanation other than your wanting to be able to say "see, they're all hypocrites: they just want to delete everyone else's entries, not their own."
I would say, from my observations, that 99% of the nominations and the votes- both "delete" and "keep"- have been based on people's honest opinions. I see people change their minds all the time when someone brings up a good point. Of course people have their prejudices, and of course people sometimes go overboard defending their point of view- but there's no conspiracy or dishonesty involved. Chuck Entz (talk) 08:11, 3 March 2021 (UTC)[reply]
  • Sure, he's been blocked before, but that was for disruptive and "trolling" edits to RFD. If I remember correctly, it was for him nominating the entry fuck like bunnies just to prove a point. But the repeated delete votes are not of that same mold. Imetsia (talk) 02:55, 3 March 2021 (UTC)[reply]
I mostly agree: there are a lot of people whose only participation is to make the same vote on a number of entries at a time. That, in itself, isn't a problem. It's only when the sheer volume and the comprehensiveness starts to give the impression that he's just holding his thumb down on the "keep" button all through rfd without bothering to even read what he's voting on. Chuck Entz (talk) 08:11, 3 March 2021 (UTC)[reply]
  • Thank you. Though this actually looks like a success story to me. Robbie SWE essentially tells Dentonius "cut it out with all these undeletion requests or you're going to get banned", and (after a bit of squirming), Dentonius heeded the warning. It looks like they didn't nominate any more entries for undeletion after Dec. 20 2020. That just reinforces my belief that Dentonius would have been responsive to a warning in this case. Colin M (talk) 22:26, 3 March 2021 (UTC)[reply]
  • Disenfranchising me would be unfortunate because it would mean that there is no room for alternate viewpoints. It would mean that I have no say in the future of our dictionary. It sometimes feels as if I'm building on a foundation of sand here. This has demotivated me somewhat. The thought that your hard work can be so easily removed here deeply disturbs me. How many editors have been driven away from Wiktionary for this very reason? Please restore my username "Dentonius" with all the permissions it had prior to its being blocked. You have nothing to fear from a single person who only has the power of one vote. — (Dentonius) 91.32.93.82 18:22, 2 March 2021 (UTC)[reply]
  • In full disclosure, I brought the "RFD terrorist" comment to Chuck's attention, mainly because I didn't want to have to be the one to do something about it (Dentonius has accused me multiple times of having a vendetta against him). I'm all for a block – permanently? Maybe not, but Dentonius has not shown me any willingness to actually contribute in a productive or constructive way. I completely agree with Mahagaja about the w:WP:POINTy behaviour – it's been my hunch from the get-go that Dentonius has been trying to prove a point ever since the nigger killer discussion. Sure the Jamaican Creole entries are great and it's a pity that he didn't just continue adding them, despite being his self-proclaimed mission. I just can't turn a blind eye to the politicising of our forums, turning users against each other, instigating outright wars between so called inclusionists vs. deletionists (I'd never heard of the terms until Dentonius started throwing them around like confetti at a wedding), nominating random users for adminship without asking them first (oftentimes not taking into consideration failed nominations, opening old wounds as a result), pressuring people to reconsider their votes, personal attacks, keeping tabs on users' voting habits to prove some kind of point and last but not least, calling people who vote delete terrorists. It became painfully clear that users took offence – especially people who have experienced acts of terrorism in their real lives. One might expect that from a pubescent boy, but not from a grown man and father, claiming to be all about linguistics. I'm just really disappointed and even if the block is revoked, I fear we'll be here again in a couple of months. With that said, I'm willing to try excluding him from RFD discussions. Call me an incurable optimist...--Robbie SWE (talk) 18:47, 2 March 2021 (UTC)[reply]
  • This dictionary means a lot to me and I would like to continue contributing here in all namespaces. To have fewer rights would mean that I'm not really a part of the dictionary. As mentioned previously, I would be unable to defend my own entries in RFD and I would have no say in the future of this dictionary if I am denied my right to vote on policy or who gets to have special database management roles. It also represents the silencing and suppression of an alternative viewpoint here which I suspect is shared to different degrees by other users here. I would like to address a few of the points raised:
    • I have indicated that I'm Black and generally say "n-word" instead of the actual word because I find the word that offensive. Has anyone shown me any consideration above? You continue to use that word in this topic, yet feign offence at the word "terrorist."
    • Metaknowledge, I'm not sure how you can assert that I have more time than I do. The statistics are generated by a shell script which I wrote a long time ago. I'll reiterate: I don't have a lot of time nowadays. It's become worse under the lockdown. Family, work, and other personal matters eat up my time. I'm giving of my time now to defend my account because this is important but there are other things I should be doing right now. When the lockdown ends here in Germany, I will have more time to contribute more frequently to mainspace.
    • Mahagaja, you said I nominated several n-word entries for deletion. How many n-word entries did I nominate for RFD?
    • Mahagaja, who withdrew the RFD for "n-word killer"?
      I have not thought about that entry in months. Your bringing it up here as a motive for my actions is a straw man.
    • Robbie and Equinox, who specifically did I call an "RFD terrorist"? Not to defend the term, but is the term "terrorist" possibly a playful Caribbean way of referring to mischief makers? My way of speaking English may come across as harsh to other English speakers but we Caribbean English speakers are way more direct than Brits or Americans. We will tell you precisely what we feel but it doesn't mean that we view you as enemies.
    • Chuck, what exactly is wrong with having an agenda? I have been very transparent about that. I'm not happy with the status quo and I want to change it.
    • Chuck, what is wrong with my believing that there is a clique here? I am either wrong or I'm right. But who am I hurting? At worst, others will find it laughable. At best, it could be true.
    • Nothing I say will earn me any sympathy, but the ones calling for the removal of my RFD and voting rights are RFD regulars who consistently vote "delete". It's just a statement of fact. I have been told that appearances matter. This certainly has the appearance of being extremely biased.
      • See my comments above: personal attacks are a grievous breach of wiki-etiquette. It's one thing to say that too many entries are being deleted, and explaining why you believe they shouldn't be. It's another to say that it's all because some lying hypocrites are out to destroy everyone else's work. Chuck Entz (talk) 07:53, 3 March 2021 (UTC)[reply]
    If I hadn't responded to your pings, we would not be having this conversation. It is one of the reasons that I no longer respond in RFD. Unfortunately, I responded to Ultimateria's comment about what my priority in choosing an admin should be. She pinged me. I should have remained silent. Hindsight is 20/20. In the future, I'll be responding to even fewer pings assuming that my user account is restored. I again implore you in the interest of impartiality, fairness, and democracy here in this egalitarian community: restore my account with the permissions it had prior to its being blocked by Equinox. — (Dentonius) 2003:C2:2F01:3A09:C852:9163:C0CC:E532 20:24, 2 March 2021 (UTC)[reply]
  • Moreover, Robbie, which random users have I nominated to become admins without asking them? Please name them and provide links to those admin votes. — (Dentonius) 2003:C2:2F01:3A09:6924:D1B8:144F:E758 21:11, 2 March 2021 (UTC)[reply]
    I'm surprised you didn't think of the recent Donnanz admin vote yourself – I even called you out on it. In addition to this, you also went on a nomination spree in order to counterweight PUC's (at the time) possible adminship. It baffled me that you for some reason came off as rushed – Sonofcawdrey received the exact same message as Habst, just an hour apart, despite having declined a nomination a couple of months prior. Votes need to be diligently prepared and canvassed before setting them into motion (just look at BigDom's nomination which was at least three months in the making). Your motives are inherently dishonest and risk damaging already strained relationships (I still feel bad for what happened to Donnanz). --Robbie SWE (talk) 18:07, 3 March 2021 (UTC)[reply]
  • Final point before I go to bed: I was not warned that I would be blocked. I haven't spoken to Equinox in a long time. He gave me no warning. The block for that reason alone should be overturned. Good night all. Sleep tight. — (Dentonius) 2003:C2:2F01:3A09:6924:D1B8:144F:E758 22:16, 2 March 2021 (UTC)[reply]
1. Based purely on my visceral personal reaction to almost all of D's contributions in Wiktionary namespace, I would vote for an infinite ban. 2. But trying to look at the situation in terms of the good of the project, there are some positives in the area of principal namespace contributions and challenging some of our ways of doing things. 3. But I also believe that I am not the only one with a visceral negative response to D's contribution in Wiktionary namespace. That visceral reaction made me not want to participate in any discussion in which D was active. Though I was more angry than afraid, it actually helped me understand what Wikimedia contributors mean when they feel that the environment is not "safe", as much as I dislike the loss of freedom associated with such PC concerns.
If this last factor is something many agree with, then, for the good of the project, we need a longish ban. If there are not some others who agree, then I would swallow my visceral dislike and vote for removing the ban, especially since a ban could be instituted again if necessary. DCDuring (talk) 23:54, 2 March 2021 (UTC)[reply]
@DCDuring: I cannot comprehend your third point, or last factor. It is something about feelings, so there is nothing to agree with. This was legit something about socialization and feel-good but this form of approach does not do justice to this WWW medium, in conjunction with the manifold psychogeneses editors are subject to. Every decision must be expressed in terms of reason; or in other words, viscera spawn factors too uncertain to be agreed upon.
So I cannot be but amused by attempts of group distinctions, or the display of clique formations, like there is little cause to care whether somebody is a black or a terrorist or rather a victim of terrorists or one who calls others terrorists (nice try!) or a sodomite or a dog.
It appears that bans are all about disruption, but only in so far as one should be disrupted, not when one man should withstand in sober self-assertion.
Consequently I found it most appropriate to ignore and trust that the bait will be ignored, so maybe the ignorant will eventually discern his vanity. The only concerning point was the waste of time. This is a legitimate point of concern, as attention is expensive, and it is to be observed that some value it more than others, thus Equinox took the bait, that Dentonius gratuitously offered, rather than to continue to let everyone dance around it. This is the yardstick to be objectivized. For a comparing case, to advance the science of bannology by way of contrast, it appeared to me particularly tenable when Gnosandes (talkcontribs) got a time-out due to blather, for being awfully and longfully beside any point. But this is a recoverable ailment. Fay Freak (talk) 02:07, 3 March 2021 (UTC)[reply]
@Fay Freak: I did not ask you to point your finger at me and make an example or comparison out of me. Gnosandes (talk) 08:50, 3 March 2021 (UTC)[reply]
  • From the block log: Equinox: "Yes, thought about this carefully. User is not contributing to mainspace, doing almost no dictionary work, mainly obstructing votes etc., accuses of "RFD terrorists" -- he is an entryist troll." (1) I am contributing to mainspace here and on other Wiktionary servers. I'm just not as prolific as some of the others. I simply don't have that kind of time right now. I will have more time to make more mainspace contributions after the lockdown is over in Germany. (2) How am I obstructing votes? I have one vote like everybody else here. From a technical point of view, I know what it would take to create several difficult to detect sockpuppets (VPNs; change browser connection strings + MO; etc.) but I'm not a dishonest person. That's not worth it for me. I have one username and one vote and I'd like to keep it that way. I don't understand how my one vote obstructs all the people here who outvote me (3) Re: "RFD terrorists." That was a remark which didn't refer to anybody specifically. It referred to people who almost always vote "delete" in RFD. It's something which is very demotivating as an editor here to see how easily people's work gets destroyed. It just isn't normal to see so many deletion requests each day, but for many of you, it is. Is there anybody here who really believes that I think others here are in the same league as Bin Laden? We Caribbean English speakers employ hyperbole often when we speak (Ol' terroris'). For us, it's in our jocular way of speaking. For others, I imagine how it must come across in the written medium. I just don't understand why my account was blocked permanently with no warning given. Can someone please restore my account so I can continue contributing (not just to Jamaican Creole) but to English and other languages as well? Please restore my account with all the rights it had prior to its being blocked. — 2003:C2:2F01:3A63:C852:9163:C0CC:E532 01:52, 3 March 2021 (UTC)[reply]
So many of your defenses are specious and exhausting. At every turn you make it harder to assume good faith:
  • "You have very few Black people here" -- I've already pointed out to you that it's impossible to know that.
  • "The admins just blocked a user called 'kikeshooter'." Choosing a username is tantamount to espousing an idea. Including an entry in a descriptive reference work while clearly labeling it offensive is not.
  • "why should I be blocked forever for giving Ultimateria a response when she pinged me for feedback?" This is pathetic. I shouldn't even have to say this, but it was the content of your message that was deemed unacceptable, not the mere fact that you responded. I didn't lay a trap for you, I gave you my honest opinion, which I regret wording as absolutely as I did. For the record, Vox's edit summary was inappropriate because it could be taken literally by those who don't know that it's not only a reference, but its inverse.
  • "You continue to use that word in this topic, yet feign offence at the word 'terrorist.'" Choosing not to censor an offensive word (that isn't directed toward you or anyone else) in a discussion about the word itself is not equivalent to calling someone a terrorist.
  • And lastly, "That was a remark which didn't refer to anybody specifically. It referred to people who almost always vote "delete" in RFD." A crystal clear allusion obviates the need for names; there's no room here for plausible deniability. Everyone here understands that you're referring to me, Imetsia, Robbie, and others.
I don't think you're comparing me to an actual terrorist. I can give you the benefit of the doubt on this issue; a European friend told me how she learned the hard way to stop casually calling people "autistic" or "bipolar" in the US. I had this in mind when I calmly explained why you should avoid the word. What I find most disturbing about these comments is that you're talking about us like we're your enemies in a war. We have squabbles and there are some grouches here, but I've never felt real hostility from anyone. That changed with your list tracking our RFD votes. To paraphrase DCDuring, I'm not threatened by it, but it contributes to what has essentially become a toxic workplace. A toxic workplace staffed by volunteers! It's surreal. But that's subjective and unquantifiable, and you've already been blocked for your blockable offenses. In the end I don't know what to do, but a permaban doesn't seem like the best choice. Ultimateria (talk) 03:48, 3 March 2021 (UTC)[reply]
As for your assertion that you could avoid detection as a sock: maybe. The checkuser tool has its limitations. Of course, your "MO" is precisely the problem. If you could participate without being disruptive and toxic, you wouldn't have been blocked in the first place. Any persona you adopt is going to have the same fundamental misunderstanding of what Wiktionary is and how it works. It wouldn't be necessary to see through the disguise- all it takes is looking for someone fake who's trying very hard to be inconspicuous. Chuck Entz (talk) 15:59, 3 March 2021 (UTC)[reply]
I've unblocked Dentonius for the time being. It's silly having a discussion with one of the main parties having to sneak in to participate. Whether he ends up being reblocked, left unblocked, or something else depends on his behavior and the outcome of this discussion. Chuck Entz (talk) 07:53, 3 March 2021 (UTC)[reply]
I support this block and do not think Dentonius should have been unblocked; 'entryist troll' is an accurate (and honestly mild) description. I have no good words for this user; his behavior is sanctimonious and toxic and he poisons every discussion he participates in (all the while contributing shockingly little content to this project, given the energy he displays in politicking and agenda-pushing). But others have made these points well enough. Equinox' block was as generously late as Chuck Entz' unblock was premature. — Mnemosientje (t · c) 10:30, 3 March 2021 (UTC)[reply]
I agree. Dentonius is wasting our time. Vox Sciurorum (talk) 10:44, 3 March 2021 (UTC)[reply]
As more harm has been done to the project and the community than good, I would also agree to putting a stop to that. J3133 (talk) 11:20, 3 March 2021 (UTC)[reply]
  • I have thought about the comments here and I agree that it can't be business as usual. My behaviour has pissed off a lot of people; I acknowledge that. It is possible, I think, for me to keep my ideals without creating needless conflict. My presence in RFD is hurting any chance of realising what I'd like to see here. (1) Thus, I will no longer participate in RFD; I renounce my right to vote or participate in RFD English and RFD Non-English. If an administrator would like to enforce this with a lock preventing my modifying those particular pages, that would be acceptable. (2) As indirect name-calling is just as hurtful and divisive as direct name-calling, I will do my utmost to avoid doing either. (3) I will make more mainspace contributions. This will be an automatic consequence of not participating in RFD. Is this acceptable to anybody here? — Dentonius 20:27, 3 March 2021 (UTC)[reply]
Your willingness to compromise seems to show you're willing to put your money where your mouth is (i.e. put the project's development first). I appreciate that, and I think you should definitely not be blocked if you're willing to show this kind of self-restraint. It's great to see people editing a language like Jamaican Creole, so it would be a shame to lose you altogether. Andrew Sheedy (talk) 05:44, 4 March 2021 (UTC)[reply]
  • This block seems totally contrary to the policy described at WT:BLOCK. In particular, it doesn't seem like Dentonius is being accused of any of the criteria for an indefinite block which are listed at WT:BLOCK#Block_length. Furthermore, the policy says that the block tool "should not be used unless less drastic means of stopping these edits are, by the assessment of the blocking administrator, highly unlikely to succeed". It seems insane to not even leave a warning on their talk page before resorting to a ban. In this discussion, Dentonius has proved themselves to be open to discussion and to modifying their editing habits. Colin M (talk) 21:20, 3 March 2021 (UTC)[reply]
  • Although I haven’t been abused by any name calling, I too have been rather annoyed by what I considered unproductive and unbecoming behaviour in a project so dependent on a spirit of collaboration. The promise to avoid conflict by avoiding RFD discussions as well as name-calling is acceptable to me, for one.  --Lambiam 01:52, 4 March 2021 (UTC)[reply]
  • My initial impression was that it's healthy to have new people come in from time to time and question things. Unfortunately the way this was done was overly confrontational, and making people angry isn't the best way to get things to change. If this constant confrontation is avoided Dentonius could be a good contributor to the project. – Jberkel 09:49, 4 March 2021 (UTC)[reply]
  • Welcome back, Dent. Oxlade2000 (talk) 00:38, 7 March 2021 (UTC)[reply]

Venetian orthography

[edit]

Is there any convention for use of accents on Venetian vowels, or for othography in general? For example, {{R:tr:LF}} uses bastón for the word {{R:vec:Boerio}} spells bastòn. We have bełézsa for what Boerio spells belezza. (According to Wikipedia, "several proposed alphabets for the Venetian language" use this variant of l.) Several words with accents have unaccented alternate forms.

Venetian used to be an influential language and I want to add missing links in etymologies. But I can't hit the Internet to find out what spellings are found in the wild because I mostly find a small set of dictionaries (including Wiktionary). It's a spoken language. From one point of view, anything in a reputable dictionary is fair game. But unless acute and grave accents have distinct meanings, it would be better to settle on a standard (in the absence of a citable work of Venetian language literature using a contrary spelling). And likewise with ł vs. l. Vox Sciurorum (talk) 15:04, 4 March 2021 (UTC)[reply]

Venetian does have open and closed mid vowels, and apparently even a near-open central vowel spelt á. I spent some time listening to Venetian on Forvo in the past, and from my impression every word-final -on is closed - and this ending is extremely common. This page on our Italian sister lays out a 1995 orthography saying that only the open e/o should be indicated with a diacritic. Lack of graphical accents in a word predictably indicates penultimate (for vowel-final) or ultimate stress (for cons-final words). From all the indications, l and ł don't indicate two different phonemes and the diacritic version is just there to alert the reader that the writer is a central/city Veneto speaker and that in their speech the phoneme is pronounced differently from Italian. I have a bigger question here: Italian doesn't use diacritics in entry names, and Italian regional languages use them sparingly as well. I don't see any reason to include diacritics in entry names for them - this is not Spanish, why not let vowel openness be indicated in the headword/IPA? This is even more pressing in the case of Sardinian, on which I will post below. Brutal Russian (talk) 19:59, 4 March 2021 (UTC)[reply]

Sardinian accent marks in page names

[edit]

Related to the question above, this has been bothering me for a while now. Sardinian doesn't have anything approaching a single standard or associated orthography. It doesn't have phonemic open or close vowel - mid vowels harmonically close if followed by a high vowel (in Campidanian this is now opaque due to final mid vowel raising: sa d/o/mu, sas d/ɔ/mus (house, houses); deu sp/ɛ/ru, su sp/e/ru (I hope, hope). Those sticking to Italian orthography use different diacritics depending on openness, but many (most?) orthographies prescribe using these only in case of homophony (as in speru), and otherwise only using the grave accent - for example this one. Currently on the website òmine follows the latter prescription, but inevitably there's an 'alternative form' that does the opposite; atzàrgiu opts for marking the antepenultimate vowel letter/penultimate syllable in a case of an extremely common suffix that can really only be read one way (if there are words in /'d͡ʒi.u/ they must be very rare indeed) while rubiu and dighidu do not. And písche is just obscene considering Sardinian avoids final stress in a manner very similar to Latin. To make a long story short, in my opinion the most obvious solution is that Sardinian entry names should contain no diacritics. Brutal Russian (talk) 20:27, 4 March 2021 (UTC)[reply]

Considering there is a second language with the same problem as Venetian, I prefer no diacritics in the page names for these rarely written languages without generally accepted standards. They will make the terms harder to find. The module wizards can strip them out of links like vowel length markings in Latin and old Germanic languages. You can use {{head|vec|noun|head=bastón}} to provide them, as is done for vowel length markings. And if somebody else decides to change to {{head|vec|noun|head=bastòn}} no links have been broken. Vox Sciurorum (talk) 23:19, 4 March 2021 (UTC)[reply]
Here's a few more thoughts on Venetian:
  • Link #1: per esempio molti termini che contengono le vocali “e” e “o” anche a distanza di pochi chilometri possono venire pronunciate aperte o chiuse (“mòcolo” o “mócolo”, “calièro” o “caliéro”). It gives bastón and even bón, where the vowel is etymologically open. Thus the close vowel does indeed seem to be conditioned by the syllable-final nasal, and then gets transferred throughout the paradigm. Word-final o's seem to be all open as in Italian.
  • Link #2: gives the same. Additionally a description on the same website's root mentions that di solito il dialetto non si scrive con gli accenti (ed è un errore perché è un guaio per chi legge, anche se più comodo per chi scrive). In our case though it is a guaio to find the terms if diacritics are used.
  • Link #3: this one takes a more libearl approach: Data la varietà delle pronunce, anche fra abitanti delle stesse zone o zone vicine, solo utilizzando opportunamente gli accenti grafici sulle vocali “o – e “ si può far capire al lettore come lo scrivente intende siano pronunciati i suoni chiusi o aperti. C’è chi, infatti, dice pòco o póco, pìe o piè, sèra o séra e così via. Tuttavia è invalso l’uso, copiato dall’italiano, di non segnare mai tali accenti nelle scritture normali, lasciando ai lettori piena facoltà di pronuncia., nonostante il rischio di ingenerare qualche inesattezza. Per questo, e solo per facilitare una dizione esatta (almeno secondo la parlata padovana dell’autore) in molti esempi del presente lavoro sono stati segnati accenti grafici (acuti e gravi) su “o - e “ anche di parole piane. Gives bón and bastón as the other two.
  • Link #4: it's just the one speaker, but I don't think he pronounces bòn any differently from the other spellings.
In view of this, getting rid of diacritics in page names seems to be the sane approach - ditto for Sardinian, even if for somewhat different reasons. What do we do with extisting pages? RFD and redirect in the meantime? Brutal Russian (talk) 10:56, 6 March 2021 (UTC)[reply]

Kana lemmatization of Japanese entries

[edit]

There seems to have been an agreement some time ago between Japanese wiktionary editors to lemmatise native Japanese words fully in kana, whereas they are typically spelt containing kanji in most other dictionaries and academic material (if you are not familiar with the Japanese language, hiragana is a Chinese-character-derived syllabary that is often used coupled with kanji to indicate the readings of words (okurigana), but in some cases also used by itself). The reason given here is that non-Sino-Japanese vocabulary is more likely to words that have multiple, equally used different kanji spellings, and to prioritise the lemmatisation of one spelling over another would be to make a hasty judgement on their frequency. An example given to me is つける (tsukeru, "to apply"), which has the double kanji spelling 着ける (tsukeru) and 付ける (tsukeru), both being used with relatively equal frequency.

Very well, I'm not necessarily against the application of this standard to the few words that actually have alternative, equally used kanji spellings, but it seems the community, in a move which I find to be mistaken, has taken to themselves to apply it to absolutely all native Japanese words, regardless of whether they have different spellings or not.

For a recent example, the word 着太り (kibutori), a word without any alternative spellings and whose kanji spelling has approximately 800x more hits on Google than its hiragana, is lemmatised as きぶとり (kibutori) on Wiktionary. If this is simply a measure to mitigate the inconveniences of lemmatising certain words that don't confine themselves to a single spelling, then I see no reason why it should also be applied to words that don't pertain to this category, lest we pester the reader with spellings that he is 800 times less likely to search for, or copy-paste entries for no reason (some may argue that the {{ja-see}} template is not a copy-paste, but it is for all practical purposes so).

Another problem is that, if we are to take this standard to its very extreme, you are very likely to wind up grouping two completely unrelated words in the same page. Verb 知る (shiru, "to know"), one of many said native Japanese words with no other common spellings, will be (but fortunately, not presently is), in a twist of fate, put in the same page as noun (shiru, "juice"), despite having different pronunciations, etymologies (and spellings!) and a different grammatical classification. If the objective is to reduce data usage, then it fulfils its purpose perfectly; at the expense of readability, of course.

Yet another issue, taking from the last point, is that, as you can see, it makes reading and differentiating different words very difficult. Chinese characters (kanji) have been integrated in the Japanese language for over a thousand years, and even the native words have been accommodated to fit the system. Before the spelling reform it was not as much of an issue, but nowadays you cannot tell the difference between 居る (iru, "to be"), 要る (iru, "to need") and 入る (iru, "to enter") if they are written in hiragana, even though they are three completely different words with completely different meanings. I believe in the one entry per lemma ethic, and I don't want to end up on the word for "juice" when searching for the word "to know" simply because editors found it convenient to put them in the same page.

To reiterate, I am not for completely abolishing the current standard of making hiragana entries for words with multiple, extant kanji spellings. I simply find it unwise for it to be applied to words that have otherwise no reason to be put in hiragana entries, for what I perceive to be simply convenience or some sense of uniformity. I've been told this is a temporary measure while editors get used to the system; if it is true, it's counterproductive insofar as it will only create more work in the future. (Notifying Eirikr, TAKASUGI Shinji, Atitarev, Suzukaze-c, Poketalker, Cnilep, Marlin Setia1, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233): Alves9 (talk) 02:56, 5 March 2021 (UTC)[reply]

I think overall the conventional form or the most common form of a word should be lemmatized (which can be debatable for a lot of words). It could contain kanji or all in kana. Sometimes I believe it is reversed, like 依怙贔屓(えこひいき) (ekohīki), which I'm sure kana form is more common than kanji. But as long as both forms have their entries created, I'm okay with that. Shen233 (talk) 03:48, 5 March 2021 (UTC)[reply]
We already do create entries for 1) spellings with kanji (if any), 2) spellings in kana only, and 3) romanized spellings (romaji). Romanizations are only soft-redirects to kana spellings. For kana or kanji spellings, whichever is the non-lemma entry should use {{ja-see}} or {{ja-see-kango}} to display a limited set of information dynamically fetched from the lemma entry, which also refers the user to the lemma entry for fuller detail.
For examples, see also non-lemma kanji entry [[齎す]], lemma kana entry [[もたらす]], and romanized entry [[motarasu]]. ‑‑ Eiríkr Útlendi │Tala við mig 06:05, 5 March 2021 (UTC)[reply]
There is no such consensus. Now every editor is just editing based on personal preference. Your example, 着太り, is lemmatized in kana just because its creator, User:Suzukaze-c, is a supporter of kana lemmas.
By the way, I seem to be the only user intentionally left out from the workgroup ping for Japanese editors. Can I know why? -- Huhu9001 (talk) 05:54, 7 March 2021 (UTC)[reply]
If it's really the case that there is no consensus agreed on, then that's a problem of similar magnitude. We can't make a dictionary where even the writing system in which entries are submitted is based on personal preference; it would be like asking for dissent and chaos. It's better to decide on a fair standard now than to suffer later.
The reason why you weren't pinged is probably the same as why I'm usually not pinged, that is: sadly, I don't know who you are. Also, the pings being copy-pasted from another post was likely another factor. Alves9 (talk) 08:28, 7 March 2021 (UTC)[reply]
@Alves9: Then you are not using t:wgping correctly. The correct way to use it is: {{subst:wgping|ja}}. If you want to get pinged as well when others use this, add your username to mod:workgroup ping/data like this:
groups["ja"] = {
	desc = "Japanese";

    --(your username here)
    --...
}
-- Huhu9001 (talk) 10:10, 7 March 2021 (UTC)[reply]
I see, thanks for taking the time to explain. If I must, I'll use that from now on. Alves9 (talk) 11:24, 7 March 2021 (UTC)[reply]
Without {{ja-see}} / {{ja-kanjitab|alt=}}, I lemmatized, and would still lemmatize, at "the most common form".
(In the context of native words:) Lemmatizing at kana avoids the inconsistency of deciding "what is more common" (御 / お・ご) based on Google©™ hits or the intuition of a bunch of freakish word nerds. Following the principle of using kana for all entries is a consistency that I believe in.
If this consistency is achieved, and a user knows that Wiktionary lemmatizes at kana for all native words, they can arrive at an entry instantly with just the pronunciation, whether it is とる or きぶとり. The entry title should not be a partial guessing game on the part of users *and* editors.
As for your "one entry per lemma ethic" and "grouping two completely unrelated words in the same page", I present 擦る, 燻る, and 上下. Homophones are no less troublesome than homographs. —Suzukaze-c (talk) 06:23, 8 March 2021 (UTC)[reply]
Dictionaries are made by people; that means that human thought must be unavoidably involved in the creation of entries. It seems you don't like that, but that is just how it is. Once again, your full hiragana policy does not even succeed at neutralising the problem of multiple spellings, as it simply uses another form of spelling that actually happens to be the most uncommon one between all of them.
Your とる example was a queer one, as I thought I'd specifically explained that I have no problem with lemmatising words with multiple, commonly used kanji spellings with hiragana, if only for the sole purpose of centralisation of the spellings. The きぶとり one is a bad one, as nobody wants or should need to do the pointless effort to 'deconstruct' a term to kana just to search it in this dictionary, when said term is spelled with kanji 99% of the time and does not even have any alternative spellings.
Your retort concerning homographs is fallacious, as just because those terms exist does not mean we have to concoct an absurd standard that puts "homophones" (to risk being viewed as pedantic, the example 汁/知る that I gave is not a homophone, as they have a different accent) on the same page for absolutely no reason. It reminds me of a child justifying a certain behaviour because their peers "do the same thing"; the circumstances must be taken into consideration.
I'll restate that I have absolutely no problem with using this standard for what it was likely originally devised for, that is, words that, at the very least, actually have alternative spellings; however, applying it to words like 着太り is undeniably futile, no matter how I see it. The Japanese language was never written in full hiragana: concede, fellow. Alves9 (talk) 08:59, 9 March 2021 (UTC)[reply]
re: "the Japanese language was never written in full hiragana": this is never something I asserted. I am fully aware that pedantic usage of a pure kana spelling is unrealistic in terms of typical writing and reflective solely of pronunciation.
re: "peers": when everyone else on en.wikt has generally stuck to the long-standing status quo? or do you refer to me using monolingual dictionaries as justification? (and to make it clear: I do believe that there is superiority in the practice of long-standing monolingual dictionaries.)
But I did not spend the good part of an hour agonizing over how to explain, condense, and rationalize my position and intuition for a smug response like "concede, fellow". You will see no more of my time used on this topic with you (supposedly not a "platform to crush your opponent"). —Suzukaze-c (talk) 10:35, 9 March 2021 (UTC)[reply]
I also didn't spend my time building a coherent argument so it could be judged solely based on the closing remark. I'd just like to get a rational explanation for why absolutely all (native Japanese) entries should be submitted using a full hiragana spelling, besides uniformity, which seems to be the strongest driving factor, but in fact means little in the context of a dictionary (some words will be written in katakana, some words will be written in hiragana, some words will be written in kanji, it is simply the reality of a language with three different writing systems). If it's so difficult to explain your position, then it may be because it is incoherent or poorly supported. Now, to be clear, I didn't mean to insult you, and I didn't anticipate that the use of such an inoffensive phrase could be interpreted as a display of smugness. Maybe it wasn't such a good idea, but I just thought this conversation needed some good humour. Alves9 (talk) 14:03, 9 March 2021 (UTC)[reply]
  • I'll note that Japanese monolingual print dictionaries are explicitly indexed by kana spelling. If you know how to pronounce a word, you can find it immediately. Due to how the back-end MediaWiki software indexes things, we have not quite been able to reproduce this ease of use: for terms with kanji in them, folks must know how to spell the kanji (and input it appropriately) before they can find the entry immediately. We thus also have kana-only and romanization-only entries, as additional means for users to find what they are looking for.
Regarding the claim that "the Japanese language was never written in full hiragana", I'll grant that it's unusual, but it's not "never". The National Institute of Japanese Literature even held an exhibit a few years ago of books written entirely in hiragana or katakana. Kana are also how children are taught to read -- so everyone in Japan starts by reading the words in kana, and then later "graduates" to reading the kanji forms.
Personally, I don't care too much about where the lemma entries live, so long as they are easily findable. The relatively-recent addition of the {{ja-see}} and {{ja-see-kango}} infrastructure has vastly improved usability for our Japanese entries. We could do even better, but I think most of us active in the JA space care more about building out the entries, than working on the infrastructure. Certain ideas that have been floated in the past would apparently require buy-in from non-JA editors, and these generally meet with opposition from the wider Wiktionary editing community.
  • One of the more ambitious such proposals was about storing entry data for terms with kanji at the kanji+kana page address as a unique identifier, and transcluding that as appropriate. As an example, there would be separate entries stored at [[水・みず]], [[水・すい]], [[水・み]], and [[水・もい]], to account for the separate mizu, sui, mi, and moi readings for the kanji [[]], while avoiding overlap with homophones like 針孔 (mizu, eye of a needle), (mizu, freshness; luster; beauty), 美豆 (Mizu, a place name); (sui, the essence of something; the pith of something), 酸い (sui, sour, tart), (sui, sleep, sleeping)... etc. That entry data at the [[水・みず]] page could then be transcluded, or otherwise fetched as the {{ja-see}} templates do, into the pages at [[水#Japanese]] and [[みず#Japanese]] (heck, possibly even at the romanized entry at [[mizu#Japanese]] -- allowing users to immediately find Japanese entries even if they don't have a Japanese IME and can't input Japanese characters). This would more closely mimic the behavior of dedicated software for Japanese dictionaries, which can index more flexibly and accurately than the MediaWiki setup, or indeed the EN Wikt setup, currently allow.
Any such more-aggressive reorganization of the JA entry space to deal with the many-to-many relations between Japanese spellings and pronunciations (i.e. kanji and kana) has so far fallen flat, as the targeted structure is so very different from what we (Wiktionary as a whole) have for other languages. This raises concerns that other tools and infrastructure would break. I, for one, have mostly given up on trying to rework the infrastructure.
If I've worked on or created an entry, and someone else moves it to a different lemma address, I'm generally fine with that so long as 1) the new address is at least accurate (and not a typo or something similar), and 2) the overall usability isn't impeded, and users can still find things.
I care more about the entry information, so that's where I'm spending my time. ‑‑ Eiríkr Útlendi │Tala við mig 19:47, 9 March 2021 (UTC)[reply]

Recently I've been asked to please include "Borrowed" before "from" when using the {{bor}} template, which struck me as superfluous because it should follow from the knowledge of the relationship between the two languages, which is oftentimes simply common knowledge. Of course it's not objectionable por seigh, but if its so desirable, I thought, why isn't this automatised? - Indeed, why? said the echo inside my head. - For there's that {{m+}} that is strictly superfluous because {{cog}} exists, unless one experiences idiological hatred towards the idea of clickable language links. Any way, thus it was the idea of a {{bor+}} template was born, which would come with "Borrowed from" already appended; and since I actually find spelling out "Inherited from" to be actually desirable in my actual fly-raisin separating quest, also the idea of {{inh+}}. Has there ever existed a more splendid idea than this? Feedback sought. Maybe you'd prefer to include a simple parameter to add the same text, which begs the question: why {{m+}}?.. Brutal Russian (talk) 20:11, 6 March 2021 (UTC)[reply]

I agree that writing "borrowed" before "from" is superfluous (but not harmful), unless the borrowing is from a source that the receiver language is descended from, e.g. English from Old English, or any Romance language from Latin, in which case saying "Borrowed from" is probably a good idea. I see no need for {{inh+}} and {{bor+}}; indeed {{bor}} itself did include the text "Borrowed from" many years ago until we decided to scrap it with the vote at Wiktionary:Votes/2017-06/borrowing, borrowed. {{m+}} is not actually redundant to {{cog}}, because {{m+}} can be used when identifying words that are not cognates as well (i.e. it covers {{noncog}} as well as {{cog}}). I usually use {{m+}} when a language is mentioned for the second time (or more) in an etymology section, to avoid excessive linking. —Mahāgaja · talk 20:37, 6 March 2021 (UTC)[reply]
I totally get why scrap the text from the default template; but wouldn't having an optional template that includes the text bring nothing but piece and prose parity to the website? You just add a plus and voi lah. Brutal Russian (talk) 22:21, 6 March 2021 (UTC)[reply]
  • My bias: I find myself looking askance at arguments that rely on "common knowledge". I fear that, too often in projects like ours that rely on a certain degree of specialist knowledge from the creators, the creators forget that the users exist in a different context, often without that specialist knowledge.
I note here that the EN Wiktionary can only assume that our readers have sufficient fluency in reading English to read our pages. Assuming more than that may not be accurate, or even fair.
Including the "borrowed from" is trivial, and it makes it explicitly clear what the derivational relationship is. We can be even clearer by linking the word "borrowed" through to [[Appendix:Glossary#loanword]], as the {{bor}} template formerly did prior to that 2017 vote. The passing Proposal 1 of that vote even advocated for adding the "borrowed from", just adding it manually outside of the template.
If all we say is "from", we leave things less clear than is ideal for a dictionary. Did the term come via some other language, or directly? Was it a local coinage to mirror that other term, resulting in a word so close to the source that for some reason it isn't quite considered a calque (and assuming that our readers are even aware of what calques are)? Was there some other kind of process at work?
Similarly, I always specify if a compound is a compound. My main language area here is Japanese, which doesn't use whitespace. So if we don't specify that the term is a compound, and just say "from TERM + TERM", the reader might plausibly think that this might be a looser collocation of independent terms, and that these components could be used freely and productively in other combinations.
As we spell out at WT:WWIN, this isn't a paper dictionary. Yes, we should be brief, but there's no good reason to omit pertinent and relevant information that helps describe the term in useful ways to our readers. ‑‑ Eiríkr Útlendi │Tala við mig 23:04, 6 March 2021 (UTC)[reply]

Japanese Dialectal Synonyms Module

[edit]

@Eirikr, @Whym, @Atitarev Hi! I'm tagging Japanese editors that I know, feel free to tag others. I'm not sure if this has already been suggested before, but I feel like Japanese needs a dialectal synonym module similar to what Chinese has done. It's an opportunity to be able to more systematically include dialectal Japanese entries into Wiktionary, as well as clarify which terms are used where. I noticed sometimes that there are very literary Japanese words, that I see are colloquially used in certain dialects. So info like that would be easier to look at if we had a Japanese dialectal synonyms module. I'm happy to read your thoughts. Thanks! --Mar vin kaiser (talk) 08:31, 7 March 2021 (UTC)[reply]

Indeed, there seems to be a lot of dialectal terms being added recently. But I'm afraid the Japanese language is not nearly as linguistically diverse as the said Chinese 'dialects' are, nor are there as many different varieties contained within it, so it likely would be a very situational template. Alves9 (talk) 08:52, 7 March 2021 (UTC)[reply]
  • Agreed that something like this would probably be useful. Also agreed that Japanese lects are probably not as many nor as divergent as the various Chinese lects.
That said, Japanese dialectal terms often display differences in conjugation that cannot be clearly displayed using our standard Japanese templates and modules. In some cases, a Japanese dialect may have distinctions of tense or aspect completely missing from the national standard, such as Tosa dialect with its separate verb endings for "the verb has happened and the result is the ongoing state" as opposed to "the verb is happening continuously or repeatedly as an ongoing action". This distinction exists also in Korean, interestingly, but in standard mainstream Japanese, both are expressed using the -te iru ending, with any sense distinction having to rely on context.
I have very little personal expertise or experience at present with dialectal Japanese of any flavor, aside from a little experience with the everyday speech of prefectural capital Morioka, exhibiting some influence from Tōhoku dialect and some passing familiarity with Kansai dialect from friends from that area. I have thus avoided working on Japanese dialect entries, aside from infrastructure issues like formatting and entry organization.
I would be interested to see more coverage of Japanese dialects. I am hopeful that a more supportive template and module setup might help in that regard. ‑‑ Eiríkr Útlendi │Tala við mig 21:44, 7 March 2021 (UTC)[reply]
I don't think it's necessary that the Japanese language be as linguistically diverse to have a module like this. As long as we can have a systematic way to show that, for example, for the term "father", they use this term here and use this term there, it would be great if it were in a chart format. I noticed also that it's often the case that since Japanese use Kanji, but the pronunciation could be several, sometimes different dialects use the same Kanji for a certain object, but the pronunciations vary significantly. That's kinda something that would be cool to explore with a dialectal synonym module. And given various sources online on Japanese dialects, references won't be a problem. --Mar vin kaiser (talk) 00:59, 8 March 2021 (UTC)[reply]

@Mar vin kaiser I've made a language-agnostic {{dial syn}} that just needs location data. —Suzukaze-c (talk) 06:25, 8 March 2021 (UTC)[reply]

@Suzukaze-c: Cool! I see the samples in Korean and Arabic. Has this already been used in any existing entry (except for Chinese, of course)? --Mar vin kaiser (talk) 08:13, 8 March 2021 (UTC)[reply]
@Mar vin kaiser It's in use for Korean (and not for Chinese, which currently uses Module:zh-dial-synModule:dialect synonyms is a rewrite.) —Suzukaze-c (talk) 08:14, 8 March 2021 (UTC)[reply]
@Suzukaze-c: Thanks! I'll try to work on one for Japanese. --Mar vin kaiser (talk) 08:56, 8 March 2021 (UTC)[reply]
@Mar vin kaiser BTW, what are you using for sources? Jlect (User_talk:Kwékwlos#Japonic_dialects)? —Suzukaze-c (talk) 08:58, 8 March 2021 (UTC)[reply]
@Suzukaze-c: Yeah, I know about that, but what was in my mind was the published resource, "日本方言辞典". --Mar vin kaiser (talk) 09:04, 8 March 2021 (UTC)[reply]
@Mar vin kaiser: Cool. I think it is best to add references to an entry for accountability (especially since we have other entries copied from Jlect for some reason, which is probably questionable). —Suzukaze-c (talk) 09:15, 8 March 2021 (UTC)[reply]
@荒巻モロゾフSuzukaze-c (talk) 10:34, 8 March 2021 (UTC)[reply]
As for sources, [1] seems like a good online resource on dialectal kinship terminology in Japan. Since dialectal words are primarily spoken, I believe it usually makes sense to describe them in kana, unless other written (kanji) forms are established as such in literature. That is how most sources document them, too. When they describe the meaning of オヤッサン as 親爺さん, they are not necessarily claiming 親爺さん is the standard written form of the dialectal word, although the two might well be related. Whym (talk) 12:08, 8 March 2021 (UTC)[reply]

Reconstructions in Latin script

[edit]

I'd like get a discussion going about adding a guideline to WT:PROTO that states that all reconstructions should be in Latin script. Most already are, but here's a list of the ones that buck that standard: Gothic (52), Ashokan Prakrit (15), Old Armenian (9), Old Korean (6), Proto-Norse (5), Pictish (4), Sanskrit (4), Primitive Irish (3), Hittite (2), Old Korean (2). @Mnemosientje, AryamanA, Vahagn Petrosyan, Tom 144 --{{victar|talk}} 09:37, 7 March 2021 (UTC)[reply]

I agree for languages with an inconsistent orthography (e.g. cuneiform), especially if they can be written in multiple scripts (e.g. Middle Persian). Strongly disagree for languages with a well-established orthography written in one script, such as Ancient Greek, Old Armenian, Old Georgian. --Vahag (talk) 09:52, 7 March 2021 (UTC)[reply]
Same as Vahag, would be weird to have Gothic reconstructions lemmatized with a Latin script entry title while the rest of the entry titles are Gothic script. Same for Proto-Norse, I guess. Reconstructed languages (e.g. PIE) should obviously be lemmatized at Latin script entry titles by default, but corpus languages with their own script in which some terms can be reconstructed (e.g. Gothic) are another story altogether, and the two should not be confused. — Mnemosientje (t · c) 10:32, 7 March 2021 (UTC)[reply]
I feel like that depends a bit on the language. Having reconstructed Sanskrit terms in Devanagari seems perfectly natural to me, reconstructed Gothic terms in Gothic script somewhat pedantic, and reconstructed Primitive Irish terms in Ogham downright perverse. I guess it depends on the extent to which the native script is actually used in the historical linguistics literature: Sanskrit is very often given in Devanagari, Gothic rarely in Gothic script, and Old Primitive Irish absolutely never in Ogham. —Mahāgaja · talk 11:10, 7 March 2021 (UTC)[reply]
I suppose that's a broader discussion that can be had. But as long as the regular, attested Gothic entries are at the Gothic script spelling, using the Latin script for the handful of reconstructions we have seems odd. — Mnemosientje (t · c) 11:16, 7 March 2021 (UTC)[reply]
If we're going by academia, reconstructions will always usually be in Latin script, which does also go for Sanskrit and Avestan. Seeing *लुट्टति (luṭṭati) is rather weird to my eyes. I probably have the most sympathy for Old Armenian, since it isn't too different from modern standard Armenian. I however have no sympathy for the dead language Gothic because you would be hard-pressed to find any work that reconstructs Gothic in Gothic script. --{{victar|talk}} 21:22, 7 March 2021 (UTC)[reply]
Beyond just reconstructions you'd be hard-pressed to find a work that reproduces any actually attested Gothic in Gothic script. Yet we use Gothic script for the entries. Whether that's desirable or not is a larger discussion but again, imo it's odd to make an exception for reconstructions on that basis as the same reasoning also applies to regular, attested terms where we so far have opted to use the Gothic script and not the Latin script for entry titles. — Mnemosientje (t · c) 22:42, 7 March 2021 (UTC)[reply]
One issue to consider is that the romanization, if used for this purpose, must not lose information. (In mathematical terminology, it must be an injection.) This may require tagging the constituents of romanized reconstructions with all kinds of diacritics and indices. Also, the phonetic values of various graphemes may be uncertain; should we pile such uncertainty on top of the inherent uncertainty of reconstructions?  --Lambiam 15:53, 7 March 2021 (UTC)[reply]
Obviously in Hebrew and Arabic works Hebrew and Arabic words are reconstructed in vocalized Hebrew and Arabic script. Just do the natural: If a script is successful, a script that is continued to be used because being able to represent languages well at least with expansion, then use that. Hence don’t reconstruct in cuneiform or any of those scripts Middle Persian actually used because that is generelly fanciful and arbitrary and as difficult and ambiguous does not get to the point to the extent that we desire. But Devanagari and Cyrillic works very well. Like one can read and and write languages in it like natively, without understanding why there would be a different script. Aye, I like this dictionary because its aim of everything being in the script it should be written. You know the ting, there was an argument somewhere that now in the 21st century one should represent writing correctly while the “standard” in academia was wrong and now one is sluggish enough to continue it. Then again, there may be cases where one deliberately uses scripts which are defective in some respect and writing them in Latin would add something and make a statement that is not desired to be reconstructed.
See, knowledge is not by default in Latin script. Somewhere I argued by hyperbole that it even starts only when one has forsaken one’s native world. I don’t find this argument, but at least using the reconstruction namespaces in foreign scripts is consistent with our former votes of not letting encroach Latin transcriptions too much, Wiktionary:Votes/2019-05/Lemmatizing Akkadian words in their transliteration, Wiktionary:Votes/pl-2018-12/Allowing attested romanizations of Sanskrit. It’s actually writing in Latin that needs a justification, not the other way round, that’s the point. With cuneiform you have that justification obviously but as we have recognized not that much when there are attestations you can encode, so here lies the difference.
True that it is inconsistent if some languages use native scripts and others not and some only when attested, but then we have at least consistence in some languages, e.g. Ashokan Prakrit is always in native script—whether found by attestation or conjecture—, this makes senses in isolation, and if it makes sense in isolation it can’t be wrong. And of this can’t-be-wrong we have in total, proceeding so, more than if we always reconstruct in Latin script, because if we always reconstruct in Latin script then only those languages natively written in Latin have this consistency. Fay Freak (talk) 19:54, 8 March 2021 (UTC)[reply]

Borrowings vs. code-switching

[edit]

This RFV discussion has halted because of disagreement about what constitutes a borrowing, and so it was suggested to take it here.__Gamren (talk) 09:34, 8 March 2021 (UTC)[reply]

  • If a Latin phrase is used in several languages with similar meaning, and is not treated as a part of speech, we should include it as a Latin phrase only. Vox Sciurorum (talk) 15:27, 8 March 2021 (UTC)[reply]
  • Several dictionaries, old and modern, classify the phrase as Latin: [2], [3], [4]. Authors sometimes immediately supply a translation,[5] or set off the expression by placing in between quotation marks[6] or putting it in italics,[7] or sometimes all three at once.[8] We have a guideline on code-switching that mentions these as clues that something is code-switching rather than borrowing. (Disclosure: I was a major contributor to that page.)  --Lambiam 17:14, 8 March 2021 (UTC)[reply]
  • “Treating as a part of speech” is nebulous. If that were so, we could have ius cogens as Russian, Ukrainian, Bulgarian, Macedonian (search term "ius cogens" "право")—it even has preceding adjective-qualifiers—, as well as lex rei sitae and what not. Editors in these languages would be reluctant to agree with that 😧. And a minore ad maius, the fact that Turkish and Azerbaijani are written in Roman script does not make there such terms more acceptable than in Kazakh or Kyrgyz. Accordingly, in the RFD to “Danish” de gustibus non est disputandum (Talk:de gustibus non est disputandum) Lambiam found it silly to have petitio principii as Turkish. The way one can bracket all these foreign terms, whether as longer phrases or phrases or smaller parts of speech, cannot be decisive – it’s the sorites paradox and abuse of lexicographic categories designed for distinctions within one individual language (it does not work that way: “it’s a noun, boom, it’s Macedonian!” Sorry @Vox Sciurorum, one needs something wittier). And again borrowings vs. code-switching is a false dichotomy. This isn’t the formula either. I have expounded the problem and solution longer in Wiktionary:Beer parlour/2021/February § Translingual. Fay Freak (talk) 15:20, 9 March 2021 (UTC)[reply]
  • Longer Latin phrases should certainly be considered Latin. Shorter phrases have distinct pronunciation by language, distinct grammatical usages, and simply are used in a select set of languages--far from complete, and not even in the complete set of languages that sometimes use Latin in this manner. i.e. has a large English entry, and we lack any data on its use in most other languages. Where is ius cogens and jus cogens used, and which are used in what languages? If they're actually spoken in courtrooms, I'm sure they have conventionalized pronunciations that may have little to do with Latin and differ vastly between Russian courtrooms and English courtrooms.--Prosfilaes (talk) 03:07, 18 March 2021 (UTC)[reply]

What is Wiktionary needing the most?

[edit]

Wiktionary is not yet complete, but in your opinion, which is more important for Wiktionary? Creating English entries or creating non-English entries? — This unsigned comment was added by BrightSunMan (talkcontribs) at 09:14, 9 March 2021 (UTC).[reply]

  • Both are equally important. We aim to include ALL words in ALL languages. English is more or less complete except for many technical or rare words. SemperBlotto (talk) 09:41, 9 March 2021 (UTC)[reply]
    Many technical or rare words, although used, are not found in many dictionaries. We should aim to include also those. A sample of missing words starting with stra: strabismometry; strackling; straddlewise; strainproof; straitlacing; Straitsman; stramineously; strandage; strangerdom; strangership; strangulable; strangullion; strappable; stratagematical; straticulation; stratospherical; strawbreadth; strayaway.  --Lambiam 13:54, 9 March 2021 (UTC)[reply]
    I would disagree with this. We're missing a lot of English idioms and slang, both old a new. I've been reading a lot of late-19th c. fiction recently, and every couple of pages I'll come across an idiom I don't recognize which we lack an entry for (e.g. God save the mark, Sunday girl, give vent to). We also lack entries for a lot of modern slang/idioms that are in wide use, at least in the niches that I edit in (the heavily overlapping circles of gay slang, drag slang, 'stan' slang, and AAVE). Recording recent slang in particular seems like something that could deliver a lot of value to a lot of readers. One of the virtues that has allowed Wikipedia to thrive and out-compete its fusty competitors was its agility - the fact that it can rapidly incorporate new information - and its willingness to cover topics that wouldn't ordinarily be covered in a paper encyclopedia (particularly in the area of pop culture). Wiktionary seems well-poised to gain a similar edge over traditional dictionaries. One thing that would help a lot here would be explicitly updating our "permanently recorded" criteria to incorporate online sources beyond UseNet (there's nothing at WT:ATTEST that explicitly excludes such sources, but it seems a lot of editors exclude them from their personal interpretation of the phrases "permanently recorded"/"durably archived"). I recently read a news article that described Urban Dictionary as "the go-to site for researching slang terms in the English language" and that made me very, very sad. Colin M (talk) 20:12, 11 March 2021 (UTC)[reply]
Urban Dictionary is also a cesspit of any old rubbish someone thought was funny at the time. I've seen entries there that are basically some poor rendition of yo mama jokes. Sure, they don't restrict entries to terms that have demonstrable track records, so it's great for rapidly evolving slang -- if you can sort out the actually-used from the so-niche-only-a-handful-of-people-have-any-idea-what-this-is.
I don't think you should be sad.  :) The fault is that "reporter's", not ours. (Intentional quoting since I have a low opinion of their research abilities.) ‑‑ Eiríkr Útlendi │Tala við mig 23:03, 11 March 2021 (UTC)[reply]
I agree with your assessment of UD's editorial practices, and that's precisely why it makes me so sad that it should be regarded as the best source on the net for modern slang.
I also agree with you that the site in which that particular story was published is fish wrap, but if I search the New York Times archives, I get 129 results for "Urban Dictionary" and only 12 for "Wiktionary". So even respectable publications seem to cite UD a lot more than us.
Maybe some of that is for uninteresting reasons like Urban Dictionary having better SEO or marketing, but I think part of the reason is that there is a lot of legitimate slang (attestable across many independent sources over a period longer than a year) that UD covers (even if badly) which we don't. And I think it's possible to tweak the culture here in a way that will help us close that gap without giving up our fundamental principles around attestation. Colin M (talk) 00:09, 12 March 2021 (UTC)[reply]
I'm also puzzled by this. Maybe it's because UD is not only documenting, but also shaping slang? (Social features like rating/sharing etc. Guessing here). –Jberkel 00:34, 12 March 2021 (UTC)[reply]
I'm not so sure that all of those 129 mentions of UD in the NYT are cases of citing UD, so much as just writing about UD? It's certainly flashier, and more controversial. I wonder if that might be some of why UD is mentioned more -- it's not proper lexicography, it's more of a social-media metadiscussion about current linguistic fads and bad jokes. That inherently makes UD more interesting to media outlets like the NYT: Drama!
By contrast, we generally avoid Drama (intentional capital "D"); some of us gravitated to Wiktionary away from Wikipedia due to the Drama there and lack of it here. ‑‑ Eiríkr Útlendi │Tala við mig 00:41, 12 March 2021 (UTC)[reply]
  • Non-English. The easiest measure of how good this dictionary is is: when someone looks up a word, does he get the definition (etymology, pronunciation, etc.) for that word? And the quality of English entries is greater than non-English. Additionally, there are many, many more possible entries in non-English than English. —Justin (koavf)TCM 10:13, 9 March 2021 (UTC)[reply]
Reformatting entries so that there isn't a huge ton of etymology and pronunciation before the meaning. This is what we call a "quick win" in business and our users would love us. Also probably good for SEO, fuck SEO, but we are all slaves to Google. Your English-non-English dichotomy, BrightSunMan, isn't the real problem. Equinox 10:14, 9 March 2021 (UTC)[reply]
Sure but his question still has merit: if we had to choose between a dictionary in English that is complete for the English language but has no foreign terms versus a dictionary in English that is complete with all foreign terms but has no English ones, I think it is pretty clear that the latter is a much greater achievement. —Justin (koavf)TCM 10:17, 9 March 2021 (UTC)[reply]
I reject this dichotomy. Our project's stated goal is "all words in all languages." We seek neither to create a complete dictionary of all English terms and no non-English, nor a complete dictionary of all non-English terms and no English. We seek to document all language, with English as the medium of description.
Also, inasmuch as this is a volunteer project, and we all work on what we freely choose to work on, I'm not sure what the OP's goal is in asking this question. I do very little in English entries, by my own choice. If the Wiktionary community were to go all-in on focusing exclusively on English entries, I would spend a lot less time here. ‑‑ Eiríkr Útlendi │Tala við mig 19:05, 9 March 2021 (UTC)[reply]
Incidentally, BrightSunMan, do you realise there are separate Wiktionaries for different languages, like e.g. fr.wiktionary.org will take you to a French Wiktionary (with its own culture and rules)? It's not fair to split everything between X and non-X. Look at the reaction if you talk about "non-white people" etc. Equinox 10:15, 9 March 2021 (UTC)[reply]
That seems needlessly racially charged and also a very weird way of implying that a term like "person of color" is somehow inappropriate. I think you can make your point in a far more intelligible and inviting way. —Justin (koavf)TCM 10:18, 9 March 2021 (UTC)[reply]
I stand by my accurate and non-racist remark and I don't know what is wrong with you. Equinox 10:23, 9 March 2021 (UTC)[reply]
I don’t get what looking at the reaction to a racial characterization has to do with anything, including the question being posed here. Obviously, “English” versus “non-English” is the same dichotomy as we make between Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English, which, as far as I am aware, has not elicited any reactions that this is an unfair split.  --Lambiam 13:54, 9 March 2021 (UTC)[reply]
No one wrote that your comment was racist. What I wrote was that your analogy is needlessly inflammatory and extreme. He posed a very mundane question to just assess our attitudes on this project which is nowhere near as controversial as the reaction if you talk about "non-white people". —Justin (koavf)TCM 18:31, 9 March 2021 (UTC)[reply]

It bears noting: Wiktionary will never be complete. ‑‑ Eiríkr Útlendi │Tala við mig 18:57, 9 March 2021 (UTC)[reply]

  • I think we need more emojis. As well as that, more quotes form sports journalism, more definitions written in old-fashioned language imported straight from 100-year-old dictionaries, more accusations of racism, and more FUN. Oxlade2000 (talk)
  • As a language learner, these are of primary importance to me: IPA, hyphenation, examples, conjugation/declension. Wiktionary, in my opinion, blows everybody out of the water where pronunciation is concerned. We've done a great job of documenting the IPA transcription of words. However, others have done a better job of providing audio recordings. What do I think we need? I think we need to slow down a bit ... We need to spend less time expanding the dictionary with sparse entries (headword and definition entries) for words which are rarely used. We could focus more energy on expanding all the good entries we already have. It's great to have lots of obscure words here, but it would be nice to have more IPA, hyphenation, examples, conjugation/declension in the entries we already have. — Dentonius 08:51, 12 March 2021 (UTC)[reply]
I fully agree with @Dentonius's comments here. Audio is great, but should not be treated as a substitution for IPA, and vice versa.
  • In favor of IPA: In various environments, a user might not be able to access the audio, or if deaf or otherwise hearing-impaired, might not be able to use the audio.
  • In favor of audio: Users might not be familiar enough with IPA to be able to understand the phonetic notation. There is definitely a learning curve for IPA. Meanwhile, audio (for those who can access and hear) is immediately usable, and is how (hearing) humans have been learning language since before we were humans.
Also agreed that stub entries with just a headword, a POS header, and a gloss, are inadequate. It's a bit better than nothing (so long as it's correct), but it's certainly not ideal: much better to provide fuller information to help a reader understand much more about the term -- how it's used, how it isn't used, how it's pronounced, where it came from and what it's related to, any common collocations, etc. etc.
I've been doing my best to build out the Japanese entries with the above all in mind. I've also chipped in some in entries for other languages I've been studying, such as by adding {{rfe}}, {{rfp}}, and adding in glosses where I can to improve the usability of the pages.
Looked at another way, I think Wiktionary has achieved a certain useful degree of breadth -- we have entries for a lot of different terms, in a lot of different languages. But we lack depth in many places -- there is a paucity of useful detail in too many of our entries. I would love to see our content offering enriched, and I work towards that end where and when I can. ‑‑ Eiríkr Útlendi │Tala við mig 00:06, 13 March 2021 (UTC)[reply]
+++DEPTH DCDuring (talk) 01:24, 13 March 2021 (UTC)[reply]
+1 for more examples. As a side note, I'm surprised it's not expected as a matter of policy that newly created entries have at least one quotation. It's not a lot of extra work, it would help readers, and would greatly reduce the volume of requests at RfV. It would make it harder to rapidly create new entries using automated tools, but in most cases I would regard that as a feature rather than a bug. Colin M (talk) 20:37, 13 March 2021 (UTC)[reply]
  • Just having a few audio recordings for some languages (e.g. Zulu) would be awesome, as it's very hard to interpret IPA without some examples to go on. Troll Control (talk)
  • This is a very fun discussion topic. I would say that Wiktionary just needs more editors. As the quality of Wiktionary increases, I expect that more people will be attracted to the project. I have also attempted to build and strengthen the links between Wikipedia and Wiktionary so that more people in my subject matter of editing will realize that Wiktionary exists and then come and edit over here. --Geographyinitiative (talk) 01:21, 14 March 2021 (UTC)[reply]
  • More quotations. A collected corpus of attestations in the wild is the foundation of any lexicographical project that isn’t just reproducing previous efforts. When we have doubts about whether definitions are distinct, what exactly they’re referring to, and so on, the illustrating quotations should be the first point of reference. Similarly, if we want to have information about dates of attestation, details of usage and register, and so on, that aren’t just copied out of the OED, we need to have a clear corpus of quotations to draw on. This is even more important for less documented languages, for which existing materials and descriptions may be highly inadequate. Unfortunately (but understandably) it’s exactly in those languages that we’re most lacking. — Vorziblix (talk · contribs) 17:44, 16 March 2021 (UTC)[reply]

Wiktionary is always looking for attestations of words. And so, it will never be completed too. 119.56.97.150 13:33, 31 March 2021 (UTC)[reply]

Sunda-Sulawesi Language Group

[edit]

For a while I've noticed @Austronesier systematically removing the proto-language for this group from etymologies. Just now I discovered that they've replaced all the entries for Proto-Sunda-Sulawesi languages with {{delete}} tags.

While I agree that this language group is probably utter nonsense (I would have intervened immediately, otherwise), we can't have individuals unilaterally deciding whether an entire language or language group should exist on Wiktionary or not. Nor should we have people nominating entries for speedy deletion without extremely solid consensus that such entries shouldn't be allowed.

I'm going to give Austronesier the benefit of the doubt as someone relatively inexperienced with Wiktionary's rules and policies, and I'm going to start the necessary discussion here. To get the ball rolling, here's the deletion reason given in one of the delete tags:

The Sunda-Sulawesi subgroup is a spurious artefact from Wikipedia, cf. w:Sunda–Sulawesi languages. No scholar has ever proposed a subgroup under such a name or with this scope. All reconstructions were created ad hoc by a Wiktionary user based on the spurious WP subgrouping scheme.

Chuck Entz (talk) 07:22, 10 March 2021 (UTC)[reply]

@Chuck Entz: Thank you for the ping and initiating the discussion. I apologize if the addition of the delete-tags was overly bold. My main reason for requesting a speedy was not driven by the fact that the reconstructions were home-made (I am aware that Wiktionary gives room for well-argued etymologies and reconstructions even if they have no explicit scholarly source), but by the close-to-"hoax"-like nature of the entity these are related to. Before that I had boldly edited away these reconstructions from etymologies (including redlinks) in numerous entries. Unilaterally, yes, but ready to discuss if there's any disagreement about it.
We dug into the history of "Sunda-Sulawesi" as a WP artefact two years ago, details can be found at the WP link. In a nutshell: "Sunda-Sulawesi" is the reified complement set of all w:Nuclear Malayo-Polynesian languages (a low-impact, little-cited subgrouping proposal) which do not belong to the w:Central–Eastern Malayo-Polynesian languages (an oft-cited, but still controversial subgrouping). The term "Sunda-Sulawesi" does not appear in scholarly literature on Austronesian subgrouping (unless copied from Wikipedia, as happened in rare cases: Hans Henrich Hock(!) obviously took his background info for Palauan from WP in this paper), nor were we able to find a blog or similar source where the complement set to the Central–Eastern Malayo-Polynesian languages was assigned into a subgroup, either of that name or any other. Since "Sunda-Sulawesi" is unattested outside of "WP-generated" content and an artefact of Wikipedia, "Proto-Sunda-Sulawesi" reconstructions are essentially in-house artefacts of Wiktionary. As such, I believe it makes little sense to keep them.
If the community decides that these entries still have their merit, so be it. But in this case I would at least propose that Sunda-Sulawesi should be removed as a clade from the subgrouping hierarchy, because it still appears e.g. in the category hierarchy. I'm sure nobody wants to have the equivalent of Category: Arabic lemmas derived from Celto-Tocharian languages (supposing that no-one ever has come up with "Celto-Tocharian"; and even if, well...) here. –Austronesier (talk) 08:39, 10 March 2021 (UTC)[reply]
The thread at Wiktionary:Requests for deletion/Others#Category:Sunda-Sulawesi languages and Category:Borneo-Philippines languages is also relevant to this discussion.@TagaSanPedroAko, Tropylium participated in that discussion (which petered out without coming to any conclusion, but also without anyone expressing a desire to keep Sunda-Sulawesi). —Mahāgaja · talk 14:37, 10 March 2021 (UTC)[reply]
Since you ask, sure, I'm in favor of getting rid of Sunda-Sulawesi categories (I even still hold my opinion from 2015 that also some valid subfamilies would be better off as not being modelled as their own categories on Wiktionary). No opinion on procedural soundness, though FWIW principles held at Wikipedia like BOLD and IAR would seem to approve.
For the future though, a more relevant question to ask might be if we have any other such cases lurking around. In the wake of some classification problems being discussed in the recent years (both just at Wikimedia and in scholarly literature), I've been thinking that it might be worthwhile to comb thru WP's language (sub)group articles to see which of them have any of (1) sources, (2) extensive definitions and (3) characteristics given. Anything missing at least two of these should probably be treated with suspicion. Throwing in also WT's categories and checking if and how they might mismatch with WP would probably be a natural follow-up. --Tropylium (talk) 23:22, 10 March 2021 (UTC)[reply]
Thank you for the links to the earlier discussion. Apart from the rather unique cases of "Sunda-Sulawesi" and "Borneo-Philippines", I agree with Tropylium that categorizing shouldn't be overly fine-grained, and should only contain uncontroversial subgroups. The mid-level subgrouping of the Malayo-Polynesian languages is still debated, so it is advisable to have an agnostic rake structure here. "Malayo-Sumbawan" has been silently abandoned by Adelaar so we shouldn't use it (FWIW, I have left them untouched in my "Sunda-Sulawesi" cleanup). Follow-up proposals regarding the languages of western ISEA like Blust/Smith's "Greater North Borneo" and "Western Indonesian" (which haven't found their way into WT yet) are clearly not the last word about the matter (especially the latter is very tentative). Blust's "Central-Eastern Malayo-Polynesian" and "Central Malayo-Polynesian" are also doubtful, but most sources accept them at least as preliminary handy bookkeping entities. To the defense of Blust (in reply to comments by Chuck Entz in the cited threads): he flirted on one occasion with computational phylogenetics when the ABVD was built up, but his actual work on subgrouping and reconstruction is deeply rooted in the comparative method–as he never fails to bombastically mention. Major flaws of the ACD are the sloppy treatment of central and front mid vowels (which are mostly uniformly spelled "e" for many languages); his perseverant use of the label "Western Malayo-Polynesian" when nobody believes that such a subgroup exists (resulting in 3.620 reconstructions of unclear status as either PMP or some level below it); and quite a number of bad entries for bound morphemes. Apart from that, it is the best collection of reconstructions we have, and a real gold mine of lexical correspondences.
@Tropylium: You should take your idea to sift through WP for problematic subgroups to the WikiProjects Languages and Linguistics, I fully support it. There's definitely some work to do; WP has articles e.g. about subgroups proposed by Blench, even though many of them have rapid decay rates with half-lives of 2-5 years (I don't know if anyone has mapped any of these into the structure of WT). –Austronesier (talk) 10:56, 11 March 2021 (UTC)[reply]

Latin pronunciation notes templatificated

[edit]

So better today than next year, I'm looking for suggestions in creating a template that would absolve me from manually writing all those pronunciation notes in Latin entries. I'm wondering what phenomena you think are worth templating - and having useful categories for - as well as how many different templates this should be done with. Here's an interim list:

  • unknown/unattested vowel length;
  • uncertain/conflicting (inscriptionally attested) vowel length;
  • metrically attested, but variable (such as muta cum liquida);
  • metrical licenses (Ītalia), Grecisms (-īa);
  • tower of babel: late/medieval, cited length being best guess based on etymon, but could have been anything;
  • modern - recent/New Latin borrowing, e.g. from English, where there's at least phonemic vowel length; or from Russian etc, where stress is identified with length in vowel length-enabled languages;
  • other stuff a learner concerned with correct pronunciation will want to browse/watch out for, such as vocalic /i/ in Grecisms like ïūlus, or long vowels/dipthongs before another vowel (Dīāna, dīus, fīat, Gnaeus; the pronomnal -īus), or unwritten double consonants (mīles(s), hoc(c));
  • maybe attested syncopated forms like assecla, calda...

I'm thinking having one template with multiple arguments and a possibility of adding/overwriting with a manual description. Also, I'd like to avoid monstruosities like this, instead aiming at a brief and eloquent note with links to Appendix:Latin_pronunciation (which should be expanded). Any theoretical or technical suggestions, encouragements and discouragements to excite or temper my imagination greatly appreciated. I won't mind some basic implementation either :D

Also, if you don't give an ass about Latin pronunciation but are concerned about other areas of grammar worth having usage templates and categories for, feel free to make suggestions in a new section; for ex. I suggested but haven't implemented yet one for stative and inchoative verbs. Brutal Russian (talk) 21:05, 10 March 2021 (UTC)[reply]

What is the value of "plural only"?

[edit]

It seems to me that users who come to an English entry like that for wild oats might well be interested in whether it is singular or plural for purposes of verb and pronoun agreement. I don't think we clearly and correctly address that. Instead we lead users on a chase in vain in pursuit of the answer, sometimes leading to a wrong answer.

In an appendix (not principal namespace!) we "define" it as follows:

A noun (or a sense of a noun) that is inherently plural and is not used (or is not used in the same sense) in the singular, such as pants in the senses of "trousers" and "underpants", or wheels in the sense of "car", is plural only or a plurale tantum. In practice, most pluralia tantum are found in the singular in rare cases. (See Category:English pluralia tantum.) Contrast words which are singular only (singularia tantum).

At plurale tantum we have:

A noun (in any specific sense) that has no singular form, such as scissors (in most usage).

In these "definitions" there do not seem to be any specific usage implications of the label.

The online Lexicon of Linguistics ("LoL") defines pluralia tantum [sic] ("p.t.") as follows:

MORPHOLOGY: a traditional term used for words which (a) end in a plural affix, (b) have a plural meaning, and (c) do not have a singular counterpart.

I am not sure what the implication of plural only in the headword line or as a definition label is supposed to be. If it is only a statement that the headword looks like a plural, as the definition above, who cares? If it is a statement that the headword requires a plural verb, then it is often misapplied. For example, we show wild oats as plural only in the headword line. It is not very hard to find attestation for its use with singular verbs: "Wild oats is a crop most people sow when they live like children"; "Wild oats is a factor in the trade."; "Wild oats was more competitive than kochia and sunflower at lower soil temperatures."; "Wild oats is an annual weed with growth habits similar to those of small grain."; "wild oats was controlled with herbicides"; "early emerging wild oats was more vigorous and competitive with wheat.".

By the LoL definition of p.t., the various definitions of wild oats are incorrectly labelled, as necessary condition (b) is fairly often not met in usage.

According to my reading of CGEL (2002) there are at least 40 nouns (and, in addition, compounds derived from them) that require some lexical information to make clear what number agreement restrictions apply, for which the term invariant is more applicable that plural only, though invariant is not satisfactory either. I propose that we adopt the practice of showing the same form (eg, wild oats) as both singular and plural on the inflection line and wherever else "plural only" appears. Where either singular or plural agreement is much more frequent, usage notes or labels should be used. DCDuring (talk) 15:31, 11 March 2021 (UTC)[reply]

I propose that we adopt the practice of showing the same form (eg, wild oats) as both singular and plural on the inflection line and wherever else "plural only" appears.

This seems like a bad idea. In most of these cases, the headword has the morphological form of a plural, and a corresponding singular form is inferrable. In many cases, that singular form is attestable, but just rare/obsolete/dialectal/etc. e.g. pajamas/pajama. If we're going to list any singular form for pajamas, I would think it should be pajama (along with a note such as the one we have currently: normally plural).
I'm also not totally convinced that the examples of singular agreement for wild oats are representative of all the terms we currently class as plurale tantum. I think there's a general ability to flip into singular agreement when talking about a normally plural noun as a commodity. e.g. "Fresh cranberries was another item that sustained some spoilage.", "Cranberries is a commodity which, for volume purposes, is about the same as cherries." It's not really a lexical property so much as a grammatical one.
I would read the "plural only" label as meaning that the corresponding sense uses the plural-inflected form of the headword, and that it follows the normal grammatical rules for plurals (e.g. "a pyjamas", "five pyjamas", "pyjamas is" are all forbidden). I agree that there are more subtle cases, e.g. where only the plural-inflected form is used, but it has singular agreement, and in these corner cases, we should be explicit with our labels/usage notes about how they work. e.g. for scissors, OED has separate senses for a) in singular form b) in plural form with plural agreement and c) in plural form with singular agreement. Colin M (talk) 21:09, 11 March 2021 (UTC)[reply]
It’s English agreeing with the sense rather than with grammatical properties. Constructio ad sensum. “A ratnest emerged in my datcha. They were controlled with rodenticides.” “I informed my team. They will contact you back about the costs.” Things that don’t happen with other languages, so therefore it is a true statement that certain terms in English are plural only but agreeing with the singular. What now, @DCDuring? I am afraid that due to the Latinate, classical-education origin of lexicographic catchwords there is no terminology to describe number restrictions specific to English in their kind, except perhaps in some obscure journal which due to its little clout on usage cannot aid understanding. So, unless we want to become even obscurer by pushing neologisms to be more unequivocal, I conceive that we have to stick to the current descriptions, as opposed to having no markers about why the form be plural and whether a formally expected singular exist; this is the actual purpose of this label and thus it is correct unlike you contended, you just expected agreement with that label in a wrong way. I don’t read the LoL snippet as stating three necessary conditions for a plurale tantum but alternative or typical ones. “Plurale tantum” basically means “there is something plural where with other things there isn’t this plurality, so therefore I confirm that this is correctly lemmatized.” Fay Freak (talk) 01:34, 12 March 2021 (UTC)[reply]
Fundamentally, the problem is that plural only conveys no practical information or is misleading or wrong. I don't think anyone at Wiktionary has dealt with this at the level of specificity required to accommodate the variety of English phenomena subsumed under the category or label.
You are probably right about the existence of some singular forms having the s removed, but the attestable fact remains that many plural-only nouns agree with both singular and plural verbs and pronouns. We could put multiple singular forms in the headword.
I don't propose this for all items labeled plural only. There are some 40 itemized in CGEL, each of which would need to be examined.
I don't care at all what the phenomenon is called. I just want Wiktionary to be usable by folks other that language geeks.
Your reading of LoL definition is incomprehensible to me. I rely on linguist definitions to use a logical definition of and.
I will make a list of the plural-only terms as categorized and discussed in CGEL on yet another of my user subpages and add a link thereto here. DCDuring (talk) 01:55, 12 March 2021 (UTC)[reply]
Latin pluralia tantum always behave grammatically as a plural: idus dividant mensem (“the ides divide the month”). I think this is true of most English plurals-only; to me, *“the scissors is in the drawer” is not acceptable. Attestable exceptions (such as, when a subject, permitting verb agreement in either number) should be noted, for which the Usage notes section can be used. BTW, what’s with wages? We now say: “It may take a singular verb. E.g. 'the wages of sin is death' (Romans 6:23 KJV)”. Is there any other phrase than Romans 6:23 in which wages ‘is’?  --Lambiam 16:30, 12 March 2021 (UTC)[reply]
I have no interest in Latin grammar, nor is it germane.
Neother scissors nor wages would be leading examples for me, but singular scissors appears in both edited works and in reported speech.
Consider: "In the United States that scissors sells to the user for $3 per pair." (1938)
"This scissors has recently been much improved by Messrs. Tiemann & Co." (1888)
"He testified that there was no difference in the breakage rate of this scissors compared to any other scissors" (1989)
"This scissors will trim the wick even ly and the lower blade extension holds the charred part of the wick" (1916)
An 1845 grammar reported wages in "The wages of sin is death" to be singular.
My main point is that the label plural only is often misleading, confusing, or wrong and fails to convey the important information about acceptable number agreement. DCDuring (talk) 00:40, 13 March 2021 (UTC)[reply]
Consider the instance of terms including the word and in the category "English:Pluralia tantum". What is "plural only" telling users? What does it imply about me and mine? That it can be a subject? And that ladies and gentlemen does not permit lady and gentlemen or ladies and gentleman if audience demographics warrant?
It's telling the reader that these are plural noun phrases for which no parallel singular form exists. I agree that these are non-central examples of plural nouns (e.g. they can't take determiners like "some" or "5"), and that the fact that they're plural is not very actionable in these cases. Though, despite being proscribed, many speakers would accept "Me and mine are staying home", whereas few would accept "Me and mine is staying home".
I think the main argument for presenting the headword line that way is that it's consistent with the pattern we use for all nouns. I'm curious how you would prefer it be formatted? The only alternative I can think of is to use {{head|en|noun}} to render the headword without any further text. I don't think that's an unreasonable option.
That said, I agree that these "X and Y" NPs are definitely not pluralia tantum. I would suggest that {{en-plural noun}} should accept an optional nocat parameter for these cases, so they're not placed in the Pluralia tantum cat. Colin M (talk) 21:06, 13 March 2021 (UTC)[reply]
You should be aware of the all-too-common phenomenon of contributors getting caught up with some linguistic concept and applying liberally to entries, including those for which it is inappropriate. Examples abound, 'negative polarity items' is a recent one. It has happened to me, though I usually realize I am confused before doing too much damage.
I believe that me and mine always takes a plural as does every other NP consisting of and-linked nominals, the phenomenon being part of grammar, not the lexicon. It is likely that someone decided that Wiktionary was deficient in not putting all English headwords containing and into Category:English pluralia tantum.
More generally, I believe that some of the members of Category:English pluralia tantum should be presented as "invariant", ie, the plural is identical to the singular. The facts need to support that, of course, in each case. I think we need to clean the Augean stables that is (!) Category:English pluralia tantum, by getting rid of clear errors, of which there are many and in every case making sure that we are communicating what the permitted, required, or more/less common number agreement is (verb, determiner, pronoun). I would suggest that we use {{head|en|noun}} or {{en-noun}} and abandon {{en-plural noun}}. I also note that {{en-plural noun}} lacks a way of marking uncountable usage, reguired for words such as measles. DCDuring (talk) 16:02, 14 March 2021 (UTC)[reply]

Tangwang language and Xieheyu

[edit]

Can Tangwang language and Xieheyu be included in Wiktionary? -- 07:38, 13 March 2021 (UTC)[reply]

[edit]

It's been suggested on Commons that an audio recording of a word is too simple to copyright, users here might be interested in this discussion: [9] Troll Control (talk) 11:00, 13 March 2021 (UTC)[reply]

I wish this were true, but it seems more like you suggested it, and we need a legal opinion (presumably from the WMF legal team), not wishful thinking. —Μετάknowledgediscuss/deeds 21:17, 15 March 2021 (UTC)[reply]

The result of the archived discussion, summed up. Words that are performed, like a dictionary recording, have performance copyright. On the other side, if you lift a word from an audio clip, and especially when it is hard to identify the word with audio clip, the copyright claim is weak. Legal Wikimedia was not involved in the discussion. 119.56.97.84 10:56, 29 March 2021 (UTC)[reply]

Deletion

[edit]

How do I request deletion of my user page? It's the same as my global user page, so it is unnecessary. Chicdat (talk) 12:53, 13 March 2021 (UTC)[reply]

Combining multiple PoS

[edit]

As an example, full speed ahead has four separate PoSes: interjection, noun, adverb and adjective. full steam ahead is said to be an alternative form of it (whether this is really true seems debatable to me, but for the purposes of this question, let's assume that it is). Presently full steam ahead has only two PoSes, but there seems no reason why it should not have the same number as full speed ahead. Given that there is presently no more information for any PoS than "Alternative form of full speed ahead", that would create a rather tiresome article, with four sections all repeating the same thing. Do we have a mechanism for combining all under one heading? I am tempted to combine under "Interjection, Noun, Adverb, Adjective", but then what to do about the next line, presently PoS-specific "en-intj", "en-noun" etc.? Is there a way to combine these? Mihia (talk) 17:47, 14 March 2021 (UTC) Actually, I just thought ... I suppose I could make "full steam ahead" a "phrase", but this only works because it is multi-word, which needn't always be the case. Mihia (talk) 17:51, 14 March 2021 (UTC)[reply]

Interesting question. I suppose you could do a combined header with a generic {{head}} headword line, and then manually add the corresponding PoS categories that would normally be added by the different PoS headword templates. But as much as I agree that it's inelegant to have four sections all just repeating the "Alternative form of..." line, I can see an argument for it. In my ideal world, Wiktionary would eventually have quotations (and maybe usexes) for everything, including alternative forms like this. And if someone does eventually add quotations/usexes, they should be separated by PoS. Colin M (talk) 16:28, 15 March 2021 (UTC)[reply]
The first question is whether each PoS header is correct. I'd bet on the term not meeting adjectivity criteria. I'm also not sure what the point of the adverb PoS is. Do we need an adverb PoS section for mph because one can say "He was driving 60 mph"? (BTW, is mph any more a symbol than BTW? Don't we want parts of speech for such abbreviations?)) DCDuring (talk) 17:08, 15 March 2021 (UTC)[reply]
Yeah, I was trying to address the general principle, but I had the same thought about this particular example. Though I personally find the adverb label unobjectionable. At least at first glance, I would read the phrase as an AdvP headed by ahead. And browsing examples on Google books, it seems to be most commonly used to modify a verb ("A twenty-something young man runs full speed ahead into the ocean"). But I'm dubious of the noun and adjective labels. Colin M (talk) 17:41, 15 March 2021 (UTC)[reply]
For the purposes of this question, I am less concerned about the justification for all the PoS at full speed ahead, though of course that can be amended if anyone wishes, and more interested in the general principle. Mihia (talk) 22:12, 16 March 2021 (UTC)[reply]
[edit]

I notice that the logo which appears at the top left of every page has quite blurry text at the bottom, at least on my display. Is there any way it could be fixed so that it'd look a little nicer? Sdkb (talk) 19:52, 14 March 2021 (UTC)[reply]

  • It's either your device or your eyesight. SemperBlotto (talk) 16:32, 15 March 2021 (UTC)[reply]
    It's somewhat fuzzy for me too (much less sharp than the links in the sidebar), but not so much that I've ever really paid any attention to it. I've attached a screenshot of what I see, for reference.
Screenshot of logo

Andrew Sheedy (talk) 02:52, 16 March 2021 (UTC)[reply]

It's not blurry, it's gray instead of black, which makes it more like the white background and harder to distinguish from it. If you look at it in high magnification, you can see that the edges are just as sharp as those of the black letters. Chuck Entz (talk) 04:47, 16 March 2021 (UTC)[reply]
When I look at it in high magnification, I see a thin, lighter border around most of the letters, which at lower magnification makes the letters look blurry. Either way, it's not enough that it's ever bothered me or ever will. Andrew Sheedy (talk) 00:17, 17 March 2021 (UTC)[reply]
The logo text is anti-aliased, which is the norm, otherwise it would look jaggy. However, to my eye the anti-aliasing does in fact look soft on both "Wiktionary" and "The free dictionary" text, even allowing for the shades of grey. Mihia (talk) 02:33, 17 March 2021 (UTC)[reply]
For anyone who wants to take a closer look, here is a direct link to the image. Chuck Entz (talk) 03:40, 17 March 2021 (UTC)[reply]
The text on the original image looks sharper, but the original is a larger size than what I see in the browser, so there will be an additional factor of the browser scaling it. I don't know whether the logo on the webpage is dynamically scaled, but, if not, it might be better to create an original logo of the exact correct size, to eliminate the browser scaling. Of course, nothing can be done about people also having local scaling in their browser. I guess a vector logo would be another option. Mihia (talk) 18:00, 17 March 2021 (UTC)[reply]
+1 vector logo. The 2x resolution isn't sufficient for modern hardware. Mediawiki will convert it to PNGs, but it can create scaled up versions automatically. – Jberkel 23:01, 17 March 2021 (UTC)[reply]

In Category browsing, term "page" confusing

[edit]

E.g. "The following 200 pages are in this category, out of 7,074 total. (previous page) (next page)".

I think we mean "200 entries", while the "(previous page) (next page)" are correct. Facts707 (talk) 21:53, 14 March 2021 (UTC)[reply]

Formatting of foreign language proverbs

[edit]

@Equinox

Should the literal translations of proverbs in foreign languages be permitted in the sense line?

Many proverbs in non-European languages don't have clear English equivalents, and proverbs are not lexicalized words where the etymology is of only tangential importance; whenever you use a proverb, you're intentionally seeking to evoke the literal meaning of the words. Understanding the literal meaning of the proverb is of crucial importance to understanding the full force of the proverb, which justifies its inclusion in the sense line. Besides, having a separate etymology section for the literal meaning is far from succinct.

Here's an example of the format that I have used in Korean entries, with the literal translation in the sense line:


Here is a Vietnamese entry with the literal translation in the Etymology:

Thoughts?--Tibidibi (talk) 10:42, 15 March 2021 (UTC)[reply]

Literal meanings are etymologies, they should go to etymology sections naturally.
the etymology is of only tangential importance
Understanding the literal meaning of the proverb is of crucial importance
Your argument seems self-contradictory. -- Huhu9001 (talk) 12:19, 15 March 2021 (UTC)[reply]
The etymology is not of tangential importance, as I said explicitly.--Tibidibi (talk) 12:23, 15 March 2021 (UTC)[reply]
Sorry, I misunderstood it. -- Huhu9001 (talk) 12:31, 15 March 2021 (UTC)[reply]
  • When adding an idiom under a sub-heading of a headword that forms a part of that idiom, I prefer a format like the following, formerly at Japanese 閑古鳥 (kankodori, cuckoo bird):

Idioms

[edit]
This includes the literal meaning in quotes, and an explanation of what that expression means in actual use.
For reasons unexplained, @Huhu9001 insists on removing the literal meaning and much of the gloss as well, using the following now at Japanese 閑古鳥 (kankodori):

Idioms

[edit]
The lack of detail strikes me as poor usability for our readers, which I think @Tibidibi addresses some above. In cases where we have a full entry for the idiom (as we do now for the above two), I can acquiesce on this point with some reservations, as users can at least click through to the full entries to find those details. If we do not have full entries for the idioms, the inclusion under the sub-heading may be the only place where we have any gloss or explanation of the idiom at all, as currently at 鳩#Idioms:
Current version
Clearer layout
Incidentally, as a minor formatting point, I think that including the literal and explanatory meanings in the third parameter as in the "Current version" above, and thus both within the quotation marks, is sub-optimal and potentially confusing.
  • When building out a full entry for an idiom, I am open to suggestions. The 閑古鳥(かんこどり)() (kankodori ga naku) entry is one possibility, where the etym lays out the literal meaning and explains some of how the idiomatic meaning came to be (albeit, in this case, by referencing the details of another entry). The sense line uses {{n-g}} to format an explanation of the idiomatic meaning.
Looking at it again, I am leaning towards the thought that the sense line should also include the literal meaning -- particularly since not all readers will look at the etymology, and also since, in some cases, the etymology sections are contained in collapsing elements.
In addition, I must admit some confusion about appropriate POS heading for full entries. At various points, we have had ===Idiom===, or ===Proverb===, or just plain ===Phrase=== as at 閑古鳥(かんこどり)() (kankodori ga naku). I've lost track of which is currently the recommended heading.
Interested to see what others think. ‑‑ Eiríkr Útlendi │Tala við mig 22:34, 16 March 2021 (UTC)[reply]
I personally like the inclusion of the literal translation in the sense line. —Suzukaze-c (talk) 23:48, 16 March 2021 (UTC)[reply]
I agree. Mihia (talk) 02:28, 17 March 2021 (UTC)[reply]
I simply don't understand why you must put the details of one entry in another entry's page. Doesn't Wiktionary have enough page for you to write in? Or is it too painful for readers to click those links? Literal meanings, or in my opinion, even the gloss, should not be given in sections like "Derived terms" or "Idioms". Readers that want to know more can easily navigate by themselves. -- Huhu9001 (talk) 11:17, 17 March 2021 (UTC)[reply]
Requiring readers to click through to other pages makes the site more difficult to use.
Turning your argument sideways, editors that want to add more can easily add more by themselves. Why remove such detail when others have added it?
An ideal would be for there to be the needed templates and modules so that references to other terms, as in "Derived terms", "Synonyms", "Idioms", what-have-you, would automatically present users with at least a little bit more information about that term, without having to click through. This could be similar to what the {{ja-see}} template does, as an existing example.
We have the capability to do this already, we just lack the implementation.
@Huhu9001, given past discussions with you (particularly this one), it seems that you desire for Wiktionary to be more "normalized", in a database kind of way -- where the data only lives in one place, and there is no data duplication. Given the current structure of the website, this isn't really possible, certainly not without negatively affecting site usability. Rather that data duplication doesn't present any negatives to readers, while data de-duplication does make things harder for readers to find and understand, I have intentionally erred on the side of reader usability. ‑‑ Eiríkr Útlendi │Tala við mig 19:43, 17 March 2021 (UTC)[reply]
@Eirikr: No, this has nothing to do with "data duplication". I just wish you could stop bombarding readers with information they did not ask for. I know you are all prolific linguists, who can not help brandishing your erudition by writing verbose essays in entries, like what you have done to many etymology sections. But Wiktionary is not the right place to do this. In what other languages have editors done things like you have in Japanese? In pages like French tête, did they place wordy glosses with each idioms, or write long-winded etymology sections at the top? That's not going to make Wiktionary easier to use, but rather it's distracting. Maybe you prefer Wikibooks. -- Huhu9001 (talk) 22:20, 17 March 2021 (UTC)[reply]
@Huhu9001: I am sorry you find the additional detail distracting. I guarantee that not all readers do.
I cannot agree with your contention that "Wiktionary is not the right place to do this [write etymologies explaining the derivations of terms]". A dictionary is exactly the place to write etymologies. The fact that so many of our entries lack any such detail is, conversely, a negative in my view. Take Danish pudse på, for instance. Where did this term come from? When was it first used? What is it related to? These are all questions I have as a beginning studier of the Danish language. As a reader and consumer of the Danish entries here at EN Wikt, I would very much appreciate fuller details.
Taking your example of French tête instead, I view the bare-bones list of links in the Derived terms section as deficient -- a reader must already have substantial knowledge of French before that list becomes all that useful. Ideally, I would like to see at least a short gloss for each term, to support those readers who might only be at the start of their French abilities. My perspective comes in part from my own experience -- learning Japanese was a hard slog, where one often had to read the dictionary in order to read the dictionary. When I edit entries here, I think back on that experience, and I try to make things easier for Wiktionary readers than they were for me.
While you "did not ask for" such a level of detail, I have. As have others here at Wiktionary (see above in this very thread). As have others I've spoken with IRL and even taught over the years. It is much easier for you to ignore the pieces you don't want, than it is for others to read content that was deleted or never written.
I am even actively trying to help you ignore those pieces, most recently by starting up the [[Wiktionary:Grease_pit/2021/March#Collapsible_etymology_sections?]] thread. But I cannot support any effort to just remove correct and relevant detail. ‑‑ Eiríkr Útlendi │Tala við mig 23:10, 17 March 2021 (UTC)[reply]
@Eirikr: Sorry, I literally burst out laughing. I would never expect that the user who said "but I cannot support any effort to just remove correct and relevant detail" is the very one that massly removed all classical conjugation tables from Japanese vowel-stem verb entries only a short time ago.
Anyway, tired of double standard talks. Good luck with your revolution against those old "deficient" entries. Wiktionaryを革命する力を!!!なんちゃって -- Huhu9001 (talk) 11:46, 18 March 2021 (UTC)[reply]
  1. That information was not correct: it was located in the wrong entries. As, indeed, my edit comments clearly stated. As, indeed, other editors also agreed when this was discussed last month.
  2. That information is autogenerated by a template, and is trivial to add to the correct entries.
I am sorry that you are unable to engage in respectful discourse and contribute to an amicable solution here. ‑‑ Eiríkr Útlendi │Tala við mig 04:14, 20 March 2021 (UTC)[reply]
@Eirikr: Turning your argument sideways, editors that want to add more can easily add more by themselves. Why remove such detail when others have added it? It is much easier for you to ignore the pieces you don't want, than it is for others to read content that was deleted or never written. While you "did not ask for" those conjugation tables, I have. As have others here at Wiktionary. I have intentionally erred on the side of reader usability and I cannot support any effort to just remove correct and relevant detail.
How's that? Am I "respectful discourse and amicable solution" enough now? -- Huhu9001 (talk) 14:07, 20 March 2021 (UTC)[reply]
@Huhu9001:
As I noted earlier, and as was discussed last month, the key point in the removal of the conjugation templates is that they were in the wrong entries.
They were not correct information.
I would hope that you would support the removal of incorrect detail, but apparently you do not care to address that point? ‑‑ Eiríkr Útlendi │Tala við mig 21:34, 25 March 2021 (UTC)[reply]
@Eirikr: As I noted above in this thread, the key point in the removal of the glosses of these proverbs is that they were in the wrong entries. They were not correct information. I would hope that you would support the removal of incorrect detail, but apparently you do not care to address that point? -- Huhu9001 (talk) 21:59, 25 March 2021 (UTC)[reply]
  • @Huhu9001: I feel like you're nitpicking and misrepresenting my words.
I shall address now your contention that proverb glosses "were in the wrong entries".
I contend, and have for years, that every item in a list of terms in non-English languages should have at least a minimal gloss as an aid to the English-reading audience we aim to serve. I referred obliquely to this same standpoint earlier in this thread, in my description of the link list at French tête.
In fact, our template infrastructure supports this, by explicitly including options for editors to add glosses for non-English terms.
Considering this, I dispute your statement that the proverb glosses "were in the wrong entries". Indeed, the main thrust of this very thread (at least, the portion of it dealing specifically with formatting for non-English proverbs) is that multiple users apparently agree that links to proverbs would benefit from both an idiomatic gloss and a literal gloss.
Meanwhile, as I laid out last month, Classical Japanese conjugation tables for the specific verb types at issue there explicitly do not include the modern lemma form. This makes it confusing and awkward to include these tables in the entries for the modern verb forms, since the Classical tables do not -- and cannot -- include the headword of the modern verb's entry. Much as the lemma forms for Middle English don and English do are different, even though this is essentially the same verb, so too do we have different lemma forms for Classical Japanese 食ぶ (tabu) and modern 食べる (taberu). Conjugation tables belong in the lemma entries, so the conjugation table for 食ぶ (tabu) belongs at 食ぶ (tabu), not at 食べる (taberu). If the Classical entries are missing, we should not shoehorn their info into the modern entries -- instead we should create the Classical entries. ‑‑ Eiríkr Útlendi │Tala við mig 22:21, 25 March 2021 (UTC)[reply]
@Eirikr: So we see that details you like (Why not? You have been longing them "for years".) are "correct information" and "should be shoehorned", while details you dislike are "confusing and awkward" and "should not be shoehorned".
"we should not shoehorn their info into the modern entries -- instead we should create the Classical entries". And on the other hand we should shoehorn the proverbs' glosses into other pages, partly because you "generally haven't felt comfortable building out full entries for these (proverbs)". Oh yes. It matters a lot that you are "comfortable" with the former but not the latter.
Perhaps the ultimate guidance of editing Wiktionary is your, User:Eirikr's, personal preference and comfort. No one can have any "respectful discourse and amicable solution" with such hypocrisy present. -- Huhu9001 (talk) 22:52, 25 March 2021 (UTC)[reply]
If we were to have our brains uploaded to computers and spend the next 100,000 years working on Wiktionary, I would have each word have a detailed essay showing when it appeared in the language, its course through the language, and how its meanings and implications have changed through time. As mere mortals, that is outside our reach, but while I appreciate the need to make the major parts upfront, I see no reason to abandon the detailed information about the language; in an optimal world, we would have that information and more.--Prosfilaes (talk) 03:25, 18 March 2021 (UTC)[reply]

POS of words for "X tribe/people, collectively" like British, Chinese, Cheyenne, Xhosa

[edit]

I was reminded by a Tea Room thread that we are really inconsistent in classifying collective (plural) terms for people-groups as proper nouns or common nouns or failing to cover them at all:

  • As proper nouns, we have British "(collective) The residents or inhabitants of Great Britain", Cheyenne "An indigenous people of the Great Plains", Abenaki "An Algonquian First People from northeastern North America", etc,
  • As common nouns, we have Irish "The Irish people", Chinese "(collective) All people of Chinese descent" (this one is my doing), Zulu "An African ethnic group [...]" (though before 2018 we had it as a proper noun), etc,
  • We don't cover collective use at all in Japanese (where we only cover the uncommon/nonstandard count-noun use), Xhosa, or Lakota (though before 2017 we did), nor in e.g. Finnish (where collective use is uncommon and might be dismissable as substantivization of the adjective).

Can we pick a consistent approach here? - -sche (discuss) 19:49, 15 March 2021 (UTC)[reply]

Most dictionaries don't bother with the distinction between proper nouns and common nouns. I am not sure what the behavioral consequences of the distinction are, other than for philosophers of language, either in general or for demonyms.
1. Don't almost all of these have a countable common-noun sense? There are exceptions (Irish, English, British, Spanish come to mind immediately.) Aren't many of these invariant (singular form same as plural form)?
2. To me, it seems natural to designate the noun referring to the entirety of these peoples as proper nouns, not common nouns, and not fused-head use of the corresponding adjective. There is an obvious parallel with the "Translingual" proper nouns that designate taxonomic groups, which are treated as proper nouns (not just here but in taxonomic theory), though we treat the corresponding vernacular names not as proper nouns, but as common nouns. It may be that the fact that they require plural agreement makes us feel that they are plurals of the common noun, but the collective unit does seem to be a different referent than groups of multiple individuals from the collectivity. Even when the collectivity is a homonym of the regular plural of the singular common noun ("the Germans"), it still seems like a proper noun to me.
3. There is a (dated?) usage of such demonyms as in "the Lakota is not as warlike as the Apache", with the demonyms being singular with a definition like "The typical Lakota/Apache". That also 'feels' like a proper noun. We may not have such definitions because they seem conducive to stereotyping, but they certainly have been common enough, at least in Col. Blimp's times.
4. The corresponding senses of the homonymic adjectives may not meet the adjectivity PoS criteria in many cases.
I don't know what considerations I may have ignored, but I think there is potential for consistency, while acknowledging exceptions to the pattern. DCDuring (talk) 21:33, 15 March 2021 (UTC)[reply]
"The Lakota is not as warlike" seems analogous to "the ostrich typically lives in Africa", which we have so far considered to be a common noun (and not even a separate sense, just a feature of how English and some other languages are able to use the usual sense "a large flightless bird..."), which does complicate my inclination to view "the Lakota are..."-type collective usage as a proper noun. I acknowledge that ones which are count nouns could be considered invariant plurals of the count nouns ("the Abenaki are" could just be considered to be using "Abenaki" as a plural of the singular noun "Abenaki" that means "a member of the Algonquian First People known as the Wôbanaki"), and the ones which are not normally count nouns, like "Japanese" and "British", could be argued to be substantivized adjectives like "the poor", "the deaf", but as you say they may not meet criteria of adjectivity. - -sche (discuss) 22:09, 15 March 2021 (UTC)[reply]
That fact that we have treated "the ostrich" that way does not make it true. I view our treatment of vernacular names of organisms as a simplification, partially based on the fact that most taxonomic usage uses the scientific Latin name whenever referring to the entirety. Another reason is the semantic redundancy that we would introduce into both vernacular name entries if each such term had both an individual and a collective definition. We can make the same simplification for this purpose if we'd like, but more users will probably care about the completeness of our demonym definitions.
It does make a difference whether we are referring to an entire nation/tribe/race or whether we are referring to some definite collection of members of such collectivity. A given use of Abenakis could be referring to the tribe as a whole or "the Abenakis that have been tracking us." I presume that both kinds of uses are possible, though evidence could show me wrong.
It is easy enough to find examples of singular Japanese, especially in older writings (before 1930), but abundantly in books right up to the present. It is much harder to find "an|one English is|was|has|does|seems|looks" where English refers to a person, at least in writings or speech of native English speakers.
There probably are some empirical regularities (at least of the statistical sort) in the existence or absence of given PoSes and plurals, some depending on suffix. Sadly, there is lots of variation even among terms with a single ending: English, Irish, Cornish, Amish, Flemish, Englishman, Irishman, Cornishman, Amishman, ?Flemishman; Scottish, but both Scot and Scotsman; Flemish, but both Fleming and ?Flemishman; Turkish, Finnish, Lettish, but Turk, Finn, Lett; Polish, but Pole; Spanish, but Spaniard. DCDuring (talk) 00:44, 16 March 2021 (UTC)[reply]
I think the use of "Japanese" and "Chinese" in the singular is simply a convergence between the collective form and the singular, countable form, in which case, there should be separate definitions for each at Japanese and Chinese. English, on the other hand, has (as you mention) the singular counterpart Englishman, Polish has the counterpart Pole, etc. The singular counterpart of Flemish is Fleming, BTW. Andrew Sheedy (talk) 02:46, 16 March 2021 (UTC)[reply]
Searching for "many [x] are" turns up a least a couple genuine uses for just about any demonymic adjective you can think of, though there are lots of false positives caused by OCR not recognizing that there are multiple columns on a page.
As for "the ostrich", that's simply a metonymic reference to a hypothetical individual as a stand-in for the class to which it belongs. It's usually in a vary educational register: "the Lion is a noble Beast", or my new favorite example sentence: "the fragrance of roasted grugru is said to be most tempting to epicures" (not really a good example, but I just had to work it in it in there somewhere...). Chuck Entz (talk) 05:42, 16 March 2021 (UTC)[reply]
In summary, empirically, (all?) demonyms like Japanese can be found agreeing with both singular and plural verbs, determiners, and pronouns. In addition they can be found in expressions like the Japanese referring either to the demos as a whole, a typical member of the demos, or to definite groups of individuals. We could dismiss the "typical" definition as metonymy and even go further and declare the proper noun to also be metonymy, going from individuals to the ethnic entirety. I see elements of ideology influencing our choice about presentation. We can use the metonymy argument to dismiss any proper noun (either the entirety or the typical) or we can go with the idea of dignifying demonyms by making them proper nouns.
I wonder what the PoS sequence of English usage of such terms has been. Is it adjective → common noun → proper noun?
In Latin both descent groups (gens) and alien or geographically distinct (from Roman PoV) groups had the same kind of name, plural in form. How were individual from such groups referred to? DCDuring (talk) 14:10, 16 March 2021 (UTC)[reply]
The ostrich analogy is a good one. This is a form of polysemy that is entirely "systematic" - i.e. rather than thinking of these meanings as residing inside each lexical entry, it's much clearer to think of them as being part of the rules of language use which we all implicitly understand. Compare:
  1. John bought the dishwasher. [instance of X]
  2. G.E. is introducing the new dishwasher in July. [sub-class of X]
  3. Joel Houghton invented the dishwasher. [type of X]
These apply to anything. The fact that we explicitly list senses corresponding to, e.g. the "sub-class of X" relation demonstrated in 2 (cheese#Noun 2: "Any particular variety of cheese.") is, IMO, silly. The examples above are modified from Nunberg 1977 pg. 101, and are just the tip of the iceberg - he goes on to list many, many more examples of these (what he calls "hypostasizing" functions). Maybe I'll do a separate effortpost on this phenomenon sometime. Colin M (talk) 18:31, 16 March 2021 (UTC)[reply]
Which of the three types of definition of dishwasher above should be the one that we have in the case of dishwasher and in general?
I don't think that all nouns allow all of the patterns. Which ones they do allow seems lexical, because there is no generally understood classification of nouns that would enable us to say which nouns allow which of the possible combinations of the three (or more) patterns. DCDuring (talk) 22:53, 17 March 2021 (UTC)[reply]
The natural way of defining dishwasher already works pretty well for all the three variants above. Look what happens when we replace dishwasher with the first definition at dishwasher:
  1. John bought the machine for washing dishes.
  2. G.E. is introducing the new machine for washing dishes in July.
  3. Joel Houghton invented the machine for washing dishes.
It's free real estate! (Largely because machine itself benefits from the same lability.) I'd be curious to hear about which nouns you think don't permit this pattern. Even if it fails to apply to 10% of all nouns (I think the real fraction is vastly smaller), it still seems awful to have formulaic entries for the remaining 90% like "any particular variety of cheese", "any particular variety of computer", "any particular variety of sadness", etc. Colin M (talk) 05:54, 18 March 2021 (UTC)[reply]
@Colin M: It's true that at least most nouns could be semantically diverged as per your description. Our definition lines, however, are ditinguished as a result of the pragmatically determined conventional uses, not the thoretical formulaic permutations. While lots of nouns can be used in the same linguistic structure of sense #2 of cheese, the discrepancy between the widespread attestation of the latter in comparison of the former fairly justifies the status quo. Assem Khidhr (talk) 23:57, 18 March 2021 (UTC)[reply]
Yes, I think conventionalization is key, but I think it's tricky deciding where to draw the line. Apparently OED does have a corresponding ('variety of') sense in their entry for cheese, so that one at least passes the lemming test. However, they don't have a 'variety of beer' sense for beer (which we do). They also don't have the "serving of X" sense which we have for wine and beer. Colin M (talk) 05:17, 23 March 2021 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── Please see my comment here for my definition of those demographic collectives. I believe it's important to contextualize this discussion within some linguistic concepts:

  1. DCDuring's the Lakota and -sche's the ostrich are both definite generics. These refer to the abstract mental notion elicited by the word, regardless of their concrete actualizations/instances. This explains DCDuring's note why they might be understood as derogatory, since they by definition de-individualize the demographic member as a mere limited mental representation. (See this useful presentation)
  2. When preceded with the, Abenaki in -sche's the Abenaki are can be understood as a plural of the corresponding count noun only when they refer to a group of instances of their singular's class, not to the entire instances thereof. They can't thereby work out as a rebuttal for classifying the matter at hand as proper nouns.
    • A simple test I've thought of to distinguish the two is to check for the construction's dependence upon the context to determine its extension: a plural of a count noun would rely on a contextual determination of its referents (a definite collection of members, in DCDuring's words). Our subject matter, the demographical collective, however, is always understood as representative of the whole class members, meaning that it's definite by means of being a proper noun, and only preceded by a definite article as a redundant emphasis; hence, a weak proper noun.
  3. When plural and not preceded with the (i.e. bare plural), morphologically plural count nouns can be used to compose a construction called zero generics (e.g. girls seem to be more subject to peer pressure in childhood than boys). In cases where a demonym has a respective count noun, such construction can utilize it to denote the same significance of a proper noun (e.g. Egyptians make up most of Arabs). Here, one can point out the intuitive human tendency to treat a unitary demographic group (vs. an otherwise less concretely delimited notional class), by analogy, as more particular, individual, or for the sake of this discussion, proper; thus being semantically, at least, similar to a proper noun. Conversely, the demographical syntactically plural strong proper nouns I was originally discussing, lacking a count nount of their own, can hardly be considered generic in that sense. For example, Quraysh are most revered among Arabs sounds like a statement of fact predicated of a definite entity rather than a generalization for an indefinite class.
  4. The metonymy argument lacks factual accuracy, since, at least in some constructions, speakers willing to communicate the collective sense would literally intend to speak of every individual of a taxa as included in the referents of which they predicate a proposition; not use a group of individuals as signifiers of the whole taxa as a signified. Similarly, it can hardly be argued that speakers intending to communicate the definite generic actually think of a specific instantiazation of the class and use it figuratively to denote the whole class. It's just, to me, a mental possibility that is way less natural than considering that construction a utilization of a mental notion.

Apologies for the wall of text. Assem Khidhr (talk) 23:57, 18 March 2021 (UTC)[reply]

  • Is there a consensus that terms like the Texans, the Chinese, Blacks, the Dead Rabbits, the Goths, or the Hamites can, in principle, be used as proper names, whether of groups of residents, of citizens, of identity groups, or of descent groups, possibly fictional or wrongly attributed? Are we reasonably confident that there is usage that could demonstrate that the terms are used as proper nouns? DCDuring (talk) 17:53, 23 March 2021 (UTC)[reply]
    Under the first definition for Chinese#Noun I have added citations which, IMO, clearly show that the term is used to refer to the Chinese as "a people", while having Chinese in agreement with a singular verb. To me that seems to be good evidence that Chinese is, among other things, a proper noun. DCDuring (talk) 02:02, 24 March 2021 (UTC)[reply]
I was hoping any of the users who add these as (or convert them to) common nouns would explain the rationale for them being common nouns, as I am not 100% sure they are proper nouns, but it does seem like the evidence and the discussion above point in that direction, and in the absence of other input, we should start converting the ones which are labelled common ===Noun===s to be ===Proper noun===s. - -sche (discuss) 03:31, 7 April 2021 (UTC)[reply]
Ditto. Assem Khidhr (talk) 23:48, 7 April 2021 (UTC)[reply]

Audio files by non-native speaker

[edit]

If I am learning say Zulu and I am confident I am pronouncing a word correctly with the help of my teacher, would it be worth uploading an audio file of my pronunciation? i.e. would that be better than nothing? Troll Control (talk) 08:12, 16 March 2021 (UTC)[reply]

I do not see how that is any different from any other kind of contribution. So yes that is better than nothing I think. Giorgi Eufshi (talk) 09:21, 16 March 2021 (UTC)[reply]
I disagree in the case of language like Zulu that has millions of native speakers. If you're learning an endangered language that has only a few hundred native speakers left, and there's no realistic chance of getting recordings from any of them, of if you're learning an extinct language with no native speakers, then an audio file from a nonnative speaker is better than nothing. But for a language like Zulu, I'd say it's better to have no audio at all than audio from a nonnative speaker. —Mahāgaja · talk 13:19, 16 March 2021 (UTC)[reply]
Extinct languages (excluding e.g. Latin but including e.g. Old English) should not have audio files at all IMO.--Tibidibi (talk) 14:00, 16 March 2021 (UTC)[reply]
@Troll Control: Don't do it. We've had problems before with people who thought they were confident and unwittingly produced bad audio. If you wanted to help, the useful thing to do would be to convince a native speaker to record a bunch of words and upload them. —Μετάknowledgediscuss/deeds 17:00, 16 March 2021 (UTC)[reply]
@Metaknowledge Do you want to add something to Help:Audio_pronunciations#What_to_record? Troll Control (talk) 09:03, 17 March 2021 (UTC)[reply]

Why do we consider names of taxa to be proper nouns?

[edit]

I've searched the Beer Parlour for an answer to my question, but the only relevant snippets I've found are in a discussion dating to 2014 in which DCDuring wrote, "Many folks don't act as if taxonomic names are proper nouns, but most theoretical taxonomists seem to," and Angr responded in part (and explicitly not rhetorically), "What usage of taxonomic names indicates that theoretical taxonomists treat them as proper nouns?"

Yes, names of taxa are to be capitalized, but that alone doesn't make them proper nouns. Nor for me does our own definition of "proper noun," which is

"A noun denoting a particular person, place, organization, ship, animal, event, or other individual entity,"

shed any light on the matter.

Certainly a taxon constitutes a well-defined collection of organisms. By "well-defined" I mean that any organism either is or isn't in some taxon—modulo disputes among taxonomists. But consider for instance the collection of barbers: an organism is a member precisely if it is a human, has an official license, and practices the trade, but barber is still a (mere?) common noun.

What am I missing?—PaulTanenbaum (talk)

Some linguists distinguish between proper nouns (George, Washington) and proper names (George Washington). As for taxonomic names, they are all proper names, but binomial (etc.) names are not, nor, I suppose, are names like "Rosa L.". We don't follow this distinction in our PoS headings and categorization.
A taxon is an "individual entity", whether viewed through a Darwinian or other lens. Taxa have been thought of as such at least since Linnaeus: both genus and species names are singular in Latin form, though they are names of groups of individuals. Higher taxa are plural in form, constituted as groups of genera. In Darwinian thinking, all of these groups are descent groups. Taxonomy strives to enable users of each name to keep track of how workers in the field have used the terms (circumscription and placement, description) over time. The way that taxonomists use taxonomic names is more formal than the way normal humans use vernacular names, though there are those who attempt to make vernacular names correspond more or less precisely to taxonomic names, eg, birders. Many vernacular names of macrofauna and macroflora are capitalized, indicating perhaps that they name individual entities and should be considered proper nouns. There is nothing in the usage of taxonomic names that would make a linguist want to exclude them from the class "proper nouns". In the end whether a noun (or NP), is called a proper noun is a matter or convention. One such convention is that we deem taxonomic names to be proper nouns, but call all vernacular names of organisms (common) nouns.
Elsewhere on this page we are discussing which senses of demonyms may or should be called proper nouns. DCDuring (talk) 16:49, 16 March 2021 (UTC)[reply]

Hey, DCDuring, thanks for your reply. I'm still left somewhat wanting though. As an example, you write that, "There is nothing in the usage of taxonomic names that would make a linguist want to exclude them from the class 'proper nouns'." Well, OK, but what is there in the usage that would make a linguist want to include them? I totally subscribe to taxonomy's convention of capitalization, but I still don't see that what's going on here is any more than that convention.

Consider an analogy from mathematics involving the set B of Beatles. So B = {John, Paul, George, Ringo}. Would you want to call "B" a proper noun? If so, then what about the name of some set B* = {John, Paul, George, Ringo, m}, where m is any other (unspecified) musician? And more generally, a set A = {a1, a2, a3}?

You go on to write, "In the end whether a noun (or NP), is called a proper noun is a matter or [sic] convention." I'm beginning to see that that's so, which leaves two possibilities. Either (1) it's nothing for me to fret further about and I should just let it slide, or (2) a matter of convention that varies by language community, field of knowledge etc. isn't of much use in any general way, and therefore doesn't merit inclusion within Wiktionary entries. I guess I know which way I lean. What specific benefit can a Wiktionary user derive from the knowledge whether any given noun is proper or common besides whether to capitalize it (pace e e cummings and k.d. lang)?—PaulTanenbaum (talk)

We do consider it worthwhile to record matters of convention within "language communities, field(s) of knowledge etc". Definitions are themselves conventional, after all.
It is much easier to make a decision about typical members of a word class than about non-central members. In taxonomy taxa are well-defined formal conventional entities that they strive to make correspond to real-world groups that are not quite as well-defined, depending on speciation events, which can last for many generations, for their commencement and on sexual reproduction for their persistence and boundaries. Grammatically, I suppose that proper nouns do not normally accept modification by most adjectives when they are being used as or as part of proper names. Exceptions include adjectives like new, old, last, young, old, middle-aged, the ordinals, certain evaluative adjectives etc. The possibility of modification by the specific adjectives also reminds me that usually persistence of the named entity is required. DCDuring (talk) 22:44, 16 March 2021 (UTC)[reply]
FWIW, I agree that the distinction is not very useful. I think the vast majority of readers will have internalized a definition of proper noun that's no more nuanced than "a word that's always capitalized". Apparently there was a vote to do away with "Proper noun" headers back in 2011. It failed 6-8, though several of those who opposed indicated some level of support for getting rid of the header, but disagreed with other specifics of the proposal (like doing away with the Proper noun category). Colin M (talk) 05:02, 17 March 2021 (UTC)[reply]
At the risk of getting off the original topic, I wonder if we should have a new discussion about whether to remove the header but keep some indication, e.g. keep "X proper nouns" as an additional category (so Russia or Rebecca would be headered "Noun" and go in both "English nouns" and "English proper nouns"), and/or have a usage note. I do that think that, given how few other major online dictionaries provide information on what is a proper noun, we can do a service here by providing that information! (Embarrassingly many reference works I've seen, both from a hundred years ago and from recently, including college-level textbooks on language(!), simplistically equate proper = capitalized, common = uncapitalized—pssh, tell it to the Marines!) A usage note + categorization would allow for "disputed" cases to be handled. Sometimes I wonder if we could compactify our display of parts of speech even further, e.g. dropping the header entirely and moving the info onto the headword line: "cat, noun, plural cats", as this would reduce the ridiculous amount of whitespace in entries that exists especially in Mobile view, but I realize it would present formidable challenges as far as entries being easy to edit or navigate by TOC. - -sche (discuss) 20:20, 17 March 2021 (UTC)[reply]
I agree that we waste a lot of vertical screen space with excessive header font size, header spacing, etc.
Categorization is, at best, at the level of the L2 section, not even the Etymology level, or the PoS level, whereas any proper noun (or proper name) categorization would probably need to be at the definition level, which would mean we would need labels to point appropriately. This would not eliminate the problem of saying what we mean by "proper noun". DCDuring (talk) 16:04, 18 March 2021 (UTC)[reply]
I agree entirely with -sche—even to the point that I too had been thinking that the discussion of taxon names was played out but the discussion of proper nouns more generally was the real heart of the matter in my original post anyway.
The challenges that DCDuring mentions in dealing with "non-central members" of this word class are one of the reasons I question the wisdom of highlighting the class. That and the fact that even if there were an unambiguous definition of "proper noun," I still have not been able to learn what value any Wiktionary user might obtain from knowing a noun's properness vice commonness. Except, perhaps, whether it should be capitalized (pssh, maybe we should ask the Marines).
Is it time to revisit the whole issue by posting a proposal along the lines that -sche describes?—PaulTanenbaum (talk) 13:59, 18 March 2021 (UTC)[reply]
I would also like to consider the value of the proper noun/proper name distinction. After all, James is not really in itself anyone's proper name, not having a sufficiently unique referent. And don't get me started on Karen. There are other distinctions of low value such as that between the pronoun and noun word classes. Native speakers know that almost everything they need to know about proper nouns and proper names. It is really only language learners that may benefit from know about word-class categories below the level of the traditional PoSes. "Proper noun" is not even a traditional PoS. Every day at Quora I see queries about noun classifications like abstract, concrete, and collective. I don't recall a single question about proper nouns. DCDuring (talk) 15:39, 18 March 2021 (UTC)[reply]
The Millian characterization of proper names is that they are simply arbitrary signs. Clearly there is something to be said for maintaining the distinction between Karen as proper noun ("arbitrary sign") used as a component of a proper name and as a ellipsis of such proper names and Karen as common noun ("assertive, entitled white female"). DCDuring (talk) 16:04, 18 March 2021 (UTC)[reply]
I'm all for retaining the distinction between 'Karen' the name and 'Karen' the slang epithet. To me they're analogous to, say, 'Macintosh' the family name and 'macintosh' as a synonym for 'raincoat.'—PaulTanenbaum (talk) 17:32, 18 March 2021 (UTC)[reply]
  • As to the original question of 'why we consider names of taxa to be proper nouns', we are simply following the insistence of taxonomists (eg, E.O. Wiley), and philosophers of biology (eg, Hull, Sober) that they are proper nouns. They meet some criteria that philosophers of language have for proper nouns (unique referent, arbitrary sign, rigid designator). That they are rigid designators is clear from the fact that in taxnomic practice it is usual to often precisely limit a given use of a name by specifying the names of those who have defined and redefined the term, some times even adding dates when authors have used the term differently in different publications. The intent is that there be a singular unique referent, which in current taxonomic thinking is a descent group of reproducing organisms, prototypically those reproducing sexually. That they are arbitrary signs is evident from their resistance to modification by adjectives when the name is not being applied metonymically, as to an individual specimen, population, developmental stage, etc. That taxonomic names have formal systems for maintaining these characteristics makes it particularly easy to characterize them as proper nouns. Other proper names have the same general characteristics. Names of human descent groups (ethnicities, Roman gens, tribes, families, etc), groups of citizens of jurisdictions or residents of places, toponyms of all kinds, individual persons, groups of students and alums of schools and universities, names or addresses of buildings or lots, legally recognized organizations, brands, products, and models of products would all qualify. Less formal, but durable groups would also qualify, street gang names, for example. Many of these would not meet CFI, of course, but we are trying to think clearly about what constitutes a proper noun or, rather, a proper name. DCDuring (talk) 16:41, 18 March 2021 (UTC)[reply]
Again, I'm squeamish at the thought that a general-purpose dictionary should decide properness vice commonness on domain-specific bases. For instance there are lots of words used in mathematics that seem to fit the criteria that DCDuring lists above but that I don't think any mathematician would consider to be proper nouns. As just one example, consider the noun integer. It has an utterly unambiguous and durable extensional definition: membership in the set of integers has completely crisp boundaries. And if, arguendo, the taxonomists and philosophers of biology were to undergo a change of heart and come to believe taxons' names to be common nouns, then should we at Wiktionary automatically shift all of our entries to correspond to the way we're currently treating the nouns from mathematics?—PaulTanenbaum (talk) 17:32, 18 March 2021 (UTC)[reply]
I don't think general-purpose is as good a term as multi-purpose to characterize English Wiktionary. We are a monolingual English dictionary, a translating dictionary, a historical dictionary, an etymological dictionary, a thesaurus, a slang dictionary, a dialect dictionary, as well as including terms usually found only in technical glossaries. As a historical dictionary we would keep our existing taxonomic proper nouns as a record of this current school of thought. The current school of taxonomic thought goes back to the middle of the 18th century, so it is not a flash in the pan. The first taxonomic codes were standardized internationally in the second half of the 19th century. The descent group concept, is more recent, but in its modern form goes back to the middle of the 20th century. It is really onomastics that seems to be lagging, not covering many classes of proper names.
There are philosophical questions such as whether a name like 'Rosa L.' was a proper name when first used by Linnaeus or whether it became one only when all the criteria I've suggested were met, but those questions are not far different in kind from determining when determiners came into existence. (Was it before the concept was named, only when fully described, etc?).
I simply can't speak to the question of the nature of unique mathematical entities. The absence of capitalization for entities such as the integers, the real numbers, the rational numbers, the prime numbers, the even number, the odd numbers suggests that not too many of "the vast majority of readers", whether or not mathematicians, think of them as proper nouns. Maybe proper names also have to correspond to entities that exist or potentially exist in Newtonian space and time. DCDuring (talk) 21:28, 18 March 2021 (UTC)[reply]
I may have somewhat obscured my point by providing thus far examples coming only from math. So consider instead the word planet. It has a precise definition thanks to the International Astronomical Union, and its extension is a set of objects, each of which has—or has every right to—a name: Earth, Neptune, etc. All of those are of course themselves proper nouns. So for me, the noun planets seems completely parallel to, say, Canis. And yet the biologists' term is a proper noun and the astronomers' isn't. Is there perhaps one of the criteria you mention above in the context of 'Rosa L.' that is not met by planets?—PaulTanenbaum (talk) 12:23, 22 March 2021 (UTC)[reply]
@PaulTanenbaum So, if I understand you correctly, your point is not restricted to taxonomic names. You are saying that EITHER:
  1. the name of any entity defined as the group of all the things that are referred to by any common noun (eg, the planets, the barbers, the Chinese, the plants, Plantae should be a proper noun OR
  2. none of them should, with proper noun being reserved for the names of singular named entities such as Jupiter; Franco Gallo, the barber; the National Association of Barbers; Sun Yat-sen; the People's Republic of China; and Rosie the Rosebush in my garden.
Further, you would not seriously entertain the first proposition.
A Roman gens would not have a proper name, nor clan MacGregor, nor the Royal Family. The Detroit Tigers would be a proper name by virtue of being an organization and/or a trademark, not because it was a unique team of individual baseball players. DCDuring (talk) 16:28, 22 March 2021 (UTC)[reply]
Actually, @DCDuring, I don't believe I take any position of the form, "All x-nouns should be considered proper," or, "No y-nouns should." What I do hope there is (wish there were?) is some clear and consistent specification of the extension of the term "proper nouns" such that its intension is (were?) useful. Or heck, I'd be just as pleased to come across a clear specification of its intension that allowed questions of the form, "Is word w among its referents?" to be answered (moderately) unambiguously.
If neither such specification exists, and nobody can explain the value that any Wiktionary user might obtain from our using the term to label some nouns, then I don't see that the concept merits the highlighting that we currently grant it.—PaulTanenbaum (talk) 00:15, 23 March 2021 (UTC)[reply]
@PaulTanenbaum Do you accept that Mary, Stephenson, (the) White House, and (the) Atlantic, Gambia, and (the) Gambia are proper nouns? If so, you should have no trouble accepting Rosa multiflora, Rosa, and Rosaceae. If you do not, then the least grammatical departure between the behavior of these terms and the bahavior of common nouns would put paid to your lengthy arguments. DCDuring (talk) 13:56, 23 March 2021 (UTC)[reply]
This is just an assertion without an argument. What Paul is saying, as I understand it, is that insofar as Rosa multiflora satisfies the criteria you described above ("unique referent, arbitrary sign, rigid designator") so do integer and planets. AFAICT, you have not answered the question they posed above: "Is there perhaps one of the criteria you mention above in the context of 'Rosa L.' that is not met by planets?" Colin M (talk) 14:23, 23 March 2021 (UTC)[reply]
There are two basic classes of entities whose characteristics are relevant: the proper name and its referent.
  1. A proper name, when used as such and not in its various secondary (often metonymic) uses, refers to a single entity. Grammatically, a strong proper name is not used with any determiner; a weak proper name is only used with the. Those proper names that agree with singular verbs and pronouns linguistically refer to a single entity. In the case of taxonomic names there should be no controversy because the names agree with singular verbs and pronouns. When used as a name, proper name cannot be used as a semantic predicate. This is related to its having no meaning apart from its referent. These distinctive characteristics may be enough to earn proper names a distinct PoS header. Capitalization seems to be virtually a necessary condition, but by no means a sufficient conditions for a noun to be a proper name.
  2. The referent of a proper name, when it is not a group, but a common-sense individual is not be normally controversial. When the referent consists of a many individuals, more potential for controversy apparently exists. In the case of all taxonomic names, the referent is intended to be a descent group. In the case of sexually reproducing groups there are relatively clear boundaries for species and, thus for the higher-ranked taxa. Asexual reproduction, hybridization, ring species, horizontal gene transfer all make the extension of the species concept harder, but biologists (and even virologists) continue to use the word species as if referring to a single entity.
I am perfectly willing to entertain the possibility that barbers, integer, or planets could be used as proper nouns. I am not aware that they are actually so used. They are typically treated as a class. If capitalization is, as I believe it to be, virtually a necessary condition for a noun to be linguistically regarded in English as a proper name, then these terms don't seem to be so considered. DCDuring (talk) 16:33, 23 March 2021 (UTC)[reply]
@DCDuring, now I'm further confused. You write, "When used as a name, [a] proper name cannot be used as a predicate." It can't? What about, "The state with the largest population is California"? Or indeed, "The primary field of Dr. Pabst's research is Artiodactyla"? To be clear, I presume that when you write of use as a predicate, you mean use as a predicate noun, or anyway as a predicate nominative.
In any event, could you help me understand what's involved in your notion of the possible use of barbers, integer, or planets as proper nouns. Perhaps a few sentences illustrating what such uses might look like? I ask because I agree that "these terms don't seem to be so considered." That's why I offered them as examples: to me they are used much the same way as names of taxa, but I don't think anyone considers them proper nouns.
Acknowledging your disinclination to opine about names in mathematics, I hope you'll indulge another example from that discipline: "Every complete ordered field is isomorphic to the reals." In that sentence, the object of the preposition to is a name, and no mathematician will have any question about that name's meaning or extension. And as to your thought that perhaps proper names must denote things that do or could exist, I'd argue that the reals meet that test at least as well as, say, the Furies. Or the Flying Spaghetti Monster.—PaulTanenbaum (talk) 19:30, 23 March 2021 (UTC)[reply]
I didn't say that any of planets, barbers, or integers were proper names, or that I thought that they were, only that I would entertain arguments to that effect. I have no doubt that the names of human individuals, certain uses of demonyms, toponyms, etc are proper names. I doubt that anyone has too much of a problem with that. As for taxonomic names of sexually reproducing species, few have doubts about that. I believe that the species concept is being extended to cover other classes of organisms without many being greatly troubled. I have not been following the efforts to extend the definition.
I really don't care to continue a philosophy discussion any further. I'm not that good at it and it is largely irrelevant to Wiktionary. I believe that I have indicated some reasons why users might want to know whether a term was a proper noun and why taxonomic names qualify. If you disagree with those conclusions or there is some other lexicographic matter up for discussion, please let me know, in simple English that even a simpleton like me could understand. DCDuring (talk) 00:08, 24 March 2021 (UTC)[reply]

Actually splitting WT:RFVN

[edit]

No one has voted at Wiktionary:Beer parlour/2021/February#Splitting WT:RFVN in almost a month, so I guess the vote is over. It looks like both options passed, option 1 more decisively than option 2:

Option 1: 12× support, 0× oppose, 5× abstain
Option 2: 4× support, 2× oppose, 5× abstain

Shall we actually do the split now? Who wants to do it? I wouldn't know how to begin. —Mahāgaja · talk 18:29, 16 March 2021 (UTC)[reply]

An undecided issue is how to name the pages resulting from the split. We could go for “RfV/CJK” and “RfV/Romanic”, but how to name the “None of the above” page? “RfV/Other”?  --Lambiam 10:30, 17 March 2021 (UTC)[reply]
I sort of assumed it would stay RFV/Non-English, though there'd have to be a big notice at the top saying that CJK and Romance languages didn't belong there anymore but on their own subpages. —Mahāgaja · talk 13:31, 17 March 2021 (UTC)[reply]

Pending vote

[edit]

I wonder whether one or more people could please take a look at Wiktionary:Votes/2021-03/Clarification_of_supermajority_rule and check for any obvious flaws or errors in wording, and if it looks OK then we can start it. Since there have been no comments so far, I would prefer at least someone to cast their eye over it first. Thanks. Mihia (talk) 22:17, 16 March 2021 (UTC)[reply]

Is this necessary? Voting proposals may be worded in any of multiple kinds of way for which it is not clear how to interpret and implement the outcome. Obviously, a proposal to be voted on should be a proposal to make some change. Even a proposal to freeze the content of Wiktionary as having attained a degree of perfection that can only be spoilt by further changes is a proposal to change the status quo – which currently holds that improvements can and should be made. The wording of proposals should be clear in being specific about the proposed change so that voters know what they are voting on, but this has nothing to do with the supermajority rule.  --Lambiam 10:04, 17 March 2021 (UTC)[reply]
I personally see it as necessary, or at least very desirable, for the reasons that I have explained at Wiktionary_talk:Votes/2021-03/Clarification_of_supermajority_rule and previously at Wiktionary:Beer_parlour/2020/December#Operation_of_"supermajority"_voting_rule. Mihia (talk) 18:44, 17 March 2021 (UTC)[reply]
At Wiktionary:Beer parlour/2020/December#Operation of "supermajority" voting rule I asked for a concrete example where the proposed modified rule would be helpful. I do not see how it would have helped to mitigate the flawed original formulation for Wiktionary:Votes/2020-10/Use of "pronunciation spelling" label. The problem with the original proposal had, as far as I can see, nothing to do with the supermajority voting rule. If the wording of a proposal is problematic, it should be fixed before it goes live, under any system of voting rules.  --Lambiam 11:57, 20 March 2021 (UTC)[reply]
  • Please see my comment here: Wiktionary_talk:Votes/2021-03/Clarification_of_supermajority_rule#Administrator_roleDentonius 10:55, 17 March 2021 (UTC)[reply]
  • When Mihia originally proposed this idea of clarifying the supermajority vote, I understood where he was coming from. I should hope that he first became aware of the problem when I first pointed out a problem with his "pronunciation spelling" vote which, all things being equal, would have changed the status quo even in the absence of votes. Mihia and I worked together and corrected that vote. Metaknowledge, Mihia, and I discussed the matter. Mihia believes that this understanding of how the supermajority rule should function needs to be voted on. I don't. I don't believe Metaknowledge thinks so either. Nevertheless, here we are. Now, I hadn't been paying attention to what Mihia was draughting in the votes area. However, it appears that the vote, as it is currently worded, doesn't even properly address the original problem we had discussed: what to do about votes which seek to push a status quo change by default even in the absence of participation. Mihia's proposed change does nothing to improve the situation and, in my opinion, makes it even worse. We are being asked to trade community consensus in for a select few individuals to decide which votes should be a simple majority vote and which votes should be a supermajority vote. In this regard, Mihia's vote is not a "clarification of the supermajority rule" but would be more appropriately titled: "allow administrators to decide which votes are simple majority and which are supermajority." Mihia, I would have supported your vote if you had stuck to the original issue. — Dentonius 20:14, 17 March 2021 (UTC)[reply]
We have been through all this before. In the absence of ANY votes, clearly nothing changes. I do not know whether a quorum is anywhere specified, or what it is. If you are concerned about votes passing on very low participation, propose a quorum rule, or a stricter one that what we have, if we do have one. This is nothing to do with the problem that my proposal seeks to address, which, frankly, I believe that you do not properly understand. Mihia (talk) 20:27, 17 March 2021 (UTC)[reply]
Mihia, I understand very well what you wrote. I hope others do too. — Dentonius 20:30, 17 March 2021 (UTC)[reply]

References

[edit]

At, for example, bench test, we have:

References

  • “bench test”, in Lexico, Dictionary.com; Oxford University Press, 2019–present.

I see this kind of thing from time to time. Is there any reason for these sporadic instances of "References" sections that refer to entries in random other dictionaries? Unless we are to be systematic, I am tempted to delete them. We don't require our definitions to be sourced to other dictionaries, and we can assume readers already know that "other dictionaries exist", right? Mihia (talk) 02:01, 17 March 2021 (UTC)[reply]

Dictionaries in references are attestations; In LDL languages they are more than essential, in HDL languages they may help as well. However, online dictionaries (with a few exceptions, like the Online Cree Dictionary or INFCOR) don't help with this, so this particular reference can be deleted IMHO. Thadh (talk) 07:49, 17 March 2021 (UTC)[reply]

(Notifying Ungoliant MMDCCLXIV, Metaknowledge, Ultimateria): @Fay Freak I am in the process of cleaning this template up but it would help a lot if I had an online reference indicating, for each adjective, how its feminine is formed, particularly for adjectives ending in consonants. The Real Academía Española has detailed rules on plurals but the section in their grammar on adjective feminines is small and vague. I suspect there are a lot of mistakes in existing entries but without a reference I don't know for sure which ones are actually errors. Any ideas? Benwing2 (talk) 04:31, 17 March 2021 (UTC)[reply]

The Routledge volume Spanish: An Essential Grammar presents the rules very clearly (starting on p. 52). It's possible there are exceptions they omit, but I can't think of any. —Μετάknowledgediscuss/deeds 04:43, 17 March 2021 (UTC)[reply]
@Metaknowledge I went through the possible mistakes. Of the first half of adjectives, I found the following that I'm not sure of:
  • guanaco "Salvadoran": f=guanaca or f=guanaco (as current)?
  • rector "governing": f=rector, fpl=rectoras or f=rector, fpl=rectores (as current)?
  • protector "protective": f=protector or protectriz (as current)? Which other such words have -triz?
  • retro "retro": f=retra or f=retro (as current)?
  • extensor "extending": f=extensora, fpl=extensoras or f=extensor, fpl=extensores (as current)?
  • sensor "acting as a sensor": f=sensora, fpl=sensoras or f=sensor, fpl=sensores (as current)?
  • señor "great big, whopping": fpl=señoras or fpl=señores (as current)?
  • mafioso "mafia (relational)": f=mafiosa or f=mafioso (as current)?
  • bollo "dyke (relational), lesbian": f=bolla or f=bollo (as current)?
  • colmo "summit (relational?)": f=colma or f=colmo (as current)?
  • nadando "naiant (heraldry)": f=nadanda or f=nadando (as current)?
  • obturador "obturator (anatomy; relational)": f=obturadora/obturatriz or f=obturatriz only (as current)?
  • cingalés "Sinhalese, Sri Lankan": f=cingalesa or f=cingalés (as current)?
  • monoposto "single-seat": f=monoposta or f=monoposto (as current)? pl=monoposto or pl=monopostos (as current)?
  • productor "producing": f=productora or f=productor (as current)?
  • borrachín "drunkard": f=borrachina, fpl=borrachinas or f=borrachín, fpl=borrachines (as current)?
  • tropezón "stumbling": f=tropezona, fpl=tropezonas or f=tropezón, fpl=tropezones (as current)?
  • gil "naive (Argentina/Chile/Uruguay)": f=gil or f=gila (as current)?
  • bretón "Breton": f=bretona, fpl=bretonas or f=bretón, fpl=bretones (as current)?
  • gandul "lazy": f=gandul or f=gandula (as current)?
  • chiclán "having only one visible testicle": f=chiclana, fpl=chiclanas or f=chiclán, fpl=chiclanes (as current)?
  • dicotiledón "dicotyledonous": f=dicotiledona, fpl=dicotiledonas or f=dicotiledón, fpl=dicotiledones (as current)?
  • guasón "dull; funny": f=guasona, fpl=guasonas or f=guasón, fpl=guasones (as current)?
  • agraz "unpleasant, disagreeable": f=agraz or f=agraza (as current)?
  • adenín "adenine": f=adenina, fpl=adeninas or f=adenín, fpl=adenines (as current)?
  • adenosín "adenosine": f=adenosina, fpl=adenosinas or f=adenosín, fpl=adenosines (as current)?
  • regulador "regulatory": f=reguladora, fpl=reguladoras or f=regulador, fpl=reguladores (as current)?
  • multiuso "multiuse": pl=multiuso or pl=multiusos (as current)?
  • dilatador "dilating": f=dilatadora, fpl=dilatadoras or f=dilatador, fpl=dilatadores (as current)?
  • contra reloj "time trial (relational?)": is this invariable?
  • montaraz "wild, rough": f=montaraz or f=montaraza (as current)?
  • oculomotor "oculomotor": f=oculomotora, fpl=oculomotoras or f=oculomotor, fpl=oculomotores (as current)?
  • retractor "retracting": f=retractora, fpl=retractoras or f=retractor, fpl=retractores (as current)?
  • flector "bending": f=flectora, fpl=flectoras or f=flector, fpl=flectores (as current)?
  • antitabaco "antismoking": pl=antitabaco or pl=antitabacos (as current)?
  • benjamín "8 to 11 years of age": pl=benjamínes or pl=benjamíns (as current)?
  • colegial "collegiate": f=colegial, fpl=colegiales or f=colegiala, fpl=colegialas (as current)?
Can you review? Benwing2 (talk) 06:32, 17 March 2021 (UTC)[reply]
There are a bunch of words in this list I don't think I've ever heard. You should ping native speakers. —Μετάknowledgediscuss/deeds 06:46, 17 March 2021 (UTC)[reply]
@Metaknowledge Who are the native speakers here? (Sorry, I don't know most people's native language ...) Benwing2 (talk) 14:23, 17 March 2021 (UTC)[reply]
I thought you'd remember, as we had this same conversation recently regarding es-pronunc. @Koszmonaut, Pablussky, AugPi, Ser be etre shi, Vivaelcelta are all active native speakers. —Μετάknowledgediscuss/deeds 19:40, 17 March 2021 (UTC)[reply]
God, there's so many things here I don't really know. I can tell you, though, a couple of things:
  • rector: f=rectora, fpl=rectoras
  • protector: Some other words that use the suffix -triz to form the feminine are: actor → actriz, emperador → emperatriz, although these are nouns. I actually don't think I had ever heard of an adjective using it.
  • retro: f=retro
  • extensor: f=extensora, fpl=extensoras
  • señor: fpl=señoras
  • mafioso: f=mafiosa
  • bollo: f=bollo
  • nadando: f=nadando, for this is actually a gerund of a verb, and gerund in Spanish doesn't have gender.
  • obturador: both feminines are OK, although I have only heard obturatriz irl. (But the dictionary states obturadora as the feminine of obturador, so I guess it should be right)
  • cingalés: f=cingalesa
  • productor: f=productora
  • borrachín: f=borrachina
  • tropezón: according to the dictionary, f=tropezona, but I swear I didn't know this could be used as an adjective.
  • gil: f=gila
  • bretón: f=bretona, fpl=bretonas
  • gandul: f=gandula
  • dicotiledón: to be honest, I don't actually think this word has a feminine form. Whenever I have to use it with a feminine noun, I would just use dicotiledónea, feminine form of dicotiledóneo, which is a synonym.
  • guasón: f=guasona
  • agraz: f=agraz
  • adenín: this is actually just an apocopated form of adenina used in some chemical names, so I don't think it can ever be feminine, if it can even be considered an adjective.
  • adenosín: same as before
  • regulador: f=reguladora
  • multiuso: at least where I'm from, we'd always use the plural form multiusos, even with singular words, so I don't really know.
  • dilatador: f=dilatadora, fpl=dilatadoras
  • montaraz: f=montaraz
  • contra reloj: yes, invariable. there's contrarreloj, tho, which can have plural.
  • oculomotor: I'd say f=oculomotora, fpl=oculomotoras, although I'm not 100% sure.
  • antitabaco: invariable, pl=antitabaco
  • benjamín: pl=benjamines
  • colegial: when used as an adjective, f=colegial, fpl=colegiales. when used as a noun, f=colegiala, fpl=colegialas.
Hope this was helpful. :) --Pablussky (talk) 20:17, 17 March 2021 (UTC)[reply]
@Benwing2 I agree with all of Pablussky's choices, yes, even the one on dicotiledón (except, well, I've never heard gil or gandul at all... so no idea about those, really). In my dialect we do use multiuso in the singular, and we say f=multiuso (una bodega multiuso). I'm really not sure about the plural (in real life, this adjective tends to be used with rooms that are unique within an institution or building, so I haven't gotten to hear it much in the plural...), but maybe it would vary between fpl=multiuso and fpl=multiusos.
Regarding some of those he left unaddressed:
  • guanaco: yes, f=guanaca, fpl=guanacas. And I'm Salvadoran, so I would know.
  • colmo: I've never heard the adjective (it's rarely rare), but the current DRAE at least gives the example fanega colma, which points to f=colma, fpl=colmas
  • nadando: after a quick search, it appears this heraldry term is nadando when used as a noun, and nadante when used as either a noun (un nadante = un nadando) or an adjective (un pez nadante)...
  • flector: f=flectora, fpl=flectoras
Finally, I have truly never heard sensor, monoposto or chiclán... Any answer I'd give would be merely analogical (namely monoposta like monocroma; chiclana like patana, truhana). I would tend to say the entry is correct in some vague way about f=sensor and fpl=sensores, if only because if meaning 'acting as a sensor', most likely we're dealing with a compound noun, not an adjective (la puerta sensor, las puertas sensor(es))...--Ser be être 是talk/stalk 01:18, 18 March 2021 (UTC)[reply]
@Koszmonaut, Pablussky, AugPi, Ser be etre shi, Vivaelcelta Thank you very much for your detailed responses! Here is another round; after this I think there are only a few adjectives left to fix.
  • mogol "Mogul", mongol "Mongol", huichol "Huichol", chol, "Ch'ol": f=mogola, etc. or f=mogol, etc. (as current)?
  • lao "Lao" pl=lao or laos or both?
  • azul-petróleo "teal" pl=azules-petróleos or azul-petróleo?
  • multigrado "multigrade": f=multigrada or f=multigrado (as current)? pl=multigrado or multigrados (as current)?
  • antisuicidio: pl=antisuicidios (as current) or pl=antisuicidio?
  • monomotor: pl=monomotor or pl=monomotores (as current)?
  • carmesín: f=carmesina or f=carmesín (as current)?
  • záparo: f=zápara or f=záparo (as current)? pl=záparo or pl=záparos (as current)?
  • antihurto: pl=antihurto or pl=antihurtos (as current)?
  • intergrupo: pl=intergrupo or pl=intergrupos (as current)?
  • multimodo: pl=multimodo or pl=multimodos (as current)?
  • blemio: f=blemia or f=blemio (as current)?
  • incomún: f=incomún or f=incomuna (as current)?
  • hominoideo: f=hominoidea or f=hominoideo?
  • kushán "Kushan": f=kushana or f=kushán (as current)? pl=kushán or pl=kushanes (as current)?
  • pijao: pl=pijao or pl=pijaos (as current)?
  • kitán "Khitan": f=kitana of f=kitán (as current)? pl=kitán or pl=kitánes (as current)?
  • caxcán "Caxcan": f=caxcana or f=caxcán (as current)? pl=caxcán or pl=caxcánes (as current)?
  • warao "Warao": pl=warao or pl=waraos (as current)?
  • multinúcleo "multicore": pl=multinúcleo or pl=multinúcleos (as current)?
  • bugarrón: f=bugarrona or f=bugarrón (as current)?
  • antinarco: pl=antinarco or pl=antinarcos (as current)?
  • cubeo "Cubeo": f=cubea or f=cubeo (as current)? pl=cubeo or pl=cubeos (as current)?
  • papiado "buff": f=papiada or f=papiado (as current)? pl=papiado or pl=papiados?
Benwing2 (talk) 05:18, 18 March 2021 (UTC)[reply]
The following are the very last ones:
  • multidispositivo "multidevice": f=multidispositiva or f=multidispositivo? pl=multidispositivo or pl=multidispositivos?
  • beaumontés "of or supporting Luis de Beaumont": f=beaumontesa or f=beaumontés?
  • vigesimonoveno "29th": f=vigesimonovena or f=vigesimanovena?
  • abacanao "presumptuous (Chile)": f=abacanao or f=abacanaa or f=...?
  • heterodermo "?": f=heteroderma or f=heterodermo?
  • butuco "short and stout (Honduras)": f=butuca or f=butuco?
  • mosuo "Mosuo": f=mosua or f=mosuo? pl=mosuo or pl=mosuos or both?
  • senufo "Senufo": f=mosua or f=senufo? pl=senufo or pl=senufos or both?
  • zorrón "sluttish": f=zorrona or f=zorrón? does it differ as a noun vs. an adjective?
  • exgay "ex-gay": pl=exgays or pl=exgais?
  • improlijo "sloppy (Argentina)": f=improlijo or f=improlija?
  • antirretorno "non-return": pl=antirretorno or pl=antirretornos? f=antirretorno or f=antirretorna?
  • suido "swine": f=suida or f=suido?
Benwing2 (talk) 06:43, 18 March 2021 (UTC)[reply]
  • mogol, mongol: f=mogola and f=mongola. I don't really know about the others, but they probably follow the same rule.
  • azul-petróleo: I'm quite sure this hyphen here is just a typo. It should be azul petróleo, as two separate words, and the plural form would be azules petróleo.
  • multigrado: Never heard this word, but I'm positive the feminine is invariable: multigrado.
  • antisuicidio: probably pl=antisuicidio, just like in antitabaco
  • antihurto: same as before
  • blemio: f=blemia
  • incomún: f=incomún
  • hominoideo: probably f=hominoidea
  • multinúcleo: I'd say pl=multinúcleo
  • bugarrón: apparently it's a word with a meaning that only applies to men, so it shouldn't have feminine, I guess. Searching the web, I found the term bugarrona, applied to women, mainly used as the English term dyke.
  • multidispositivo: invariable both in gender and number.
  • beaumontés: f=beaumontesa
  • vigesimonoveno: f=vigesimonovena
  • abacanao: since it's a pronunciation spelling of abacanado, I guess the feminine form should be abacaná, pronunciation spelling of abacanada.
  • heterodermo: probably f=heteroderma
  • exgay: the plural of gay is gais in Spanish, so I guess it should be pl=exgais
  • improlijo: probably f=improlija, but I'm not from Argentina, so I have never actually heard this word.
  • antirretorno: invariable in gender or number.
I hope these help you, and I'm sorry I'm not able to help with the rest. --Pablussky (talk) 11:37, 18 March 2021 (UTC)[reply]
FWIW, I can find "la presentación evidentemente deficiente e improlija", "las manos improlijas" and "las recolecciones apresuradas e improlijas", etc in Argentinian books, confirming Pablussky's guess above. (I can also find books using bugarrona, but the sense is obviously somewhat different in the feminine from in the masculine, as he says.) - -sche (discuss) 05:37, 19 March 2021 (UTC)[reply]

Please comment on this new vote, which intends to build on the work with place names to cover extraterrestrial entities as well. —Μετάknowledgediscuss/deeds 22:28, 18 March 2021 (UTC)[reply]

Entries for specific names as Translingual?

[edit]

Fx. should the specific name falciparum (from Plasmodium falciparum) deserve its own Translingual entry? Or should it be entried as Plasmodium falciparum instead (like what the French folks have done with fr:Plasmodium falciparum)? I just really want its etymology to be included in Wiktionary. Kritixilithos (talk) 14:49, 19 March 2021 (UTC)[reply]

Implying that Gorilla gorilla and Gorilla gorilla gorilla exist, sure. ॥ সূর্যমান 22:00, 19 March 2021 (UTC)[reply]
First of all, we would want to have the main entry at falciparus if we have one, since falciparum is just the neuter form (to agree with Plasmodium, which is a neuter noun). There's also Laverania falcipara, which is what you get when you treat the subgenus Laverania as a genus in its own right.
Apparently the specific epithet has only been used for this one species, so creating a translingual entry might not be a high priority- and there are those who would argue against a Latin entry, because it's apparently never been used except in the context of this species. Ironically, the original version of the name, "Haematozoon falciparum", was rejected because it was argued that the author didn't intend to create a taxonomic genus named "Haematozoon", which is Latinized Ancient Greek for "blood animal". But then, the history of the taxonomic name is a colossal mess. They finally had to overrule the normal rules via a vote by the governing body in order to settle the matter and prevent massive confusion.
I'm not sure about the etymology anyway, since English par is already an adjective and there's a verb, pareo, which fits much better, semantically (though I don't know Latin well enough to explain what happened to the "e"). Chuck Entz (talk) 23:48, 19 March 2021 (UTC)[reply]
This does look a good use of the practice of including the epithet etymology in the entry for the species. Generally, I'd rather see it on a separate page for the epithet, but there doesn't seem to be any justification for the separate page. I found an image (now on [[falciparum]] that seems to provide a clue why the the epithet was (more or less) appropriate. DCDuring (talk) 03:35, 20 March 2021 (UTC)[reply]
"so creating a translingual entry might not be a high priority", OK I'll make a translingual entry for Plasmodium falciparum sometime, because I too agree that falciparum by itself shouldn't belong as a Latin entry for the same argument you present. Regarding the etymology, according to https://doi.org/10.1016/0169-4758(87)90153-0 falciparum is wrong Latin (specifically the -parum part, as you have noted). The article also presents this confusion that is its generic name. Kritixilithos (talk) 16:12, 20 March 2021 (UTC)[reply]
I can see nothing wrong with the formation of falciparum if it's meant to mean "bringing forth a sickle(-shaped gametocyte)". The -parum part is from pario (to bring forth) and is formed according to viviparus and Deipara. Or am I missing something? --Akletos (talk) 07:04, 21 March 2021 (UTC)[reply]
You are. fr.wikt claims that falciparus is "Composé de falx (« faux »), par et -us, littéralement « pair, égal »." (faux meaning "sickle" in this case, not "false", of course). I considered pario, but I like pareo better, semantically. Whether it works morphologically is another matter entirely... Chuck Entz (talk) 07:46, 21 March 2021 (UTC)[reply]
That which is visible (pāreō) as a sickle shape are not the motile sporozoites that infect the host, but the gametocytes (germ cells) they produce (păriō). So the infectious agent is falcipārentipărus :).  --Lambiam 12:22, 21 March 2021 (UTC)[reply]
Yup, basically what that article says too regarding par being used incorrectly. Kritixilithos (talk) 12:50, 21 March 2021 (UTC)[reply]
Who's saying that it's supposed to mean "sickle-shaped, resembling a sickle" apart from fr.wt (which isn't providing refs)? Does this wrong explanation appear already in the species description? "producing sickels" does make sense, doesn't it? --Akletos (talk) 07:13, 25 March 2021 (UTC)[reply]

Etymology bashing

[edit]

Hello. I'm airing my complains. Don't mind me walking by the newcomers desk, grabbing a beer and a chair.

Etymologies are an important part of today's comprehensive and self-respecting dictionary, yes? Well, the word is said to come from Greek etumologia, but senses of a word does change, to its current meaning in English. Such as a hypothesis of the evolution of a word to its current form.

Now then, on Wiktionary, what do we want Etymology to stand for here. I see business-made trademarks inserted as anEtymology. I see Internet web-speak referenced as one. I beheld recent foreign loanwords as etymologies, and wow (!) even the transference of linguistic practices are held up as etymology

Are etymologies so broadly defined now? I cannot grasp this concept of etymologies, so forgive me. The wikiproject for etymologies has a RfD tag on it, searching the archives for etymology gives me clutter of pages with no sorted dates. So help me Wiktionary. This is not an early April's day and I really feel like Im losing sleep tonight over this. (Im following from Recent Changes) 119.56.96.202 16:05, 19 March 2021 (UTC)[reply]

I think both the origin and development of words and other terms viewed as a sequence of phonemes or graphemes, and the origin and development of their senses (meanings) are of interest. I cannot think of a reason to exclude the origins of terms that are recent. Oxford English Dictionary defines the term as having the meanings “The study of the origin of words and the way in which their meanings have changed throughout history” and “The origin of a word and the historical development of its meaning”.  --Lambiam 00:02, 20 March 2021 (UTC)[reply]
I don't understand. Etymology is where the word comes from. If that's a recent foreign loanword or a business-made trademark or from the Internet, that's where the word comes from.--Prosfilaes (talk) 09:15, 21 March 2021 (UTC)[reply]
Hello took a while for be to get back to this page. Recent Changes got very cluttered. Anecdote, one of my relatives pointed me to wikipedia:etymology, not that it helped.
Okay, let me try going into specifics. The English meaning now of etymology is more or less "the becoming of a word". So quite new words can have etymologies, yes. So what if an entry is less than a word by itself, say a glyph (single written symbol). Can a glyph have an etymology, which usually applies to words? How about stretching it further to have etymologies of linguistic marks like punctuation? And an extreme example where it is not even a mark but just a spacing, etymologies in Unsupported titles/Space?
I am not saying that the historical information of glyphs, marks and other practices do not belong here. They can be very helpful. They contribute substaintially to a better understanding of the entries. But for the word "etymology" to apply in such entries, seem far-fetched in year2021. Especially when it is meant to refer to words. 119.56.99.36 09:10, 24 March 2021 (UTC)[reply]
@119.56.99.36 I honestly don't see any way around having business trademarks, web-speak and recent foreign loanwords in etymology sections. People derive new words by turning trademarks generic ("to google") or deriving words from them with an affix. People quickly derive words from new web-speak words (pog > poggers, with -ers). People borrow words and quickly derive new words from them (boba tea > bubble tea, via phonosemantic matching). It's simply the reality of our linguistic situation today, even if these business/Internet/fast-and-foreign origins seem unsightful to you, and to do otherwise seems a denial of reality... Try to think of them as fun rather than ugly?
For what it's worth, big well-reputed dictionaries also include new words and their origins. The Oxford English Dictionary mentions "meme" as originating in the 1970s (in Richard Dawkins' book). And you could check out Coromines & Pascual's Spanish etymological dictionary to see how long and discursive entries can get in some publications, including plenty of discussion of evolving meanings, examples in specific works of famous literature, comments by writers or grammarians, etc. There are words that were introduced by individuals like Shakespeare, or in Spanish, by Juan de Mena. Cicero excuses himself for using elementum in the philosophical sense of "element" (water, fire, earth, air), either seeking lenience from the reader for the unusual usage, or coining a new meaning for the existing word.--Ser be être 是talk/stalk 14:03, 24 March 2021 (UTC)[reply]
Some characters have fascinating etymologies. Emojis have an etymology. Your point is that "Etymology" doesn't make sense when it's not restricted to old words? What's the alternative? – Jberkel 14:29, 24 March 2021 (UTC)[reply]
I have done work on geographical terms in Asia and their etymologies, and I want to say that the origin of many geographical terms, especially those created in the 19th, 20th or 21st centuries, is a subject that is not that well understood. On Wiktionary, I have gotten to explore the origins of loan words from Asian languages into English in a way that no classroom setting or research project allows. I heartily disagree with the spirit of this etymology bashing. --Geographyinitiative (talk) 14:40, 24 March 2021 (UTC)[reply]
There's nothing really new about these kinds of etymologies: look at aspirin, dry ice, hansom, mackintoshand tiddlywinks- all trademarks or brand names. Words like silhouette, daguerotype and chauvinist were named after people in ways reminiscent of a lot of modern coinages. As for characters, what about ampersand and asterisk, which have interesting histories? History is not static- today will be a quaint matter of historical curiosity in a century or two. Chuck Entz (talk) 15:20, 24 March 2021 (UTC)[reply]

@Chuck Entz @Geographyinitiative: @Jberkel: @Ser be etre shi: @Lambian: @Profilaes

Hello, 119.56.0.0 /16 here. I do appreciate your considerations. So I do my best to explain my concerns.
I have one reason to support, that the use of etymology evolved, and two reasons against: that Wikimedia projects should describe the world as it is and not precribe what it should be, and that we write using formal writing here.
The suppoting reason is that the word etymology has indeed evolved as some here said. :::Etymology, as an English word, was created by some scholar/academic (which can be a clue why this word can feel so technical and feel so restrictive). In the old, strict meaning, etymologies have to refer to predecessor forms of words(note emphases). That is, a word used to have had an older form from somewhere which is cited as its etymology. Now etymology has evolved to be possible to mean any possible origin story of a word, that is supported by attestations, or can still amount to an educated guess. (Trademarks can have etymologies today.) Note how etymology is still obsessed with talking about words. But still, the meaning of etymology has expanded (become broader). This is its current evolution, as we can see it being used in reputable English dictionaries now in 2021. It may expand further to also refer to emoticons or even CJK radicals in future years. I would not be surprised if it happens 10 or 50 years later, but not yet now in 2021, I say.
My first objection is that Wikimedia projects should describe the world as it is. This means using etymology as is used normally in other dictionaries, and not going over the bounds of its normal use. This means not using etymology in English to refer to the origins of characters, glyphs or graphemes.
I know not if other languages use etymology to also refer to characters or part of a word. But are we sure that the other language used a word that is truly cognate, and not a translation to hyponym or hypernym senses? I brought this up because Wiktionary accepts words from all languages, so I added this as a cautionary note.
My second objection is that we write in formal English here. This means we write with commonly-accepted words and senses. (This builds on the first objection.) I do not see reputable English dictionaries use etymology to describe anything other than words. That is, a word means that which can stand on its own, without being conjoined to any other characters, and has its own definition. If I don't see the reputable English dictionaries using etymology to refer to other than words, I do not see why we should be doing so.
In closing, I do recognize the consensus policy, that common agreement is the basis of decisions. Wiktionary may very well decide as a whole to continue using etymologies to refer to all part(icle)s of words. Instead of following generally accepted use now, WT can be an early adopter and break new ground if it choose to. And it does go against the policy of being descriptive and not precriptive. I just want WT to be aware that this issue exists, no matter the final decision.
So what other words to replace etymologies. Why not history or origins which can refer to many things anyway. If you are unhappy with the narrow senses of etymology, don't blame me! A scholar/academic/researcher created the word in English and I can't help it119.56.99.230 09:42, 29 March 2021 (UTC)[reply]
Follow up to etymologies on Unsupported titles/Space. Can the practice of leaving spaces between words be an etymology? How about further down. We have vaporwave as an etymology for the spacing practice there. And so on. Did you feel that these are ridiculous? Weird? 119.56.99.230 09:42, 29 March 2021 (UTC)[reply]
The problem is that our naming system for headings (WT:EL) is fixed and has to be consistent throughout entries, so it has to be "Etymology" everywhere. The French Wiktionary uses templates in headings (==={{S|étymologie}}===), which means the actual text could be changed or parametrized. This has other consequences for data processability, however. I'm not sure much would be gained by mass-renaming all our etymology headers to "Origins", though. – Jberkel 11:37, 29 March 2021 (UTC)[reply]
@JberkelSo there is a grandfathered programming code. Sounds difficult. So let's keep to the lesser evil?(whichever it is)119.56.98.224 17:06, 29 March 2021 (UTC)[reply]
Yes, but the question is if the "Etymology" heading is really confusing in the small percentage of entries where it is not related to a word. – Jberkel 18:09, 29 March 2021 (UTC)[reply]
Here: [10], 仔 is split into six etymologies. I think that's what this person is against? --Geographyinitiative (talk) 13:14, 29 March 2021 (UTC)[reply]
The writing system makes it a bit more complex, but I don't see it any different then cat or record; it's tracing the words, not how writing representation thereof.--Prosfilaes (talk) 13:30, 29 March 2021 (UTC)[reply]
@GeographyinitiativeHello. The number of etymologies is not a problem, as long as each one is tracing the past of the word. It is also possible for single characters to form words. Like in Chinese? In English, [[a#English|]] is the common example. The issue in this discussion is about writing etymologies for non-words.119.56.98.224 17:06, 29 March 2021 (UTC)[reply]

Putting this another way, the core issue appears to be a concern about the meaning of the term etymology itself.

  • If this term is limited in scope to referring solely to the derivation and history of words, then I agree that we have a (minor) problem in using the ===Etymology=== heading on entries for things like emoji, punctuation, etc.
  • If this term is not limited in scope to referring solely to the derivation and history of words, as implied by the etymology of etymology, then it would appear that our entry at etymology needs to have an additional sense or two.

The Wikipedia page at [[w:Etymology]] suggests that this is limited to words. However, even there, there's a discussion of morphemes, which opens the door a bit.

Meanwhile, Merriam-Webster's entry defines the term as (emphasis mine):

the history of a linguistic form (such as a word) shown by tracing its development since its earliest recorded occurrence in the language where it is found...

This suggests that etymology need not be limited to words, so long as we define "linguistic form" to include things like emoji, punctuation, etc. Merriam-Webster's entry for "linguistic form" describes this as "a meaningful unit of speech", but I don't know why this should necessarily exclude "a meaningful unit of writing".

Curious as to others' thoughts. ‑‑ Eiríkr Útlendi │Tala við mig 20:32, 29 March 2021 (UTC)[reply]

@Eirikr Hello its T-s here. Thank you for your detailed reply. I just checked my c.1994 merriam-webster paperback. Etymology is defined there as

1 : the history of a linguistic form (as a word) shown by tracing its development and relationships 2 : a branch of linguistics dealing with etymologies

Another Macmillan millennium dictionary, lists etymology as noun [C] the origin and development of a particular word.

Maybe lexicographers evolve their definitions or something. 119.56.99.206 07:12, 30 March 2021 (UTC)[reply]

Call for review, comment and discuss my PhD thesis on Wikimedia movement

[edit]

Hello,

Just a short message to call people interested to review, comment and discuss my PhD thesis on Wikimedia movement. All the best, Lionel Scheepmans (talk) 19:32, 19 March 2021 (UTC)[reply]

Spanish reflexive-only verb forms (again)

[edit]

@DTLHS, Metaknowledge, Ungoliant MMDCCLXIV, Ultimateria, Koszmonaut, Pablussky, AugPi, Ser be etre shi, Vivaelcelta How should we handle nonlemma forms of reflexive-only verbs in Spanish (and for that matter, any other languages with lexicalized reflexives)? User:NadandoBot has created tons of Spanish verb forms of the nature of e.g. agilipollo, which claims to be the first-person singular present indicative form of agilipollarse. This is misleading at the very least, and maybe wrong entirely; the actual first-singular present indicative of agilipollarse is me agilipollo, while agilipollo is the first-singular present indicative of agilipollar (which doesn't seem to exist other than as a reflexive verb). The tricky thing here is that the reflexive pronoun is a clitic attached to the infinitive (and present participle, and imperative forms), but a separate word with respect to all other verb forms. In Russian, meanwhile, the reflexive pronoun is always a clitic, and my bot has created non-lemma reflexive forms with the reflexive form attached, pointing to the infinitive also with the reflexive form attached. Hence влюби́лся (vljubílsja, I/you/he fell in love) points to влюби́ться (vljubítʹsja, to fall in love), both with attached -ся (-sja). The corresponding way of doing things in Spanish would be to create nonlemma forms me agilipollo, te agilipollas etc. rather than agilipollo, agilipollas, but that might not be so helpful to the beginner, who might not realize that agilipollarse is reflexive-only and would try to look up agilipollo instead of me agilipollo. One possibility is to have agilipollo say "Only used in me agilipollo", while the latter is listed as the first-singular present indicative of agilipollarse. Thoughts?

BTW I'm sure this issue has come up before, but I don't remember the outcome (if any). Benwing2 (talk) 02:26, 24 March 2021 (UTC)[reply]

I don't know much of Wiktionary's criteria on this, but I think the most useful option would be having both entries, the one-word one redirecting to the reflexive one, just as you suggested in the end. --Pablussky (talk) 08:48, 24 March 2021 (UTC)[reply]
I've never quite understood why Spanish-English bilingual dictionaries so often treat reflexive verbs separately. Is the lack of space between infinitive and pronoun really so important? Many verbs can be used either way, with the reflexive form being a bit more emotional or colloquial (morir, morirse), or are "labile" verbs with the reflexive being a kind of passive but with no change in meaning (quemar algo, quemarse), or involve an indirect object reflexive that's very productive (darse algo, quedarse con algo), sometimes with a similar emotional/colloquial nuance (comer algo, comerse algo; beber algo, beberse algo). Considering how frequent the relationship is between so many pairs (the stronger difference between ir 'to go' vs. irse 'to leave' is more exceptional), it seems more useful to just point at the bare infinitive, and include the meanings there, even in the case of the likes of ir vs. irse. It's what the DRAE does.--Ser be être 是talk/stalk 13:41, 24 March 2021 (UTC)[reply]

I think it makes more sense to keep reflexive pronouns out of page titles for verb forms, but they could be worked into the headword as:

se agilipolla

  1. Formal second-person singular (usted) present indicative form of agilipollarse.
  2. ...

or

agilipolla (with reflexive pronoun se)

  1. Formal second-person singular (usted) present indicative form of agilipollarse.
  2. ...

with the definitions unchanged. Either one would be an improvement to these entries. Thoughts? Ultimateria (talk) 16:45, 24 March 2021 (UTC)[reply]

@Ultimateria, Ser be etre shi I generally like User:Ser be etre shi's suggestion. In any case there is no consistency at all in Wiktionary in how we treat reflexive verbs. Sometimes they're lemmatized under the reflexive form but much of the time even for reflexive-only verbs, they're lemmatized under the non-reflexive form, e.g. autogobernar. (What about e.g. abofar, which appears to be almost always reflexive but is listed at the non-reflexive form?) We can have the reflexive versions simply redirect to the non-reflexive ones, and maybe include reflexive conjugations along with or instead of non-reflexive ones. I also think User:Ultimateria's suggestion is good if we are to keep reflexive-only verbs lemmatized at the reflexive form. Benwing2 (talk) 05:09, 25 March 2021 (UTC)[reply]
As long as we're consistent with the formatting and provide all the necessary soft redirects, I'm fine with lemmatizing under the bare infinitive. Ultimateria (talk) 21:36, 25 March 2021 (UTC)[reply]
I strongly support this, reduce all reflexive entries to a simple link or redirect to the non-reflexive and have all the usage in one place. It's not only simpler, it's more informative since you'll see whether or not the verb has reflexive usage without having to do another lookup, and you'll see what, if anything, changes when it's used reflexively. JeffDoozan (talk)
For all languages, collapse all modified entries to the most basic form if they actually mean the same thing.
Also for the page-not-found MediaWiki. Encourage people to search for a word (first or second bullet) instead of always first suggesting creating a new page? Wonder if it will work.119.56.98.252 10:14, 29 March 2021 (UTC)[reply]

Spanish "irregular verbs"

[edit]

Would it be possible to purge the Spanish irregular verb categories of c>z and g>gu and the like? It decreases their functionality a lot. Dngweh2s (talk) 23:29, 24 March 2021 (UTC)[reply]

@Dngweh2s Sounds good to me. Benwing2 (talk) 05:09, 25 March 2021 (UTC)[reply]
I am redoing {{es-verb}}; after that I'll see about fixing the irregular verb categories to distinguish truly irregular verbs from those with predictable spelling changes and those with certain unpredictable changes that aren't truly irregular, just subconjugations (e.g. recordar -> recuerdo, confiar -> confío).Benwing2 (talk) 08:21, 25 March 2021 (UTC)[reply]

12-year-old entries

[edit]

According to OldPages, there are now a few entries which haven't been edited in twelve years. Either they are perfect, or so obscure that nobody would ever find them. Anyway, the winner is the entry mandelstein, the first entry not to have been touched in 12 years. It was made by Jackofclubs (talkcontribs), a mediocre editor if there ever was one. I wonder what came of them... Yellow is the colour (talk) 11:58, 25 March 2021 (UTC)[reply]

I spruced up some Italian and Catalan pages. Ultimateria (talk) 21:39, 25 March 2021 (UTC)[reply]
Good place to hide some easter eggs. – Jberkel 23:55, 26 March 2021 (UTC)[reply]
I assume this is your "subtle" way of letting us know that you're Wonderfool?__Gamren (talk) 02:42, 27 March 2021 (UTC)[reply]
WF's MO is pretty easy to spot, esp. since they like to edit User:AryamanA/Wonderfool and add their user names there :) ... Benwing2 (talk) 18:44, 28 March 2021 (UTC)[reply]

Thank you! Finally someone brought up the forsaken pages of yore *cough*.

They are too obscure. Not noteworthy enough. Sometimes I wonder if we should have a project to clean up the most AncientPages. See if there are attestations and if it meets CfI. (does this go against the principle of inclusivity? Uh nevermind)119.56.98.252 10:07, 29 March 2021 (UTC)[reply]

: Lua error: not enough memory

[edit]

I'm mainly on zhwikt, so not sure what to do. Maybe provide a subst: version of {{Japanese_numbers}}? Also {{Japanese_numbers}} don't include (). EdwardAlexanderCrowley (talk) 12:21, 25 March 2021 (UTC)[reply]

Sadly, remove {{Japanese_numbers}} is not enough. EdwardAlexanderCrowley (talk) 12:24, 25 March 2021 (UTC)[reply]
@EdwardAlexanderCrowley: See Wiktionary:Lua memory errors. J3133 (talk) 12:27, 25 March 2021 (UTC)[reply]
Well, after imaging 10 琉球 dialects being added, I retreated. EdwardAlexanderCrowley (talk) 12:38, 25 March 2021 (UTC)[reply]

Does anyone want to organize another round of A multilingual word game and see if Mattel sues us for copyright issues? Yellow is the colour (talk) 23:09, 26 March 2021 (UTC)[reply]

Rhymes:Spanish - word forms, alliteration

[edit]

Seeing @Benwing2's recent great work in Template:es-IPA, and the good quality of the generated pronunciations, I was thinking of writing a bot using similar code that would 1) generate pronunciations of all Spanish word forms for itself (without adding them to the Wiktionary entries), and then 2) generate lists of rhymed words for me to review and add here, to Rhymes:Spanish.

I have two questions:

  • Does anyone oppose me doing this with word forms, namely, listing conjugations?
  • I am also interested in adding assonant rhymes, if only for words stressed in the third-to-last syllable. E.g. -ˈaCaCo (ábaco, cálamo, cántaro, pájaro), -ˈeCaCo (pétalo, piélago), -ˈoCaCo (estómago, sótano), -ˈuCiCa (túnica, última). This would be a very unusual thing to include on Wiktionary, but I think users could find it very helpful for poetry. (Although I should probably focus on Spanish Wiktionary for this, rather...)
  • Note: in either case, forms with clitics would not be included (rather there would be just a comment about -ˈaCa verbs being able to rhyme via -nos/-os/-lo(s) (alcánzalo, cántanos).

Pinging @Metaknowledge due to his general interest in conventions and upkeep around here.--Ser be être 是talk/stalk 16:40, 27 March 2021 (UTC)[reply]

@Ser be etre shi This sounds great to me. I think if you want to list non-lemma forms on rhyme pages, you should list them in a separate section on the page. As for generating the pronunciations themselves, with a bit of change to Module:es-pronunc it can be used to directly generate those pronunciations, i.e. you can invoke the module from a bot and get the generated pronunciation without having to rewrite the code in Python or whatever. If you are interested in doing this, let me know and I'll help you set it up. Benwing2 (talk) 16:47, 27 March 2021 (UTC)[reply]
BTW, not opposed to adding assonant rhymes but again they should be in a separate section. It seems to me there might be a ton of them, though. Benwing2 (talk) 16:48, 27 March 2021 (UTC)[reply]
@Benwing2 Yeah, that is why I wanted to limit them to proparoxytonic words, otherwise there would be far too many, and they would be well served by the normal consonant rhyme pages anyway (like this one).--Ser be être 是talk/stalk 17:14, 27 March 2021 (UTC)[reply]
@Ser be etre shi I added the function IPA_string to Module:es-pronunc for bot usage; see the comments above this function. Benwing2 (talk) 20:52, 27 March 2021 (UTC)[reply]

Capitalisation and full stops in definitions; also in form-of templates

[edit]

So looking at some recent and quite welcome edits by @JoeyChen like this one, this old sore of mine has been disturbed. WT:STYLE tells us not to capitalise glosses or end them with stops, which seems like prudent council to me - in the linked edit, the result feels positively wrong. The same goes for gloss + article, and likewise for a series of glosses with or without articles. Neither does it feel right to start adding capitalisation and full stops all of a sudden when conjunctions or remarks like "especially" find their way into the definition. Nor does adding a {{gloss}} to any of the above make it look like a worthy occasion for treating it as a sentence. To illustrate: what excactly makes polka a gloss but "A polka jacket." or "To dance polka." a sentence? The first is a Noun Phrase. The second is an Determiner Phrase, equivalent in most ways to a noun phrase. The third is a non-finite Verb Phrase heading a noun phrase; non-finite verb phrases function similar to noun phrases. The definition of a sentence is notoriously tricky, but I feel confident that the vast majority will not consider these stand-alone sentences if only becuase they lack predication; and lacking predication is true for the absolute majority of definitions.

I'm not talking about merely visual preference, mind: capitalisation signals to my subconscious reading faculty that I'm reading a finished sentence. This is like writing in ALL CAPS - IT FEELS LIKE SHOUTING or MAKING A FIRM STATEMENT on a subconsious level. In this case, finishing your sentences with stops online has been found by researchers to be insistent and rude: Google full stop rude for some articles on the topic. This is doubly true of one to three-word sentences, and in our case this is precisely what one will end up with most of the time.

I agree that a capital and a stop are absolutely the right choice for italicised Category:Form-of_templates. In this respect I was genuinely perplexed to find out through practice that there's seemingly no logic to why some of the templates use stops while others don't, so changing templates involves keeping track of stops; not to mention capitalisation!! What's going on there?

I tried using capital+stops on numerous occasions, inspired by {{R:OLD}} entries which use cap-stops, and came to the conclusion that it's impossible to find a dividing line between a gloss and a full sentence; indeed, that it's unusual for a full sentence with a predicate to appear in a definition on the website, and that these feel wrong as definitions (let alone several full sentences); that reading a list of such definitions feels not only wrong (like being insistently told by someone), but also jerky - there's a very good reason list entries are commonly not capitalised and not ended with full stops. Besides, a capital feels wrong after a label. In view of this, I've ended up never capitalising definitions at all, but reserve it for non-gloss definitions {{n-g}} and forms-of, since these can generally be rephrased with a predicate. What are other editors' thoughts? Brutal Russian (talk) 18:34, 28 March 2021 (UTC)[reply]

@Brutal Russian Yes, there is no consistency. See Category:Form-of templates, where I have categorized all form-of templates as to whether they include an initial capital letter and/or final period. My belief is that italicized form-of phrases should not be capitalized or include a final period in foreign languages because foreign languages use non-capitalized glosses primarily, but should be capitalized and include a final period in English because we use sentence-like definitions in English entries. However, I proposed implementing this before and it got shot down. Benwing2 (talk) 18:40, 28 March 2021 (UTC)[reply]
(sorry, I had seen the category bit messed up the category link) I think the problem with this approach is that it would be reasonable for an editor who doesn't like reading those long and numerous policy articles (among whom I count myself) to assume that English entries are the model for other languages, being the most numerous and most developed. And I don't see why it shouldn't be the case: surely if other language-entries are lacking, they're to be helped to reach the same standard instead of introducing handicap rules for them. That most entries are mere glosses with as much utility as a generic translation website is an unfortunate situation and one that Hungarian entries have successfully, even spectacularly overcome, and my hope is to see Latin entries follow suit.
Anyhow, even apart from these considerations, I like it when guidelines are grounded in logic that's both intuitive and objectively testable. While your suggestion does well on the latter criterion, I think mine scores better overall: there should be an objective way to determine what is a sentence for our purposes, and if by "sentence-like" you mean "A polka jacket." and "To dance polka.", then I think these can be rejected on the criterion I mentioned: they lack a predicate. So for the linguistically-savvy the guideline would be that, explicitly rejecting stranded noun, determiner and non-finite verb phrases. For everyone else, the criterion would be "does it answer what does it mean? or what does it do?" If the latter (>> forms part of verbal expressions; signifies that the speaker is uncertain), capitalise and end with a stop; if the former (>> polka; a dance; to dance polka with one's trousers down), do neither. This works both intuitively and objectively, and its application across the board would also be objectively and intuitively reasonable. Brutal Russian (talk) 19:12, 28 March 2021 (UTC)[reply]

Indeed there is no precriptive standard so far. Just use as much language as is needed to do the job. At most, be aware of the difference between a phrase and a sentence.119.56.98.252 09:56, 29 March 2021 (UTC)[reply]

CFI: Spanish non-idiomatic cliticized verbs and German compound words

[edit]

I posted over at WT:RFDN about deleting non-idiomatic reflexive verbs in Spanish. My take is that non-idiomatic combinations of verb + clitic, e.g. comprarse (to be bought; for one to buy), comprarlo (to buy it), comprarla (to buy it (feminine)), comprarlos (to buy them), comprarlas (to buy them (feminine)), should not be kept unless we also think that non-idiomatic German compounds should be kept. The problem as I see it is that we have created a rule that says anything written without separating spaces is a "word", based on English usage, and we are trying to apply it to other languages, where it doesn't work so well. English conveniently writes non-idiomatic compounds with spaces between the words; compounds may move from being written with spaces or hyphens to being written as a single word as they become more idiomatic, cf. database, filename, dataset, hotcake (originally and sometimes still written data base, file name, data set, hot cake). Likewise, English cliticized pronouns and verbs are written with an apostrophe, as in "the king's" (possessive or a contraction of "the king is").

OTOH, German has effectively the same compounding process as English, but happens to write compounds without spaces between them, even if non-idiomatic. I know there are some inclusionists like User:SemperBlotto who believe that "all words in all languages" even includes non-idiomatic German compounds, but I don't think this is the norm here. I argue that the same should apply to Spanish cliticized compounds; we don't keep the corresponding cliticized compounds in English, nor in Portuguese, which is almost exactly parallel to Spanish but happens to write the compounds with a hyphen, e.g. amar-se instead of amarse, compremo-nos instead of comprémonos, comprando-o instead of comprándolo. Why should an arbitrary choice of whether to write a separator (space/hyphen/apostrophe) or no separator determine whether we include terms? Benwing2 (talk) 18:35, 28 March 2021 (UTC)[reply]

It seems to me as a reasonable choice, albeit arbitrary Yellow is the colour (talk) 20:10, 28 March 2021 (UTC)[reply]
Why not have a hot streak of applying some linguistically sound criteria. It seems we're talking about phonetic words that aren't necessarily lexical words here. Clitics are easy to weed out as being, well, words by definition; orthography does not deprive them of their word status. There are some very clear cases of clitic+word lexicalisation itaque; in all other cases no argument can be made that such combinations are lemmas any more than there can be for prespositions+nouns.
Compounds in compounding-prone languages are a separate and trickier thing (good thing I don't know Chinese!), but the diagnostic here is language-dependent and can't be reduced to same spelling meaning same word-status across different languages. In English, what is a word can most straightforwardly be identified phonetically: a syllable that isn't the primary stressed one inside a phonetic word cannot receive Nuclear Stress. Here's a short paper by Truckenbrodt exploring how this stress is assigned; there's probably way more accessible treatments.
English compounds typically involve a leftward stress shift, but there's a very similar, not-to-be-confused-with phenomenon that functions across word boundaries, but inside phrases, called the Rhythm Rule, which elminates stress clashes. What's worse, although pairs such as blàckboard <> black bòard have traditionally been analysed as compound word vs. non-compound phrase (Hayes 1995), this here presentation takes both accentuations to be on the same compound footing: Arndt-Lappe. Now we could introduce a different criterion, a semantic one. It seems to me however that this is already being done for us by the speakers: words that are understood to be semantic compounds, that feel different from a combination of two standalone lexemes, tend to be spelled together; incidentally, all of these cases will be found to have the leftwards compound stress shift. I don't believe any English speaker will spell hot dìsh as a single word, a consequence of not considering it a compound. I would argue that we know they don't consider it a compound precisely because of the lack of even the possibility of a stress shift; as such, I believe this should be the main distinguishing criterion. Brutal Russian (talk) 20:13, 28 March 2021 (UTC)[reply]
I see I haven't directly addressed the German problem here. In German, compounds are very clearly distinguishable from noun phrases syntactically. Most obviously, they cannot be separated, and I can't think of examples of infixation or anything like that. They don't show adjective agreement; noun+noun phrases are inflected for case, compounds show no inflection. However non-idiomatic they may be and however transparent their meaning, they still are undeniably single words. And their stress is very similar to the English situation in that the last element, the head, is typically unstressed (Schwàrzbrot), as opposed to noun phrases where the last word will receive the Nuclear Stress by default (schwarzes Bròt) - even a separate spelling of the former (which isn't unheard of) does not obscure the distinction. In fact German seems to present the clearest distinction one can hope for. The only criterion for inclusion of such words can only be the general attestation requirements. To put it another way, not including them can only be based on the same reasoning as not including all words suffixed with -ig. Brutal Russian (talk) 20:38, 28 March 2021 (UTC)[reply]
My position is that for languages like Spanish (and German) where words are usually delineated by spaces, someone who sees ...en vez de comprarla á... will take comprarla to be a word and look it up as such, so we should have an entry soft- (or hard-) redirecting them to the lemma, to the extent that we also have entries for inflected forms like comprases. We have sometimes deleted affixes that can apply to any word, like Latin -que and Tzotzil -e, but if (-)la is only found on verbs, I'm not sure why comprarla shouldn't be be a (soft-)redirect to comprar, just as much as e.g. hombres is to hombre, where one could likewise argue that someone should know to drop the -s, but someone who was looking the word up precisely because they didn't know the language wouldn't necessarily know that.
As for German compounds, the norm has been that they are kept, from Talk:Zirkusschule to this July 2017 BP discussion. I think this is more helpful than the alternative. - -sche (discuss) 01:22, 31 March 2021 (UTC)[reply]
I support the status quo that SOP single-word compounds are kept and that SOP combinations of prefixed prepositions and prepositional objects (e.g. in several Semitic languages) and SOP strings in scriptio continua are excluded. It's also worth considering that some SOP compounds may be quite unusual in a given language, as not all languages use the same typologies to the same degree. Adj. + noun compounds are ten a penny in German and Afrikaans, but in Dutch they are quite rare and often they are borrowings or calques from German. Regulating for this on a language-specific or type-specific basis will in my expectation prove tedious and not very workable. ←₰-→ Lingo Bingo Dingo (talk) 17:39, 3 April 2021 (UTC)[reply]

Shortcut templets

[edit]

I created {{uv}} as a shortcut for {{univerbation}} some time before. I hope the community has no problems therewith. Maybe I should have asked permission to create it, but I have not put the shortcut templet to use thus far. If you do not like it, please go for another-un.

Also, I would like to have a shortcut for {{doublet}}: please decide which shortcut should we have for’t. Thanks. -- inqilābī inqilāb·zinda·bād 21:15, 28 March 2021 (UTC)[reply]

To me, {{uv}} (the one you created) seems appropriate. {{un}} would be quite vague, because a lot of templates start with 'un' and hence 'un' could be used for any of these (compared to this, 'uv' seems much better).
For a shortcut for {{doublet}}, I would suggest either {{dl}}, {{dlt}} (these 2 can be mistaken for a shortcut for {{delete}}, so mind that), {{dbt}} or {{dblt}} (most specific). 🔥शब्दशोधक🔥 04:10, 29 March 2021 (UTC)[reply]
Oh, and I had created these 2 template-shortcuts :- {{alt}} for {{alter}} and {{com form}} for {{combining form of}}. {{alt}} now seems to have been widely employed to use by other editors too (particularly @Hans-Friedrich Tamke), but {{com form}} is currently used on only 6 pages. I seek approval for these 2. 🔥शब्दशोधक🔥 04:48, 29 March 2021 (UTC)[reply]
@SodhakSH: {{dblt}} is too long for a 7-letter word, and also looks quite odd as one of the letters is a sonorant while the rest all are obstruents; so I am okay with {{dbt}}, and I shall go ahead & create it. Also, since {{alt}} has been, as you say, widely used, what should be done with’t? Before putting that to use yourself, you should have made an announcement here; but I guess it’s too late now. [Though it’s not uncommon to have two shortcuts for a templet, see for example {{template}}, {{calque}}; so I personally find no problem with having {{alt}}, and furthermore, {{alter}} is used a lot in entries, so that’s not any big deal.] -- inqilābī inqilāb·zinda·bād 20:04, 29 March 2021 (UTC) P.S. Oops, I mistakenly assumed that {{alter}} was itself a shortcut, but I see it is not. -- inqilābī inqilāb·zinda·bād 20:15, 29 March 2021 (UTC)[reply]
  • I'm a big fan of shortcut templates. The best way to get them to be kept is to use them on a large number of pages, meaning that other editors get much less likely to want to go through the hassle of changing them. Better still, have multiple accounts that all use them, thereby giving the illusion there is community consensus. Yellow is the colour (talk) 07:19, 29 March 2021 (UTC)[reply]
    I think a block is warranted for that trolling comment. -- inqilābī inqilāb·zinda·bād 20:15, 29 March 2021 (UTC)[reply]
  • Shortcuts only make sense for widely used templates. And is it really worth to save 2 keystrokes? {{alt}} for {{alter}} (which is already an abbreviation) is ridiculous. You can always set up shortcuts on the OS level, to expand keystrokes locally, o r improve your touch-typing skills. – Jberkel 09:35, 29 March 2021 (UTC)[reply]

The number of templates is getting too much. Someone really got to get to documenting which templates are needed. A MOS for templates can be a way to do it. 119.56.98.252 09:51, 29 March 2021 (UTC)[reply]

two to three letter names clash with the ISO language codes. To avoid confusion, such short names should be avoided as far as possible. 119.56.97.84 10:45, 29 March 2021 (UTC)[reply]
@Inqilābī Maybe {{dub}} would work? brittletheories (talk) 10:43, 26 March 2022 (UTC)[reply]

Headings for non-POS non-words

[edit]

From:Wiktionary:Etymology_scriptorium/2021/March#Unsupported_titles/Space

Words usually are used as a Part of Speech. A few words usually form a phrase or an idiomatic expression.

On the other side, we have entries which are less than a word. They alone often do not amount to a unit of speech.

What headers should we keep to then? Are we going to list them as Symbol, Punctuation, Prefix, Suffix, and so on? Or do we do an umbrella category and put them under Others.

There are times when the use of a non-word entry does not stick to one category. One is mentioned at the link above. I also pull out the next article as an example: [[!|!]]

! can be used as a punctuation in some entries. In some, it is used as a symbol, such as alone as a warning sign, as a math symbol usually for factorial, in chess to signal a good move, and in computer program logic as negation. In these later examples ! is used as a symbol yes, but definitely not as a punctuation.

Now I wonder if there is a need for such difficult distinctions. Your thoughts?119.56.98.200 06:40, 30 March 2021 (UTC)[reply]

It is not unusual for a term to have multiple personalities; English set has three PoS assignments split over five etymologies. I do not see why this should become more of an issue if the term happens to be a symbol or character. The PoS assignment Symbol functions as the "Other" category if more specific ones, such as Letter or Punctuation mark, do not apply.  --Lambiam 20:05, 30 March 2021 (UTC)[reply]
@LambiamIn the first-mentioned entry, I guess the issue came about because it is a whitespace. White space are interpreted differently when it comes to typesetting words: notepad, other computing parsers, or printing presses. So this creates some diversity in typography. The original idea of space really was as a void, an absence of everything else. Kind of how we think of outer space as empty.
I recognize Wiktionary adopts the Unicode standard, as do most other people using a computer. The Unicode standard treats all whitespaces as character symbols(official name is point?). So by taking this as the major definition for most people, we can possibly say that we categorized its PoS as symbol.
Can we change the Typography header to symbol then? For me, it is like ticking the box symbol from a list, just because only symbol was available. 119.56.103.124 14:27, 31 March 2021 (UTC)[reply]
Wiktionary does not adopt the Unicode standard. I don't even know what that means. Most of the terms we have can be represented using the Unicode code system, which is convenient since the software platform, the MediaWiki engine, recognizes it. We do with what we have. How various typographic rendering systems deal with whitespace is not lexicographic but encyclopedic information. It is not clear to me that the Typography section serves a lexicographic function.  --Lambiam 15:10, 31 March 2021 (UTC)[reply]
It's not so much Mediawiki supporting Unicode, but Wiktionary policy of using Unicode infoboxes that is the adoption. So following a listed Unicode symbol, its use has to be mentioned for this Unicode to be fully explained. At least a gloss is asked for. I get you also for your point about a lexicographic definition. It can come across as a description of its precribed use (a bit weird but it kinda works like that). I can write EN but I do find writing glosses difficult. Give me a while to rework the section.119.56.103.124 16:32, 31 March 2021 (UTC)[reply]
@LambiamHi Lambiam, sorry for the long interval. I have been looking at transferring out non-definition-of-usage out into the Appendix. So that what is remaining is on what it is used to represent. I hope I have taken right steps towards this aim.119.56.96.199 19:46, 30 April 2021 (UTC)[reply]

Combining POS and Declension ofs

[edit]

So on the subject of POS headings and what not, there's a problem with duplication, particularly with Latin entries which are rich in homographs. It doesn't look like people know how to handle this, so I think this needs to be brought up and decided.

  • Take pila: there's 4 different words, which makes reading the article difficult enough. 3 of these further have 3 homographic forms each: nom., acc. and voc. sg. Some entries add these in definitions, which seems counter-intuitive but readable (and I've seen other users remove them). Other entries, e.g. membrāna, stick a Noun POS somewhere at the end of the entry: not only is it irrelevant to 95% of the readers, the fact that it's a separate POS and tucked away Devil knows where makes it useless to the other 5%.—That one has only the ablative; we could add the vocative, and often it's there as well. Now repeat the same for the 3 different Etymologies and you get a totally unreadable mess.—Here's an experimental edit where I decided to just remove all of it, because if someone decides to add this to every 4th declension noun I'm gonna have diar...rhoeiah or whatever. All of these forms are already presented in the main headword's table - I say it's completely silly to spell them out the same information right below the thing and make the whole article look ugly.—Elsewhere (can't find), I moved the declined formes from a separate into the same POS header as the lemma - telling the reader there's a whole different noun only for it to turn out to be some declension of the same noun achieves nothing.—Here in salsus I decided to list two headwords that would have identical tables, and provide one for both.
  • A parallel problem concerns those words that can function as adjectives, participles and then nouns. This whole phenomenon is very characteristic of Latin and I think should be treated in some different way from how it's currently handled, which leads to half the language section occupied by duplicate tables for the different POSs. Practically no Latin dictionary singles out substantivised adjectives into standalone entries because firstly it would be unworkable due to how productive they are, and in fact they don't really stop being adjectives when used substantively. I think we're trying to account for purely syntactic phenomena in an inadequate way by giving separate POS headings. The opposite also happens frequently enough and is known as apposition causa victrīx "the winning cause".
  • Another piece of ugliness I've been trying to get rid of is the pronunciation stuff I removed in this edit. Sorry @JoeyChen but I don't see any utility for this whatsoever, while the downsides are screaming; an option would be to add a pronunciation section to each different POS that differs by vowel length, but this ties to the main issue: why? The only situation this would be of any help is when the reader has completely no clue about what the macrons stand for. In which case, they won't be able to read the IPA either, that's for bloody sure.
  • So to sum up I think some of these headers need to be combined in some way, and the duplicate information reduced to a minimum. I haven't found a way to handle substantivized adjectives yet; {{R:OLD}} as well as DMLBS simply give them inside the adjective's definition as (as sb. n.), because in the majority of cases they have the same meaning regardless. A definition-section template similar to {{syn_of}} seems like it would work a treat; it could be added in-line (also as) or as a sub-sense. Or maybe use labels for this? Oh the labels, another ill-defined head-ache.
  • The issue of telling apart adjectives and participles is also nothing new, and it exists because both terms are traditional and descriptively inadequate. What is dēfūnctus - a noun, an adjective or a participle? Why do we need to necessarily distinguish them? Is it not time to get over traditional grammar? Why not have the possibility of combining parts of speech when it's manifestly senseless to list three different POSs all meaning the same exact thing, with the difference being mainly in how they're translated in English? It's borderline readable if the language has no declension, as is the case with Italian. But imagine something like 75% of Latin participle pages getting the same "expansion". Brutal Russian (talk) 20:49, 30 March 2021 (UTC)[reply]
Agreed. Making the part of speech the primary organizational element of the page (rather than definitions) leads to nonsense like this. DTLHS (talk) 21:22, 30 March 2021 (UTC)[reply]
Yes. Well, just don’t duplicate, a known yardstick. On مَغْرِبِيّ (maḡribiyy) I found it noteworthy that it is per se both an adjective and a noun but at the noun I just referred back to the adjective, no reason why not, if only the part of speech itself was to be noted. It would be consequential also to have noun, adjective, participle in the same line and in one head template—which would perhaps also save memory—as the opposite layout derives from the wrong assumption that terms need to be ordered by their English translations. Fay Freak (talk) 22:56, 30 March 2021 (UTC)[reply]
In defunto, I suspect the noun could be deleted for the same reason English sick was. Even the adjective defunto could be deleted, and all the usexes could go under the Participle header if Italian-speaking editors think that's a good idea. (If we consider substantivized defunto or sick to be 'really' an adjective or participle, examples of substantivized use do belong under that POS: hence sick#Adjective has a usex "care for the sick" which used to be under the noun.)
When an inflected form is homographic (on the level of page name) to the lemma, and is already mentioned in the inflection table / headword, I don't think it needs to be repeated as a definition. Some -ese entries ("Chinese", maybe?) used to have "Plural of Chinese." as a sense, which is unnecessary. As long as someone searching the page for "pīlā" will find it in the inflection table, I don't think it needs a ===Noun=== header, though people sometimes give it one. - -sche (discuss) 00:57, 31 March 2021 (UTC)[reply]
WT:About Dutch contains the sentence
This means that, if an adverb has the same meaning as the adjective, it is not included separately.
I think this philosophy could be extended to, for example Latin, and we could decree that substantivized adjectives, participles, etc. should only be included if their meaning is not predictable from the meaning of the original word. Actually, I think most English -ly adverbs could be described as "adverbial form of" instead of what we do now as well. MuDavid 栘𩿠 (talk) 01:00, 31 March 2021 (UTC)[reply]

I wanted to show the simple English Wiktionary. They have a good idea of listing all the forms of a word. In a box. I say it makes good use of the width of a page. 119.56.100.56 15:31, 31 March 2021 (UTC)[reply]