User:Chernorizets/Bulgarian Lemma Improvement Project
Overview
[edit]The goal of the Bulgarian Lemma Improvement Project (BLIP) is to raise the overall standard of Bulgarian lemma entries on English Wiktionary. To that end, the project proposes specific actions editors can take, grouped into tiers. The tier system is meant to help match project expectations to individual editors' available time and energy, as well as language fluency. In particular, Tier-1 is designed to be within reach for every currently (2023) known active Bulgarian editor.
Vision
[edit]The "north star" of this project is that, in the fullness of time, every Bulgarian lemma entry should have, at a minimum:
- an informative etymology section, with select cognates in other languages when such exist
- a pronunciation section that's as helpful as possible to learners of the language
- high-quality English translations, for each word sense, with either usage examples, quotations or both.
- the corresponding English terms should list the Bulgarian term in their "Translations" sections
- derived terms, related terms and "-nyms" (e.g. synonyms, antonyms) to place the word in its broader context
- dictionary references, or quotations for words not yet in published dictionaries
We envision that this will be an ongoing rather than a fixed-duration project, since it naturally competes with other worthy goals such as increasing Bulgarian coverage. The hope is that, over time, the baseline quality of Bulgarian lemma entries increases, thereby also encouraging a higher standard for new entries.
Participation
[edit]Participation in this project can take many forms, depending on editors' availability and personal interest. We're happy with on-and-off, regular, time-limited or even one-off participation. We're also happy with editors choosing their focus - e.g. whether they want to work on entries starting with a particular letter, or on a particular subject, or of a particular grammatical category, etc.
We encourage, but do not require, participants to put "BLIP: " or "[BLIP]" (or their lowercase equivalents) in their edit messages to indicate that an edit is related to this project. There is also a section at the end of this document where editors can optionally let others know what they're working on.
Tiers
[edit]There are currently four tiers, each consisting of a set of recommended edits to an entry, if the entry needs them. The tiers are designed to be cumulative - e.g. Tier 2 implicitly subsumes everything under Tier 1. However, it's OK for editors to make later-tier changes before earlier-tier ones, if that's consistent with their interests and inclination. It's also OK to split the overall work into multiple edits - e.g. you can bring an entry to the Tier-2 bar by making a Tier-1 edit first, and later an incremental Tier-2 edit.
All that said, it is the strong hope of this project that all lemma entries meet at least the Tier-1 quality bar. Tier 1 should be accessible to any editor who can make use of one of our standard Bulgarian online dictionaries (e.g. see {{R:bg:RBE}}
).
Tier 1 - the minimum viable bar
[edit]This is the "baseline" tier that expresses our vision of a minimally viable Bulgarian entry. While a lot of entries already meet that bar, many don't. For BLIP to be considered a success, the proportion of Bulgarian entries that meet this bar needs to be very high.
Pronunciation
[edit]Every entry should have a pronunciation section. Based on the currently available Bulgarian templates, a pronunciation section should have, at a minimum:
- an IPA transcription using
{{bg-IPA}}
, indicating the correct stress for polysyllabic words.- a stress is considered "correct" if it's listed in one of the official dictionaries published by the Bulgarian Academy of Sciences. Dialectal and regional pronunciations are out of scope for Tier 1.
- per that criterion, some words still have more than one valid stress pattern (e.g. молив (moliv)). In that case, each stress pattern should get its own
{{bg-IPA}}
line. - verbs ending in stressed "-а" or "-я" require the
|endschwa=
parameter to be set to1
.
- for polysyllabic words, a hyphenation using
{{bg-hyph}}
.
Part of speech
[edit]Тhe part of speech should be one of the allowable parts of speech listed in WT:EL (see link above).
For parts of speech that have a custom Bulgarian headword-line template - e.g. {{bg-noun}}
- that template should be used. In all other cases, the {{head}}
template should be used - e.g. {{head|bg|prefix}}
. See the category above for currently available Bulgarian headword templates.
Some basic expectations:
- nouns should specify gender. For pluralia tantum nouns, the gender is
p
(for plural). - verbs should specify aspect - imperfective, perfective or both - as well as their perfective or imperfective counterpart (if it exists). See
{{bg-verb}}
for how to do this. - adjectives should specify whether they are indeclinable, e.g. коскоджа (koskodža). See
{{bg-adj}}
. - adverbs should specify their comparative and superlative forms, if either one exists. See
{{bg-adv}}
.
English definitions
[edit]For Tier 1, what we're looking for is quality translations from Bulgarian to English. Usage examples and quotations will be covered by subsequent tiers.
There is already a wealth of English-Bulgarian and Bulgarian-English dictionaries in existence, so there isn't much new ground to break here. PONS is a pretty decent bi-directional online dictionary, and useful if you're unsure about a translation. If you're unsure about how to translate something, ask another editor, or post on Wiktionary:Tea room.
Be sure that you have familiarized yourself with the distinction between a translation, a gloss definition (see {{gloss}}
) and a non-gloss definition (see {{ng}}
). In a nutshell - a translation makes sense when there's a direct English equivalent of a Bulgarian word; otherwise, you need a gloss or non-gloss definition. Both gloss and non-gloss definitions explain the meaning of a word, but gloss definitions can be substituted for the word in a sentence. E.g. if the verb "to flaffle" was given the definition "to make a gurgling sound", then you could substitute that definition in the sentence "He flaffled" → "He made a gurgling sound." That's a gloss definition. If, instead, you had defined "to flaffle" as "an onomatopoetic verb that describes making a gurgling sound", then you couldn't replace "flaffle" with that in the example sentence - it's a non-gloss definition.
Recommendations:
- don't go overboard with the number of English words you provide as translations. Wiktionary has a lot of English synonyms, some of which are rare, obsolete, specific to a region, etc. Provide the most common translations in the English variety you know (e.g. American or British English).
- if the English translation has many meanings (e.g. set or can), use
{{gloss}}
to list out the meanings that apply to the Bulgarian lemma. See for example балон (balon) - it has 5 of the 14 meanings of English balloon, each listed out separately.
For Tier 1, don't worry about labels ({{lb}}
and {{tlb}}
) unless you feel comfortable adding them. Tier 2 goes more in depth on label use.
Declension and conjugation
[edit]Verbs should have a "Conjugation" subsection which uses {{bg-conj}}
. Nouns and adjectives should have a "Declension" subsection, utilizing {{bg-ndecl}}
and {{bg-adecl}}
, respectively.
Expectations:
- always double-check the declension/conjugation table generated by the template. If something looks off, it's often because the template syntax used is not entirely correct. Inflection templates have lots of options and a learning curve. If you're not sure the table is correct, ask another editor.
- unless you really know better, suppress vocative masculine adjective forms by specifying
/-voc
in the options to{{bg-adecl}}
. See the template's documentation on how to do that. - unless you really know better, don't add vocative forms to noun declensions via
{{bg-ndecl}}
(they are suppressed by default). Use of the vocative in modern standard Bulgarian is limited outside of given names, and it takes a certain amount of background to know where it makes sense to be included. Such background is not assumed for Tier 1.
References
[edit]Every entry should ideally have at least one reference in its "References" section.
The most common type of reference for Bulgarian entries is a dictionary reference. We have templates for several popular dictionaries which are available online, and the most commonly used ones are {{R:bg:RBE}}
and {{R:bg:RBE2}}
. It's often a good idea to just add those two, and double-check in Preview Mode before publishing that the generated links actually work. If you click on them and they take you to a URL where you can find definitions for the word, then they work. When a word is missing from {{R:bg:RBE}}
, you should check if it's in {{R:bg:Infolex}}
.
A user-edited online slang dictionary a la Urban Dictionary is available via {{R:bg:BGJ}}
. A dictionary of neologisms is available via {{R:bg:Neolex}}
. The Bulgarian Etymological Dictionary is available via {{R:bg:BER}}
; however, etymology is considered out of scope for Tier 1. This is primarily to allow non-native editors who are concurrent language learners to contribute to Tier-1 improvements. BER is not an easy read, and incorporating it in an entry is often not as simple as just translating the Bulgarian text into English.
Some words may be too new to have made it into Bulgarian dictionaries yet. Don't worry about such words for Tier 1.
Tier 2 - usability and discoverability
[edit]Tier 2 builds on the bar set by Tier 1, by providing users with better clues on how Bulgarian words are used in practice, as well as by indicating relevant stylistic and grammatical considerations. The latter include, among other things, whether a word is dialectal, obsolete, uncountable or derogatory.
The second goal of Tier 2 is to improve the discoverability of Bulgarian entries, by:
- ensuring that Bulgarian words are listed under the appropriate "Translations" section of English words
- organizing Bulgarian vocabulary into topic categories, so users can find words related to their area of interest more easily
Richer headwords
[edit]In Tier 2, we want to take advantage of the ability to specify related word forms that some Bulgarian headword templates provide.
Where applicable, consider specifying:
{{bg-noun}}
: relational adjectives, feminine/masculine equivalents, diminutives and augmentatives (undocumented but available via|aug=
).{{bg-verb}}
: as already expected in Tier 1, the imperfective or perfective counterpart of the verb{{bg-adj}}
: abstract nouns, adverbs and diminutive forms
Not every noun, verb or adjective will have all (or any) of these additional forms. Some may have multiple applicable forms, e.g. multiple noun diminutives (as in вода (voda)).
Adding labels
[edit]We use labels to add grammatical, stylistic and topic categorization information to entries. Grammatical information includes things like whether a noun is uncountable, or a verb is transitive. Stylistic information captures whether a word is colloquial, derogatory, dialectal, archaic, etc. Topic categorization lets users know whether a word is e.g. a physics or an art history term. Topic labels often automatically add an entry to a particular topic category.
The two main label templates are the term label: {{tlb}}
, and the context label: {{lb}}
. A term label applies to all listed senses of a word, and is placed directly after the headword template. A context label applies to a specific word sense, and is placed in front of it. In other words, if you find yourself applying the same label to all senses of a word, it should most likely be a term label. For a sense label example, see реотан (reotan). For a term label example, see лих (lih) (which also uses several sense labels). For information on what labels are available, consult the documentation of {{lb}}
.
Recommendations and expectations
[edit]- nouns should indicate whether they are uncountable, either with a term label (if true for all senses) or a context label (if true only for some senses). Otherwise, a noun is considered countable by default.
- verbs should indicate whether they are transitive, intransitive or reflexive, either with a term label or a context label. There is no default for verbs. Note that Bulgarian uses the special labels
reflexive-se
andreflexive-si
. - if a word (or word sense) is dated, archaic or obsolete, one of those labels should be provided using these guidelines. As in English, it's possible for e.g. a word in current use to have individual obsolete senses, so be careful about whether the label should be a term label or context label.
- if a word (or word sense) is dialectal, colloquial, slang or derogatory, that should be indicated via the appropriate labels.
- for choosing topic labels to add (e.g. things like music, physics, etc), take a look at the topic labels added to the equivalent English word sense (if any). That's what we did with e.g. пулсар (pulsar).
- when multiple different kinds of labels apply, list them in the order: grammatical, stylistic, topic. For example
{{lb|uncountable|dialectal|agriculture}}
.
Labels often automatically put an entry into a category, so when you add or modify labels, always check for red-link (not yet created) categories at the bottom of the entry, and create them using {{auto cat}}
.
Updating English entries' translations
[edit]To make Bulgarian entries more discoverable, we should ensure they're listed in the "Translations" sections of the corresponding English entries. That way, Wiktionary can function as a bidirectional English-Bulgarian/Bulgarian-English dictionary. While many English entries list their Bulgarian translations that are also on Wiktionary, quite a few don't.
Well-crafted English entries have separate translation tables per word sense, making it possible to add Bulgarian translations that match the correct sense(s). That's not true for all English entries, and some English entries don't even have a Translation section. Translations are added using a Wiki gadget called TranslationAdder (see WT:TADDER), which is enabled for everybody by default.
Note that not all Bulgarian entries will have corresponding English entries on Wiktionary. That could be because there's no adequate English translation in general, or because the English translation hasn't yet been added to Wiktionary, or e.g. because the English translation would be considered a "sum-of-parts" (WT:SOP), and thus ineligible for addition to Wiktionary. The rest of this section assumes that you're working with a Bulgarian entry which has English translations, and those translations have entries on Wiktionary.
Steps
[edit]A lot of this is already covered by the links provided in "See also" at the top of this section.
For each listed sense of a Bulgarian lemma:
- click on the English translation(s) to open their Wiktionary entries
- locate the "Translations" section of the English term
- if it's missing, check whether there's a Translations section under a synonym of the English term. Sometimes, to reduce duplication, English entries pick a "representative synonym" to be the entry where all translations are listed.
- if it's missing, and there is no "representative synonym", add the section to the English entry, and initialize its contents using
{{trans-top}}
and{{trans-bottom}}
. See the template docs, and any English entry for a common word, to get an idea of how these two are used.
- if the English term has multiple senses, locate the translations table for the right sense. If one doesn't exist, create it using
{{trans-top}}
and{{trans-bottom}}
, giving it an appropriate gloss to serve as a heading. - inspect any already added Bulgarian translations. Some might be incorrect, or missing word stress. If you spot that, edit the page directly to remove the incorrect translations and/or add word stress.
- add the Bulgarian translation using the UI gadget, and click the
Save
button that will show in your browser when you're done. When adding Bulgarian translations:- enter the
bg
language code and hit TAB to load applicable checkboxes. - write out the Bulgarian word, indicating word stress if it's multisyllabic. We find it easiest to just copy-paste the headword from the Bulgarian entry, since it already has the stress.
- for nouns, indicate gender. For verbs, indicate imperfective/perfective/both. There will be checkboxes for those options.
- enter the
Cleanup
[edit]Click on the "What links here" link under "Tools" in the Wiktionary main menu. This should list all the English pages where you added the Bulgarian word as a translation. You may see additional English pages, which means other editors before you added the Bulgarian word as a translation of those additional English words. Double-check those translations, removing incorrect ones and/or adding word stress as needed.
Usage examples and quotes
[edit]In Tier 2, we start complementing the quality English translations provided in Tier 1 with example sentences and quotes. This gives learners, translators and anyone else interested in Bulgarian an even better idea of a lemma's actual usage in the language.
General guidelines for adding example sentences can be found at Wiktionary:Example sentences. The main template for formatting example sentences is {{ux}}
, and its variant {{uxi}}
for short examples. Guidelines for adding quotations can be found at Wiktionary:Quotations. There are several templates - such as {{quote-book}}
and {{quote-journal}}
- depending on the type of durable media quoted.
Another kind of example is a collocation - a combination of two or more words that commonly go together in a language, such as "tight budget". A Bulgarian example is потомък (potomǎk, “descendant”), which includes the example collocation пряк потомък (prjak potomǎk, “direct descendant”). Collocations are not complete sentences, but they show users common word combinations that they might encounter in practice. For general guidelines on adding collocations, see Wiktionary:Collocations. Collocations go before example sentences.
For the scope of Tier 2, the main expectation is that editors add well-chosen collocations and/or example sentences. Quotations are a stretch goal, except in the case where a word has no applicable dictionary references. In those cases, at least one quotation is required. Well-chosen examples don't just mention a word, but rather use it in a way that helps a reader form a mental picture of the word. For instance, if you had to give an example for "chair", a not-so-good example might be: "On top of the pile of trash there was a chair." In this example, the word "chair" is mentioned, but not in a way that represents what a chair is or does. A better example might be: "The old man sat in his favorite chair in front of the TV." It is, indeed, common to sit in a chair in order to watch television, so this example captures more of the real-life use of the word. Very common words like "chair" don't necessarily need an example sentence, but the principle applies in general.
Stress should be indicated at a minimum on the word for which examples, collocations or quotations are being given. We recommend that all stressed words in collocations and example sentences indicate their stress. For quotations, use your judgment - older and dialectal texts might contain words stressed differently from the contemporary standard language, so add stress when you're sure you're right. Remember that Bulgarian prepositions, conjunctions, pronouns and a few other word types are usually pronounced together with the word that follows, so they wouldn't get their own stress (unless done for emphasis).
Topic categories
[edit]Topic categories help organize entries by subject matter - e.g. astronomy, music, fabrics, biological taxa, etc. They provide another way for users to discover entries in a language, taking advantage of users' interests, hobbies or professional needs.
As previously discussed, certain context labels will automatically add entries to topic categories. Additional topic categories are added to an entry using the {{C}}
template, e.g. {{C|bg|Physics}}
. Using this template is preferred over raw category markup, e.g. [[Category:bg:Physics]]
- in fact, there is a maintenance category listing Bulgarian entries with raw category markup. If you want to save yourself some typing, you could also add topic categories via the HotCat gadget.
A good starting point for considering what topic categories to add to a Bulgarian entry is to look at the topic categories of the corresponding English entry (if one exists). If there is no corresponding English entry, use information provided in Bulgarian dictionaries, Wikipedia, or your personal understanding of the subject matter. Topic categories have subcategories, and it's often best to assign an entry to the most specific (sub-)categories that apply to it. It is, however, not a requirement or expectation that every entry should be assigned to topic categories.
The English category tree can be found at Category:en:All topics. It is a superset of the Bulgarian category tree Category:bg:All topics, because we simply haven't created all the same categories for Bulgarian yet. Note that you can't just make up any category name and have it work properly with Wiktionary - see Module:category_tree/topic_cat/data for the names of recognized topic categories and subcategories. This is where looking at the topic categories applied to an equivalent English entry can save you some time.
It will sometimes be the case that a valid category you specify via {{C}}
won't exist for Bulgarian yet - in that case, click on the red link and create the category by setting its text contents to {{auto cat}}
. That template works with the category tree module, and will automatically do the right thing if you're using a recognized category name.
Tier 3 - Etymology
[edit]Tiers 1 and 2 ensure that Wiktionary is a good standard Bulgarian-English and, to an extent, English-Bulgarian dictionary. That's a great milestone to reach, but short of the unique strengths that Wiktionary brings to the table compared to standard bilingual dictionaries. One of those strengths is that Wiktionary entries can, and often do, include the etymologies of words and expressions, deepening users' understanding, and revealing the unique and often surprising histories of those words and expressions. Connections with other languages - both within the same language family and outside of it - are elucidated, as are connections between words in the same language.
Bulgaria has sat at the crossroads of civilizations throughout its history, and that's reflected in its lexical makeup - alongside a solid Slavic core formed by inheritance, derivation and borrowing, there are influences from Greek (through prolonged contact), Latin, Ottoman Turkish and through it - Classical Persian and Arabic, French, German, Italian, and a host of other languages, including English. Tier 3 is about providing etymologies for Bulgarian entries, as well as the related and derived terms revealed by etymological information.
Etymology section
[edit]The main reference source for etymologies of Bulgarian words is the Bulgarian Etymological Dictionary (BED). As of this writing (11/2023), eight volumes have been published, of which the first seven - covering words up to терясвам (terjasvam) - are available online for free. We use {{R:bg:BER}}
to cite the dictionary. Dictionary entries contain the headword's origin, cognates and derived terms (among other more detailed information).
BED will often provide the Old Church Slavic etymon for modern Bulgarian words, if it is attested in the OCS canon. A lot of Bulgarian entries today are missing this link, and instead show inheritance directly from Proto-Slavic. Make sure entries with etymology sections (including those you write yourself) reflect inheritance/derivation from OCS whenever possible. For words that aren't available in BED, check out {{R:sla:EDSIL}}
or {{R:sla:ESSJa}}
.
There are a few things to watch out for when citing BED:
- Ottoman Turkish etymons are often given as just "Turkish" and rendered in the Latin alphabet. Most words of Turkish origin entered Bulgarian during the Ottoman Turkish period, so our etymologies should reflect the Ottoman Turkish form, written in the Arabic script. If you're lucky, the Ottoman Turkish word survives in modern Turkish, the modern Turkish word has a Wiktionary entry, and the entry's etymology section lists the Ottoman Turkish ancestor form. You won't always be this lucky. You can find help in the
#turkic
channel on Discord, or you can request that the term be added by specifying{{l|ota||tr=<Latin equivalent>}}
(if you don't know the Arabic-script form). - the provided Proto-Slavic ancestor forms refer to an older stage of the Proto-Slavic language than the one used for Wiktionary reconstructions. For example, BED gives *nagarditi instead of *nagorditi (see *nagorda) for наградя (nagradja) (vol. 4, p. 464), showing the state of affairs before short /a/ became short /o/. Wiktionary has a number of conventions for representing Proto-Slavic etymons, which you can find on Wiktionary:About Proto-Slavic. If you can't determine the Wiktionary-compliant Proto-Slavic form to use in an etymology section, use the form given in BED, and ping someone for help. You can do so on the entry's talk page, on the Etymology scriptorium page, or in the
#balto-slavic
channel on Discord.
Guidelines and recommendations
[edit]- familiarize yourself with
{{inh+}}
,{{der+}}
and{{bor+}}
and when to use them.- if a term is inherited via successive stages of ancestor languages, e.g. OCS < Proto-Slavic < Proto-Balto-Slavic < Proto-Indo-European, use
{{inh+}}
for the nearest ancestor language, and{{inh}}
for its ancestors. - if a term is borrowed from another language, and that language in turn borrowed it from yet another language, use
{{bor+}}
with the immediate donor language, and{{der}}
for its donor language, etc.
- if a term is inherited via successive stages of ancestor languages, e.g. OCS < Proto-Slavic < Proto-Balto-Slavic < Proto-Indo-European, use
- sometimes, an ancestor chain can get pretty long and winding. There is a balance between showing too much information and hiding useful detail. The template
{{dercat}}
is available to indicate that a word has additional ancestors without including them in the etymology section text explicitly.- usually, for Bulgarian terms deriving from Proto-Slavic, it's sufficient to show ancestry up to Proto-Slavic, leaving the rest to
{{dercat}}
. However, if there is a direct, reconstructed Proto-Indo-European (PIE) ancestor, it's OK to go up to PIE. English entries often do that. - for Bulgarian terms deriving from non-Slavic languages, it's sufficient to go up to the last non-proto-language. E.g. if a word is ultimately from Ancient Greek, there's no need to show the Proto-Hellenic or PIE form.
- even with those strategies in mind, you may occasionally end up with ancestor chains that subjectively feel too long. It's ok to omit intermediate ancestors for brevity, so long as you indicate that in the etymology section, usually by saying "ultimately from" instead of just "from".
- don't automatically assume the existence of ancestors without checking. Not every Proto-Slavic term goes back to PIE, and neither does every Ancient Greek term, to give just two out of many examples.
- usually, for Bulgarian terms deriving from Proto-Slavic, it's sufficient to show ancestry up to Proto-Slavic, leaving the rest to
- the inclusion of cognates is a generally good practice, but like ancestor chains, you shouldn't overdo it.
- familiarize yourself with
{{cog}}
and{{ncog}}
. - common words like ден (den) which have close cognates in every Slavic language don't particularly benefit from listing all of those cognates in the etymology section, especially if we have a Proto-Slavic reconstruction page that already lists them all
- words that are particular to the South Slavic area or the Balkans do, on the other hand, benefit from the inclusion of cognates. See for example Appendix:Balkanisms.
- words that lack a PS reconstruction or are otherwise not common within Slavic (e.g. only showing up in South and East Slavic, but no West) can benefit from cognates as well.
- since this is English dictionary, it's a good idea to mention an English cognate if one exists
- words like захар (zahar) have cognates in a huge number of languages. Use your discretion.
- familiarize yourself with
- if a Bulgarian lemma is inherited, derived, borrowed or calqued from a lemma in an ancestor language, the latter's "Descendants" section should list the Bulgarian lemma. Make sure you're familiar with
{{desc}}
.- if the "parent" entry doesn't have a "Descendants" section, add one, and add the Bulgarian descendant to it.
- if the parent entry has a "Descendants" section, add the Bulgarian descendant in correct alphabetical order by language name. Some entries list direct descendants first, and non-direct descendants (loans, derivations) second. Adhere to the way the entry orders its descendants. Make sure that the Bulgarian descendant indicates lexical stress (e.g. by simply copying it from the headword).
Derived terms
[edit]BED often includes a rich list of derived terms, some of which may be dialectal, archaic or obsolete. Use your judgment on how many of them to include in the "Derived terms" section of an entry. It can also be helpful to look up those derived terms in our standard dictionaries.
To keep things neatly organized, separate derived terms by part of speech by using e.g. {{col-auto|bg}}
with |title=adjectives
. {{col-auto}}
will figure out the number of columns for you in case there are multiple derived terms, as well as whether or not to collapse the list. It is also used by the specific template we have for derived verbs - {{bg-derived verbs}}
. For an example tying all of these together, see черпя (čerpja) or кадър (kadǎr). If a word only has a few derived terms, it's OK to forgo these templates and list the derived terms directly.
Derived terms should indicate word stress. Derived verbs should further indicate aspect, which you get (almost) for free by using {{bg-derived verbs}}
.
Related terms
[edit]Words in the "Related terms" section generally have the same morphological root as the main lemma, but aren't derived from it using affixes. For example, if you're working on the entry for изправям (izpravjam), related terms include прав (prav) and правило (pravilo), all three of which ultimately come from *pravъ.
There are no hard and fast rules about how many related terms to include in an entry, so use your judgment. It might be a good idea to try and reduce duplication across entries - for instance, rather than copying all the derived terms of черпя (čerpja) into each derived term's "Related terms" section, you could simply put черпя (čerpja) in that section. A user clicking on черпя (čerpja) would then see all of its remaining derived terms. For general guidance, see Wiktionary:Entry layout#Related terms. By that guidance, arguably a better related term for изправям (izpravjam) might be изправност (izpravnost), since it looks like it might be derived from the verb's past passive participle, but it's actually borrowed from Russian исправность (ispravnostʹ).
Descendants
[edit]Bulgarian words and expressions get borrowed by other languages too! Romanian is the most common such language, followed primarily by other languages spoken on the Balkans. English sometimes has unadapted Bulgarian borrowings related to traditional culture, such as gadulka and sharena sol. BED usually indicates if a Bulgarian word is borrowed by other languages.
Generally, if you list a foreign-language word as a descendant of a Bulgarian word, you should also make sure that the foreign-language word's etymology section indicates that it was borrowed or derived from Bulgarian. In practice, since you likely don't know all the languages that borrow from Bulgarian, you should exercise caution and be open to collaboration. In particular:
- if the foreign-language entry already has an etymology section, and it doesn't mention Bulgarian, contact an active editor in that language - either on the entry's talk page or on Discord - before making changes. Other languages have their own etymological dictionaries, and they don't always agree with ours.
- Bulgarian loanwords in Albanian might actually be from Macedonian. Our dictionaries don't make a distinction, so it can be hard to tell. The older the loanword is in Albanian, and the closer it follows Bulgarian phonological characteristics (e.g. position of the stress; the presence of /ɤ/), the likelier it is to be from Bulgarian. Generally, it should predate the standardization of the Bulgarian language in the 2nd half of the 19th century.
- Bulgarian loanwords in Romanian might be from Serbian, especially if the Bulgarian and Serbian equivalents are very close or identical.
- Bulgarian loanwords in Turkish, if borrowed before the dissolution of the Ottoman Empire, might be better classified as loanwords in Ottoman Turkish.
Tier 4 - Enrichment
[edit]Tiers 1 to 3 spell out a roadmap for making Wiktionary a capable explanatory, etymological and synonym Bulgarian dictionary, all in one! If we've gotten thus far consistently for most Bulgarian entries, we're in really good shape and we should be proud of our work. Tier 4 is about going the extra mile and taking full advantage of Wiktionary's unique format and strengths.
All activities listed here are optional and largely independent, so editors can pick and choose the ones they're passionate about - it all qualifies as Tier 4 work. Of these, adding audio recordings is a particularly helpful activity, especially for current or potential Bulgarian learners.
Audio recordings
[edit]Wiktionary gives us the opportunity to add native-speaker recordings to entries. User:Kiril kovachev has done an incredible amount of good work towards increasing Bulgarian audio coverage, so if he's still around when you're reading this, you might want to collaborate with him on that. For guidance on uploading audio files, check out Wiktionary:Pronunciation#Audio files. Once uploaded to Commons, you can add an audio file to an entry using {{audio}}
.
Additional quotations
[edit]In Tier 2, we got started on adding quotations for Bulgarian words. Per WT:ATTEST, a word has to either be in "clearly widespread use" (e.g. котка (kotka)), or we need to provide "at least three independent instances [of usage] spanning at least a year". Since Bulgarian is a "well-documented language" (see WT:WDL), it is subject to that attestation requirement. Tier 4 is a good time to add quotations for each applicable word sense towards meeting the minimum count.
Images
[edit]For lemmas that would benefit from a picture (e.g. пъстърва (pǎstǎrva), more so than краставица (krastavica)), you can add it using [[File:CommonsFileName.png|thumb|<caption in Bulgarian>]]
. You put this right after the L2 language header and before any of the L3 headers.
There is also the ability to add a picture dictionary to an entry, usually for entries that represent "umbrella" terms with many individual examples. A Bulgarian entry with a picture dictionary is гризач (grizač). See {{picdic}}
and {{picdicimg}}
for more information, and take a look at the English pages where those templates are used to get an idea of applicability.
Tables and lists
[edit]Colors, months, zodiac signs, chemical elements and days of the week are examples of closed sets of related terms. Wiktionary allows us to organize them in lists and tables, so that users can get an at-a-glance view, and quickly navigate to any of the terms in the set. Check out Category:Bulgarian auto-table templates and Category:Bulgarian list templates for what we have today, and how these templates are used. It's also good to check out the English ones, for an idea of where else these might be applicable.
Plant and animal species
[edit]The systematic taxonomic names of living things are sometimes available as Translingual entries on Wiktionary, but not always. Our sister project Wikispecies is devoted specifically to the cataloguing of species and their "vernacular names", i.e. non-scientific names in different languages. See the template {{taxlink}}
for including a link to Wikispecies.
Other
[edit]These are just some of the ways to enrich Wiktionary entries - look around, see what you like, and consider bringing it over to Bulgarian entries!
Participant notes
[edit]If you'd like to let your fellow Bulgarian editors know what you're working on as part of this project, feel free to update the list below. This is completely optional, and it's only there for visibility and to help prevent duplicate effort.
Current participants:
- add a line below with your username and what you're working on
- User:Chernorizets: words starting with the letter "С с". It's a big letter, so other editors are welcome to join.
- User:Kiril kovachev: working alphabetically from the beginning of the dictionary. Currently, I'm up to аглика. I previously added many words with я and ш, which I'll go and review as well.
- User:SimonWikt: adding/improving every-day words for leaners of the language.