Jump to content

User talk:DPMaid

Page contents not supported in other languages.
Add topic
From Wiktionary, the free dictionary
Latest comment: 5 years ago by Dan Polansky in topic Updating Telugu further reading and URLs

Of course a man needs a maid. SemperBlotto (talk) 09:42, 2 March 2014 (UTC)Reply

Restoring pedialite

[edit]

I will restore pedialite template using AWB, using dump enwiktionary-20140728-pages-articles.xml.

--DPMaid (talk) 18:52, 22 August 2014 (UTC)Reply

The actual number of edits was around 2700; the remaining pages from the worklist still used "pedialite".

Example diff: diff

Prevalence of the competing markups obtained from the dump from Windows command line:

  • 'find /c "{{pedialite" enwiktionary-20140728-pages-articles.xml' yields 12980 hits.
  • 'find /c "{{projectlink|wikipedia" enwiktionary-20140728-pages-articles.xml' yields 1 hit. Yes, only 1 hit.
  • 'find /c "{{projectlink|pedia" enwiktionary-20140728-pages-articles.xml' yields 601 hits.
  • 'find /c "{{projectlink" enwiktionary-20140728-pages-articles.xml' yields 857 hits.

--DPMaid (talk) 01:00, 23 August 2014 (UTC)Reply

Restoring Czech rhyme pages

[edit]

I have restored Czech rhyme pages to their state before a run by MewBot from the beginning of September 2014. The MewBot run was probably undiscussed, as is their habit.

Some characteristics:

  • Technology: AWB
  • Number of replacements: over 1300
  • Replacement:
    • Restore the top line to read "[[Wiktionary:Rhymes|Rhymes]] > [[Rhymes:Czech|Czech]]".
    • Place Category:Czech rhymes at the bottom
  • Consequence: the rhyme pages are no longer categoried into rhyme subcategories. I oppose rhyme subcategories.
  • Example edit: diff

This DPMaid run restored the pages to status quo ante.

--DPMaid (talk) 08:35, 14 September 2014 (UTC)Reply

[edit]

I added Polish external links to between 1000 and 2000 pages. I first started alphabetically and then by a frequency list. To add an external link, I would open a command URL that uses User:DPMaid/common.js.

The external link is to a PWN dictionary page, which has multiple dictionaries and corpus search results; overall looks quite neat.

I only added links to pages for which the target page had dictionary results. To do that, I run the candidate term list through the following filter:

import sys, urllib, httplib, re
# Take a list of Polish terms and output only those that are found in Polish dictionaries online
for term in open(sys.argv[1]):
  term = term.rstrip()
  termURL = term
  # Assumption: the input encoding is UTF-8
  url="http://sjp.pwn.pl/szukaj/" + termURL + ".html"
  notFound = False
  dictPolishFound = False
  for line in urllib.urlopen(url):
    if "Nie znaleziono" in line:
      notFound = True
      break
    if re.search(".*<span class=.entry-head-title.>S..?ownik j..?zyka polskiego.*",line):
      dictPolishFound = True
      break
    if re.search(".*<span class=.entry-head-title.>Wielki s..?ownik ortograficzny.*",line):
      dictPolishFound = True
      break
    if re.search(".*<span class=.entry-head-title.>Wielki s..?ownik W. Doroszewskiego.*",line):
      dictPolishFound = True
      break
  if not notFound and dictPolishFound:
    print term
  else:
    print >> sys.stderr, "Not found in PWN: " + term

--DPMaid (talk) 18:14, 14 September 2014 (UTC)Reply

Moving Wikipedia box and images into English language section

[edit]

I have moved Wikipedia box from above the English language section into that section.

  • Technology: WT:AWB
  • Search term: {{wikipedia}}\n==English==
  • Replacement term: ==English==\n{{wikipedia}}
  • Number of edits: 2018
  • Example edit: diff
  • Edit summary: Move WP box into English language section

--DPMaid (talk) 12:07, 22 March 2015 (UTC)Reply

I have, furthermore, moved images from above the English language section into that section.

  • Technology: WT:AWB
  • Search term: (\[\[(File|Image):.*\]\])[\n ]*==English==
  • Replacement term: ==English==\n$1
  • Number of edits: 385
  • Example edit: diff
  • Edit summary: Move image into English language section
  • Note: I sometimes added more edits manually.

--DPMaid (talk) 13:12, 22 March 2015 (UTC)Reply

A follow-up on the 1st step, to catch e.g. "{{wikipedia|dab=digit}}":

  • Search term: ({{wikipedia\|[^}]*}})[\n ]*==English==
  • Replacement term: ==English==\n$1
  • Number of edits: 777
  • Example edit: diff

To put these numbers in perspective:

  • Search for ==English==[\n ]*{{wikipedia found 30093 items
  • Search for ==English==[^=]*\[\[(File|Image) found 7282 items

--DPMaid (talk) 14:11, 22 March 2015 (UTC)Reply

[edit]

I added over 250 external links for Slovak entries:

I am trying to figure out how to query slovniky.korpus.sk web site using Python to tell me whether the target page exists, doing that for a list of pages that have a Slovak lemma and are missing Slovak enternal links. For some reasons, I am getting weird results for URLs like https://slovniky.korpus.sk/?w=konšpiračná+teória. I expect to get a page containing "nič nebolo nájdene", but I get some other page instead such as that for "čo". Maybe later. --DPMaid (talk) 16:25, 2 August 2015 (UTC)Reply


I added over 1000 external links to Slovak entries:

  • I proceeded as above, using command URLs that take advantage of User:DPMaid/common.js.
  • I manually checked presence of entries in slovniky.korpus.sk.
  • I achieved rather high degree of coverage. There are now about 200 Slovak lemma entries with no external link to slovniky.korpus.sk, out of 2545 Slovak lemma entries.
  • I used the following little Excel VBA, having "Yes", "Y", or "y" in the third column to indicate presence in the dictionaries:
Sub SubmitYeses()
  For Row = 2 To 300
    If Cells(Row, 3) = "Yes" Or Cells(Row, 3) = "Y" Or Cells(Row, 3) = "y" Then
      addr = "http://en.wiktionary.org/wiki/" & Cells(Row, 1) & _
        "?action=edit&task=addel&language=Slovak&buttonToPress=wpSave"
      ActiveWorkbook.FollowHyperlink Address:=addr
      Application.Wait (Now + TimeValue("0:00:06"))
    End If
  Next
End Sub

--DPMaid (talk) 23:04, 9 August 2015 (UTC)Reply

[edit]

I added external links to {{R:PSJC}}, {{R:SSJC}}, and sometimes {{R:KNLA}} to over 800 Czech entries. I went by a frequency list; many more can be added. I used the same technique as in #Adding Slovak external links; I updated User:DPMaid/common.js to support Czech. I checked the presence of terms in PSJC and SSJC by automatically quering the dicts online. --DPMaid (talk) 14:02, 22 August 2015 (UTC)Reply

I expanded a little less than 1500 additional Czech entries with external links. All links were checked to have targets via a run of a Python script before the adding began, although there was a hiccup during the checking that I had to correct. --DPMaid (talk) 23:13, 22 August 2015 (UTC)Reply
[edit]

I added 100 external links to {{R:DEX}}.

--DPMaid (talk) 21:56, 21 December 2015 (UTC)Reply

[edit]

I added over 50 external links to {{R:TDK}}, after creating the template.

--DPMaid (talk) 12:05, 24 January 2016 (UTC)Reply

Adding headword templates to Czech entries

[edit]

I added headword templates to Czech entries using AWB. One consequence is a better population of the lemma category. Where header template was already there, I removed a manual category is present. On occasion, I made some manual edits along the way.

  • Example edit: diff
  • Number of edits: Over 650
  • Note: Heavy human supervision was required since the regexes below are very heuristic.
  • Regex for AWB:

Run 1, the latest version (I started with a worse version):

 \[\[Category:Czech nouns\]\]\n\n		False	True	False	True	False	False	True	
 \[\[Category:Czech nouns\]\]\n		False	True	False	True	False	False	True	
 '''.*''' *{{g.([mfn])}} *{{g.p}}	{{cs-noun|g=$1-p}}	False	True	False	False	False	False	True	
 '''.*''' {{g.([mfn])-p}}	{{cs-noun|g=$1-p}}	False	True	False	False	False	False	True	
 '''.*''' *{{g.([mfn])}}	{{cs-noun|g=$1}}	False	True	False	False	False	False	True	
 '''.*''' *''([mfn])''	{{cs-noun|g=$1}}	False	True	False	False	False	False	True	
 '''.*''' *$	{{cs-noun}}	False	True	True	False	False	False	True	

Run 2:

 \[\[Category:Czech adjectives\]\]\n\n		False	True	False	True	False	False	True	
 \[\[Category:Czech adjectives\]\]\n		False	True	False	True	False	False	True	
 \[\[Category:Czech adjectives\]\]		False	True	False	True	False	False	True	
 '''.*''' *{{g.([mfn])}}	{{cs-adj}}	False	True	False	False	False	False	True	
 '''.*''' *''([mfn])''	{{cs-adj}}	False	True	False	False	False	False	True	
 '''.*''' *$	{{cs-adj}}	False	True	True	False	False	False	True

--DPMaid (talk) 10:14, 27 December 2016 (UTC)Reply

On tajit you seem to have removed the category without making sure there was a headword template, resulting in an uncategorized page. DTLHS (talk) 17:15, 29 December 2016 (UTC)Reply

Switching Wikisaurus to Thesaurus

[edit]

For Thesaurus namespace, I made an AWB run to replace links that use Wikisaurus to use Thesaurus. Furthermore, in that namespace, I replaced External links heading with Further reading.

  • Example edit: diff
  • Edit count: 1088

Votes: Wiktionary:Votes/2017-07/Rename the Wikisaurus namespace, Wiktionary:Votes/2017-03/"External sources", "External links", "Further information" or "Further reading". --DPMaid (talk) 09:06, 17 November 2017 (UTC)Reply

I started doing the same (Wikisaurus to Thesaurus) in the mainspace, making over 1000 edits; I lost track of edit count when AWB dumped. The job is not completed yet. --DPMaid (talk) 10:50, 17 November 2017 (UTC)Reply

Etymologies for -í

[edit]

I fixed a couple of Czech etymologies for words ending in -í to use {{af}}. It was a heavily supervised regex replace:

  • (===Etymology===[^\[{]*)\[\[(.*?)\]\]
  • $1{{af|cs|$2|-í}}
  • Regex, SingleLine.

--DPMaid (talk) 13:38, 24 February 2018 (UTC)Reply

Templatization of Slovnik afixu

[edit]

I made a small regex replacement, 31 edits. --DPMaid (talk) 07:57, 19 December 2018 (UTC)Reply

Adding Malagasy further reading

[edit]

I added further reading {{R:MGW}} to over 100 Malagasy entries. After adding a couple of them, checking what the further reading says, I grew suspicious. I started to skip entries where the definition did not reasonably match the external source, or where the part of speech did not match, e.g. noun vs. passive verb. Presence of further reading does not make any accuracy claim, but still. --DPMaid (talk) 09:54, 6 July 2019 (UTC)Reply

Updating Telugu further reading and URLs

[edit]

I replaced bare URLs to Brown for Telugu with {{R:CPB}} for pages where the template was able to automatically generate a working link from the headword alone; this yielded about 450 replacements. Where the template could not do that, I at least replaced the defunct URLs to working ones, preserving the page number in the URL. --Dan Polansky (talk) 14:43, 8 July 2019 (UTC)Reply