Jump to content

Wiktionary:Project - Minimize uncategorized pages

From Wiktionary, the free dictionary
See also: Categorization

Categorization Project

[edit]

Goal

[edit]

The goal of this project is to get all pages categorized, i.e. minimize the size of Special:Uncategorizedpages.

How to

[edit]

To categorize pages one uses Template:infl or other language specific templates. For those see Wiktionary:Inflection templates. They will categorize by language and part of speech automatically. Besides that one may manually add [[Category:XXX]] lines to categorize by topics.

Because this is a large task, and it took contributors pretty long to (semi-)manually finish just those that were capitalized or starting with the letter 'a', there is still 'c' to 'z' to do, and we need your help.

Lists of uncategorized pages

[edit]

To make this more efficient i wanted to get those thousands of uncategorized words sorted by language and part of speech first. For that i wrote a local script that fetches them, parses the headings, and puts them into a local MySQL database. From that i can then easily sort them and create lists of uncategorized pages by language, and by POS. I will insert those lists below, so you can pick one category that you are familiar with, and efficiently categorize them by using copy / paste of the right template line. If you have finished categorizing a group of pages, please strike or delete them from this page. You can also put your name next to a group of pages to volunteer watching over it also if new pages appear.

Since there is a 5000 limit on Special:Uncategorizedpages i could not get all pages right away. I will start out with a few lists, and add more later, step by step.

Thanks for your help! Mutante 17:50, 5 April 2008 (UTC)[reply]

Progress

[edit]

WE ARE DONE!! YAY!

0 pages left to categorize

(*) I know because by begging on IRC #mediawiki-toolserver i got a raw list of ALL uncategorized pages. I will use that as a basis for my local db and keep counting down now. Very helpful, thanks to "SQLDb" who provided the file. Mutante 08:59, 6 May 2008 (UTC)[reply]

New script is currently running, here is a GUI to watch. Mutante 12:40, 6 May 2008 (UTC)[reply]
  • 18,839 - as of 06 May 2008
  • 12,275 - as of 20 May 2008
  • 9,986 - as of 01 June 2008, all English verbs done
  • 6,564 - as of 28 June 2008, letters A - L completely done
  • 4,820 - as of 04 July. We have now reached the letter Z and you can see the non-latin characters in [1] and the complete up-to-date list can be seen on the normal Special page.
  • 2,927 - as of 29 July
  • 1,373 - as of 16 Aug
  • 906 - as of 28 Aug
  • 756 - as of 04 Sep
  • 499 - as of 07 Sep All the twelve-XXX pages have been deleted.
  • 425 - as of 10 Sep
  • 309 - as of 13 Sep
  • 0 - as of 15 Sep
  • 24 and 0 again - as of 16 Sep

Now you can go on to uncategorized templates if you like. ;)

Note: I've started dealing with templates beginning with "L" or "l", since the majority of them are Latin templates that have been (or will be) deprecated. --EncycloPetey 20:17, 16 September 2008 (UTC)[reply]

Uncategorized pages by language

[edit]

English

[edit]

English uncategorized pages grouped by POS

English adjectives

[edit]

All

[edit]