User talk:Dodde

From Wiktionary, the free dictionary
Latest comment: 17 years ago by Dodde in topic Bots, bots, bots!!
Jump to navigation Jump to search

Welcome

[edit]

A bit belated, I see, but welcome, nonetheless.

Welcome!

Hello, and welcome to Wiktionary. Thank you for your contributions. I hope you like the place and decide to stay. Here are a few good links for newcomers:


I hope you enjoy editing here and being a Wiktionarian! By the way, you can sign your name on Talk (discussion) and vote pages using four tildes, like this: ~~~~, which automatically produces your name and the current date. If you have any questions, see the help pages, add a question to the beer parlour or ask me on my Talk page. Again, welcome! —Dvortygirl 00:53, 6 October 2006 (UTC)Reply

thanks ;) Dodde 01:03, 6 October 2006 (UTC)Reply

Prononciation

[edit]

Hello, I've recorded an audio collection of 2800 swedish words... you can get the collection in the .ogg or .flac format from this adress: http://shtooka.net/project/swac/?lng=en (it is the "swe-balm-voc" package)

This package is distibuted under a CC-BY license. Here is a demonstration of what we can do with such a collection: http://shtooka.net/dico/swe/

Contact me for more information. You can find me (zMoostik) on the #wiktionary irc channel.


Bot

[edit]

How does DoddeBot find the pages that have the numeric HTML entities? It is a very good idea to replace them, there are a number scattered about our Han character entries (that I am working on reformatting). Robert Ullmann 23:18, 5 November 2006 (UTC)Reply

It is using the XML dump to find articles including &#\n\n\n?\n?\n?; then it will decode all HTML entities it will find in those articles (even non-numeric entities, like ° = &deg ;. For examples of performance, check the last 250 edits of Doddebot on the Swedish Wiktionary [1]. ~ Dodde 05:10, 6 November 2006 (UTC)Reply
I see that the bot is making some errors. For one, it replaces & # 133 ; (…) with nothing. If it doesn’t know how to convert some HTML code or other, it should leave it unchanged rather than simply deleting it. —Stephen 00:06, 9 November 2006 (UTC)Reply
Taking a second look, it isn’t deleting the "…", it is converting it into an invisible character "…" (it’s between the quote marks). —Stephen 00:21, 9 November 2006 (UTC)Reply
I am using a free perl module called HTML::Entities [2] to perform the decodings of the HTML entities. & # 133 ; was found in two articles, раз and однажды. In these two articles the script using the module failed to decode the entity & # 133 ; correctly. Why this is failing is beyond my knowledge. I can do a workaround for future changes by first changing & # 133 ; to the correspnding named entitity & hellip ; and by doing this it seems to be able to decode correctly to save the original character. Why this work, I have no idea. Did you find other errors than these two articles? ~ Dodde 12:43, 9 November 2006 (UTC)Reply
No, that was the only one I discovered. Thanks. —Stephen 13:05, 9 November 2006 (UTC)Reply
& #hellip is & #8230, not 133. 133 is a control code in the C1 space: "…" that Microsoft (urk! #^$%&&# Microsoft!) decided to use for an ellipsis. So HTML treats "#133" as an entity name, not a numeric, and displays the correct character. This is true for everything between #130 and #159, you should probably look up the correct characters instead of calling decode_entity. (Or, as you say, subbing the correct entity number or name and then calling it.) Let me find you a reference ... okay, [3] should do, see the table about halfway down. Robert Ullmann 13:10, 9 November 2006 (UTC)Reply
One more thing, you need to pick up 5 digit numbers as well, Han characters start at 19968. (4E00, ) Robert Ullmann 13:14, 9 November 2006 (UTC)Reply
I was referring to the same document when I assumed #133 was the same as #hellip, but I guess I was wrong. To make this safe in the future, should I simply rename these numbers (presented in green) to the corresponding entity names before calling the decode function? Regarding the 5 digit numbers, it is matching these, just my post here wasn't entirely accurate. (changing the original post with +\n?). ~ Dodde 13:53, 9 November 2006 (UTC)Reply
There was a total of 8 articles that had these #130-#159 included. These have now been corrected (after beeing blanked unfortunately). ~ Dodde 14:28, 9 November 2006 (UTC)Reply
Yes, change them to the entity names and then let decode do its thing. Robert Ullmann 14:22, 9 November 2006 (UTC)Reply

malfunction

[edit]

Hi, it just blanked several entries. Blocked temporarily (just tell me you've got this message ... it will expire in an hour anyway ... ;-) Cheers! Robert Ullmann 14:12, 9 November 2006 (UTC)Reply

I am aware of it, I am correcting it now. Thanks... ~ Dodde 14:13, 9 November 2006 (UTC)Reply
okay, unblocked ;-) Robert Ullmann 14:17, 9 November 2006 (UTC)Reply

regarding creation of t-lang templates

[edit]

Hi, BLOCKED

You didn't (that I can see) even suggest a bot run to create these on WT:GP. Robert Ullmann 00:20, 30 March 2007 (UTC)Reply

Aspects of this matter has been discussed intensively for over a week. I am part of the discussion because I have been told one would appreciate Doddebot running this service on enwikt. In order to make it all happen, Doddebot needs to be used for some preparing work, which I am confident everyone participating in the discussions are supporting. Connel suggested creation of this extra set of templates twice in the discussion and creating it by bot is simple and risk free. Connel is supporting this too. Apart from what is discussed on Wiktionary many hours of discussion has been taking place in the chatroom aswell to be able to present an appealing package that goes well with the wish of the community. ~ Dodde 01:29, 30 March 2007 (UTC)Reply

t2 template: solution!

[edit]

So I got lucky. Please see User:ArielGlenn/Template:atg1 for a working version which does not generate spurious includes of nonexistent templates, and see User:ArielGlenn/Sandbox for examples of it in action, taken from the t2 talk page.

Is it ugly? Yup. Does it work? Barely! Here's the summary:

It's not a MediaWiki bug. On the m:Help:Template#Template_expansion page in the section "Template expansion", it describes exactly what is going on. All calls to templates get expanded, whether in conditional parts of if statements that are true OR false. This is a known "side effect". The only way to beat this is to put an if statement *inside* of the template braces, and figure out what is going to be called there on the spot. This essentially is what I did with every template call in t2. There is probably a better way to arrang the logic, I'm sure.

Happy trails! ArielGlenn 07:09, 30 March 2007 (UTC)Reply

t7 error

[edit]

I am aware it doesn't provide an anchor, but this will only be used for languages which has no Wiktionary. That means probably less than 1 of 100 translations will have this code, and probably less than 1 of 10 of these words will have its own entry on enwikt, and probably less than 1 of 10 of the words which will have an entry, will have the entry in an article with so many languages that an anchor is even needed. This multiplies up to a single case every 10,000 translations. And only one of 100 articles that has the exact syntax {t7|-|word} will be in the need of an anchor at all. So I think the code should be changed back to keep the simpleness. For any option we will later use, we will make an enhanced template wich can be used for various options, bit imo t, t- and t+ should be the simplest possible for the task. (I noticed your change broke the syntax, and I don't understand what you mean by "see {{t7!}}" since the link is red-colored). Btw, are you at IRC? Then we could discuss some of the issues? ~ Dodde 09:47, 1 April 2007 (UTC)Reply

Not an error, a work in progress. The results should be the same now, except I'm not sure how you had displayed the case of an invalid language code. Is what I have okay? DAVilla 10:54, 1 April 2007 (UTC)Reply
I understand, I just kinda liked the t7-code as it was, since it was very little code in it and very effective, just like t7+ and t7- along with it, and nothing more was needed. When you totally changed the content of the code and added the language template I felt that maybe with a completely other option was a better way to present it. Regarding {{language}} I am not sure if it is suited for being used with a translation template at all. Personally I hope for that the extension [4] will be added for use in {{t}}. It will practically be very much less server intense than both calling the second set of language templates and using the language template you've been working on. Until then I think it is ok to use the second set of language templates, since the translations list has not grown that that large as they might be some time in the future... ~ Dodde 13:31, 1 April 2007 (UTC)Reply
Why do you say I completely changed the content of the code? {{language}} may be complicated but its implementation is not that far from what you were doing. It too relies on template arrays, it just does about twice as much error checking. All I did was replace your unchecked implementation with a more user-friendly version, using {{language}} instead of {{t-xx}} directly and {{langcode}} instead of the raw language code. The skeleton of your code is identical and the results for the correct uses are unchanged. DAVilla 15:59, 1 April 2007 (UTC)Reply

Bots, bots, bots!!

[edit]

Hi Dodde. I was just wondering ... umm ... if you could possibly run your bot over on ga.wikti, too. We're in the middle of reviving that wiktionary and are making excellent progress, and a bot like yours would be really helpful. We'd really appreciate it ;) - Ali-oops 19:41, 13 September 2007 (UTC) (ga.wikt/gd.wikt sysop)Reply

I registered an account at gawikt let's talk about how I can help. ~ Dodde 01:49, 1 October 2007 (UTC)Reply