Module talk:grc-utilities

From Wiktionary, the free dictionary
Latest comment: 6 years ago by Erutuon in topic New tokenize function
Jump to navigation Jump to search

New tokenize function

[edit]

I created a new version of the tokenize function in Module:User:Erutuon/grc and added it in this edit.

The old version used a lot of Ustring functions, especially mw.ustring.sub and mw.ustring.find. This made Module:grc-decl slow, because it frequently uses functions in Module:grc-accent that require tokenization.

The new version doesn't use any Ustring functions at all, except initially to decompose the characters. It relies only on the current character, the previous character, and the current token to decide which token to put the current character in. It's more verbose and difficult to understand, but significantly faster. That means that adding accents to every declined form in Module:grc-decl/sandbox with add_accent in Module:grc-accent is a bit faster than it was before. — Eru·tuon 21:10, 10 November 2017 (UTC)Reply