Module talk:grc-utilities

New tokenize function

Latest comment: 7 years ago1 comment1 person in discussion

I created a new version of the tokenize function in Module:User:Erutuon/grc and added it in this edit.

The old version used a lot of Ustring functions, especially mw.ustring.sub and mw.ustring.find. This made Module:grc-decl slow, because it frequently uses functions in Module:grc-accent that require tokenization.

The new version doesn't use any Ustring functions at all, except initially to decompose the characters. It relies only on the current character, the previous character, and the current token to decide which token to put the current character in. It's more verbose and difficult to understand, but significantly faster. That means that adding accents to every declined form in Module:grc-decl/sandbox with add_accent in Module:grc-accent is a bit faster than it was before. — Eru·tuon 21:10, 10 November 2017 (UTC)Reply