Status: offline

vvprok

Forum User
Newbie
Registered: 07/07/03
Posts: 10
Geeklog is translated to many languages. It is fine!
However Gl does not work with multibyte characters correctly.

As you know, string related functions strlen, strpos, substr, etc. do not take into account string encoding and works with byte sequence only. In such way, f.e. links plugin incorrecly composes brief string for "whats new" block. It leaves 16 bytes of the link title and then adds "...". As result for uk_UA.UTF-8 locale I got 7 symbols of the title in Ukrainian language and then some garbage symbols before "...".

And as you also know, there are another set of functions especially for multibyte encoding: mb_strlen, mb_strpos, mb_substr, mb_etc.

I already fixed links plugin with mb_* functions (see here).
I simply changed calls
Text Formatted Code
str...(...)

 
to the
Text Formatted Code
mb_str...(..., $LANG_CHARSET)

 
However, it looks quite complicated to be used as total solution for all string related operations.

So, I propose to create lib-strings.php module. It will contain string-related functions. Those functions will hide from Gl code implementation details of the string related code. All of them will look in the next manner:
Text Formatted Code

function gl_strlen($string)
{
    global $LANG_CHARSET;
    return mb_strlen($string, $LANG_CHARSET);
}


 


So, what do you think?

Status: offline

sakata

Forum User
Junior
Registered: 12/17/01
Posts: 25
Hi,
I have created COM_titlesplit function.
see
http://www.geeklog.net/forum/viewtopic.php?showtopic=65070

I think having lib-strings.php is a good idea.