Welcome to Geeklog, Anonymous Thursday, December 26 2024 @ 07:39 pm EST
Geeklog Forums
search bots and multilingual support
genetikci
Anonymous
What will a searchbot do if you activate the multi-language support?
I have a website with a default language and all my stories until now are in that language. I was thinking about starting to write some stories in English. However, language switch is done with a cookie and I don't think googlebot will accept cookies. Also, geeklog may detect its language as English and show only that content.
Can people using this feature let me know
thanks
I have a website with a default language and all my stories until now are in that language. I was thinking about starting to write some stories in English. However, language switch is done with a cookie and I don't think googlebot will accept cookies. Also, geeklog may detect its language as English and show only that content.
Can people using this feature let me know
thanks
15
15
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
Not only is it done with a cookie, but it may be done with a form (if you have more than two languages - and always, in the pre-official hack).
I used to think Google was smart enough to actually support all of that somehow, but I've just tried to Google some of my sites and it seems only very few pages, that aren't in the default language, are indexed. I have no idea what's so special about those few that made them get through.
Come to think about it, when I say "used to" I mean back when I used it as a hack. It's like when the hack turned official, something was done that made it less Google friendly (and I don't mean the choice form in itself - again, in the hack it was around even for two languages).
Maybe a dynamic page (linked from the main page) with the site's map (that doesn't care for languages) would solve this issue. I know some people in these forums asked about a site's map in the past. I wonder if anyone has anything in progress.
I used to think Google was smart enough to actually support all of that somehow, but I've just tried to Google some of my sites and it seems only very few pages, that aren't in the default language, are indexed. I have no idea what's so special about those few that made them get through.
Come to think about it, when I say "used to" I mean back when I used it as a hack. It's like when the hack turned official, something was done that made it less Google friendly (and I don't mean the choice form in itself - again, in the hack it was around even for two languages).
Maybe a dynamic page (linked from the main page) with the site's map (that doesn't care for languages) would solve this issue. I know some people in these forums asked about a site's map in the past. I wonder if anyone has anything in progress.
11
12
Quote
genetikci
Anonymous
So, google sees the default language.
I was thinking about adding invisible links in header.thtml to staticpages that list stories in other languages.
I was thinking about adding invisible links in header.thtml to staticpages that list stories in other languages.
14
10
Quote
Status: offline
LWC
Forum User
Full Member
Registered: 02/19/04
Posts: 818
So, google sees the default language.
...except chosen stories, all of them from sites which used to have the hack, which seemingly was more Google friendly (not that I know why).
I was thinking about adding invisible links
If Google realizes that, they would lower your pageranks. If they don't realize it themselves, they have an informer form, where even the most random searcher could report your site uses invisible links.
So it has to be a visible link unless you want troubles.
Then again, if someone comes up with a dynamic site's map, we could always just submit it to Google Webmasters instead of linking to it from the main page. But then you'd both have to register in Google Webmaster every site you have and lose other search engines.
15
11
Quote
Status: offline
ivangan
Forum User
Newbie
Registered: 03/14/08
Posts: 1
The reason that Google and other search engines dont properly index multi-lingual sites is that they do NOT understand the problems associated with multi-lingual sites, Google for example do not have true multi-lingual web sites, they have very limited translations of their site
The same can be said for almost all web sites because there are exceptionally few true multi-lingual sites
Compare static translation to dynamic translations
Static: Get a translator to translate the pages individually and create the translated version of the site
Dynamic: Use some form of reference file, eg: PO file or include files, translate these and include them by language selection
Native: NLSO library, provides a back end database to cater for 'n' languages, currently it is in theory possible using NLSO to support almost all written languages, in excess of 5000!
Static can be indexed easily enough, though some robots get upset when trying to read Unicode as most development environments give poor Unicode support so the foundation to build on is not there
Dynamic: dependent on the implementation and the search engine you will get better/worse results, also depends on whether the robot emits the http-language-accept header and the site's ability to detect and use it
Native: NO hope, search engines cannot effectively index a site which supports not only hundreds of languages but also their dialects
The best you can hope for in this case is the http-accept-language, NLSO fully supports it
What is NLSO?
Natural Language Support Objects, a PHP framework designed to enable web creation with full native language support
Example [http://comchatter.com/nlso]
I know of no effective remedy for this problem and until such time as there are large numbers of native multi-lingual sites, I expect there is little hope of the search engines making much effort to support it
16
9
Quote
All times are EST. The time is now 07:39 pm.
- Normal Topic
- Sticky Topic
- Locked Topic
- New Post
- Sticky Topic W/ New Post
- Locked Topic W/ New Post
- View Anonymous Posts
- Able to post
- Filtered HTML Allowed
- Censored Content