Welcome to Geeklog, Anonymous Sunday, December 22 2024 @ 03:17 am EST
Geeklog Forums
Google, Geeklog Weirdness
Status: offline
chief123
Forum User
Chatty
Registered: 05/02/03
Posts: 58
On one GL site, that has now been temporarily put on the back burner, I could never figure out why Google was only spidering part of the site. It was coming every couple days but while picking up just about everything else it would not pick up the actual stories. After a little digging around I discovered something.
There are 2 types of stories I'm talking about here - those with no extended text and those with extended text. If a story has no extended text Google will pick up part of the story as it crawls that topic but not any comments.
For example, type "Is this a real community?" in Google with the quotes and the first entry is an extended story from a while back here. Google has the first part of the story only because it spidered all the stories in that topic page and because those words were on that topic page they are in Google if that makes sense.
However it does not have links to the extended part nor the comments because Google will not spider, apparently, the stories themselves.
Google has links to various other parts of Geeklog sites but not the stories unless it picks up part of a story as above.
This may not be important to some but to others who rely on traffic from Google it would be. Many search engine experts talk about putting a lot of content on your site because Google and the others loves good content. If Google isn't picking up a big part of it then all that content is for naught (for search engine purposes).
Furthermore this cuts down on the relevance of your rankings. One of the reasons for the titles of the stories to be in the tag is that if someone is searching for ways to lose weight fast and you had 50 stories about that topic with for example "Mysite: Lose Weight Fast by Eating Less" and then "Mysite: Lose Weight Fast by Exercising More" in the title tags then that may inch you closer to the top of the rankings.
However, since the stories aren't being picked up, that means the titles are basically irrelevant because they aren't being picked up.
Even for a site like this one, if you had more words in Google (because the stories were fully spidered) you would have more visitors which equals more potential contributions (financial or code or otherwise).
Google has no problem with the dynamic content, the length of the URLs or anything like that with Geeklog as it stands today or other pages wouldn't be there from GL that are dynamic and have the same URL format.
What would cause Google to get just about everything but the full stories themselves with comments? Does anyone know what the limiting factor is here or am I missing something?
Thanks for everything.
There are 2 types of stories I'm talking about here - those with no extended text and those with extended text. If a story has no extended text Google will pick up part of the story as it crawls that topic but not any comments.
For example, type "Is this a real community?" in Google with the quotes and the first entry is an extended story from a while back here. Google has the first part of the story only because it spidered all the stories in that topic page and because those words were on that topic page they are in Google if that makes sense.
However it does not have links to the extended part nor the comments because Google will not spider, apparently, the stories themselves.
Google has links to various other parts of Geeklog sites but not the stories unless it picks up part of a story as above.
This may not be important to some but to others who rely on traffic from Google it would be. Many search engine experts talk about putting a lot of content on your site because Google and the others loves good content. If Google isn't picking up a big part of it then all that content is for naught (for search engine purposes).
Furthermore this cuts down on the relevance of your rankings. One of the reasons for the titles of the stories to be in the tag is that if someone is searching for ways to lose weight fast and you had 50 stories about that topic with for example "Mysite: Lose Weight Fast by Eating Less" and then "Mysite: Lose Weight Fast by Exercising More" in the title tags then that may inch you closer to the top of the rankings.
However, since the stories aren't being picked up, that means the titles are basically irrelevant because they aren't being picked up.
Even for a site like this one, if you had more words in Google (because the stories were fully spidered) you would have more visitors which equals more potential contributions (financial or code or otherwise).
Google has no problem with the dynamic content, the length of the URLs or anything like that with Geeklog as it stands today or other pages wouldn't be there from GL that are dynamic and have the same URL format.
What would cause Google to get just about everything but the full stories themselves with comments? Does anyone know what the limiting factor is here or am I missing something?
Thanks for everything.
5
8
Quote
Jamie
Anonymous
Long Query Strings.. I.e. bigstrings of numbers/letters after a "?"
Yes, smaller query-string variables are OK, its just the big ones which I think google assumes to be some sort of Unique Session Identifier, so skips said URL's.
That's my theory, though I haven't tested it properly..
Jamie
Yes, smaller query-string variables are OK, its just the big ones which I think google assumes to be some sort of Unique Session Identifier, so skips said URL's.
That's my theory, though I haven't tested it properly..
Jamie
6
6
Quote
Jamie
Anonymous
Long Query Strings.. I.e. bigstrings of numbers/letters after a "?"
Yes, smaller query-string variables are OK, its just the big ones which I think google assumes to be some sort of Unique Session Identifier, so skips said URL's.
That's my theory, though I haven't tested it properly..
Jamie
Yes, smaller query-string variables are OK, its just the big ones which I think google assumes to be some sort of Unique Session Identifier, so skips said URL's.
That's my theory, though I haven't tested it properly..
Jamie
6
8
Quote
jamie
Anonymous
I'm referring, of course to the ?story=xxxxxxx ID.
Try making the ID part of the URL (whilst probably not practical for people who don't have admin access to their server, it may serve as a test)... or splitting the id into smaller bits (horrible hack!) or maybe even trying to use characters that don't form a typical session-ID ? (even more horrible hack..)
Maybe I'll get around to testing this theory out one day :-)
Jamie (jamie at bishopston.net)
Try making the ID part of the URL (whilst probably not practical for people who don't have admin access to their server, it may serve as a test)... or splitting the id into smaller bits (horrible hack!) or maybe even trying to use characters that don't form a typical session-ID ? (even more horrible hack..)
Maybe I'll get around to testing this theory out one day :-)
Jamie (jamie at bishopston.net)
5
20
Quote
anon
Anonymous
Did this help ?
9
14
Quote
Status: offline
chief123
Forum User
Chatty
Registered: 05/02/03
Posts: 58
Thanks for your response but I have no clue how to do that.
Any ideas as to how to resolve the problem or has anyone tried the ideas posted below?
Thanks.
Any ideas as to how to resolve the problem or has anyone tried the ideas posted below?
Thanks.
Quote by jamie: I'm referring, of course to the ?story=xxxxxxx ID.
Try making the ID part of the URL (whilst probably not practical for people who don't have admin access to their server, it may serve as a test)... or splitting the id into smaller bits (horrible hack!) or maybe even trying to use characters that don't form a typical session-ID ? (even more horrible hack..)
Maybe I'll get around to testing this theory out one day :-)
Jamie (jamie at bishopston.net)
Try making the ID part of the URL (whilst probably not practical for people who don't have admin access to their server, it may serve as a test)... or splitting the id into smaller bits (horrible hack!) or maybe even trying to use characters that don't form a typical session-ID ? (even more horrible hack..)
Maybe I'll get around to testing this theory out one day :-)
Jamie (jamie at bishopston.net)
9
9
Quote
All times are EST. The time is now 03:17 am.
- Normal Topic
- Sticky Topic
- Locked Topic
- New Post
- Sticky Topic W/ New Post
- Locked Topic W/ New Post
- View Anonymous Posts
- Able to post
- Filtered HTML Allowed
- Censored Content