In GL 2.1 there is a new and highly appreciated logfile called 404.log.
This file contains all 404 errors encountered (dead links). Some of them are caused by normal users that just typo a real link. Some of them caused by bots scraping incorrect links from somewhere else.
However, I do see a lot of 404's listed that have a encoded link. So '%3F' for the question mark and '%26' for the ampersand. It seems that Apache cannot handle properly urlEncoded requests by the browser.
Than I see a lot of 404's starting with 'RK=0/RS='. This seems to come from Yahoo search results.
The following is a quick patch to .htaccess in order to correct the behaviour of Apache. There is a lot of information on the Internet on the subject. Just search for %3F or 'RK=0'.
The below patch did the job for me.
Text Formatted Code
RewriteEngine On
# strange behaving bots, these are urls scraped from yahoo (botters scrapping for links, yahoo search link contain RK RS) tenants modification:
RewriteRule ^(.*)RK=0/RS= /$1 [L,NC,R=301]
RewriteRule ^(.*)RS=^ /$1 [L,NC,R=301]
RewriteRule ^(.*)\?(.*)$ /$1?$2