"I am webby and I think webby" - AjiNIMC aka Aji Issac Mathew - "I thought and I wrote".

 AjiNIMC logo - Aji Issac Mathew I am Aji Issac Mathew also known as AjiNIMC at various forums. I am webby and I think webby, being a part time blogger, this blog is a documentation of my experiences and my learning.
Blog Stats (06 June 2008): There are currently 306 posts and 1100 comments (and 397,307 spam comments), contained within 17 categories.
RSS for Aji Issac Mathew's blog 
  I am into professional Web Marketing services which includes Web marketing strategies, SEO/SEM, Content Designing, Web Designing for usability, conversion improvement and various other things. There are limited availability per month. We don't take too many clients but we make sure that all our clients get their share of success. I worked on in-house sites for over 5 years, now is the time to help others with my experience. I have a great team helping me achieve this. A very creative and experienced team. Contact aji.issac (at the rate) digitalavenues.com and get your share of success.  

 Home >

Google bot went unhappy

Sep
16

This happened when we shifted the server last time (some 6 months earlier). After the shift we were keeping a watch over the bots. We started facing THE PROBLEM with few bots, “Cache loss“.

I checked everything from robots.txt, .htaccess, php programs, frames and everything possible. Validated robots.txt, XHTML validation for all the pages to make sure I am not doing anything wrong.

It did no good. The number was going down and down, from over 20,000 to 10,000 and 10,000 to 5,000. It started worrying me and my team as search engines contributes for your traffic (almost 60% in our case).

Then I started investigating:-

  • Investigation part 1:
    I changed my user agent (using firefox extension) to Google bot. I was still able to access the pages.
  • Investigation part 2:
    Checking the Log files manually. I could find no trace of Google bot.
  • Investigation part 3:
    Making sure that Google is having no problems at its end. I read almost all the recent search engine posting at webmasterworld, search engine watch , digg.com, webproworld, hedir.com, blogs like mattcutts.com. I found none. Our other sites were not loosing the cache either.
  • Investigation part 4 to 100:
    Did all possible checks.

No way out - Last shot
When we saw that there is no way out, we decided to swift the servers back. Then while testing with the http live header I saw that the header passed was with content type “text/html”.
Our servers were not passing content type “text/plain” for the txt files. I asked the questions at various forums and all said that it shouldn’t make any difference. I had no options, so thought of passing the right content type “text/plain”. I configured it and left it to God.

It was the Eureka moment as Google started visiting us again and cached all the pages soon. Believe it or not, the header matters for Google bot. They may correct it later but it certainly did matter that time for us.

This post was written by AjiNIMC aka Web Kotler at 2:42 pm under category Tech Talks(




1 Comment »

  1. [...] As usual after the shift you are suppose to keep a check on the spiders esp the Google bot. Last time I faced a strange problem and lost almost all the cache. This time our team who were checking the raw log file directly and with log analyzer (sawmill, awstats) told me that Google bot is not visiting our site. I took it lightly and took it as their mistake as I could see the latest cache with Google. When the team forced me to look at the raw log file I found them with no guilt, they reported the truth. I did a grep and found no trace of Google bot. It certainly worried me. [...]



    My Abode » Google Bot mystery on September 16, 2006 - 3:25 pm @ 3:25 pm

RSS feed for comments on this post · TrackBack URI

Share your thoughts

You are visitor number