Googlebot "attacked" my site!
I had the strangest thing happen to me recently.
This is my server's statistics.
I had about 8,000 visits per day. At first I though it was a counter error, but later on I saw that my website traffic reached 1 gb in a few days !!! I thought that someone is trying to hack my website, but only after checking the ip's of the "intruder" I found out it was no else but googlebot!!
I have immediately uploaded robots.txt file to my server telling googlebot to go away, and the day after the visits ceased.
What went wrong? Why did googlebot keep visiting my site so many times?
I didn't even have meta information prior to the "attack".
These visits took much traffic from my server. That is some strange behaviour from something that should be the "friendliest" thing on the web.
Does anyone know what might be the reason for it? This is the first time such a thing happened eventhough it's not the first time I work with php-fusion on the same server, on different domain (with the same settings).
Re: Googlebot "attacked" my site!
I am confused too... :confused:
Where in the statistics does it say 8000 visits per day?
Where in the statistics does it say 1GB?
Are you sure you posted the right stats?
Risto
Re: Googlebot "attacked" my site!
He did post the right statics, though you might find that it doesent say exacly 1GB or 8000, I found 15000 and 56000kb...
This behavior is indeed wierd for a googlebot...I've accually never heard of anything similiar...
I even searched a forum for Hackers, and found nothing...(they usually have info on security problems and such)
Re: Googlebot "attacked" my site!
Quote:
Originally Posted by
Manwe
He did post the right statics, though you might find that it doesent say exacly 1GB or 8000, I found 15000 and 56000kb...
This behavior is indeed wierd for a googlebot...I've accually never heard of anything similiar...
Ok, so what's the problem again? This is what I see:
19981 35.14% 11141 40.74% 325488 56.23% 38 12.38% crawl-66-249-72-50.googlebot.com And that's for the month.
Looks normal to me? What's weird about it?
Availor, here's what you can do to limit impact from SEs if BW is an issue for you:
(1) Implement a proper robots file. One that asks SEs to stay away from admin / structure files, and only look at content. Why ban them from all of them? I assume you still want to be visible in search engines, and for people to find and come to your site? Ask SEs to stay away from big files (big downloads, movies and games etc.)
(2) Use Google Webmaster tools to tell the robot to slow down its crawl rate.
Both 1 and 2 can take more than a few days to kick in.
Also, once robots learn about your site's update pattern - they will adapt accordingly.
You asked: What went wrong? Why did googlebot keep visiting my site so many times?
It has only visited you 38 times so far? That's nothing...
Risto
Re: Googlebot "attacked" my site!
Maybe I posted a wrong statistics file :confused: Trust me on that one I had about 8000 visits per day, and I found that it's not an uncommon problem
http://www.google.com/support/webmas...571&topic=8460
So I'm not the only one - it happens. At least i know how to solve the problem. :rolleyes:
Re: Googlebot "attacked" my site!
Maybe you were thinking "hits" at 8000? The hits can get high as the server counts everything accessed - every file (including those used to build the page) - every single piece of graphical element etc.
Looking at the stats you had 38 separate visits from Google. This is low... If you have frequently updated site, you would like to see more frequent visits that. The more visits, the quicker your content (will most likely) end up in the serps.
Again, you need a good robots files.
Risto
Re: Googlebot "attacked" my site!
Quote:
Originally Posted by
Availor
So I'm not the only one - it happens. At least i know how to solve the problem. :rolleyes:
I know it happens too - but it's not you. It would be in the stats.
This is really only an issues for sites that have an incredible amount of content, especially those with big files. Again even more so for sites hosted with limited BW (that are tapdancing around the allowed quota). I guess, server load can become an issue too, if you are being bombarded with SEs and traffic (and have all kinds of (and sometimes poorly designed) backend scripts running).
Risto
Re: Googlebot "attacked" my site!
Availor, try clickin on the home page of talk graphics where it says "currently active users "and see how often this site is hit by the Yahoo! Slurp Spider. It's quite an eye opener.
Re: Googlebot "attacked" my site!
I had 18,000 visits in two - three days I don't think that's normal. Now things are better once I have set robots.txt file.
Re: Googlebot "attacked" my site!
Quote:
Originally Posted by
Availor
I had 18,000 visits in two - three days I don't think that's normal. Now things are better once I have set robots.txt file.
Availor, are you sure don't mean "HITS"? :confused: A robot normally looks at more than one page per visit... Also, one single page can result in hundreds of "hits" depending on what is contains - images, php files etc. used to build that one page.
Looking at the updated stats for your site (yesterday the 25 th.) I see:
104 visit (people and robots) - they viewed a total of 703 pages - a total 6402 files were accessed - 12192 hits (note that this number includes the tiniest spacer image) - the bandwidth is negligible (almost nothing).
Where do you see the 18,000 visits?
Risto
Re: Googlebot "attacked" my site!
I think, this is a good point, to link to our "Web site statistics" thread with some explanations about Hits and Visits and Page Views.
Regards,
Remi
Re: Googlebot "attacked" my site!
Quote:
Originally Posted by
Risto Klint
Availor, are you sure don't mean "HITS"? :confused: A robot normally looks at more than one page per visit... Also, one single page can result in hundreds of "hits" depending on what is contains - images, php files etc. used to build that one page.
Looking at the updated stats for your site (yesterday the 25 th.) I see:
104 visit (people and robots) - they viewed a total of 703 pages - a total 6402 files were accessed - 12192 hits (note that this number includes the tiniest spacer image) - the bandwidth is negligible (almost nothing).
Where do you see the 18,000 visits?
Risto
18,000 - that was the counter number. I suppose that the counter shows only unique visits. Anyhow, I hope this will not repeat itself. Now I have a running website which succeeds quiet well. Remi, thanks for the explantion I didn't really give it a thought prior to reading this post :)
Re: Googlebot "attacked" my site!
If you by counter mean one of those little simple little scripts (no tracking of IP) that supposedly show visitors to a page (in just numbers e.g. "000018000") kind of thing? They are highly inaccurate...
Availor, last words from me... :p You are making a big mistake by not allowing robots to index your site. Yes, I guess one can do without search engine traffic, but if you wish to have a well visited site, why ignore it? This also means no pagerank for you... For some advertisers no pagerank means - no money for you.
Why not just block them from indexing everything but your content?
Risto
Re: Googlebot "attacked" my site!
Risto I intend to delete the robots.txt soon, to see what happens again :) maybe I will allow the to index only a portion of the site or the main page.
Re: Googlebot "attacked" my site!
Ok, I said in the last post that it would be my last one... but... :rolleyes:
Again, don't delete the robots.txt file - use it the way it was intended to be used! It's very useful with the big 3 at least - Google, Yahoo and MSN.
With the robots.txt file:
1. You can remove unnecessary files from the index, like files used to build your pages.
2. You can remove files that kill your BW, like movies, flash games, images and other big downloads.
3. You can remove files that make it appear that you have duplicated content on your site. Meaning, severeal independant files and pages (categories, pagination, feeds, random page scipts and archives etc.) that link to/represent content in a way that it appears that the content is duplicated. This might result in some of your content not being listed in the SERPS but in the supplementary results, or ignored all together.
It's not a question of denying everything or allowing everything, but to find away that works for your site.
If 1-2-3 is completely confusing - PM me, and I'll show you (It's not appropriate here).
Risto
Re: Googlebot "attacked" my site!
You sure its not the adsense bot? rather than the search index bot?
Re: Googlebot "attacked" my site!
Hello marck_don,
I hope you are not expecting a response from the original poster of this thread.
He has been inactive on the forums for about 3 years now.