I’m certainly not addicted to web stats. This blog has a Google Analytics marker but I don’t follow the statistics on a regular base. After all, I’m blogging for fun and I don’t need to keep my audience at a certain amount or don’t need to attract more visitors – even if a growing audience is very rewarding. That’s a good opportunity to thank all my readers! Did you also notice that no commercial ads are displayed here? (Except for some specific security events or podcasts but they deserve!)
On the other side, I keep an eye on the server logs. I’m addicted to “logs”. They provide very useful information about your visitors and their behavior. Never forget: You need logs and you need to take care of them. Event if they contain non-critical information, the same details may get a very high value in the future when you’ll have to investigate a security incident. Think about this…
So, while reviewing the log file of the web server running this blog, I found something interesting. I published my last post yesterday at 18:40 GMT+2. Google fetched and indexed the data less than three minutes later:
126.96.36.199 - - [29/Aug/2010:18:41:01 +0200] "GET /2010/08/29/back-online-2/ \
HTTP/1.1" 200 15085 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; \
Another statistic? Since the beginning of this month, the Google bot hit this blog 30056 times! Ok, honestly, blogs are not the best references. Lot of blogging platforms notify Google when new content has been published with messages such as “Hey, Google, I’ve something for you!“. But regular websites are also very often “crawled” by Google. A small forum maintained by myself (with a very low activity) has been visited by Google 3509 times this month.
What does it mean? If you publish some content on the Internet, don’t expect a chance to bring your data off-line. The time to read this post, they already have been indexed! Bots like the Google one have powerful algorithms and know where to find relevant information. “CTRL-Z does not work on the Internet”
Google released interesting statistics about the number of requests they received from Governments Agencies around the world. If you offer free services on the Internet, they are (mis)chances that people will try to abuse of them. Google is certainly not the exception with all the services they provide: webmail, web indexes, collaborative tools and all the sites they bought months after months like Picassa, Youtube & co.
The sampling period was between July, 1 2009 and December, 31 2009. Two types of requests are reported:
- Removal requests: Official requests to ask for removal of content from Google search results or from another Google product like YouTube.
- Data requests: Official requests to ask for information about Google user accounts or products.
Statistics are interesting but, like any number, can be interpreted in different ways! Let’s have a look at my country – Belgium:
- 67 data requests
- <10 removal requests, and none (0%) was complied by Google
Different observations could be:
- Belgian Internet users are wise (only 67 requests)
- or Belgian Internet users are crafty (only 67 of them were catched!)
- or Belgian authorities are overloaded (only 67 requests were processed)
- or the load of work to investigate online crime is too high (only 67 cases were closed)
On the other side, no removal request issued from Belgium was successfully complied by Google. Lack of knowledge? Lack of procedure? Google make its own observations. Interesting to read.
One more time, companies asked the help of the Justice to fight against Google. This time, the Google Suggest tool is the target. This service is quite simple and you probably use it on a daily basis. When you type your search terms in the search engine, Google offers keyword suggestions in real time. Google Suggest is certainly not bullet-proof and can sometimes give funny results.
Funny or injuring results? Google has been convicted of injury by a French Tribunal (it looks like the French Justice is quite busy at the moment). The complainant company blamed Google to suggest a combination of the company name and the word “arnaque” (“scam” in English). After some months of fight between the parties, Google removed the association.
I won’t join any side in this story. Google can be the worst evil and I don’t know this French company. But, there is no smoke without fire! If Google suggested this query, it means that the Google bots found some references in blogs and forums about this company and complaining customers. This reminds me the story of the Belgian jeweler who had a very bad publicity in the Internet a few months ago.
When companies will realize that there is “live” on the Internet? Dad’s static website containing your filtered own information is over! Customers exchange information about your company on the Internet, potential customers looks for more information. Keep an eye on your online reputation!
Once again, Google hit hard! They announced yesterday a new service via their blog: Google Public DNS.
The new Google baby is a public DNS resolver open to everyone. Just reconfigure your TCP/IP stack to use the following DNS server and you’re done!
Google’s arguments are in direct line with the current DNS limitations: security (DNS is a key component of the Internet infrastructure. Lot of attacks may compromize DNS caches and redirect traffic to fake websites), speed (Google implemented a prefetching technique to refresh continuously popular domains before the TTL on a record expires) and availability.
Immediately, OpenDNS, another well know DNS service provider, posted some comments in their blog about the Google announce. I personally liked the 3rd point in the article: Google is “the largest advertising and redirection company on the Internet”. Using their DNS, they will be able to track you even more and building users profiles. DNS servers logs are a goldmine!
Google already handles our web searches (google.com), our e-mail (gmail.com), online documents (docs.google.com), instant messaging (gtalk.com), now DNS requests and soon your operating system (ChromeOS)? Don’t put all your eggs in the same bag!
Unusual but tonight, Gmail was unavailable as reported by the Google Apps Status Dashboard. Besides the fact that more and more people rely on the number-one webmail interface to handle their e-mails, this problem has impacted other Internet social services like Twitter!
When the problem was detected by users, they immediately tweeted to ask if it was a local or global problem. Check out the result on search.twitter.com:
Click to enlarge
Other services more focused on security like the Internet Storm Center received a huge amount of notifications. What can we deduce from this incident?
First, not a surprise, Gmail can be considered as a critical service on the Internet (It does not seem unfair to compare Google with the delivery of water or electricity in the real life). Any minor problem is immediately detected by the users community and reported. Second, third party services can suffer of the outage like Twitter. They must be prepared to face a peak of network traffic or servers load.
Now let’s wait for some official communication from Google. They have to be transparent to keep the confidence of their users. A few days ago, Apache did it perfectly after a SSH key was compromised. They gave a detailed overview of the incident via a blog article. I’m curious of the Google feedback!
[Edit 03/09/2009 09:16]
Google posted more information about the issue which affected Gmail yesterday evening (GMT time): More on Today’s Gmail Issue.
The Google toolbar is a powerful add-on for your browser. It adds very nice features (of course, to be used sparingly if you don’t want Google to know everything about you). There is also an API which offers extra features for webmasters such as creating custom buttons.
Here is a quick how-to to add custom search engines to your Google toolbar. By default, Google provides a lot of alternate search engines (example: Wikipedia). Week after week, I’m using the Twitter search engine more intensively, why not add it as a custom search engine?
Step 1: Go to the Twitter search page, click the right mouse button on the search field and select “Generate customer search“:
Click to enlarge
Step 2: Fill the required parameters if needed:
Click to enlarge
Step 3: Done! Your new search engine is available via the menu on the right side of your search field:
Click to enlarge
That’s it! (thanks to @brianrose for the tip)
Click to enlarge
Click to enlarge
I was driving into Brussels this afternoon and saw two Google cars. The second one stopped just in front of me at a traffic light. The cameras were recording (I saw some activity on the LEDs on top of the cams).
I wrote down the place and time. Let’s see now how much time it will take to have the images available online!
The Cult of the Death Cow team is back with a new toy called Goolag.
One more time, the Google search engine power is diverted to help webmasters to find security breaches in their web site(s). Of course, as a good boy, you will always use Goolag against your own site! Isn’t it? :-] 
Goolag is a frontend (today, only available for Windows – via a mirror in Belgium) and uses the well known Google Hacking Database. The source code is also available.
Notice that the tool handle properly the Google scan protections! It allows you to open a browser, enter the captcha and resume the scan!
Let’s make some tests…
 If you combine Goolag with Tor, your anonymity will be preserved…