All is started with an email received by Panos sent by Amazon Web Services LLC that informed him of an increase in its monthly charges larger than previous ones. The mail was a simple customer care action to show to the clients all features and functionality of Amazon services
The researcher was stunned reading the amount of $720, seven time greater than ordinary charges for Amazon Web Services. Once logged in its account, he has noted that the total amount for usage charges was $1177.76! Reading the details of the traffic revenue he noted that $1065 was paid for 8.8 Terabytes of outgoing traffic! The total cost was increasing hour after hour scaring the researcher.
What was happening? Why costs levitate like that?
The scientist enabled logging on his S3 buckets and analyzed the reports provided by Amazon on the usage of its cloud service. He discovered that his S3 bucket was generating 250GB of outgoing traffic, per hour.
He discovered that the source of the traffic was a big bucket containing images used for Amazon Mechanical Turk web domain, an approximately 250GB of pictures. Considering that the size of each image was maximum of 1MB, the bucket should have been serving 250,000 images per hour with more than 100 requests per second, that is anomalous.
According the researcher
“Somehow the S3 bucket was being “Slashdotted” but without being featured on Slashdot or in any other place that I was aware of.”
Checking the logs the researcher discovered that a Google was aggressively crawling the bucket.
Here are the IP’s and the User-agent of the requests.
126.96.36.199 Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html)
Why would Google crawl this bucket? The images in the tasks posted to Mechanical Turk are not accessible to Google to crawl. The effect was that that an S3 bucket with 250Gb of data generate 40 times that amount of traffic. Google would just download once the images in the bucket but in this case each image was being downloaded every hour.
The discovered address is not related to Google crawler, named GoogleBot for web pages and Googlebot-Image for images, meanwhile Feedfetcher is a Google grabber for RSS or Atom feeds added by users to their Google homepage or Google Reader.
Let’s consider also that the agent ignores the robots.txt, file created by webmaster to mark file or folder that their want to hide to the spider of search engines.
Continuing his analysis the researcher noted that all the URLs related to the images were also stored in a Google Spreadsheet, in particular to display a thumbnail of the images the researcher inserted “image(url)” command in the sheet. In this way every time the spreadsheet is viewed Google, it downloads all images to create the thumbnails.
But why downloading the same content repeatedly?
The explanation is simple, Google is using Feedfetcher as a “url fetcher” for all sorts of “personal” URLs someone adds to its services, and not only for feeds. Google doesn’t store the URLs of the pictures because they are private avoiding any caching activities. In the specific case, its not possible anyway to store a a robots.txt in the root directory of https://s3.amazonaws.com that is common server for many different customers.
Working with a huge domains such as s3.amazonaws.com, containing terabytes of data of different accounts, Google has no reason to apply rate limit, it is normal to open thousand of contemporary connections on a set of URLs that were hosted on it.
The steps to conduct similar attacks are:
- Collect a large number of URLs from the targeted website. Preferably big media files (jpg, pdf, mpeg and similar)
- Put these URLs in a Google feed, or just put them in a Google Spreadsheet
- Put the feed into a Google service, or use the image(url) command in Google spreadsheet
- Sit back and enjoy seeing Google launching a Slashdot-style denial of service attack against your target
The lesson learned is that it is possible to use Google as a cyber weapon to launch a powerful denial of service attacks against other platforms.
In reality the service in this case hasn’t been interrupted but that attack has made it extremely expensive to run, that is the reason of the researcher named it Denial of Money attack.
The event alerts security professionals on the opportunity to adopt cloud platforms or search engine services as primary tools for the attacks. In the past we have already spoken regarding the usage of search engines as hacking tool. A fundamental aspect that must be considered during the design of new service is its interaction with existent platform.
Let consider also the opportunity given by the introduction of the cloud paradigm, a concept that under the security perspective must be analyzed more in detail in the next years. Cloud platforms could offer incredible opportunity for hackers if used as weapon, they represent a mine of information still too vulnerable.