On March 26, Bloomberg published a report “How Private Data Became Public on Amazon’s Cloud” with chilling implications. At least some of the exposed data was clearly intended to be private and leaves the companies involved dealing with potential losses and embarrassment. Revelations like this make many, both inside and outside IT, look at cloud privacy with trepidation.
But even this discovery does not show that any cloud storage option is less safe than on-premises storage. Let me explain why.
Cloud Storage Basics
To understand the storage threat described in the Bloomberg article, we can start with the basics of Amazon Simple Storage Service (S3). Individuals and businesses use S3 to store large quantities of data on Amazon’s storage hardware in Amazon data centers spread over the globe. S3 offers storage in increments called “buckets.” Users request buckets and receive a URL and credentials for accessing the buckets, which users then fill with data. Only the owner can access private buckets. Anyone on the Internet can access a public bucket.
When the user receives the bucket, S3 defaults it to private; only the credentialed owner can access the contents. Private buckets are not the only protection available for S3 data. Each file in a bucket has an access control that can block public access. Amazon offers data encryption services and users can apply their own encryption to render the data unreadable without the key. No protection is completely impregnable, but gaining access to a locked and encrypted file stored in a private bucket is a formidable challenge to the most skilled hacker.
Aim for the Foot
The Amazon document on protecting data on S3 spells out the facilities available for protecting S3 data and recommendations for use. The document indicates that Amazon’s facilities support compliance with some of the strictest privacy standards in the industry: PCI (Payment Card Industry) and HIPAA (Health Insurance and Accountability Act). S3 default settings are secure, not open. In other words, to expose S3 data to the public, the S3 user must intentionally release the data. As someone on Slashdot put it, Amazon handed the user a gun that clearly said, “Do not pull trigger while pointing at self.”
The Bloomberg report was based on a blog by Will Vandevanter, a security consultant. Vandevanter wrote a script to search out S3 buckets and check if they were public or private; he found roughly 12,000 S3 buckets, of which about 2000 were public. From the public buckets, he gathered a list of 126 billion exposed files, which he sampled.
Not all the files in the public buckets were readable, but his findings are hair-raising: documents containing sales records from a car dealership and personal employee information were publicly readable. Among over 5 million text documents, spot checking revealed many marked “Confidential” or “Private.”
The owners of the public buckets, for whatever reason, aimed for the foot and, intentionally and contrary to best practice, exposed 2000 buckets. They should stop doing that and end this story, but it does not end.
The fact that carelessness, not system design, exposed critical data does not make me less worried about filling out the forms to buy a car. People trade security for convenience all the time. I have been in data centers where the cipher lock doors were wedged open and industrial plants where key server passwords were blank. The convenience of not punching in those keys and passwords trumped security. It happens. I’m speculating, but exposing those buckets to public access was probably just another form of convenience.
Are cloud deployments more vulnerable to these mistakes than on-premises deployments? Well, in some ways, yes. People will sometimes prop the door open, but you can hope that it only happens when someone responsible has an eye on the door. The vulnerability of cloud ties to traditional on-premises security hygiene. Cloud technology is often poorly understood, especially when it neutralizes measures that have traditionally assured users that their IT activities are safe and private.
System administrators tend to assume that traditional physical security and firewalls are effective. In a typical car dealership, only employees with a need-to-know have access to the dealership network. The public has no access to this network and cannot read files stored there. Leaving a file unprotected in that environment may not be a good idea, but it is not a disaster either. When a dealership starts storing files on S3, administrators may not think through cloud best practices and, in the interest of convenience, apply the same permissions that they would on their private network. In other words, they opened the S3 bucket to give their users their customary convenient access. But now, instead of leaving open an interior door accessible only to authorized employees, the door is open to the street, and we all have another reason to fret about buying a new car.
As I said at the beginning of this post, Vandevanter’s discoveries do not show that Amazon S3, or any other cloud storage for that matter, is less safe than on-premises storage. But system administrators and individuals must realize that safety in the cloud is not the same as safety on the premises. Properly secured, cloud storage is undoubtedly safer than on-premises storage. The back window of a car dealership is far more vulnerable to a smash and grab identity thief than a physically secure cloud data center. Nevertheless, emphasis must be on “properly secured.” When balancing convenience and security for cloud storage in general, the rules are different and the user must understand and follow them. Cloud providers have a stake in the success of their customers and their best practices are one good place to start understanding the rules. There are some security standards and principles that I discuss in my book Cloud Standards that may be helpful, but in the end, the user must aim and fire at the miscreants, not their own feet.