The Cloud Conundrum: S3 Encryption
AWS will now encrypt all new data in its Amazon S3 storage service by default. Huge announcement, secure default for the win, sure, but it *may* give a false sense of security. Here’s how.
👋 Dear reader: Hope you’re staying safe, and going strong with your new year resolutions. This is first part of a series of posts I wish to write on peculiar cloud security challenges. In this post, I will cover:
Encryption at rest in cloud
Amazon S3 and its encryption options
How cloud’s server side encryption can give a false sense of security, and what you can do about it
Encryption is a tricky concept. It’s simple at the surface, but dig a level deeper and it unravels like Game of Thrones subplots.
Let’s take AWS' recent announcement that all new objects in Amazon S3 (Simple Storage Service) will now be encrypted by default.
Amazon S3 now applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for every bucket in Amazon S3. Starting January 5, 2023, all new object uploads to Amazon S3 will be automatically encrypted at no additional cost and with no impact on performance.
What was earlier a 1-click setup, is now zero-click. AWS, and its S3 service specially, operate at a mind boggling scale. There are 280 trillion objects in S3, averaging over 100 million requests per second. To be able to support transparent encryption on all new objects, while not breaking any existing functionality, dependencies and applications - is impressive to say the least. Kudos to the engineering teams.
But.. does the default SSE-S3 encryption provide effective confidentiality? And, does it help in reducing impact of one of the primary causes of CISO migraines, i.e., data leakage via intentional or accidental public S3 buckets?
Short answer: No.
I’m a big proponent of AWS, and can say from experience that AWS takes security very seriously. However, in this case, server side encryption with default S3 keys (SSE-S3) can be misconstrued, potentially leading to inaction from customers to employ stricter encryption schemes (which are all available natively in AWS btw) on sensitive data.
Let’s dive in.
Cryptography - a quick primer
Encryption converts readable data to a random looking blob. It is the reason we can watch dog videos in private, or crib about things on group chats (well, mostly), or buy toilet paper online securely. On a serious note, encryption is a fundamental tool for cybersecurity to the extent that it can be an enabler of human rights, by allowing freedom of speech through end-to-end encryption. It is a big deal to get it right.
Technically, encryption is the process of converting plaintext data into ciphertext. There are 2 types of encryption: symmetric and asymmetric. Symmetric encryption uses the same key to encrypt and decrypt the data. Asymmetric encryption uses separate keys to do the same, and is also called public key cryptography. We will limit this blog post to symmetric encryption of data at rest, which is the data stored on disks.
encrypt(plaintext, key) → ciphertext
The efficacy of a good encryption scheme depends upon the strength of encryption algorithm (the lock) and the encryption key (key).
🔒 First is the lock, i.e., the algorithm itself. There are many encryption algorithms available, the most common type today is Advanced Encryption Standard with 256 bit key (AES-256). There is enough evidence, backed by gory mathematical proofs, to safely assume that the AES-256 encryption algorithm is not broken, for now. From one estimate, if we use the combined compute power of every PC on earth (estimated 2 billion PCs), it’d take 13,689 trillion trillion trillion trillion years to brute force AES-256. For comparison, the age of the universe is a meager 14 billion years. Quantum computers might change the equation sooner than later, but for now AES-256 is considered a quantum resistant algorithm. Moral of the story here is to trust the researchers and don’t invent your crypto.
🔑 Then comes the key. As a wise man once said: if a thief has your key, no lock is strong enough. That’s why protecting the key is the most important part of a secure encryption scheme. So in theory, it’s pretty simple to protect your data. Choose a vetted encryption algorithm, and protect the key. In practice, things are more complex.
Any modern production application usually has many different data sources, and hence many encryption schemes and keys. For instance, Pinterest currently stores and manages 1 exabyte of data on AWS. Nope - that’s not a typo, that is one frickin’ exabyte, or 1 billion gigabytes of data, which needs to be protected.
This humongous amount of data is unlikely to be in a single data store. So now, you need to manage encryption across all the applicable data stores, equating to potentially thousands of encryption keys. Add to it the security best practice of rotating the keys periodically, or in case of an incident, deleting a whole bunch of keys. Doing this on your own is a nightmare. Ask Harry Potter.
Luckily, you don’t have to. There are key management services both for on premise and cloud. In AWS, the service is called AWS Key Management Service (AWS KMS).
This brings us to the types of encryption choices available in S3.
Encryption in Amazon S3
You can either do Server Side Encryption (SSE), in which Amazon S3 encrypts your data as it writes it to disks in its data centers and decrypts it for you when you access it. With server side encryption, there are 3 broad ways to manage your encryption keys.
One option is for S3 to fully manage the encryption keys (SSE-S3). This option places the most trust in AWS, and is the reason I’m writing this post. A second option is for customers to use a key that is managed by the Amazon Key Management Service (SSE-KMS). This option gives customers control and transparency over access to their keys with strong auditing. Spoiler: this is my recommendation for most use-cases. Third option is for the customer to provide and manage the key, but have S3 perform the actual encryption and decryption (SSE-C). This gives customers a level of separation between themselves and AWS; do note that there’s a small window where the encryption key will be present on AWS servers to do encryption and decryption. Using either of these 3 ways, you can choose to give all the encryption, decryption and associated compute headaches to AWS.
Or, you can say hey AWS, I don’t trust you, I will do the Client Side Encryption (CSE), in which you encrypt your data locally and pass it to the Amazon S3 service for storage and retrieval. You’ve 2 further options here: Use a key stored in AWS Key Management Service (AWS KMS). Or, use a key that you store within your application.
Security is a tradeoff problem. Your security decisions may come at the cost of convenience or performance or a higher spend. If you can safely create and manage your own keys in your applications for instance, you, and only you, will have access to the unencrypted material (assuming your access controls are rock solid). Choose client side encryption for highly regulated industries, business critical and the most paranoid of use cases. For the rest, the tradeoff problem may lean towards using the other option, the server side encryption.
And as per an AWS blog, server side encryption may be the way to go. I agree.
While client-side encryption still has an important role in security and data protection, two of its disadvantages are that it depends on clients having a secure source of randomness, which is not always easy, and it is CPU intensive on the client. For more simplicity and efficiency, our services also offer server-side encryption.
(Source: AWS blog)
Now, let’s go back to the news announcement, that AWS now encrypts all new object uploads with SSE-S3 server side encryption. So does it provide any meaningful confidentiality?
So does it?
In my opinion, no, the SSE-S3 server side encryption does not provide any meaningful security assurance when it comes to confidentiality of data. Here’s why.
One. It is mostly a checkbox exercise. This may appease some auditors but not all (disclaimer: nothing against auditors, I work very closely with them at Amazon, and they understand security better than most). For example, SSE-S3 meets PCI DSS’ encryption requirement but not the segregation of duties requirement.
Next, at best SSE-S3 adds a defense in depth protection against a physical loss, theft or confiscation of an AWS hard drive storing your data. Think crazy scenarios like a tornado or fire, followed by more chaos and somehow the AWS hard drive landing at Goodwill. If the data on it is unencrypted, game over. As you can imagine, the likelihood of this happening is about the same as that of the United States winning a cricket world cup.
Lastly, in SSE-S3, since S3 encrypts and decrypts the data transparently to anyone with access to the bucket, on its own it will not protect leaked S3 buckets’ contents from being read. Public S3 buckets is unfortunately still a fairly common scenario.
Why does AWS even provide this option then? For one, some encryption is better than no encryption. Few compliance attestations may be happy with it, since it gives you a defense in depth option. It also provides some practical benefits for AWS to wipe out the hard drives more easily and securely (delete the key and you get crypto shredding). Also worth noting, there are no additional costs for using SSE-S3.
What should you do instead? My suggestion is to go with any of the other options in Figure 1. At a minimum, go with the server side encryption with KMS keys, SSE-KMS.
SSE-KMS provides a good balance between security and usability. For reading and writing contents of S3, it requires users to have access to both the object and the key. Enter multiple permission policies at IAM, S3 and KMS level, and hence segregation of duties. Now if a bucket is made public, and if it’s encrypted with SSE-KMS, it’s a very low likelihood that its contents will be world readable. Win!
Takeaways
If you’re new to AWS, you might be wondering, wow this is complicated. It is, and I didn’t even cover all the scenarios. Here’re the takeaways:
Don’t invent your crypto. Choose a cryptographic algorithm vetted by academia and industry such as AES-256.
Outsource key generation and management. Prefer not to create and manage your own cryptographic keys if it’s not your core competency. Use the cloud service provider’s key management service instead.
SSE-KMS for the win. That means, in AWS for your data in S3, prefer the server side encryption with KMS keys (SSE-KMS) for most use cases.
SSE-S3 may be misleading. Server side encryption with S3 keys (SSE-S3) shows AWS’ commitment to security, but IMO it doesn’t provide benefits beyond a compliance checkbox and a very low probability scenario of AWS data-center compromise.
To wrap it up, here’s a relevant quote, attributed to Amazon CTO Werner Vogels:
“Dance like nobody's watching. Encrypt like everyone is.”
Disclaimer: Opinions expressed are solely my own and do not express the views or opinions of my employer.
P.S. This article made it to the frontpage of Hackernews. Good discussion there as a further reading.
📕 Security Wale is a blog about cloud, cybersecurity, and in between - written by Aditya Patel. This is a passion project, where Aditya shares his learnings, opinions and rants from over a decade of working in the IT industry in United States. For a living, currently, he protects ☁️ cloudy things at Amazon/AWS. Earlier, Aditya has done software security consulting, masters in Information Security from Johns Hopkins, and computer science engineering. To support this effort, consider subscribing (it’s free) and spreading the word.