Two AWS bugs you'd never have heard about, and the fix was yours

AWS disclosed two SageMaker SDK flaws on its own bulletins page. They may carry a CVE ID with no CVSS, they'll never hit CISA KEV, and patching them is the customer's job.

The dangerous thing about the two SageMaker SDK flaws AWS disclosed on February 2 isn’t either bug. It’s that if you run a normal vulnerability process, Patch Tuesday plus NVD plus CISA KEV, you’d never have heard about them, and the patch was your job the whole time.

SageMaker isn’t everywhere. But the disclosure channel and the fix-ownership model that produced this are the same ones AWS uses for every client-side SDK it ships, and those run in your containers, your CI, your Lambda layers. The visibility gap is the story. The bugs are the example.

What was actually wrong

Two flaws in the SageMaker Python SDK, both cleared by upgrading to v3.2.0 or the v2.256.0 backport. CVE-2026-1777 is the interesting one. The SDK’s “remote functions” feature cloud-pickles your functions, arguments, and results into S3, and protects that data’s integrity with a per-job HMAC signing key. The idea is sound: sign the payload, verify before deserializing, refuse anything that doesn’t match. The implementation undercut it. Per the AWS security bulletin and the GitHub advisory, that signing key was stored in the training container’s environment variables and returned in cleartext in the DescribeTrainingJob API response.

So the mechanism meant to prove a payload was trustworthy became the thing that let an attacker forge trust. An attacker holding sagemaker:DescribeTrainingJob plus write access to the job’s S3 output reads the key, signs a malicious pickle so the integrity check passes, overwrites the artifact, and gets code execution the next time the job runs. NVD rates it CWE-319, CVSS 3.1 7.2 and CVSS 4.0 8.5, High.

State the precondition plainly, because it matters: this is not an unauthenticated internet exploit. The vector is PR:H. The attacker already holds elevated IAM permissions before any of this works. This is a privilege-amplification and lateral-movement primitive, not a front-door breach. Read it as “anyone can pop your training jobs” and you’ve overstated it.

CVE-2026-1778 is cruder and more familiar. To suppress errors while downloading models from public sources like TorchVision, the Triton Python backend globally disabled TLS certificate verification. Not for one host. For every HTTPS connection made after the Triton Python model was imported. That opens interception and model or dependency replacement, leading to RCE in the Triton container. NVD: CWE-295, CVSS 3.1 5.9, CVSS 4.0 8.2. Both were reported internally and neither shows in CISA KEV; EPSS sits around 0.01% each, consistent with a vendor-found bug. AWS’s response was fast, as it usually is.

Why your feeds went dark

Here’s the mechanism that should bother you more than the pickle.

Feed	Why it misses this class
Patch Tuesday	It’s Microsoft’s calendar. AWS doesn’t ship to it.
NVD enrichment	The CVE may exist with no CVSS, CPE, or CWE attached, the exact fields your scanners key on.
CISA KEV	Requires documented active exploitation, not a PoC. A vendor-found, pre-exploitation SDK bug doesn’t qualify.

Amazon became a CVE Numbering Authority in July 2024, so it self-assigns CVE IDs for its products and SDKs. Good. But an ID is not enrichment. NIST narrowed NVD enrichment in 2026 to prioritize CVEs in KEV, federal-use, or critical categories, against a backlog of roughly 27,000 at the end of 2025. An AWS self-reported SDK bug with no exploitation doesn’t clear that bar, so it can sit in the database with an identifier and none of the metadata your tooling consumes. KEV, by design, only lists what’s already been weaponized. If KEV is your trigger for cloud software, you catch AWS issues only after someone has already used them.

Meanwhile AWS published where it always does: its own Security Bulletins page, a channel you probably never subscribed to. That page often opens in reassurance mode. The 2022 OpenSSL bulletin led with “AWS services are not affected, and no customer action is required” before getting to the part where self-managed OpenSSL 3.0 did need patching. The headline and the homework don’t always match.

Then there’s who owns the fix. The shared responsibility model draws the line at “security of the cloud” versus “security in the cloud,” and a client-side SDK runs entirely on your side of it, in your environments, containers, and CI. AWS patched the library and posted the bulletin. Deploying it is on you. This is the normal lifecycle of an AWS-self-reported SDK vulnerability, and that lifecycle has a hole in it.

The part that’s a pattern, stated carefully

CVE-2026-1777 isn’t the first time an AWS service stored internal state somewhere an API would hand it to the wrong principal. The defensible version of the claim: AWS has repeatedly shipped trust assumptions, ambient service principals, predictable resource names, internal state readable through APIs, that outside researchers keep proving exploitable. Aqua’s “Bucket Monopoly” work at Black Hat 2024 found six AWS services, SageMaker among them, auto-creating S3 buckets with predictable names open to pre-claim takeover. Aqua again on the CDK bootstrap bucket in October 2024: a role that trusted a predictable bucket name without an ownership check. Datadog’s AppSync confused-deputy finding in 2022, an ARN-validation bypass into victim accounts. TrustOnCloud’s DataZone confused deputy in 2024, cross-account validation simply absent. And Lightspin’s 2021 SageMaker Jupyter work, notebook RCE reaching the metadata endpoint and the execution role token. Five cases, four years, several research teams, SageMaker twice.

That’s enough to call it a recurring design blind spot. It is not enough to call it negligence, and I won’t. AWS’s response times were generally fast, AppSync in about five days, DataZone in roughly twenty-seven. Several older cases were contested as working-as-designed, and several never got a CVE, so the set undercounts and skews toward whatever researchers happened to look at. The honest read is a design assumption AWS keeps making, not a posture I’d indict.

What to actually do

Subscribe to the AWS Security Bulletins RSS feed and wire it into the same queue as NVD and KEV. AWS’s own Prescriptive Guidance treats that feed as a required input. If it isn’t in your queue, the gap above is permanent.

Stop gating on KEV for cloud-provider SDKs. KEV is a good trigger for internet-facing appliances under live attack. For client-side libraries AWS self-reports, it’s the wrong instrument and it will always be late.

Then hunt the pin. The SDK gets installed independently of your main lockfile more often than you’d like, so check requirements.txt, pyproject.toml and poetry.lock, conda environment files, training and inference Dockerfiles, Lambda layers, and any CI step that runs pip install sagemaker. Upgrade to v3.2.0 or v2.256.0.

Split the IAM permissions while you’re in there. The 1777 attack needs both DescribeTrainingJob and s3:PutObject on the SageMaker output prefix. DescribeTrainingJob gets handed out broadly because it looks read-only. Audit who holds each, and if no role legitimately needs both together, don’t grant them together. For the TLS bug, the fix is adding your private CA to the container trust store, never disabling verification to silence a download error.

The generalizable lesson is one sentence: environment variables are readable by anyone with Describe-level access to the resource, so they’re no home for signing keys, integrity secrets, or tokens. Use Secrets Manager or Parameter Store behind a separate IAM boundary. The SageMaker SDK proved the point by storing the key that vouches for a payload in the one place a read-only API would hand to an attacker.

At PatchDayAlert we read the cloud-provider bulletin pages alongside NVD and KEV, and when a vendor self-reports a client-side bug your scanners won’t flag, it goes in the digest. The point of these two CVEs was never SageMaker. It’s that “no CVSS, not in KEV, vendor patched it” is exactly the shape of the bug already running in your environment, and the only person who can deploy the fix is you.

Sources

AWS Security Bulletin 2026-004 — 2026-02-02
GHSA-rjrp-m2jw-pv9c (CVE-2026-1777) — 2026-02-02
NVD, CVE-2026-1777 — 2026-02-02
GHSA-62rc-f4v9-h543 (CVE-2026-1778) — 2026-02-02
NVD, CVE-2026-1778 — 2026-02-02
Amazon added as CVE Numbering Authority — 2024-07-16
Help Net Security, “How NIST fumbled management of the National Vulnerability Database” — 2026-06-01
AWS Security Bulletins index / RSS — ongoing
AWS Prescriptive Guidance, “Monitor AWS security bulletins” — AWS docs
AWS Shared Responsibility Model — AWS docs
The Register, “AWS ‘Bucket Monopoly’ attacks” — 2024-08-07
The Register, “AWS CDK flaw exposed accounts to full takeover” — 2024-10-24
Datadog Security Labs, “A confused deputy vulnerability in AWS AppSync” — 2022-11-21
TrustOnCloud, “Confused Deputy Vulnerability in Amazon DataZone” — 2024-11

Two AWS bugs you'd never have heard about, and the fix was yours

What was actually wrong

Why your feeds went dark

The part that’s a pattern, stated carefully

What to actually do

Sources

Sources

A model was pulled for being too good at finding bugs

A crash got a federal patch deadline. Here's why that's the right call

The patch triage meeting that ends with owners, not opinions

The patch window went negative. Now what?