BVA · Daily Brief — AWS + Snowflake · what needs your attention today

Mon May 11, 2026 · cost owner Dor Hanegby · AWS target $35–37K/mo cap $40K/mo · Snowflake Feb baseline $5,245/mo
Yesterday clean ✓ — May trajectory over cap, but an $11K AWS credit is pending.
Daily ✓: Saturday $1,513 (below baseline). Spike resolved. No new anomalies.
Monthly ⚠: May projecting ~$67K vs your $40K cap. Drivers: May 1 settle ($8K, recurring) + May 7-8 spike ($14K cost, $11K refund requested from AWS — human error, verbose logging) + intentional Neptune backfill (~$7.5K/mo above baseline) + ongoing c5.18xl experiment (~$5K/mo).
If AWS credit lands: May projection drops to ~$56K (still $16K over cap). If denied: ~$67K stands. Either way, June's lever question (see below) is what's actionable today.
May MTD: $31,580 (9d)
Forecast: $66,890
Credit pending: −$11,000?
Net forecast: ~$56–67K
Yesterday
$1,513
▼ −$92 vs $1,605 baseline ✓
May projected total
$56–67K
▲ +$16–27K over $40K cap (range = $11K AWS credit pending)
Current daily baseline
$1,605
elevated +$705/day from natural $900 base
Post-cleanup baseline target
$1,100
= ~$33K/mo · within $35-37K target ✓

What needs your attention today

1

May is a write-off — the only question is whether to clear the runway for June

Operating model is now set: $40K cap, $35–37K target, daily review. Against that: May lands at $56–67K depending on whether AWS grants the $11K credit for the May 7-8 human-error spike. Even with the credit, May is $16K over cap. None of this can be undone for May — drivers are all known and in flight below. The forward question is June:
Baseline math — why $1,605/day is not your natural rate
Pre-ramp natural baseline (Apr 11)$862/day · $26K/mo
+ c5.18xlarge × 3 (alex experiment)+$300/day · +$9K/mo
+ Neptune backfill (intentional)+$160/day · +$5K/mo
+ CloudWatch from LoginFeatureExtraction+$120/day · +$4K/mo
+ general drift (S3, Lambda, small)+$163/day · +$5K/mo
Current elevated baseline (May)$1,605/day · $48K/mo
Post-cleanup target (Neptune + c5 + CW reverse)~$1,100/day · ~$33K/mo ✓
The instinct is right: baseline should return to $1,000–1,200/day once the in-flight ramps reverse. That puts steady-state monthly spend at $33–36K — squarely inside your $35–37K target, not over your cap.
To hit June cap, the two levers each move ~$5K/mo:
Neptune downsize after backfill — saves ~$5K/mo (the +$160/day in the table above), reverts the +$300/day in EC2 elsewhere if Neptune-adjacent compute also scales down.
3× c5.18xlarge shutdown — saves ~$9K/mo. Owner alex (data team).

If both land before June 1 → June lands at ~$33K, well inside target with $7K headroom. If only Neptune lands → ~$42K (over cap). If only c5s land → ~$39K (just under). If neither → ~$48K. The brief tracks drift daily; June is the deadline this brief is built to defend.
Owner: you. Conversations needed: Kostya/Emile (Neptune end-date), alex (c5.18xl status). Both tracked in In-flight below — today's job is to put them on the calendar with explicit "before June 1" framing.
June deadline

Daily change log — yesterday vs day-before

Service Δ · May 9 vs May 8

ServiceMay 9May 8Δ
AmazonCloudWatch$35$5,633−$5,597
AWS Lambda$6$1,267−$1,261
Amazon S3$32$801−$769
Amazon DynamoDB$9$37−$28
EC2 — Other$106$124−$18
EC2 — Compute$384$399−$15
RDS$115$120−$4
Neptune$310$314−$4
All "expected" drops = spike resolution. Nothing new moved up.

Account Δ · May 9 vs May 8

AccountMay 9May 8Δ
gh-stg + gh-prod$496$8,170−$7,674
shared-dev$321$336−$15
org / payer$271$271$0
ug-prod$119$129−$10
global$50$57−$7
za-prod$44$42+$1
za-stg$26$27−$1
ug-stg$21$22−$1
gh-stg+prod is the only account that moved (resolution). All others are within ±$15.

Weekly change log — last 7d vs prior 7d

Service Δ · May 3–9 vs Apr 26–May 2

ServiceLast 7dPrior 7dΔ
AmazonCloudWatch$8,499$383+$8,116
AWS Lambda$1,743$48+$1,694
Amazon S3$1,338$242+$1,096
EC2 — Compute$2,594$1,601+$993
Amazon Neptune$2,054$1,525+$529
SP for AWS ML$535$173+$363
EC2 — Other$835$757+$78
EKS control plane$215$156+$59
SP for AWS Compute$995$995$0
Amazon ElastiCache$201$370−$169
Amazon RDS$868$1,447−$579
Tax$421$5,307−$4,886
Top 3 ↑ rows are spike-driven. Tax ↓ is the May 1 settle moving out of the prior-7d window.

Account Δ · May 3–9 vs Apr 26–May 2

AccountLast 7dPrior 7dΔ
gh-stg + gh-prod$14,337$6,052+$8,285
za-prod$246$75+$171
shared-dev$2,314$2,450−$136
org / payer$2,291$2,586−$295
ug-prod$1,190$1,439−$249
za-stg$203$267−$64
global (ECR)$402$953−$551
others (combined)$741$905−$164
gh-stg+prod is the only account that moved up — entirely the May 7-8 spike. All other accounts flat or down.

Service drift from Feb 2026 — golden baseline

Per-service drift · Apr 2026 vs Feb 2026 baseline (April = last clean full month, no spike noise)

ServiceFeb (base)MarAprΔ vs FebVerdict
Amazon Neptune $3,067 $2,663 $7,995 +$4,929 · 2.6× In-flight: backfill
EC2 — Compute $1,317 $1,696 $4,082 +$2,766 · 3.1× Drift (alex c5.18xl + Karpenter)
Tax $5,000 $5,478 $6,608 +$1,608 · 1.3× Passive — grows with everything
SP for AWS Database Usage $0 $1,455 $1,475 +$1,475 · NEW Verify commitment was right-sized
Amazon SageMaker $93 $160 $412 +$319 · 4.4× Small base — watch
Amazon Rekognition $644 $83 $77 −$567 · 0.1× ✓ face-dup scan moved off
Amazon ECR $891 $406 $359 −$532 · 0.4× ✓ image cleanup paid off
Bedrock (Sonnet 4.5) $956 $863 $515 −$441 · 0.5× ✓ usage trending down
Amazon RDS $4,176 $4,529 $3,890 −$286 · 0.9× ✓ on baseline
SP for AWS Compute $4,025 $4,456 $4,266 +$241 · 1.1× ✓ on baseline
AmazonCloudWatch $1,114 $1,342 $1,266 +$152 · 1.1× ✓ on baseline (May spike unrelated)
EC2 — Other $3,246 $3,519 $3,156 −$90 · 1.0× ✓ on baseline
Amazon S3 $1,103 $1,133 $1,020 −$84 · 0.9× ✓ on baseline
SP for AWS ML $2,131 $2,359 $2,055 −$76 · 1.0× ✓ on baseline
Total org (ex-Snowflake $40K one-off) $33,891 $37,110 $43,851 +$9,960 · +29% Net drift over 2 months
The diagnosis: 14 of 18 top services are at or below Feb baseline. All the drift comes from 4 services: Neptune (+$4.9K — intentional backfill, reverses when complete), EC2 Compute (+$2.8K — alex c5.18xl + general Karpenter scale), Tax (+$1.6K — passive growth), and the new Database SP commitment (+$1.5K). Net +$10K/mo drift over 2 months. Strip Neptune-when-backfill-ends and the EC2 governance fix → drift collapses to ~$2K/mo, well within "minor growth" territory.

4-week rolling totals — raw vs normalized (normalized = excludes monthly settle + spike days)

WeekRaw totalNormalizedExcludesvs prior wk (norm.)
Apr 12 – Apr 18 $10,924 $10,924 baseline
Apr 19 – Apr 25 $9,071 $9,071 −$1,853 (−17%)
Apr 26 – May 2 $15,488 $7,515 May 1 settle ($7,973) −$1,556 (−17%)
May 3 – May 9 (last 7d) $22,273 $8,297 May 7+8 spike ($13,975) +$782 (+10%)
Underlying business spend is healthy. Once you exclude the monthly settle and the spike events, weekly run-rate has been $7.5K–$10.9K and is roughly flat. The "+44% WoW" headline at the top is entirely the May 7-8 spike. If PR #567 closes today and no new spikes appear, next week's raw should land back at ~$8K.

In-flight — owners working it, track but don't re-investigate

Tomorrow's automatic checks (re-run this brief Mon morning)

Daily total · last 14 days (Apr 26 – May 9)

Apr 26Apr 30May 4May 9 ✓
baseline ($1,200–1,700) monthly settle (expected) real spike (in-flight: PR #567) yesterday — back to baseline

Standing cleanup backlog — doesn't change daily, do when you have a window

5+ idle SageMaker endpoints (za-dev / zm-dev / zm-stg document-labeling, 2 dev RiskModel)
stale 30+ days
~$300/mo
CW Logs retentionInDays: None on data-platform Lambdas
2.06 TB stored
~$60/mo + caps blowup
2 unattached EIPs in ug-prod
unknown
~$7/mo
EC2 owner-tag enforcement SCP (only ~30% of shared-dev tagged)
no policy yet
prevents next surprise

Snowflake — month trajectory

May 7 spike to $1,169 (6× normal) — eden.b loan-account-history backfill.
May trajectory: MTD $2,673 across 11 days. If May 7 spike doesn't repeat, May normalizes to ~$4,500under the $5,245 Feb baseline ✓.
The spike driver: a NEW warehouse BACKFILL_LOAN_ACCOUNT appeared 7d ago at 261 credits ($990) vs $0 prior week. All 3,396 queries by eden.b (ACCOUNTADMIN). 3,356 of those are MERGE statements into GHANA_PROD.SEMANTIC_LAYER.LOAN_ACCOUNT_HISTORY_BACKFILL, scanning 38.9 TB.
This is a SECOND backfill — separate from the fido-score backfill (PR #567) that drove the AWS spike. Two concurrent data-team backfills.
May MTD: $2,673
Excl. May 7: $1,504 / 10d
Projection: ~$4,500
vs Feb base: −$745 ✓
Yesterday (Snowflake)
$198
vs $155 trailing baseline · normal
May 7 spike
$1,169
309 credits · 6× normal day
May projected (Snowflake)
~$4,500
Under Feb baseline $5,245 ✓ (excluding May 7)
BACKFILL_LOAN_ACCOUNT WH
$990
7d · new warehouse · eden.b · 38.9 TB scanned

Top warehouses · last 7d

Warehouse7d $vs prior wk
BACKFILL_LOAN_ACCOUNT$990NEW · $0 prior
COMPUTE_WH$755+42%
ANALYTICS_WH$445−3%
DEV_STG_WH$128+75% (small base)
CLOUD_SERVICES_ONLY$2trivial
Strip BACKFILL_LOAN_ACCOUNT and 7d total drops from ~$2.3K to ~$1.3K (back to baseline).

Monthly trend · Snowflake

MonthCreditsUSDvs Feb base
Nov 20251,379$5,211−1%
Dec 20251,799$6,802+30%
Jan 20261,450$5,479+4%
Feb 2026 (baseline)1,388$5,245
Mar 20261,043$3,942−25%
Apr 20261,169$4,421−16%
May MTD (11d)707$2,673trending ~$7,290 raw / ~$4,500 ex-spike
$/credit = $3.78 at Fido (NOT $2 docs default).

Combined BVA context

Two concurrent data-team backfills

Both intentional, both currently inflating cost. Worth tracking when each ends so projections normalize.

Recommended cadence: run this brief every weekday at 09:00. Friday's run catches Thursday issues; Monday's catches the weekend.
Re-run: AWS — python3 .claude/skills/aws-finops-skill/scripts/collect_finops.py · Snowflake — queries via mcp__snowflake__read_query against SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY + QUERY_HISTORY.
Sources: AWS Cost Explorer (org payer 107812215209) · per-account aws ec2/rds/logs/sagemaker describe-* · Snowflake ACCOUNT_USAGE schema · Slack #7may-logs-indcident + #engineering-group for in-flight context. AWS detail at aws-finops.pages.dev.
Caveats: AWS Cost Explorer 24h lag (May 11 actuals not finalized). Snowflake $/credit = $3.78 not the $2 docs default. May 7 Snowflake spike is real but driven by a planned backfill, not a runaway query.