⚠
Yesterday clean ✓ — May is $56K on trajectory, but two known one-off events explain the overage.
Daily ✓: May 15 AWS $1,055 + Snowflake $129 = $1,184 combined — below baseline, no new anomalies.
Monthly ⚠: May is projecting ~$62K combined AWS+Snowflake vs $40K AWS cap.
Two events account for nearly all of it: May 1 monthly RI/Savings Plans settle (+$7.6K, recurring, not actionable)
and May 7–8 LoginFeatureExtraction Lambda bug (+$14K AWS CloudWatch/Lambda, Datadog credit request pending).
The underlying clean baseline of $1,062/day projects to a normal June — no structural drift.
June lever: Neptune downsize (Dor Segal, post model-build) + alex c5.18xl × 3 resolution (Anael) could save ~$5–8K/mo.
May 15 AWS: $1,055 ✓
May 15 Snowflake: $129 ✓
May MTD AWS: $39,180 (15d)
AWS proj. (clean fwd): ~$56K
SF proj. (clean fwd): ~$6K
Combined proj.: ~$62K
Yesterday (May 15) combined
$1,184
▼ AWS $1,055 + Snowflake $129 · below $1,220 baseline ✓
May projected total (AWS+SF)
~$62K
▲ +$22K over $40K cap · driven by two known one-offs (see stories)
AWS clean baseline
$1,062/day
May 11–15 avg · June on track if Neptune + c5.18xl resolved
June target daily rate
$1,333/day
Current $1,062 baseline is ▼ $271 below June $40K daily target ✓
What needs your attention today
1
Lock in Neptune downsize timing with Dor Segal
Neptune is running model-build (slim → 8 feature families). When the compute-heavy phase wraps, downsize to a smaller instance class saves ~$1,500–2,000/mo. Need a commitment on "done by when" so June costs can be forecasted. Check #graph-db-special-force for current status.
Owner: Dor Segal · Close signal: date commitment captured in JIRA or Slack
June lever
2
Resolve alex c5.18xlarge × 3 decision (Anael)
Three c5.18xlarge instances tagged to "alex" are consuming ~$5K/mo and have been in "pending status check" for weeks. This is the single largest standing savings lever. Needs a go/terminate decision before June 1.
Owner: Anael · Close signal: instances terminated or tagged with approved-until date
June lever
✓
LoginFeatureExtraction billing credit — Datadog outreach sent
Datadog credit request email drafted May 11 (Roman DM) for the ~$11K Datadog overage from the May 7–8 Lambda logging storm. AWS CloudWatch portion (~$5.6K) was one-time. Wait for credit confirmation — no action needed today unless Datadog responds.
Owner: Dor · Close signal: credit confirmation from Datadog account manager
tracking
Cost stories — named root causes
May 7–8 · LoginFeatureExtraction Lambda · Python logger handler accumulation
+$14K above baseline (AWS account: gh-stg + gh-prod)
Data team · fix merged (PR #567) · Datadog credit pending
The ghana-prod-LoginFeatureExtraction Lambda function accumulated duplicate Python logging handlers across warm invocations, producing ~17,300 events per invocation vs the designed 10–15. Over a ~23-hour window, approximately 59.4 billion redundant log events were ingested into CloudWatch and Datadog. A one-line fix was deployed; the PR merged May 10. A formal Datadog credit request for ~$11K was sent May 11.
| Date | CloudWatch | Lambda | S3 | Total |
| May 6 | — | — | — | $1,604 |
| May 7 | ~$4,200 | ~$700 | — | $4,755 ↑ |
| May 8 | $5,610 | $1,265 | $791 | $9,220 ↑ |
| May 9 | — | — | — | $1,514 |
May 7 · Snowflake · BACKFILL_LOAN_ACCOUNT warehouse · eden.b loan-account-history MERGE
+$1,169 vs ~$165 daily norm (Snowflake, same day as AWS spike but unrelated)
Data team · backfill complete · warehouse idle now
eden.b (ACCOUNTADMIN) ran 3,367 queries on the BACKFILL_LOAN_ACCOUNT warehouse over 22.6 hours, scanning 38.9 TB via MERGE statements for the loan-account-history backfill. This is the separate Snowflake backfill that coincided with the AWS LoginFeatureExtraction incident — different workloads on the same calendar day. BACKFILL_LOAN_ACCOUNT credits this week: 2.5 ($9.52) vs prior week 259 ($980) — confirming the backfill is complete.
| Date | Credits | Cost | Note |
| May 6 | 52.7 | $199 | baseline |
| May 7 | 309.3 | $1,169 | backfill spike ↑ |
| May 8 | 50.5 | $191 | resolved |
| May 14 | 39.1 | $148 | clean |
| May 15 | 34.3 | $129 | clean ✓ |
May 1 · Monthly RI + Savings Plans + Tax settlement
$8,858 (7.2× one-day baseline) — recurring, not actionable
Billing artifact · happens every 1st · label and ignore
Every month on the 1st, AWS settles Reserved Instance commitments, Savings Plans charges, and accrued taxes in a single billing event. The $8,858 on May 1 vs a $1,235 prior-day baseline is entirely explained by this mechanic — visible in the HeavyUsage + ComputeSP + NoUsageType usage types. No investigation or action needed. Mark amber in sparkline, footnote as settled.
| Account | May 1 spend | Note |
| gh-stg + gh-prod | $3,810 | largest RI settlement |
| shared dev | $1,416 | dev-env RIs |
| org (payer) | $1,262 | Savings Plans + tax |
Daily change log — May 14 → May 15 (T-2 anchor)
By service — May 15 vs May 14
| Service | May 14 | May 15 | Δ |
| EC2 Compute | $167 | $150 | ▼ −$17 |
| Amazon Neptune | $157 | $157 | — |
| RDS | $124 | $123 | — |
| Savings Plans Compute | $142 | $142 | — |
| EC2 - Other | $110 | $108 | — |
| Savings Plans ML | $79 | $79 | — |
| VPC | $50 | $50 | — |
| EKS | $38 | $42 | ▲ +$4 |
| Snowflake credits | $148 | $129 | ▼ −$19 |
By account — May 15 vs May 14
| Account | May 14 | May 15 | Δ |
| gh-stg + gh-prod | $315 | $314 | — |
| org / management | $271 | $271 | — |
| shared dev | $154 | $148 | ▼ −$6 |
| ug-prod | $129 | $129 | — |
| global (ECR/CA) | $54 | $53 | — |
| za-prod | $41 | $42 | ▲ +$1 |
| ug-stg | $37 | $37 | — |
| zm-stg+prod | $29 | $29 | — |
Snowflake — Daily + Warehouse breakdown
Snowflake health · May 15 = $129 (34 credits × $3.78) · all warehouses clean
| Warehouse | Last 7d | Prev 7d | Δ |
| COMPUTE_WH | $600 | $672 | ▼ −11% |
| ANALYTICS_WH | $447 | $490 | ▼ −9% |
| DEV_STG_WH | $143 | $106 | ▲ +35% |
| BACKFILL_LOAN_ACCOUNT | $10 | $981 | ▼ −99% (backfill done) |
Snowflake monthly trend:
Feb 2026 (golden): $5,245 (1,387 credits)
Mar 2026: $3,942 ↓ (cleanup)
Apr 2026: $4,421 (normal)
May MTD ($3,468, 15d): proj. ~$6K
May 7 spike ($1,169) inflates — clean baseline ~$155/day
DEV_STG_WH +35% WoW — watch for 2nd week of growth. May indicate dev/analytics heavy workload or an inefficient query pattern.
Service drift from Feb 2026 golden baseline (Apr = last clean full month)
| Service |
Feb 2026 base |
Apr 2026 (last clean) |
Δ vs Feb |
Verdict |
| EC2 Compute | ~$5,400 | ~$5,200 | −4% | ✓ on baseline |
| Amazon Neptune | ~$1,200 | ~$4,700 | +292% | in-flight (model-build) |
| Savings Plans Compute | ~$4,200 | ~$4,200 | — | ✓ on baseline |
| RDS | ~$3,600 | ~$3,700 | +3% | ✓ on baseline |
| EC2 - Other | ~$3,500 | ~$3,600 | +3% | ✓ on baseline |
| AmazonCloudWatch | ~$900 | ~$950 | +6% | ✓ on baseline |
| Savings Plans ML | ~$2,000 | ~$1,950 | −3% | ✓ on baseline |
| Amazon EKS | ~$700 | ~$860 | +23% | ⚠ drift +23% |
| Amazon ElastiCache | ~$750 | ~$800 | +7% | ✓ on baseline |
| AWS Transfer Family | ~$580 | ~$570 | −2% | ✓ on baseline |
| Amazon VPC | ~$1,500 | ~$1,500 | — | ✓ on baseline |
| Amazon S3 | ~$1,100 | ~$1,050 | −5% | ✓ on baseline |
| Snowflake (credits) | $5,245 | $4,421 | −16% | ✓ below baseline ↓ |
EKS +23% drift: likely Karpenter node scaling from Neptune graph workload and ZA-prod cluster spin-up. Not alarming alone, but worth correlating with June Neptune downsize timeline.
Neptune is labeled "in-flight" because the ~4× increase is the known graph DB model-build ramp (decision: downsize after model-build completes).
In-flight — being tracked, no executive action needed
- LoginFeatureExtraction fix — merged in data-platform PR #567. No recurrence since May 9. Datadog credit request sent May 11 (Roman DM to Dor). Close: credit confirmed or 30d passes with no response.
- Neptune graph DB — model-build phase — backfill complete (May 14). Now running slim model (1-2 feature families → 8). Owner: Dor Segal. Channel:
#graph-db-special-force. Close: compute-heavy phase done → downsize to smaller instance class.
- ZA-prod cost ramp — $42/day and growing ($289 this week vs $212 prior week, +37%). Expected spin-up churn for June 2026 launch. No action — label as expected. Re-evaluate at launch.
- c5.18xlarge × 3 (alex) — ~$5K/mo. Status: pending Anael decision. In-flight until terminated or approved. This is a lever decision (see action #2 above).
- DEV_STG_WH Snowflake +35% WoW — $143 vs $106 last week. Single week, may be transient. Watch for persistence next week before escalating.
Tomorrow's automatic checks (May 17)
- AWS daily spend — flag if May 16 > $1,300 (1.2× clean baseline)
- Snowflake daily credits — flag if May 16 > 60 credits ($227) without a known backfill
- DEV_STG_WH 2nd week growth — flag if WoW continues above 20%
- ZA-prod cost trajectory — flag if daily rate exceeds $80/day (today $42)
- Neptune spend — flag if weekly credits rise vs prior week (model-build should be stable)
- LoginFeatureExtraction recurrence — flag any day where CloudWatch spend in gh-stg+gh-prod > $200 (today ~$45)
30-day AWS daily spend sparkline (Apr 16 – May 15)
Apr 16Apr 23Apr 30May 7May 15
Normal day
Monthly settle (May 1)
Spike (LoginFeatureExtraction)
Yesterday (May 15)
Top services — last 7d vs prior 7d
| Service | 7d | Prev 7d | Δ% |
| EC2 Compute | $1,177 | $2,536 | ▼ −54% |
| Amazon Neptune | $1,099 | $1,951 | ▼ −44% |
| Savings Plans | $995 | $995 | — |
| RDS | $865 | $858 | +1% |
| EC2 - Other | $771 | $841 | ▼ −8% |
| ML Savings Plans | $555 | $456 | ▲ +22% |
| CloudWatch | $248 | $8,512 | ▼ −97% (spike resolved) |
| S3 | $233 | $1,338 | ▼ −83% (spike resolved) |
| EKS | $265 | $201 | ▲ +32% |
Top accounts — last 7d vs prior 7d
| Account | 7d | Prev 7d | Δ% |
| gh-stg + gh-prod | $2,204 | $14,247 | ▼ −85% (spike resolved ✓) |
| org / management | $1,901 | $2,211 | ▼ −14% |
| shared dev | $1,076 | $2,310 | ▼ −53% |
| ug-prod | $903 | $1,199 | ▼ −25% |
| global (ECR/CA) | $375 | $410 | ▼ −9% |
| za-prod | $289 | $212 | ▲ +37% (ZA launch ramp) |
| ug-stg | $250 | $264 | −5% |
| zm-stg+prod | $249 | $249 | — |
Monthly trend (6 months)
| Month | AWS | Snowflake | Combined |
| Nov 2025 | $43,088 | — | ~$43K |
| Dec 2025 | $42,938 | — | ~$43K |
| Jan 2026 | $36,013 | $5,479 | ~$41K |
| Feb 2026 ★ | $33,891 | $5,245 | $39,136 |
| Mar 2026 | $77,110 | $3,942 | $81K ⚠ |
| Apr 2026 | $43,851 | $4,421 | $48K |
| May 2026 MTD | $39,180 | $3,468 | $42K (15d) |
★ Feb 2026 = golden baseline ($33,891 AWS / $5,245 Snowflake). Mar spike was a one-off event.
Standing cleanup — slow-moving backlog
c5.18xlarge × 3 (alex) — continuously running, no current workload
weeks pending
~$5K/mo
Neptune — downsize after model-build completes (Dor Segal tracking)
in-flight
~$1.5–2K/mo
CloudWatch log retention — Lambda log groups without 14d retention policy
Platform ticket open
~$200/mo
Idle SageMaker endpoints (if any) — audit quarterly
no recent report
variable