Dev Spotlight
April 01, 2026 at 10:26 AM · v0.8 · e8a843f3-8b7
Acme APIs
DX PULSE

About This Report

1 What is this?

The DX Pulse is a fine-grained and detailed exploration of Acme APIs's developer experience.

This dashboard offers summaries, details, and exploratory tools so that you can:

  • quickly understand your strengths and weaknesses
  • dive into the fine details you're most interested in
  • explore your own questions and assumptions against the raw data
2 How does this work?

The benchmark is formed by industry best practices, published reports, and Dev Spotlight's combined decades of experience in DX.

Over 455 individual metrics produce two scores:

DX Health Score — what your team directly controls:

  • Documentation Quality (32%)
  • Developer Onboarding (27%)
  • Ecosystem Health (17%)
  • Content Coverage (14%)
  • AI Readiness (10%)

AI Lens Score — the downstream outcome:

  • LLM Accuracy (56%)
  • AI Discoverability (44%)

Your DX Health drives your AI Lens score. Improving documentation, filling content gaps, and building ecosystem presence directly improves how AI represents your product. Each metric is analyzed, measured, scored, and weighted using your program details, allowing us to track scores over time and compare to competitors.

3 The new world of LLM

The benchmark includes heavily-weighted measurements of how each leading LLM sees your product.

If an LLM hallucinates about your product, gives developers wrong answers when they ask questions, or shows no awareness that Acme APIs exists as a solution, your developer experience may have a gap.

Understanding how LLMs learn from your docs is a new lens on Dev Ex that has quickly become an important metric.

Methodology

Audit Scope

This benchmark analyzed 157 documentation pages, 381 code samples, verified against 1598 ground truth facts, and tested 124 questions across 3 AI models. The full audit ran in 64 minutes. Each audit is manually reviewed by a member of the Dev Spotlight DX Team for accuracy and nuances.

Multi-Model Cross-Validation

AI accuracy is tested against three leading models: Claude (Anthropic), ChatGPT (OpenAI), and Gemini (Google). Each model receives the same questions and their answers are independently verified against ground truth extracted from the product's own documentation. Cross-model agreement strengthens confidence; disagreement flags areas where documentation is ambiguous or incomplete.

Scoring Framework

All scores use a 1–10 scale. Two independent top-level scores measure fundamentally different things:

DX Health Score is a weighted composite of the dimensions a company directly controls: documentation quality, developer onboarding, ecosystem health, and content coverage. The aggregation method penalizes extreme weakness — a product scoring 9/10 on docs but 2/10 on onboarding will not receive a passing composite score.

AI Experience Index is a weighted composite of outcome metrics: how accurately AI models answer questions about your product, and how easily AI systems discover your content. A company can have excellent docs (high DX Health) but poor AI accuracy (low AI Experience) if AI models were not trained on their content, or vice versa.

Score Breakdown

Overall Score: 7.2/10 — click a module to drill into its dimensions
8+ 5–8 <5

Research Foundation

Scoring dimensions and weights are grounded in published empirical research on developer experience and API documentation quality, including:

  • Robillard & DeLine (Empirical Software Engineering, 2011) — field study of 440+ developers on API learning resources
  • Uddin & Robillard (IEEE Software, 2015) — 323-developer survey identifying documentation blockers
  • Meng, Steinhardt & Schubert (Communication Design Quarterly, 2019; ACM SIGDOC, 2020) — developers spend 49% of development time consulting documentation
  • SPACE framework (Forsgren et al., ACM Queue, 2021) and DevEx framework (Noda et al., ACM Queue, 2023) — multi-dimensional developer productivity measurement
  • DX Core 4 research (40,000+ developers, 300+ organizations) — each 1-point DX improvement correlates to 13 minutes saved per developer per week
  • CHAOSS project (Linux Foundation) and Jansen's ecosystem health framework (Information and Software Technology, 2014) — open source community health metrics
  • DORA program (31,000+ survey responses) — developer experience metrics predict delivery outcomes
  • Stack Overflow Developer Survey (2025) — technical documentation is the #1 learning resource (68% usage)

Quality Controls

  • Ground truth verification: Every AI response is verified against facts extracted directly from the product's own documentation. Only claims that directly contradict documented facts are flagged — unverifiable claims are tracked but not penalized.
  • Calibrated scoring: LLM-judged dimensions use a grade-based system to counteract central tendency bias, producing well-distributed scores across the full 1–10 range.
  • Deterministic results: Content-addressed caching ensures re-runs produce identical results unless source data has changed.

Known Limitations

  • No automated signup testing: The benchmark evaluates documentation and content quality, not the actual signup experience. CAPTCHAs, phone verification, or approval workflows are not captured.
  • Temporal snapshot: The benchmark captures a point-in-time snapshot. Documentation changes, AI model updates, and community activity evolve continuously.
  • Financial estimates are directional: The financial impact model applies configurable assumptions to module findings. Estimates should be treated as order-of-magnitude indicators, not precise forecasts.
DX Quadrant
Executive Summary
Your DX Health Score (x-axis) measures the developer experience (DX) you control: documentation, onboarding, content coverage, ecosystem, and more. Your AI Lens Score (y-axis) measures the outcome: how accurately AI represents your platform to developers. Combined, they reveal how developers experience your platform today, both through your owned DX and through the AI tools they rely on.
DX Quadrant
DX Health → AI Lens → Fragile Leading Emerging Untapped Acme APIs
What each quadrant means
Top-left (Fragile): AI happens to be accurate despite weak DX. Likely coasting on pre-training data that will go stale.
Top-right (Leading): Strong DX, accurate AI. The ideal position.
Bottom-left (Emerging): Weak DX, inaccurate AI. Developers may struggle to find answers from your DX or from AI.
Bottom-right (Untapped): Good DX, but AI still gets it wrong. Your content isn't reaching AI models effectively.
Acme APIs scores 7.2/10 overall, landing in solid but vulnerable territory. Its biggest strength is onboarding: all 14 first-success paths have documentation, and the getting-started flow earns consistent B+ to A grades. Its biggest weakness is ecosystem health, where community response times average over 800 hours, 13 reference app repos haven't been updated in a year, and there's effectively no visible DevRel presence.

The practical result: a developer gets through setup and first send without much trouble, then hits a wall when something breaks. Error messages lack response payloads, GitHub issues go unanswered for weeks, and SDK depth varies wildly (JavaScript and PHP get production-quality examples; Java gets three lines). Developers in that situation open a support ticket, build workarounds from Stack Overflow fragments, or start evaluating SendGrid and Mailgun. Meanwhile, AI models are fabricating error codes and misidentifying community libraries as official SDKs, which means developers using AI assistants get confidently wrong answers about Acme APIs's API behavior. The 70% of code blocks missing language tags and the poor front-loading of page content feed directly into these AI accuracy problems.

The highest-leverage fix: assign someone to own community response across your top GitHub repos and tag every code block in your docs. Those two changes address ecosystem abandonment signals and AI accuracy failures simultaneously.
Acme APIs delivers a strong onboarding experience and solid documentation, but its weak developer ecosystem health and repeated failures to earn AI-driven recommendations in key scenarios mean developers are being directed toward competitors at the point of decision.
Key Finding
DX Health and AI Lens: Key Findings
DX Health
7.2
What you control
AI Lens
7.1
What developers see
AI models learn about your product from your public documentation. When docs are incomplete, poorly structured, or missing code samples, AI models give developers wrong answers or no answers at all. Every DX Health improvement in this report also improves your AI Lens score.

DX Health

7.2
Docs Quality
Key Finding
70% of code blocks lack language tags
Acme APIs's documentation scores 7.2/10 across 157 crawled pages, with strong freshness (9.6/10) and page structure (8.1/10). The weak point is information architecture at 6.2/10, and 70% of the 381 code samples lack language tags.
7.7
Developer Onboarding
Key Finding
Quickstart not found: Sending email with the Acme APIs API
Your Getting Started score is 7.7/10. Developers reach first success at 8.0/10 and integration at 8.1/10, which means the core "send an email" path works. But only 3 of 14 essential paths have actual tutorials; the other 11 have documentation without step-by-step guidance.
5.7
Ecosystem
Key Finding
Weak ecosystem signal: Community Response
Ecosystem Health scores 5.7/10, and the breakdown shows why. Developer Reputation is strong at 8.1/10 and Pricing Clarity hits 8.6/10. But Community Response is 1.0/10, Reference Apps score 3.4/10, and DevRel Visibility sits at 4.5/10.
7.5
Content Gaps
Key Finding
Strong documentation coverage for Acme APIs
Of 14 real community questions found on Stack Overflow and GitHub, Acme APIs's docs answer 6 of them. The other 8 go unanswered. Theoretical coverage is better at 76%, meaning the docs cover most standard topics but miss the specific edge cases developers actually run into.
8.2
AI Readiness
Key Finding
Technical AI readiness scores 8.2/10 based on llms.txt, OpenAPI specs, sitemaps, Schema.org markup, and AI crawler access.

AI Lens

6.8
AI Accuracy
Key Finding
Claude cited unverified claims about Acme APIs
ChatGPT returns Acme APIs answers with a 2% hallucination rate. Gemini and Claude each hit 17%, with 17 contradicted claims apiece. The fabrications are specific and dangerous: Claude tells developers that a missing sender signature returns HTTP 422, when Acme APIs actually returns API error code 400. Gemini lists Python as an official SDK when it's community-maintained.
7.5
AI Readiness
Key Finding
Claude didn't recommend Acme APIs in this scenario
All three major AI models show 97% feature awareness for Acme APIs, and recommendation strength ranges from 7.9 (Gemini) to 8.6 (Claude). Claude's getting-started quality hits 9.6/10. The product is well-represented in AI outputs for general queries.
Actions: Priorities

Details available with full DX Pulse Benchmark.

Financial Impact

Details available with full DX Pulse Benchmark.

Your DX Health drives your AI Lens score. Improving documentation, filling content gaps, and building ecosystem presence directly improves how AI represents your product. Below is a summary of each component, with detailed findings and evidence organized by the same components further down the page.

7.2
Documentation Quality
32%
7.7
Developer Onboarding
27%
5.7
Ecosystem Health
17%
7.5
Content Coverage
14%
8.2
AI Readiness
10%
1. Documentation Quality 7.2/10

Details available with full DX Pulse Benchmark.

2. Developer Onboarding 7.7/10

Details available with full DX Pulse Benchmark.

3. Ecosystem Health 5.7/10

Details available with full DX Pulse Benchmark.

4. Content Coverage 7.5/10

Details available with full DX Pulse Benchmark.

5. AI Readiness 8.2/10

Details available with full DX Pulse Benchmark.

Maturity Grid

Code Samples
7.3
Page Structure
8.1
Freshness
9.6
Error Documentation
7.0
Ai Readability
7.8
Developer Tone
7.0
Information Architecture
6.2
Completeness
7.0
Time to First Success
8.5
Tutorial Quality
8.1
Theoretical Coverage
8.1
Community Gaps
5.0
Blog Quality
6.5
AI/SEO Visibility
7.8
Third-Party Content
7.3
Community Response
1.0
Developer Tooling
6.5
Error Message Quality
4.9
DevRel Visibility
4.5
Reference Apps
3.4
Pricing Clarity
8.6
Technical Depth
5.0
Developer Portal
6.2
Developer Reputation
8.1
Wikipedia Presence
8.0
Brand Mention Breadth
4.8
Maturity: Low (0-3) Medium (4-6) High (7-10)
Top 10 Priorities
1
Fix the 820-hour community response time before developers abandon your SDKs
+0.5 on Ecosystem
2
Create comparison content for the 6 scenarios where AI models completely ignore Acme APIs
+0.5 on AI Readiness, +0.3 on AI Accuracy
3
Tag all 266 untagged code blocks with language identifiers
+0.5 on Docs Quality

Priorities 4–10 available with full DX Pulse Benchmark.

Priority Details 10 actions

Full descriptions and recommended actions for each priority.

1
Fix the 820-hour community response time before developers abandon your SDKs +0.5 on Ecosystem
Community Response scores 1.0/10 and Advocacy sits at 2.2/10, the lowest scores in the entire benchmark. Median response time on GitHub issues is 819.8 hours (over 34 days) with 28 open issues across 9 repos. Developers who file a bug or ask a question and hear nothing for a month stop using the library. They switch to a competitor SDK, or worse, they tell other developers not to bother. The 1.0 Community Response score is dragging M5 (5.7/10) down harder than any other sub-metric.
Assign a rotating on-call engineer to triage GitHub issues across all 9 SDK repos weekly. Set a public SLA of 48-hour first response in each repo's CONTRIBUTING.md. For the top 3 repos by stars (acme-rails at 395, acme.js at 357, acme-gem at 228), target sub-24-hour responses. Close or label all 28 open issues within 2 weeks. Add issue templates with labels for bug, question, and feature request.
2
Create comparison content for the 6 scenarios where AI models completely ignore Acme APIs +0.5 on AI Readiness, +0.3 on AI Accuracy
Six critical findings show AI models scoring Acme APIs at 1.0/10 recommendation strength for features it actually supports: suppression list management, sandbox mode, email templates, message retention, multi-server API management, and dedicated SMTP endpoints. Competitors like SendGrid (mentioned 52 times), Mailgun (59 times), and Amazon SES (48 times) dominate these queries. A developer asking any AI assistant about these use cases gets sent straight to a competitor.
Create 6 dedicated landing pages, one per missed scenario, structured as 'How to [do X] with Acme APIs' with working code examples. Specifically: (1) 'Automatic suppression list management in Acme APIs', (2) 'Testing with Acme APIs sandbox mode', (3) 'Server-side email templates in Acme APIs', (4) 'Message retention and compliance in Acme APIs', (5) 'Managing multiple servers and domains via the Acme APIs API', (6) 'Sending email via Acme APIs's SMTP endpoint'. Each page should include a brief comparison section noting how Acme APIs handles this differently from SendGrid and Mailgun. Link these pages from llms.txt.
3
Tag all 266 untagged code blocks with language identifiers +0.5 on Docs Quality
Only 115 of 381 code samples (30%) have language tags. The remaining 266 blocks lack identifiers, breaking syntax highlighting for developers and degrading AI parsing accuracy. Many are clearly identifiable JSON response bodies or HTTP request examples. When an AI model encounters an untagged code block, it has to guess the language, which introduces errors in code generation. Developers copying untagged blocks into their editors lose syntax highlighting and auto-completion context.
Run a script across all 157 crawled pages to identify untagged code blocks. Tag JSON response bodies as 'json', HTTP examples as 'http', curl commands as 'bash', and XML payloads as 'xml'. Prioritize the Templates API, Domains API, Sender Signatures API, Stats API, and Message Streams API pages flagged in the major finding. This is a mechanical fix that can be done in a single sprint. Properly tagged code blocks produce richer ground truth for AI models, directly improving how accurately they generate Acme APIs integration code.

Details for priorities 4–10 available with full DX Pulse Benchmark.

All Recommended Actions

Details available with full DX Pulse Benchmark.

Your Content Plan

Details available with full DX Pulse Benchmark.

Your AI Lens Plan

Details available with full DX Pulse Benchmark.

Focused view on Generative Engine Optimization (GEO) and AI readiness.

Where AI gets Acme APIs wrong
Gemini
"Could not answer: Does Acme APIs support sending batch emails in a single API call?"
Claude
"Could not answer: Does Acme APIs support sending batch emails in a single API call?"
ChatGPT
"Could not answer: What webhook event types does Acme APIs support?"
6.8
LLM Accuracy
Does AI hallucinate about Acme APIs?
7.5
AI Discoverability
Does AI recommend Acme APIs?
Does AI Hallucinate About Acme APIs? 6.8/10

Details available with full DX Pulse Benchmark.

Does AI Recommend Acme APIs? 7.5/10

Details available with full DX Pulse Benchmark.

Competitive Landscape as Seen by AI

How does each model position Acme APIs against competitors?

Competitive Positioning by Model
ChatGPT
7.8
Claude
8.1
Gemini
7.5
Competitors Mentioned by AI
Mailgun (59)
SendGrid (52)
Amazon SES (48)
Mailtrap (21)
Resend (18)
Brevo (17)
MailerSend (17)
SparkPost (13)
Twilio SendGrid (10)
Mailchimp (7)
SMTP2GO (7)
Mailjet (6)
Zoho ZeptoMail (5)
Customer.io (4)
AWS SES (3)
Lettermint (3)
Mandrill (3)
MessageBird (3)
Braze (2)
Global Relay (2)
Iterable (2)
MailSlurp (2)
Microsoft 365 (2)
Mimecast (2)
SMTP.com (2)
SendPulse (2)
ZeptoMail (2)
AhaSend (1)
ClickSend (1)
CloudMailin (1)
Cloudflare Email Routing (1)
Dynamics 365 Customer Insights (1)
Freshdesk (1)
Gmail API (1)
Google Workspace (1)
Groundhogg (1)
Help Scout (1)
HubSpot Service Hub (1)
Jitbit Helpdesk (1)
LiveAgent (1)
Mailcheap (1)
Mailchimp Transactional (1)
Mailchimp Transactional (Mandrill) (1)
Maileroo (1)
MessageGears (1)
Paubox (1)
Proofpoint (1)
Qboxmail (1)
Retarus (1)
SMTPeter (1)
SendLayer (1)
Vuture (1)
Zendesk (1)
Competitive DX Quadrant
DX Health → AI Lens → Fragile Leading Emerging Untapped 1 2 3 5 6 7 8 9 10 11 12 13 4
1 Product A
2 Product B
3 Product C
4 Acme APIs
5 Product D
6 Product E
7 Product F
8 Product G
9 Product H
10 Product I
11 Product J
12 Product K
13 Product L
Acme APIs vs. Average
Docs Quality 7.2 / 6.7 avg
Getting Started 7.7 / 6.5 avg
Ecosystem 5.7 / 5.6 avg
Content Gaps 7.5 / 7.1 avg
AI Accuracy 6.8 / 6.8 avg
AI Readiness 7.5 / 6.9 avg
Bar = your score ▲ = average across 14 products
DX Health Ranking #4 of 14

Details available with full DX Pulse Benchmark.

AI Lens Ranking #5 of 13

Details available with full DX Pulse Benchmark.

Named competitor comparisons are available. Reach out for a side-by-side breakdown against specific companies.

Select recommendations and risk scenarios to project their combined impact on Acme APIs's DX score.

7.2
DX Health Score
AI Lens Score: 7.1/10 (read-only — driven by DX inputs)
Dimension Breakdown
Select Actions You Plan to Implement
20 Select All

Details available with full DX Pulse Benchmark.

Select Risks You Want to Model
10 Select All

Details available with full DX Pulse Benchmark.