Metrics Hub
The systems at scale story
What Every month, a deck lands on the C-suite and Board's desk with the story of every key customer metric. What that meeting hides is a coordination nightmare: dozens of contributors, cascading handoffs, metric definitions that exist only in one analyst's head, and a review process held together by email.
Why When someone left, their metrics vanished. Every new metric or report was a custom project. The process was distributed across enough people that any one departure created real gaps.
How I built a platform that replaced the entire fragmented process. A three-prong architecture: formalize the metric definition, define how data gets captured against those definitions, then present it through slides, workflows, and decks that inherit from the layers below.
Overview
Every month, Amazon leadership receives a deck where each slide tells the story of a customer metric. Business analysts define what gets measured, data scientists supply the numbers, program managers write the narratives, team leads unite the pieces, and leadership reviews everything. What that meeting hides is a coordination problem at scale: the process was distributed across enough people that any one departure created real gaps. The design challenge was decomposition: break the workflow into clean, modular layers with clear ownership and clear interfaces. Get the architecture right and the system becomes extensible. Get it wrong and every new metric is a custom project.
Pain Points
When an analyst left, the knowledge of how their metrics were defined, where data came from, and what caveats applied left with them. Onboarding meant weeks of reverse-engineering spreadsheets.
Producing a single deck required synchronizing dozens of contributors. Status tracking happened in spreadsheets, review cycles ran through email, and nobody had a single view of what was on track.
Metric definitions, data, and narratives were scattered across spreadsheets, documents, and slide decks. The same metric could have different definitions in different reports.
Review happened through email threads and ad-hoc meetings. No structured workflow tracked which slides had been reviewed or what feedback was outstanding. Missed reviews surfaced only at deck assembly.
The Entities
Four entities, because the reporting pipeline has four distinct lifecycle shapes.
The foundational definition: name, customer question, formula, directionality, segmentation, and data contract. Defined once, inherited everywhere.
A metric rendered for a specific reporting period. Inherits structure from the metric definition, adds period-specific data and narrative. Owned by a contributor, reviewed through a workflow.
An assembly of slides into a reviewable package. Manages ordering, theming, and the review workflow that gates publication. A deck orchestrates slides; it doesn't own content.
The recurring cadence (monthly, quarterly) that drives slide creation, data deadlines, and review cycles. Defines which metrics participate, who is responsible, and when each phase is due.
Report (cadence + scheduling)
→ triggers Slide instances per Metric (data + narrative)
→ Slides assemble into Deck (review + publication)
→ Metric definitions flow down: formula, segmentation, chart type The End-to-End Journey
How a metric moves from definition to the leadership deck, and who is involved at each stage.
POC Point of Contact Mgr Manager DA Data Analyst PM Program Manager Dir Director LD Leadership
| Host | Client | LD | Del. | ||||||
|---|---|---|---|---|---|---|---|---|---|
| POC | Mgr | DA | PM | Dir | |||||
| Non-workflow | Onboarding | Metric definition created | M1 | ||||||
| Data contract & segmentation agreed | |||||||||
| Data pipeline set up | |||||||||
| Permissions configured | |||||||||
| POCs identified for data & narratives | |||||||||
| Workflow | Data Update | Slide instance created | M2 | ||||||
| Monthly data uploaded | |||||||||
| Data sanity checked | |||||||||
| Collaboration | Inline comments on slide sections | M3 | |||||||
| Discussion threads resolved | |||||||||
| Narratives & Review | Narrative drafted on slide | M4 | |||||||
| Host review completed | |||||||||
| All slides finalized | |||||||||
| Director review | |||||||||
| Leadership review | |||||||||
| Publication | Deck assembled & published | M5 | |||||||
| Deck reviewed | |||||||||
Milestone 1: The Metric Library
A metric in the library is a defined object, not a row in a spreadsheet. It carries a name, taxonomy, Customer Question, measurement formula with directionality, and a data contract. Every field is downstream load-bearing: the Customer Question becomes slide copy, directionality determines trend line rendering, and segmentation determines ownership and reporting. Beyond these structural fields, each metric also captures measurement details such as statistic type, units, sampling method, and directionality, all of which drive data collection, trend rendering, and goal evaluation.
The data model
Two intervals govern every metric: measurement (how often data is captured) and reporting (how often it surfaces in review). A metric measured daily might report monthly.
Partitioning how data is reported, by geography, speed tier, or business attribute. Each partition carries its own goal, justification, and operational context.
The permission model
The system enforces a clear boundary between who maintains the platform and who uses it to produce reports, and every action in the system maps to one side of this line.
Host (Platform team) │ ├─ Define taxonomy & data contracts ├─ Configure intervals & segmentation ├─ Design slide templates └─ Set workflow stages & gatesClient (Business teams) │ ├─ Upload data for segments ├─ Write narratives & context ├─ Review & approve slides └─ View & present decks
Milestone 2: Metric Presentation
The slide is the presentation unit of a metric. The slide designer lets teams define the anatomy of their slide: what sections exist, what elements each contains, and which are system-controlled versus team-authored. In practice, every slide is divided into sections, each containing elements such as headings, text, data tables, and charts, so two teams can have slides that look different while following the same structural contract.
Slide structure
Slide
├─ Header
│ ├─ Heading ← metric name synced to slide settings
│ ├─ Text ← description synced to slide settings
│ └─ Text ← why it matters synced to metric metadata
├─ Data
│ ├─ Table ← values by segment synced to metric data
│ └─ Chart ← trend line synced to metric data
├─ Narrative
│ ├─ Text ← performance summary authored by user
│ └─ Text ← planned actions authored by user
├─ Supplemental
│ ├─ Text ← follow-up question authored by user
│ └─ Text ← answer authored by user
└─ Footer
└─ Details ← measurement details synced to metric metadata Fields marked [synced] pull directly from the metric library and data pipeline. When the upstream definition changes, the slide reflects it automatically.
Fields marked [authored] are written by the team each cycle: the narrative, planned actions, and supplemental context. The designer defines where these appear; content is produced fresh each period.
Slide canvas
A key requirement was WYSIWYG fidelity: the slide as seen in the web editor must match the slide when printed. Every slide is defined by a single structured specification (the DIP). Two parallel rendering paths consume it: the interactive path renders for the web canvas with editing controls and commenting anchors; the document path generates print-ready output through a docx pipeline. Both read the same DIP; neither modifies it.
Milestone 3: Metric Collaboration
Slides move through multiple reviewers, each leaving feedback on specific content sections. The collaboration layer provides inline commenting and threaded discussions anchored directly to slide content.
- Inline commentingComments anchored to specific content sections within a slide. Reviewers mark exactly which part they're responding to. Comments persist across sessions and survive content changes.
- Discussion threadsThreaded conversations with nested replies, quote references, and status tracking (resolved, unresolved, pinned). Any comment can escalate into a formal task with assignment and due dates.
Milestone 4: Metric Reporting
The reporting workflow moves slides through production each month. Every slide follows a defined cycle (Kick Off, Content, Review, Finalization, Closeout), each with its own participants and sign-off requirements.
Kicking off the workflow
Each period begins with the host team opening all metric slides, setting deadlines, and notifying responsible parties. From that point, the activity tracker makes the pipeline visible, surfacing every metric, every stage, and every outstanding action. When fifty slides need to reach leadership in four weeks, the central question becomes "where is everything right now?"
Data upload flow
Data owners upload metric values against the data contract defined in the library, and the system validates format, completeness, and alignment before committing anything to the reporting period.
Data owner
→ Select metric and reporting period
→ Upload data file
→ Validation against data contract
→ Metric data committed for that period
→ Slide data table and chart updated
→ Stakeholders notified Narrative authoring flow
Once data is committed, program managers write the narrative: performance summary, planned actions, and supplemental context.
Program manager
→ Open slide for the reporting period
→ Write performance summary
→ Write planned actions
→ Author supplemental Q&A
→ Mark narrative as complete
→ Slide advances to review Review flow
Review proceeds along two parallel tracks: the business team lead collects stakeholder sign-offs on narrative accuracy while the host team validates numbers and leaves comments. Both tracks must fully resolve before a slide can advance to finalization.
Business team lead
→ Collect approvals from business stakeholders
→ Verify narrative accuracy
→ Respond to host team comments
→ All approvals received
→ Slide advances to finalization
→ Host team validates final numbers
→ Slide ready for deck Milestone 5: The Deck
A deck is a collection of slides for a specific reporting month, curated for a particular leadership audience. By the time a deck exists, every layer below it has done its work: metrics are defined, data is uploaded, narratives are written, and slides have passed through the review workflow. Because the deck is curated for reading rather than editing, the slide surface strips editing controls and presents each slide in its final, print-ready form, rendering as a WYSIWYG document that can be exported to PDF for leadership distribution.
Slide ordering
Before a deck is published, the curator sets the slide ordering, choosing from three available modes:
Data uploaded
→ Narratives finalized
→ Reviews complete
→ Ordering configured
→ Deck published Default ordering
Slides appear in the order they were added
Performance ordering
Sorts by delta from goal using each metric's
high-is-better / low-is-better property
Worst-performing metrics surface first
Custom ordering
User-defined drag-and-drop sequence
Lets curators build a specific narrative arc Ordering is computed once, not live, because by the time a deck reaches the ordering stage the reporting month's data is locked and narratives are finalized. Since the underlying data does not change after publication, the computed order is stable, which clearly separates deck ordering from dashboard ordering where live data drives dynamic sorting.
Reflections
- Push harder for the declarative pathThe slide designer shipped with a visual configurator. An early proposal for a JSON-based declarative config didn't get enough push. With the rise of AI tooling, a declarative config would have made slide creation scriptable and composable.
{
"sections": [
{
"type": "header",
"elements": [
{ "kind": "text",
"source": "synced",
"label": "Metric name" },
{ "kind": "text",
"source": "synced",
"label": "Description" }
]
},
{
"type": "data",
"elements": [
{ "kind": "table",
"source": "synced" },
{ "kind": "chart",
"source": "synced" }
]
},
{
"type": "narrative",
"elements": [
{ "kind": "text",
"source": "authored",
"label": "Summary" },
{ "kind": "text",
"source": "authored",
"label": "Actions" }
]
}
]
} - Design for people who wear multiple hatsThe system modeled eight to nine distinct roles. In practice, people wore multiple hats: a program manager might also be the narrative author and reviewer. Role models should reflect how people actually work, not how org charts describe them.
Adoption
Osprey rolled out at the end of 2024 across Seller, Consumer, and AWS.
Non-value-add effort includes standardized communication, data validation, formatting, minor content edits, and slate management such as copy-pasting and uploading content. The pilot team — a 12-person Product and Customer Insights Management (PCIM) group — transitioned from fully manual operations to a workflow that two program managers can run end-to-end, over the course of one year.
Appendix
What are the detailed host vs. client permissions across each milestone?
Every milestone in the system has a distinct set of actions split between Host (platform team) and Client (business teams). The Host side governs configuration, governance, and infrastructure, while the Client side governs content authoring, review participation, and consumption.
Host (Platform team) │ ├─ Metric Onboarding │ Define taxonomy │ Create metric definitions │ Set data contracts │ ├─ Updating Data │ Configure intervals │ Manage segmentation │ Validate uploads │ ├─ Metric Presentation │ Design slide templates │ Manage synced fields │ ├─ Metric Reporting │ Set workflow stages & gates │ Assign reviewers │ │ └─ The Deck Publish deck configurations Set audience accessClient (Business teams) │ ├─ Metric Onboarding │ Browse metric library │ Request new metrics │ │ ├─ Updating Data │ Upload data for segments │ │ │ ├─ Metric Presentation │ Write narratives │ Author supplemental context │ ├─ Metric Reporting │ Advance stages │ Review & approve slides │ Add discussion comments │ └─ The Deck View & present decks
How does the WYSIWYG split rendering work?
Two parallel rendering paths consume the same DIP: the interactive path renders for the web canvas with editing controls and commenting anchors; the document path generates print-ready output through a docx pipeline. Both read the same specification; neither modifies it.
{
"metric": "Delivery Speed",
"period": "Dec 2024",
"sections": [
{
"type": "header",
"title": "Delivery Speed",
"desc": "On-time delivery %",
"why": "Customer promise"
},
{
"type": "data",
"cols": ["Segment","Actual","Goal"],
"rows": [
["US 2-Day", 96.5, 95.0],
["US 1-Day", 98.4, 97.0],
["EU 2-Day", 94.2, 95.0]
]
},
{
"type": "narrative",
"body": "US tiers improved.
EU missed by 0.8pp."
}
]
} function renderDocx(dip) {
const doc = new Document();
for (const sec of dip.sections) {
if (sec.type === "header") {
doc.addHeading(sec.title);
doc.addSubtitle(sec.subtitle);
}
if (sec.type === "data") {
const tbl = doc.addTable(
sec.rows.length, 2
);
sec.rows.forEach((r, i) => {
tbl.cell(i, 0).text(r[0]);
tbl.cell(i, 1).text(r[1]);
});
}
if (sec.type === "narrative") {
doc.addParagraph(sec.body);
}
}
return doc.toBuffer();
} How did we emulate role-based access?
A global role switcher reconfigured the entire application, including permissions, visible metrics, available actions, and review workflows, across eight to nine distinct roles.
How did we emulate different canvas layout variants?
The application coexisted with a host app that had its own navbar and sidebar, so we tested three integration strategies: inserting within the host, taking over parts of the host shell, and completely replacing it. In editing modes, the host navigation was hidden entirely to provide a focused workspace.
How did we integrate Cloudscape theming?
The host app used Cloudscape design tokens, so we built a test bench that loaded default token configurations, allowed live editing, and persisted changes via Lambdas, S3, and DynamoDB.
How do the AI integrations serve different workflow moments?
Two AI surfaces target distinct moments in the reporting workflow, and in both cases AI-generated content is always visually distinguished so that no AI output reaches other users without explicit human review.
Integrates directly into the TipTap editor for narrative authoring. AI-generated text inherits all editor capabilities. The same inline AI pattern is explored in Writing Canvas: Inline AI Nodes.
A multi-turn conversational AI in the split panel alongside discussions. Normalized as a workflow participant, not a separate tool. The pattern is covered in Writing Canvas: Companion AI Panel.
How do task types map to the reporting lifecycle?
The same three action types used in Inquiry Hub apply here, mapped to the reporting lifecycle:
| Action type | Reporting tasks | Trigger |
|---|---|---|
| Implementation | Data update Slide data refresh Narrative update | Manual Automated |
| Approval | Slide review Slide approval | Manual Automated |
| Change request | Slide audit | Manual |
Any discussion comment can generate a formal task that inherits the comment's context (which slide, which section, what the issue is). The task enters the management system with a pending status and appears in the assignee's hub dashboard.