Designed an AI system that surfaces trends, anomalies, and comparisons the moment a chart is inserted in Excel. Increased chart retention by 15%, validated Copilot’s value for non-coders, and repositioned Excel as an analytical assistant—not just a calculation tool—with 65% positive sentiment.
Liked this project?
Let's talk about what we can build together.
Introduction
Copilot feature surfacing auto-generated chart insights. 15% higher chart retention, 2x Copilot engagement lift, 65% positive feedback rate, >95% factual accuracy.
Results Overview
60% of Excel charts get deleted within minutes of being inserted. Not because the data is bad. Because the chart doesn't explain itself.
Users were creating charts, staring at them, writing bullet points about them by hand, and then sending those bullet points to managers who asked "so what?" anyway. Translating a visual into a narrative was entirely manual. Excel handed people a chart and stepped back.
Tableau had "Explain Data." Google Sheets with Gemini surfaced "This category is declining." ChatGPT Data Analyst wrote natural language summaries of uploaded CSVs. Excel, the most widely used data tool on the planet with 400 million monthly active users, was still in the old paradigm: render the chart, walk away.
I was Lead Designer for Copilot Chart Intelligence at Microsoft. My job was to close that gap.
Higher Chart Retention
Positive Feedback Rate
Copilot Engagement Lift
Factual Accuracy
The Problem: Charts Without Context Are Decoration
| Metric | Result |
|---|---|
| Chart retention | +15% |
| Copilot engagement | 2x lift |
| Positive feedback rate | 65% (goal was 50%) |
| Factual accuracy | >95% on manual eval of 500+ insights |
| Interaction rate with panel | 40% (hover, scroll, click) |
What Users Were Telling Us
Verbatim signals from research and OCV — the qualitative layer that made the quantitative data undeniable.
"The chart shows the data, but I still don't know what it means."
— NPS Feedback
"I spend more time writing bullet points about the chart than creating it."
— Enterprise User
"My manager asks 'so what?' and I have to explain manually."
— Analyst
Key Observations
Patterns across all research tracks — the moments where the data started telling a consistent story.
“Charts should tell a story – ours don’t.” This succinct user insight underscored how Excel charts lacked narrative value
Many users deleted charts soon after insertion, signalling they didn’t find them usefull or attractive – an estimated 60% deletion rate for inserted charts.
On Excel Web, charting drew a disproportionate share of negative feedback (22% of all Excel Web frown feedback was chart-related), citing missing features and difficulty “getting charts to tell me anything”
CORE INSIGHT: Users weren't struggling to make charts — they were struggling to extract meaning from them. Excel gave them the 'what' but never the 'so what?'
Why Competitors Were Winning
30+ tools analyzed to understand the intelligence gap Excel hadn't closed — and where the whitespace was.
In our competitive analysis, tools like Tableau, Power BI, and AI-native platforms were delivering automatic insights:
Tableau: 'Explain Data' feature surfaced statistical anomalies
Google Sheets + Gemini: Proactive suggestions like 'This category is declining'
ChatGPT Data Analyst: Generated natural language summaries of uploaded CSVs
Napkin AI: Auto-generated visual explainers from text descriptions
CRITICAL GAP: Competitors understood that modern data viz isn't just rendering pixels — it's helping humans think. Excel was stuck in the old paradigm.
The Business Case for AI Insights
Not just a user request — a strategic imperative tied directly to Copilot adoption and platform retention.
This wasn't just feature parity — it was strategic necessity:
The Opportunity
The gap between what Excel showed users and what they actually needed to understand their data.
Why Chart Insights, Why Now: These competitive insights underscored that to stay relevant and delight users, Excel had to infuse intelligence directly into charting. It wasn’t enough to improve the UI or add new chart types; the next logical step was a Copilot-driven experience where the moment a user creates a chart, the software adds value by explaining the data.
| Metric | Value | Denominator | Meaning |
| 2% | 8M / 400M | Total Excel MAU | Overall market penetration |
| 16.5% | 4M / 24M | Copilot-enabled users | Chart usage among Copilot users |
| ~11% | 2.6M / 24M | Copilot-enabled users | Copilot usage among enabled users |
Among Copilot-enabled users, 16.5% create charts but only ~11% actively use Copilot features (November 2024 data). This 5.5pp activation gap represented a clear opportunity — users with Copilot access who chart frequently weren't leveraging AI features.
Competitors were turning charts into “visual narratives” – combining charts with insights and even action suggestions
JTBDs
Understanding what users are actually trying to accomplish when they create and analyze charts in Excel.
| When I... | Insert a chart to visualize quarterly revenue data across product lines |
| I want to... | Immediately understand what patterns, trends, and anomalies exist without manually calculating statistics or staring at the chart for 10 minutes |
| So I can... | Confidently present findings to my manager, make data-driven decisions faster, and avoid missing critical business insights |
| Without... | Spending 20+ minutes manually analyzing every data point, second-guessing my interpretation, or relying on my manager to spot issues I missed |
| When I... | Need to present analysis to stakeholders who don't have time to dig into raw data |
| I want to... | Have ready-made narrative bullets that explain what the chart shows in plain language, with the option to copy them directly into emails or presentations |
| So I can... | Save hours writing explanatory text, ensure I'm communicating the most important findings, and look like a data expert even if I'm not |
| Without... | Spending 30+ minutes writing bullet points, worrying I'm focusing on the wrong metrics, or having my manager ask 'what about X?' that I completely missed |
| When I... | Create a chart for a high-stakes presentation (board meeting, executive review) |
| I want to... | Get a second opinion from AI to confirm my interpretation is correct, or alert me to patterns I might have missed |
| So I can... | Present with confidence, avoid embarrassing mistakes, and discover insights that make me look smart rather than missing obvious trends |
| Without... | Asking my manager to double-check every chart, staying up late rechecking numbers, or getting called out in a meeting for missing something obvious |
| When I... | Work with data regularly but don't have formal analytics training |
| I want to... | See examples of how experienced analysts interpret data, so I can learn what questions to ask and what patterns matter |
| So I can... | Develop my analytical skills over time, become less dependent on others, and eventually spot these patterns myself |
| Without... | Taking a formal data analytics course, bothering my analyst colleagues with basic questions, or relying on trial-and-error that wastes time |
My Role: Designing Intelligence, Not Just Interfaces
The qualitative layer mattered here. These weren't edge cases.
From NPS feedback: "The chart shows the data, but I still don't know what it means."
From an enterprise user: "I spend more time writing bullet points about the chart than creating it."
From an analyst: "My manager asks 'so what?' and I have to explain manually."
The pattern was consistent across every research track. Users weren't struggling to make charts. They were struggling to extract meaning from them. Excel gave them the "what." It never gave them the "so what."
There was a specific data point that made the business case undeniable: among Copilot-enabled users, 16.5% created charts but only ~11% actively used any Copilot feature. That 5.5 percentage point gap was the opportunity. People with access to AI weren't using it because nothing triggered them to. Chart Insights was the trigger.
Also worth noting: 22% of all Excel Web frown feedback was chart-related. A disproportionate share. Charts were a pain point the product hadn't addressed.
MVP — Prove the Core Value
Show static insights on chart insertion to validate whether automatic, immediate analysis delivers value.
| Success criteria | 40% click rate achieved vs. 20% target |
| Validation | A/B test for chart retention, Copilot activation |
| Performance target | P95 <20s generation |
| Interaction | Click → popover → thumbs feedback → Ask Copilot |
| Constraints | Creator-only, no persistence, native charts only |
| Scope | Skittle button on chart insert (Web first), 1-3 insights, manual refresh |
Make It Interactive & Contextual
Enrich insights with responsiveness to chart edits and expand access to chart consumers.
| Responsive insights | Auto-update when chart type/data changes |
| On-demand access | Right-click any chart → Generate Insights |
| Persistent insights | Save as chart property, visible to collaborators |
| Enhanced interaction | Copy text, "Explain why" button, hide individual insights |
| Platform parity | Win32, Mac support, multi-chart scenarios |
| Edge cases: | Trendlines, empty charts, error recovery |
Deep AI Analysis & Storytelling
Transform insights into a conversational analytical assistant with cross-chart narratives and M365 integration.
| Conversational analysis | Embedded Copilot chat with context maintenance |
| Cross-chart insights | Multi-chart narratives and high-level summaries |
| Visual highlights | Link insight text to chart elements (hover to highlight) |
| M365 integration | Send to PowerPoint with auto-generated slides |
| Advanced analytics | Predictive trends, correlations, diagnostic analysis, benchmarking |
| Vision | Excel as AI-driven analysis platform with intelligent partnership |
The MVP : Version 1
If Copilot surfaces insights the moment a chart is inserted, users will keep the chart, understand their data faster, and trust Excel as an analysis tool. Not just a calculation tool.
The solution: when you insert a chart, Copilot immediately answers "What does this show?" Plain language. Specific. Something like "North region is the top contributor with 35% of total sales." Right next to the chart, before you've had to think about it.
If dismissed users can trigger back insights from right-click menu or through chart ribbon menu
Another version exploring the Copilot pane instead of on-canvas dialog
The Collision: When Reality Punched Back
We designed for <5s latency, got >30s reality. Next time: prototype with artificial delays from day one. Assume performance will be worse than promised.
I inserted the chart and just... waited. It felt broken. I thought Excel crashed.
Users clearly preferred on-canvas overlay, but technical and real estate constraints made it impossible in its original form. The skittle button preserved the contextual proximity users loved while solving latency and space problems.
Version 2: Introducing the Skittle
We didn't try to ship everything at once. We broke it into Crawl, Walk, Run.
Crawl (MVP): Show static insights on chart insertion. Validate the core value. One to three insights, a skittle button on the chart, manual refresh. Web-only, creator-only, no persistence. Success criteria: 40% click rate, chart retention lift, Copilot activation. We hit it: 40% interaction rate against a 20% target.
Walk: Make it interactive and contextual. Auto-update when chart type or data changes. Right-click access for consumers, not just creators. Persistent insights visible to collaborators. Add "copy text," "explain why," the ability to hide individual insights. Expand to Win32 and Mac.
Run: Full analytical assistant. Embedded Copilot chat with context maintenance. Cross-chart narratives. Visual highlights linking insight text to chart elements. Send to PowerPoint. Predictive trends, correlations, benchmarking.
The Crawl phase is what shipped and what these results reflect.
We also added a pop-over toast to highlight
Key Design Decisions & Trade-offs
Two things broke.
Problem 1: Latency. We designed for under five seconds. The LLM gave us over thirty. A chart appears in under half a second. Then thirty seconds of "Analyzing..." A user in dogfood said it plainly: "I inserted the chart and just waited. It felt broken. I thought Excel crashed."
We pivoted. The skittle button approach let users choose when to load insights rather than forcing them to wait. That one constraint change fixed the perception problem without scrapping the feature.
The lesson: I should have prototyped with artificial delays from day one. Assume performance will be 10x worse than the engineering team's best-case estimate. Have a backup interaction pattern ready before you need it.
Problem 2: Real estate. We designed the insight panel on clean screens at standard resolutions. Enterprise runs on 1366x768 laptops. A 300px-wide insight panel next to a chart, with Chart Design Recommendations also running in a sidebar, on a 1366x768 screen: there's nothing left for actual work.
We didn't know about the Design Recs feature conflict until mid-sprint. Both features trigger on "Insert Chart." Both compete for the same screen space.
The fix was the skittle button pattern. Small, anchored to the chart, doesn't expand until clicked. Users who want insights see them. Users who need the real estate don't lose it.
Choice: Auto-trigger on insert (with easy dismiss)
Why: Testing showed users didn't know to ASK for insights. Making it proactive was key to discovery.
Choice: No auto-refresh on data changes
Why: Performance risk, user distraction, and testing showed users preferred control. They'll hit refresh when ready.
Choice: Floating panel anchored to chart
Why: Sidebars compete with other UI. Inline feels contextual, less like 'another feature' and more like 'the chart explaining itself'
Choice: Binary thumbs up/down, not 5-star scale
Why: Lower friction, higher response rate. We cared more about volume of feedback than granularity.
Impact & Results
Auto-trigger vs. on-demand. Insights auto-trigger on insert with an easy dismiss. Testing showed users didn't know to ask for insights. Discovery only works if it's proactive. Users who had to opt in never found it.
Manual refresh only. No auto-refresh when data changes. Performance risk, user distraction, and testing showed users want control. They'll hit refresh when they're ready. Auto-refresh felt noisy.
Floating panel vs. sidebar. Floating panel anchored to the chart. Sidebars compete with everything else already in the sidebar. Anchored to the chart, it felt like the chart was explaining itself, not like another panel had appeared.
Thumbs over ratings. Binary thumbs up/down, not a five-star scale. Lower friction, higher response rate. We cared more about volume than granularity. 65% positive rate on that binary signal.
Critical feedback we addressed:
Key Learnings
What three design iterations, a 15% retention lift, and a latency crisis taught me about shipping AI features that earn trust.
Built for <5s latency, got >30s reality. Had to pivot mid-sprint.
Next time: Design for 10x slower than best case. Prototype with artificial delays (5s, 15s, 30s). Have backup pattern ready from day one.
Designed in isolation, learned about Design Recs conflict late.
Next time: Audit all features that trigger on same event. Test on 1366x768 screens from day one. Involve PM in multi-feature roadmap alignment earlier.
Defined metrics after design was done. Had to retrofit event tracking.
Next time: Create telemetry schema during wireframing. Every interaction state = logged event. Treat telemetry as a design deliverable.
Tested with clean datasets (5 columns, 100 rows), shipped to messy reality (500 columns, formulas, merged cells).
Next time: Start with messiest data first. Create 'data chaos test suite' for prompt validation. Design failure states as prominently as success states.
The Bigger Lesson
15% increase in chart retention. Charts with insights were 2x more likely to be kept versus the control group.
65% thumbs-up rate. Our goal was 50%. Users found insights genuinely helpful, not annoying or obvious.
40% interaction rate. Not just impressions. Hovering, scrolling, clicking. Real engagement.
>95% factual accuracy on manual evaluation of 500+ insights. No hallucinated numbers. No false statements. This matters: one wrong insight destroys ten good ones.
Qualitative feedback from dogfood and early users:
Critical feedback we addressed: some insights felt obvious for simple datasets, so we tuned the LLM to focus on non-trivial patterns. The panel sometimes covered data, so we adjusted positioning logic. Users wanted to save insights to cell comments, which went to the Walk phase roadmap.
Strategically: this validated Copilot's value for people who aren't Power Query experts. It differentiated Excel as the only tool with native, editable charts plus AI insights. Competitors had either/or. And it green-lit the Walk and Run phases.
The Newsletter
Prototype latency variations earlier. Design for 10x slower than best case. Have the backup interaction pattern in the design file before the first engineering conversation.
Map real estate conflicts earlier. Audit every feature that triggers on the same event. Test on 1366x768 from day one. Get PM alignment on multi-feature roadmap conflicts during discovery, not mid-sprint.
Build instrumentation into design from day one. I defined metrics after the design was done. That meant retrofitting event tracking. Every interaction state should have a corresponding logged event. Treat the telemetry schema as a design deliverable.
Test with messy data from day one. We tested with clean datasets: five columns, 100 rows. We shipped to 500 columns, formulas, merged cells. Start with the messiest possible data. Build a failure state design that's as prominent as the success state.
Conclusion
AI features don't earn trust by being impressive. They earn it by being right.
Chart Insights worked because it solved a specific job someone actually had (understand my chart right now), appeared at the exact moment they needed it (chart insertion), told the truth more than 95% of the time, and left the person in charge of the analysis. The AI helped them see faster. It didn't replace what they were doing.
That's the model. Not AI for its own sake. AI that has a clear job and does it reliably.