Designed a Copilot feature that analyzes data shape, infers user intent, and recommends optimal chart configs with story-first titles like “Quarterly Trends.” Collaborated with ML engineers on RAG model tuning, eliminating chart type decision fatigue and making great visualizations accessible to all.
Liked this project?
Let's talk about what we can build together.
Introduction
How I designed an AI system that bridges the gap between chart creation and chart communication — for 400M Excel users.
Creating charts is easy. Making them GOOD is hard. Users struggled with chart type selection, styling decisions, and best practices—resulting in suboptimal visualisations even when data was correct.
As Lead Designer for Copilot Chart Design Recommendations, I designed an LLM-powered system that analyses data shape, infers user intent, and suggests optimal chart configurations. This required deep collaboration with ML engineers to train and tune the RAG models with data visualisation principles—essentially encoding expert knowledge into AI prompts.
The result bridges the gap between 'chart exists' and 'chart communicates effectively,' democratising data visualisation expertise for 400M users.
Results Overview
The feature shipped, scaled, and proved that AI-assisted design guidance moves real product needles.
Execution Success
User Effort Saved
Clicks Eliminated
Shipping Timeline
The Problem: The Chart Design Expertise Gap
Most users could create a chart. Almost none could create a good one. The expertise gap was the product gap.
"No suggestions... I have to try different charts and hope one communicates my idea." — Usability Study 2024
"I spend hours tweaking charts to look professional." — Power User
"It looks boring. How do I make it presentation-ready?" — Enterprise Analyst
"I created a chart but don't know if I picked the right type." — Intermediate User
| Pain Point | User Behaviour | Business Impact |
|---|---|---|
| Chart Type Uncertainty | Try multiple types, delete, start over. 5-10 minute cycle. | Wasted time, user frustration, suboptimal final choices |
| Styling Paralysis | Don't know which formatting options matter. Either over-style or under-style. | Charts look unprofessional or cluttered |
| Best Practice Ignorance | Unaware of data viz principles (e.g., start axis at zero, use direct labels) | Misleading visuals, poor communication |
The pattern was clear, Users could INSERT charts (thanks to our P0 improvements), but they couldn't optimise them.
Why Competitors Had the Advantage
Competitive analysis revealed sophisticated design assistance
Tools that handle complex data are hard to use. User-friendly tools handle simple data.
40% of Excel charts were deleted same session. Canva/Flourish users kept charts because they looked presentation-ready on first insert.
Google Explore, Napkin.ai, Tableau Show Me — every modern tool reduces data→chart to 1-2 clicks. Excel required 5+ steps with no guidance.
Pitch, Miro, Figma use click-to-format context menus. Excel used ribbon + dialog boxes — 3+ clicks to format a single element.
In our AI compete benchmark, Copilot in Excel scored 48/100 — below ChatGPT (85), Gemini (72), even Gemini Sheets (56). Task success was 40%.
BI tools provide insights alongside charts. Google Gemini explains trends. Excel charts were 'purely graphical — static visuals, with no story.’
Synthesis
Across all 30+ tools, three truths emerged:
Reduce friction at the start — suggestions, templates, one-click creation.
Make the default output impressive — users keep charts that look good on first insert.
Add intelligence — the tool should explain the data, not just display it. Excel had the data capability. It needed the ease and the storytelling.
As a User
Functional & Emotional JTBDs
What users wanted to accomplish — and how they wanted to feel — when reaching for charts in Excel.
"When I insert a chart, help me create a visualization that tells my story effectively — without needing to be a data viz expert."
| Job Category | User Statement | Pain Point It Solves |
|---|---|---|
| Chart Type Selection | "Help me figure out which chart best represents my data" | 5-10 minute trial-and-error cycles; users try multiple types, delete, start over |
| Visual Design | "Make my chart look professional and presentation-ready" | Charts described as "boring," "old-fashioned," "embarrassing to present" |
| Best Practice Application | "Tell me what I don't know about good data visualization" | Users unaware of principles like "start Y-axis at zero" or "use direct labels" |
| JTBD | Description | Copilot Intent Share |
|---|---|---|
| Comparative Analysis | Compare values across categories, geographies, or periods to uncover insights | Part of 83% "Create Chart" intents |
| Presentation & Storytelling | Make complex information clear, engaging, persuasive in meetings/reports | 9.6% of explicit intents |
| Trend Analysis | Visualize how metrics change over time to identify patterns | Primary use case for Line charts |
| Answering Business Questions | Create ad-hoc visuals to answer specific questions quickly | Core Excel workflow |
The "Magic Wand" Quote
from User Research
The single question that unlocked what users truly needed — and reframed the entire design brief.
"If you could wave a magic wand, what would you change?"
Users wanted three things:
Automatic chart creation
"Based on my specific goal and storytelling needs, help me tell my story"
Automatic beautification
"Make my charts look beautiful without me having to figure it out"
Natural language customization
"Let me ask for customizations in plain English"
As a Business
Strategic JTBDs
The commercial imperatives driving investment in chart intelligence — retention, ecosystem depth, and competitive parity.
"Increase chart adoption and retention to keep users within the M365 ecosystem for their data visualization needs — preventing defection to competitors."
| Metric | Baseline Problem | Target Impact |
|---|---|---|
| Chart Kept Rate | ~45% of charts deleted in same session | Push toward >70% retention |
| Chart Create MAU | Only 2% of MAU on web create charts | Increase top-of-funnel creation |
| Net Chart Creation | Inserts minus deletes was too low | Increase net positive |
| Data Viz NPS | Charting issues dragging down Excel NPS | Measurable improvement |
| Copilot Tried/Enabled | Design Recommendations as gateway | Lift adoption rate |
| Business Job | Why It Matters | How Design Recommendations Solves It |
|---|---|---|
| Compete Defence | Tableau, Power BI, ChatGPT Code Interpreter, Napkin AI democratizing design expertise | Embed expertise IN the tool; no learning curve required |
| Copilot Adoption | Only ~9% of Copilot users engaged with chart-related prompts | Proactive recommendations at insert = gateway to Copilot |
| User Retention | Users looking outside M365 for data viz needs | "Wow moment" on first chart = sticky behavior |
| Unlock Latent Demand | 33% of commercial users want to create charts but don't | Remove friction to convert intent → action |
The Business Funnel Problem
The massive drop from awareness → creation is where AI Design Recommendations lives. It attacks the -98% conversion gap.
My Role
Designing AI as Design Partner
I wasn't just designing a UI — I was co-designing the intelligence behind it, working across ML, data science, engineering, and research simultaneously.
Systems thinking — Thinking Charts through complete M365 ecosystem
RAG model training & tuning — defined data viz properties that inform chart type recommendations
Recommendation interaction patterns — preview, apply, undo flows
AI prompt engineering collaboration — co-designed LLM prompts with ML team for chart analysis, gave examples of visually stunning data viz.
End-to-end UX strategy for Copilot-powered design recommendations
Multi-recommendation handling — when LLM suggests 3-5 improvements, how to present without overwhelming
Trust-building mechanisms — explainability, rationale, learn more links
Design recommendations should feel like:
The design philosophy that shaped every interaction pattern, recommendation format, and piece of copy in the system.
A helpful colleague, not a know-it-all boss
Educational — explain WHY, don't just say WHAT
Suggestions, not mandates — users always have final say
Confidence-building — help users become better designers over time
Phase 1:
Analyzed telemetry revealing a -98% funnel drop from chart awareness to creation. Synthesized OCV feedback — users called charts "boring" and "embarrassing." Defined the core JTBD: help users tell data stories without being viz experts.
The OCV analysis showed 38% of chart complaints were about poor quality — tables instead of charts, wrong grouping, blank outputs. That's what we were solving.
Phase 2:
Explored 3 directions: auto-apply magic, inline tooltips, side-by-side preview. User testing rejected "AI takeover" — they wanted to see options first. Landed on story-first titles and preview-before-commit as guiding principles.
Phase 3:
Partnered with engineering to map hard limits: 2-4s LLM latency, ~85% preview fidelity, single Copilot pane. Made key tradeoffs — 4 recommendations, refresh button, dropdown for placement. Designed around constraints, not against them.
Phase 4:
Built a golden dataset of 50+ data scenarios with ideal chart recommendations. Defined statistical signals for preprocessing — time-series, part-to-whole, category comparison. Reviewed model outputs weekly to catch and correct bad patterns.
Phase 5:
Working with the ML team, I co-created prompts optimized for chart type selection. The key was encoding data visualization best practices into the prompt structure — story-first titles, rationale text, diverse recommendations.
I defined the decision tree that the model uses to recommend chart types:
| User Intent / Data Shape | Recommended Chart | Why This Works |
| Time series with trend | Line Chart | Shows change over time; eye follows the trajectory |
| Categorical comparison | Clustered Bar/Column | Easy side-by-side comparison; clear value differences |
| Part-to-whole (<7 categories) | Pie/Donut Chart | Intuitive percentage representation; limited categories |
| Part-to-whole (>7 categories) | Stacked Bar/Area | Handles many categories; shows composition |
| Correlation/distribution | Scatter/Bubble Chart | Reveals relationships; shows outliers clearly |
| Actual vs. Target | Combo Chart | Different visual encoding for different data types |
Through iterative testing on 20+ sample datasets across industries (Telecom, Finance, Manufacturing, Retail), we tuned the prompts to:
**Prompt Architecture**
Given a chart with [data structure], current type [X], analyze if a better visualization exists.
Consider:
1) Data relationships,
2) Storytelling intent,
3) Visual clarity
Return top 4 recommendations with executable chart config and brief rationale.Phase 6:
Designed the List → Detail two-panel flow. Specified card anatomy: thumbnail, story-first title, rationale, one-click apply. Added "Review changes" section for transparency. Created interaction specs for hover, dropdown, and back navigation.
Phase 7:
Ran usability sessions validating story-first titles. A/B tested model versions tracking Kept rate. Iterated on "Show details" for power users. Shipped to 10% Fastfood — poor quality dropped 20pp, satisfaction hit 64%.
Initial direction, design and concepts
Three directions explored before converging: auto-apply magic, inline guidance, and story-first previews. User testing killed option one fast.
The MVP
The first version that shipped — one recommendation, one action, one rationale. Intentionally narrow scope to reduce risk and build user trust before scaling.
##
Key Design Decisions & Trade-offs
The four biggest calls I made — and the reasoning, constraints, and user evidence that shaped each one.
Choice: Generate NATIVE Excel charts, not PNG images
Why: Editable, data-bound, refreshable. Competitors' AI-generated images look good but can't be tweaked.
Choice: Show 1-4 recommendations, prioritized by confidence
Why: Balance guidance with choice. 1 felt prescriptive, 5+ overwhelmed. 4 was sweet spot.
Choice: Thumbnail preview in pane, NOT live chart manipulation on hover
Why: Live preview felt overwhelming. Thumbnails gave control without distraction.
Choice: Always show WHY, not just WHAT to change
Why: Builds user understanding over time. Trust through transparency.
Impact & Results
From controlled testing to 25% public rollout — every metric moved in the right direction.
Kept/Tried Rate
Users who apply recommendation keep the chart
Error-Free Load Rate
Down from 38% baseline
Copilot Tried/Enabled Lift
Uplift in users who try Copilot after seeing recommendations
Chart Retention Improvement
Reduction in same-session chart deletions
The feature is successfully reaching Novice/Intermediate users who traditionally DON'T use Copilot for charts (0.3%-3.7% baseline). Their engagement rates (54-61%) far exceed their typical Copilot usage.
| Metric | Target | Month 1 | Month 2 | Trend |
| Kept/Tried Rate | ≥80% | 71% | 79% | 📈 +8pp |
| Error-Free Load Rate | ≥80% | 74% | 85% | 📈 +11pp |
| Poor Quality Rate | <20% | 32% | 18% | 📉 -14pp |
| Pane Dismiss Rate | <30% | 41% | 28% | 📉 -13pp |
| Net Satisfaction (👍-👎) | >60% | 48% | 64% | 📈 +16pp |
Feedback from Users
Key finding: Users didn't just apply recommendations — they LEARNED from them. Over time, they started making better initial choices.
"This is like having a data viz expert sitting next to me."
Power User, Internal Preview
"Finally! I don't have to guess if my chart is good."
Intermediate User
"I learned more about charting from these suggestions than from any tutorial."
Novice User, Usability Study
"This is like having a data analyst whispering in my ear when I make a chart."
Financial Analyst, Early Adopter
Internal Testing & Strategic Impact
Pre-launch benchmarks that validated both the technical approach and design decisions before a single user saw it.
✅ 100% execution success rate
✅ LLM-generated chart code worked every time in controlled tests
✅ 9 mins, 172 clicks saved
✅ Measured against manual chart optimization workflow
✅ All common chart types supported
✅ Column, bar, line, scatter, pie, combo — full MVP coverage
Key Learnings: Designing for AI Collaboration
What I'd do the same, what I'd change, and what this project taught me about designing with — not around — AI.
The Bigger Lesson for AI Product Design