Case Study 05 — Datadog Platform
Enter password
Carlos Diaz
Staff Product Designer
Work About
Case Study 05 — Datadog Platform

Natural Language
Queries

Redesigning how users interact with Datadog's query language — letting anyone describe what they need in plain language and receive a structured, editable, executable query.

335Beta users
43Organizations
6+Data sources
GATarget Q2 2025
Project overview video
01Context

Datadog's query language is powerful — Lucene for logs, a proprietary syntax for metrics, DDSQL for analytics. But for a growing number of users, that power comes at a cost.

New developers, business users, product managers, and support teams increasingly need to query observability data without mastering Datadog's syntax. Each product has its own query structure — Logs, RUM, Traces, Metrics, CCM — and knowledge doesn't transfer between them.

When a query returns unexpected results, users can't tell if the syntax is wrong, the field name is off, or the time range is bad. The result: trial-and-error, reliance on technical teammates, and eroding confidence in a tool they're supposed to use independently.

NLQ changes this. Users describe what they need in plain language — "Show me errors for the payment service in the last 24 hours" — and receive a structured, editable query they can run, refine, and trust.

RoleDesign Lead
TimelineQ4 2024 – Q2 2025
Shipped toLogs Explorer (expanding to RUM, Traces, Metrics)
TeamsNLQ, Search, Logs, AI Platform
Add image
02The Problem

NLQ existed in private beta with 40+ organizations — but retention was only 12%. Three issues were blocking adoption.

👁️

Invisible to most users

The NLQ entry point was buried inside the search bar as a secondary option. Most users never discovered it existed. "Ask a question" felt hidden — like an afterthought, not a feature.

🐌

Too slow to trust

Average query translation latency was 1.93 seconds. In a search-first tool where users expect instant feedback, two seconds feels broken. The target: under one second.

🔇

No feedback loop

Users had no way to tell the system if a translation was right or wrong. Bad queries stayed bad. The model couldn't learn from mistakes, and users couldn't help it improve.

Add image
03Process

From research to design review in four months. Started with Logs — the highest-usage surface (66% of all NLQ queries) — with a framework designed to scale to every explorer.

Phase 1

Research + Benchmarking

Conducted user interviews across new users, business users, and power users. Benchmarked against Thoughtspot, Splunk NL, and internal Bits AI patterns. Identified three blockers: discoverability, latency perception, and feedback absence. Built a kick-off deck mapping personas to pain points.

Phase 2

Design Exploration

Explored multiple directions: a focus-mode overlay (discarded — redundant with Explorer), a heavily AI-branded experience (discarded — too invasive for frequent use), and the final approach: seamless in-Explorer integration where NLQ lives inside the existing query layout. Built Figma prototypes and interactive demos for stakeholder review.

Phase 3

Design Review + Ship

Ran a formal design review with PM (Bharadwaj Tanikella), Engineering (Tim Brown), and cross-functional stakeholders. Landed on the "Eager Translation + Submit" model over eager execution. Shipped the recommended version to Logs Explorer with feedback loop, expanding to RUM and Traces next.

Add image
04Solution

NLQ v2.0 — not a separate mode, but a layer woven into the Explorer. Five design decisions that make natural language feel native.

Discoverable Entry Point

An "Ask" CTA embedded inside the search bar — not next to it, not above it, inside it. This ensures consistency across every Datadog product that adopts NLQ. A tooltip explains the feature on hover. Power users can hide it via the Search Bar Configuration Hub. We also explored a "Space" shortcut to trigger NLQ from the keyboard, aligning with Datadog's broader AI shortcut strategy.

Entry point

Eager Translation

As the user types in natural language, the query editor below shows a real-time translation into Datadog syntax. The timeframe selection updates simultaneously. Users see exactly what the system understood before committing — no black box, no guessing. The NLQ component enters a highlighted state with a visible Submit button, making it clear that Enter or click is required to execute.

Real-time translation

Three Distinct States

The NLQ flow has three clear phases: Intro (type your prompt, see dynamic placeholder suggestions based on your org's data), Eager Translation (query preview generated, Submit enabled, refinement still possible), and Outcome + Feedback (query executed in Explorer, results visible, feedback prompt appears). Each state has distinct visual treatment so users always know where they are.

Three states

Seamless Explorer Integration

NLQ doesn't take you somewhere else — it opens directly inside the Explorer. Users stay in their workflow. After submission, they can modify the generated query in the standard query editor, adjust visualization settings, tweak facets — exactly like a manually written query. The transition between NLQ and traditional editing is invisible.

In-Explorer flow

Feedback Loop

After every query execution, a feedback prompt appears in the bottom-right of the NLQ component. Users rate the translation quality, and corrections feed back into the model. Aggregated feedback trains the translation engine on real-world usage patterns — the system gets smarter with every query. This was the missing piece that kept retention at 12%.

Feedback system
05Outcomes

Shipped to Logs Explorer. Expanding to RUM, Traces, CCM, and Metrics — with a path to becoming the default query interface across Datadog.

335

Beta Users

Active across 43 organizations in private beta. 66% of usage concentrated in Logs — validating the launch surface choice.

v2.0

Redesigned Experience

From buried entry point to embedded CTA. From 1.93s latency perception to eager real-time translation. From zero feedback to structured rating system.

6+

Data Sources

Logs, RUM, CCM, REDAPL already supported. Traces and Metrics in the pipeline. Framework designed to scale to every explorer search bar.

Bits AI

AI Foundation

NLQ is the query translation layer for Bits AI agents, Cmd+K natural language, and autonomous workflows. Every improvement to NLQ compounds across the platform.

GA

General Availability

Targeting Q2 2025. Expanding from 40 to 200+ preview organizations, then full rollout with pricing aligned to Datadog's AI product strategy.

What's Next

Complex contextual queries ("errors for my team"), RBAC-aware filtering, cold-start handling for new services, and DDSQL support for REDAPL and Workspaces.

Add image
← Content Packs Back toAll Projects →