EngineeringAI

How Natural Language to SQL Actually Works

A technical deep dive into how warpcore translates plain English questions into accurate SQL queries — and why we always show you the SQL before running it.

Alex ChenApril 12, 2026

When someone types /ask How many orders shipped last week? in Discord, warpcore needs to turn that into valid SQL against your specific database. Here's how that works under the hood.

The Problem Is Harder Than It Looks

Consider a simple question: "What's our revenue this month?"

Depending on your schema, the correct SQL could be any of:

-- Simple sum
SELECT SUM(amount) FROM payments WHERE created_at >= date_trunc('month', now());

-- With currency conversion
SELECT SUM(amount * exchange_rate) FROM transactions
WHERE status = 'completed' AND timestamp >= date_trunc('month', now());

-- From an invoices table
SELECT SUM(total) FROM invoices
WHERE paid_at IS NOT NULL AND paid_at >= date_trunc('month', now());

The "right" answer depends entirely on your schema — table names, column names, relationships, and business logic. Generic SQL generation doesn't cut it.

Our Approach: Schema-Aware Translation

warpcore doesn't guess. It reads your database schema and uses it as context for every translation.

Step 1: Schema Introspection

When you connect a database, warpcore catalogs:

Tables and columns — names, types, nullable flags
Relationships — foreign keys, join paths
Indexes — to suggest performant queries
Comments — if you've documented your schema, we use that context

Well-documented schemas produce dramatically better natural language results. A column named amt with no comment is ambiguous. A column named amount_cents with a comment "Order total in USD cents" is crystal clear.

Step 2: Query Intent Classification

Before generating SQL, we classify the type of question:

Intent	Example	SQL Pattern
Count	"How many users signed up?"	`SELECT COUNT(*)`
Aggregation	"What's the average order value?"	`SELECT AVG(column)`
Lookup	"What's the status of order #1234?"	`SELECT ... WHERE id = ?`
Trend	"Show signups by day this week"	`GROUP BY date_trunc(...)`
Comparison	"Which plan has more users?"	`GROUP BY ... ORDER BY ...`

Step 3: SQL Generation

With schema context and classified intent, we generate the SQL. The key principles:

Always qualify columns — no ambiguous references
Use the simplest valid query — no unnecessary joins or subqueries
Respect types — don't compare strings to integers
Add sensible limits — default to LIMIT 100 to prevent runaway queries

Step 4: Human Review

This is the most important step: we show you the SQL before executing it.

> /ask How many users signed up this week?

Generated SQL:
SELECT COUNT(*) AS signups
FROM users
WHERE created_at >= date_trunc('week', now())

Run this query? [✓ Execute] [✏️ Edit] [✗ Cancel]

The AI is a draft. You're the editor. If the SQL looks wrong, edit it or cancel. If it looks right, run it.

When It Works Well

Natural language queries work best when:

Table and column names are descriptive — user_signups beats tbl_us
Schema has comments — we use COMMENT ON metadata
Questions are specific — "revenue this month" vs "how's business?"
The schema is normalized — clean relationships = clean joins

When It Struggles

Be aware of the limitations:

Ambiguous schemas — if you have three tables that could contain "revenue," we might pick the wrong one
Complex business logic — "active users" might mean different things to different teams
Multi-step calculations — questions that require CTEs or window functions can be hit-or-miss
Very large schemas — 500+ tables makes the context window crowded

For these cases, just write SQL directly with /query. That's always an option and always accurate.

Tips for Better Results

If you want natural language mode to work well, invest in your schema:

Name things clearly — future humans (and AI) will thank you
Add column comments — COMMENT ON COLUMN orders.total IS 'Total in USD cents, before tax'
Document relationships — foreign keys aren't just for integrity, they're for context
Keep a glossary — pin a message in your warpcore channel explaining team-specific terms

The best natural language interface is a well-named schema. Everything else is optimization.

All posts