How Natural Language to SQL Actually Works
A technical deep dive into how warpcore translates plain English questions into accurate SQL queries — and why we always show you the SQL before running it.
When someone types /ask How many orders shipped last week? in Discord, warpcore needs to turn that into valid SQL against your specific database. Here's how that works under the hood.
The Problem Is Harder Than It Looks
Consider a simple question: "What's our revenue this month?"
Depending on your schema, the correct SQL could be any of:
-- Simple sum
SELECT SUM(amount) FROM payments WHERE created_at >= date_trunc('month', now());
-- With currency conversion
SELECT SUM(amount * exchange_rate) FROM transactions
WHERE status = 'completed' AND timestamp >= date_trunc('month', now());
-- From an invoices table
SELECT SUM(total) FROM invoices
WHERE paid_at IS NOT NULL AND paid_at >= date_trunc('month', now());
The "right" answer depends entirely on your schema — table names, column names, relationships, and business logic. Generic SQL generation doesn't cut it.
Our Approach: Schema-Aware Translation
warpcore doesn't guess. It reads your database schema and uses it as context for every translation.
Step 1: Schema Introspection
When you connect a database, warpcore catalogs:
- Tables and columns — names, types, nullable flags
- Relationships — foreign keys, join paths
- Indexes — to suggest performant queries
- Comments — if you've documented your schema, we use that context
Well-documented schemas produce dramatically better natural language results. A column named
amtwith no comment is ambiguous. A column namedamount_centswith a comment "Order total in USD cents" is crystal clear.
Step 2: Query Intent Classification
Before generating SQL, we classify the type of question:
| Intent | Example | SQL Pattern |
|---|---|---|
| Count | "How many users signed up?" | SELECT COUNT(*) |
| Aggregation | "What's the average order value?" | SELECT AVG(column) |
| Lookup | "What's the status of order #1234?" | SELECT ... WHERE id = ? |
| Trend | "Show signups by day this week" | GROUP BY date_trunc(...) |
| Comparison | "Which plan has more users?" | GROUP BY ... ORDER BY ... |
Step 3: SQL Generation
With schema context and classified intent, we generate the SQL. The key principles:
- Always qualify columns — no ambiguous references
- Use the simplest valid query — no unnecessary joins or subqueries
- Respect types — don't compare strings to integers
- Add sensible limits — default to
LIMIT 100to prevent runaway queries
Step 4: Human Review
This is the most important step: we show you the SQL before executing it.
> /ask How many users signed up this week?
Generated SQL:
SELECT COUNT(*) AS signups
FROM users
WHERE created_at >= date_trunc('week', now())
Run this query? [✓ Execute] [✏️ Edit] [✗ Cancel]
The AI is a draft. You're the editor. If the SQL looks wrong, edit it or cancel. If it looks right, run it.
When It Works Well
Natural language queries work best when:
- Table and column names are descriptive —
user_signupsbeatstbl_us - Schema has comments — we use
COMMENT ONmetadata - Questions are specific — "revenue this month" vs "how's business?"
- The schema is normalized — clean relationships = clean joins
When It Struggles
Be aware of the limitations:
- Ambiguous schemas — if you have three tables that could contain "revenue," we might pick the wrong one
- Complex business logic — "active users" might mean different things to different teams
- Multi-step calculations — questions that require CTEs or window functions can be hit-or-miss
- Very large schemas — 500+ tables makes the context window crowded
For these cases, just write SQL directly with /query. That's always an option and always accurate.
Tips for Better Results
If you want natural language mode to work well, invest in your schema:
- Name things clearly — future humans (and AI) will thank you
- Add column comments —
COMMENT ON COLUMN orders.total IS 'Total in USD cents, before tax' - Document relationships — foreign keys aren't just for integrity, they're for context
- Keep a glossary — pin a message in your warpcore channel explaining team-specific terms
The best natural language interface is a well-named schema. Everything else is optimization.