User Guide
This page is for VeloDB users. Your administrator has already deployed the VeloDB MCP Service for you; you only need to know:
- How to connect the service to an AI agent (Claude Code / Cursor / Codex)
- How to query data through an AI conversation
For defining and managing the semantic model (administrators), see Semantic Model.
Connect an AI Agent (Claude Code / Cursor / Codex)
Before you start, get the following:
-
MCP Connection URL: open the Connection page → Connection Methods → copy the MCP Connection URL.
-
VeloDB username and password: used for authentication (format
username:password, e.g.admin:admin) -
Workspace name: the workspace that holds the semantic model (default
example)
Connect Claude Code
Step 1: Add the MCP Server
claude mcp add --transport http velodb https://<warehouse-id>.<region>.aws.velodb.io/mcp \
--header "Authorization: Bearer admin:admin"
Step 2: Verify the connection
In Claude Code, type:
Check service health
The AI automatically calls check_service_health and returns the VeloDB connection status and the status of every workspace. If you see "velodb": "connected", the connection succeeded.
Step 3: Start querying
List all available metrics
The AI follows a three-step flow: list_metrics → list_dimensions_for_metric → query_metric.
Connect Cursor / VS Code
Create a .cursor/mcp.json in your project root, or the equivalent MCP configuration for VS Code:
{
"mcpServers": {
"velodb": {
"url": "https://<warehouse-id>.<region>.aws.velodb.io/mcp",
"transport": "streamable-http",
"headers": {
"Authorization": "Bearer admin:admin"
}
}
}
}
Available Tools (10 total)
Startup & health check (call these first when a conversation begins)
| Tool | Parameters | Purpose |
|---|---|---|
get_query_guide | — | Call this first. Returns the complete workflow guide: when to use the semantic layer vs. raw SQL, the order of tool calls, and query syntax. Skipping it makes the agent take detours. |
check_service_health | detail (optional) | Returns VeloDB connectivity plus the status of each workspace (healthy / no_models / not_ready) and its metric count. Call it at the start of every conversation. |
Semantic queries (preferred — generates correct SQL automatically)
| Tool | Parameters | Purpose |
|---|---|---|
list_metrics | workspace (required), page_size, page_token | Lists all available metrics (name + description). The agent matches a metric to the user's intent. |
list_dimensions_for_metric | workspace (required), metric_name (required) | Returns the metric's available group_by dimensions (name + type + description), so group_by does not fail. |
query_metric | workspace (required), metrics (required), group_by, where, order_by, limit, having, database, max_rows | The core query tool. Compiles and executes a semantic query. Anything involving counts, sums, ratios, ranking, or trends must go through here. Supports HAVING to filter aggregated results. database sets the target database; max_rows hard-caps the returned rows (0 = default 10000). |
reload_semantic_layer | workspace (required) | Manually triggers a semantic-layer reload. Use it when an administrator has changed a YAML model and committed it and you need it to take effect immediately; otherwise the layer polls every 60s. |
Low-level metadata & raw SQL (use only in the two cases below)
Two raw-SQL cases:
-
Case A — semantic layer unavailable:
check_service_healthreports that no workspace ishealthy(all are no_models / not_ready). All queries must then go throughlist_databases→list_tables→describe_table→execute_query. -
Case B — no matching metric: the semantic layer is healthy, but no metric in
list_metricsmatches the user's intent. Fall back to the raw-SQL path in the same way.
| Tool | Parameters | Purpose |
|---|---|---|
list_databases | page_size, page_token | Lists all databases. Useful while exploring, but not a substitute for list_metrics. |
list_tables | database (required), like, page_size, page_token | Lists table names in a database. Returns table names only, no column information. Use describe_table to see columns. |
describe_table | database (required), table (required), detail_level | Shows the physical table structure (column names, types). detail_level: names/summary/full. |
execute_query | sql (required), database, max_rows | Raw-SQL fallback. Read-only only (SELECT/SHOW/DESCRIBE/EXPLAIN). If the semantic layer has a matching metric, prefer query_metric. |
Recommended Agent Workflow
Standard flow for every conversation:
-
get_query_guide()— get the full guide (so the agent knows which tool to use when) -
check_service_health()— confirm VeloDB connectivity and see the available workspaces -
Routing decision:
If the user asks "how many", "calculate", "trend", or "ratio" → use the semantic layer (4-5-6)
-
If the user asks "show the table structure" or "what databases are there" → use the metadata tools
-
Semantic layer unavailable (no workspace is healthy) → skip 4-6 and go straight to 7
-
list_metrics(workspace)— list metrics and match the user's intent -
list_dimensions_for_metric(workspace, metric_name)— get the available dimensions -
query_metric(workspace, metrics, group_by, ...)— run the query -
Fallback: if step 4 finds no matching metric, use
list_databases→list_tables→describe_table→execute_queryfor the raw-SQL path
Hard rules:
-
Semantic layer healthy + a matching metric → never bypass
query_metricto write SQL directly withexecute_query. -
Semantic layer healthy + no matching metric → first tell the user "the semantic layer does not cover this query", then use
execute_query, and attach a warning to the result.
Conversation Example
User: What was the total order amount by channel last month?
AI reasoning:
1. get_query_guide → got the guide ✓
2. check_service_health → workspace "example" healthy, 5 metrics ✓
3. Involves "total amount" and "channel" → use the semantic layer
4. list_metrics(example) → total_amount (total order amount) ✓
5. list_dimensions_for_metric(example, total_amount) → channel ✓
6. query_metric(example, metrics=[total_amount], group_by=[channel],
where="order_date >= '2026-04-30'")
AI reply:
channel total_amount
WEB 1,096.00
APP 2,396.00
MINI 298.00
Query Capabilities
Once connected, you ask questions in plain language in Claude Code, and the AI calls the matching MCP tool automatically.
Metadata Discovery
| What you want | Just say | Tool called |
|---|---|---|
| See the databases | "List all databases" | list_databases |
| See the tables in a database | "List the tables in the dw database" | list_tables |
| See a table's structure | "Show the columns of the orders table" | describe_table |
SQL Queries
| What you want | Just say | Tool called |
|---|---|---|
| Query data | "Show the top 10 orders" | execute_query |
| Aggregate analysis | "Count orders by status" | execute_query |
execute_query allows read-only SQL only (SELECT / SHOW / DESCRIBE / EXPLAIN).
Semantic-Layer Queries (requires an administrator-configured semantic model)
If your VeloDB cluster has a semantic layer configured, the agent completes a query in three steps:
-
list_metrics— list all available metrics (name + description) -
list_dimensions_for_metric— see the metric's available dimensions (name + type + description) -
query_metric— run the semantic query (supportshavingto filter aggregated results)
| What you want | Just say | Tool called |
|---|---|---|
| See the metrics | "List all metrics" | list_metrics |
| Check a revenue trend | "Show revenue for the past 7 days, grouped by day" | query_metric |
| See available dimensions | "Which dimensions can total_amount be analyzed by?" | list_dimensions_for_metric |
| Check service status | "Check service health" | check_service_health |
Every semantic tool requires the workspace parameter ("example" is available by default).
Query Examples
Example 1: Explore the data structure
You: List all databases
AI: [calls list_databases]
dw
You: What tables are in the dw database?
AI: [calls list_tables]
orders, users, products, dim_date
You: Show the structure of the orders table
AI: [calls describe_table]
Column Type Comment
order_id BIGINT Order ID
user_id BIGINT User ID
amount DECIMAL Amount
channel VARCHAR Channel
status VARCHAR Status
order_date DATE Order date
Example 2: Query data
You: Show the order count and total amount by status for the past week
AI: [calls execute_query]
status count total_amount
completed 1234 567890.00
pending 567 89012.00
cancelled 89 12345.00
Example 3: Semantic metrics
You: List all available metrics
AI: [calls list_metrics]
total_amount Total order amount
order_count Order count
avg_amount Average order value
unique_users Distinct ordering users
user_count User count
You: Show total order amount by channel for the past 7 days
AI: [calls list_dimensions_for_metric] → channel
[calls query_metric]
channel total_amount
Online 12345.00
Offline 15678.00
...
FAQ
Q: The connection fails with a 401
Make sure the token format is correct: username:password (separated by an ASCII colon).
Q: A semantic query returns "workspace not found"
Make sure the workspace name is correct; "example" is available by default. Sign in to the WebUI to see the available workspaces.
Q: Can it perform writes?
No. execute_query allows read-only SQL only (SELECT / SHOW / DESCRIBE / EXPLAIN).
Q: Queries are slow
For large data volumes, ask the AI to add LIMIT or narrow the time range.
Q: Which metrics can the semantic layer use?
Ask the AI to "list all metrics" (it calls list_metrics), or sign in to the WebUI to see the metrics configured for each workspace.
Next Steps
- Semantic Model — Administrators: define entities, dimensions, measures, and metrics in YAML.