Skip to main content
Version: 4.x

Monitoring

VeloDB Cloud provides monitoring and alerting so you can track the health and performance of your warehouse and clusters and react when something changes.

Open Monitoring from the Manage group in the left navigation. Monitoring has four sub-pages: Metrics, Alerts, Query Audit, and Usage.

Metrics

On the Metrics page you can:

  • View metrics by warehouse or by cluster.
  • Use Starred to pin the metrics you care about across warehouses and clusters so they display together.
  • Adjust the time selector to look at historical data up to the past 15 days.
  • Enable auto-refresh for near-real-time updates (5-second interval).

Metrics are split into two categories: Basic Metrics (physical resource utilization) and Service Metrics (query / workload performance).

Basic metrics

Basic metrics track physical utilization of the cluster by node. They help you judge whether the cluster is healthy in a given time range, and whether historical or current queries are affecting performance — useful input when planning to scale up, scale down, or optimize SQL.

metrics basic

MetricWhat it shows
CPU UtilizationCPU utilization percentage across all nodes. Useful to find quiet windows for scaling or other resource-heavy operations.
Memory UsageMemory consumed by all nodes. Consistently high usage is a signal to scale up.
Memory UtilizationMemory utilization of all nodes. Consistently high utilization is a signal to scale up.
I/O UtilizationDisk I/O utilization. Consistently high values suggest adding more nodes for query performance.
Network Outbound ThroughputAverage outbound speed per second per node (MB/s). Queries that read data over the network are slower — configure caching to reduce network reads.
Network Inbound ThroughputAverage inbound speed per second per node (MB/s).
Cache Read ThroughputPer-second cache read throughput (MB/s).
Cache Write ThroughputPer-second cache write throughput (MB/s).

Support range:

MetricWarehouseCluster
CPU Utilization
Memory Usage
Memory Utilization
I/O Utilization
Network Outbound Throughput
Network Inbound Throughput
Cache Read Throughput
Cache Write Throughput

Service metrics

Service metrics track query and workload behavior: how fast queries run, how many succeed, how write paths behave.

metrics query

MetricWhat it shows
Query Per Second (QPS)Number of query requests per second. Peak QPS is a useful input when sizing a cluster.
Query Success RatePercentage of successful queries, updated per minute. Abnormal drops may indicate a cluster or node failure.
Dead NodesNumber of dead nodes in the cluster.
Average Query RuntimeAverage query time, updated per minute. Abnormal rises are a signal to investigate.
Query 99th LatencyResponse time of the 99th-percentile query. Reflects the speed of slow queries.
Cache Hit RatePercentage of I/O operations served from cache. Low values suggest reviewing the cache policy or increasing cache space.
Remote Storage Read ThroughputAmount of data read from remote storage per unit time.
SessionsNumber of sessions for the warehouse (not split per cluster).
Load Rows Per SecondRate at which records are being successfully written.
Load Bytes Per SecondRate at which data volume is being written.
Finished Load TasksNumber of load tasks completed in the recent period. Sharp changes may indicate a business anomaly.
Compaction ScoreData-file merging pressure. Higher score means more merging pressure.
Transaction LatencyTransaction latency of write tasks. Lower means data becomes queryable sooner.

Support range:

MetricWarehouseCluster
Query Per Second
Query Success Rate
Dead Nodes
Average Query Time
Query 99th Latency
Cache Hit Rate
Remote Storage Read Throughput
Sessions
Load Rows Per Second
Load Bytes Per Second
Finished Load Tasks
Compaction Score
Transaction Latency

Alerts

VeloDB Cloud provides monitoring and alerting at no additional charge (beyond SMS alert notifications). You can configure alert rules so you are notified when cluster monitoring metrics change.

metrics alerts

View alert rules

The Alerts list shows existing alert rules and their current alerting status: a red dot means the rule is firing; a green dot means the rule is not triggered.

One-click alerts

Click Enable One-Click Alert to set up a basic rule set automatically. The rule set applies to both current warehouses and clusters and any you create later.

one click alert

Create or edit an alert rule

Click New Alert Rule or copy an existing rule.

metrics alerts new alert rule

An alert rule has four parts:

PartDescription
Rule NameA name unique within the warehouse.
ClusterClusters the rule applies to. When a cluster is deleted, its rules are not deleted, but they are invalidated.
ConditionsOne or more metric thresholds, combined with and / or.
In LastHow long the conditions must hold before the rule fires. Balance timeliness and noise.

Notification channels

Every alert rule can push to one or more channels; each channel pushes the alert message independently.

In-site notification and Email — pick the users to notify.

SMS — pick users or enter phone numbers directly.

WeCom — add a group robot and paste its webhook URL.

  1. On WeCom for PC, open the group that should receive alerts.
  2. Right-click the group and click Add Group Bot, then Create a Bot.
  3. Give the bot a name and click Add.
  4. Copy the webhook URL into the alert channel configuration.
alerts WeCom

Note To restrict message sources, configure a webhook IP allowlist. The VeloDB Cloud server IP is 3.222.235.198.

Lark — add a custom bot and paste its webhook URL.

  1. In the target group, click Settings → BOTs → Add Bot → Custom Bot.
  2. Give the bot a name and description, click Next.
  3. Copy the webhook URL into the alert channel configuration.
alerts Lark step1 alerts Lark step2

Note To restrict message sources, configure a webhook IP allowlist. The VeloDB Cloud server IP is 3.222.235.198.

DingTalk — add a custom robot and paste its webhook URL. See DingTalk's guide for the full procedure. In summary:

  1. In the target DingTalk group, open Group Settings → Group Assistant → Add Robot → Custom.
  2. Set a profile picture, name, and security settings (use Custom Keywords and enter alert).
  3. Accept the terms and click Finished.
  4. Copy the webhook URL into the alert channel configuration.
alerts DingTalk01 alerts DingTalk02 alerts DingTalk03

Note To restrict message sources, configure a webhook IP allowlist. The VeloDB Cloud server IP is 3.222.235.198.

Alert history

You can view the alert firing history on the Alerts page and filter it by time, rule, or cluster.

Query Audit

Query Audit is a one-stop tool for auditing and analyzing queries that have run in the warehouse. Use it to find poor-performing queries, investigate trends, and diagnose individual problems.

Filter historical queries in the list view, and use the List Selection to add more dimensions to your filter.

Click a Query ID to open the query detail page. If profile capture was enabled, the profile is available there.

audit log

Note Non-query statements and failed statements do not have a Query ID.

Usage

Usage shows how compute, cache, and storage are being consumed inside the current warehouse, so you can see where the cost is going.

Open Monitoring → Usage from the left navigation.

usage

For organization-level billing and payment channels, see Billing.