Data Integration
VeloDB Cloud supports three approaches to working with external data, depending on your use case.
| Use when | Documentation | |
|---|---|---|
| Import | You want a managed pipeline — continuous or scheduled ingestion from databases, object storage, or event streams | Import |
| Catalog | You want to query external data in place without moving it | Add Catalog |
| Migration | You are moving an entire existing warehouse from Apache Doris or VeloDB in one shot | Migration |
Import
Import provides a managed ingestion pipeline with a visual interface. It supports:
- Databases — MySQL and PostgreSQL via CDC, with automatic table creation and offset tracking.
- Object storage — batch load from S3.
- Event streams — continuous ingestion from Kafka, Confluent, and Amazon MSK.
Use Import when data needs to arrive on a schedule or in real time, and you want VeloDB to manage the pipeline state.
Add Catalog
Catalog lets you register external data sources — such as Apache Hive, Iceberg, or object storage — and query them directly from VeloDB Cloud without physically moving the data. Use Catalog when you need federated queries across systems or want to avoid duplication.
Migration
Migration is a one-time, full-data move from an existing Apache Doris or VeloDB warehouse. It uses a temporary object-storage staging area and cleans itself up after completion.
Use Migration when you are onboarding from another system and need to bring the full dataset across before going live.