VeloDB Cloud
Integration
Integration Overview

Integration Overview

VeloDB integrations are categorized into BI, Lakehouse, Observability, SQL client, Data Source, Data Ingestion and Data Processing categories.

This list of VeloDB / Apache Doris integrations is continuously being updated and is not yet complete. We welcome any contributions of relevant VeloDB / Apache Doris integrations to help expand it. Contact Us to update the integration list.

Lakehouse

NameLogoDescriptionResources
Apache IcebergicebergDoris supports accessing Iceberg table data through various metadata services. In addition to reading data, Doris also supports writing to Iceberg tables.Documentation
Apache HudiHudiBy connecting to the Hive Metastore, or a metadata service compatible with the Hive Metastore, Doris can automatically obtain Hudi's database and table information and perform data queries.Documentation
Amazon Glueamazon glueUsing AWS Glue Catalog to access Iceberg tables or Hive tables through CREATE CATALOG.Documentation
Apache PaimonPaimonDoris currently supports accessing Paimon table metadata through various metadata services and querying Paimon data.Documentation
Apache HiveHiveBy connecting to Hive Metastore or metadata services compatible with Hive Metastore, Doris can automatically retrieve Hive database and table information for data querying.Documentation
BigQueryBigQueryBigQuery Catalog uses the Trino Connector compatibility framework to access BigQuery tables through the BigQuery Connector.Documentation
Apache KuduKuduKudu Catalog uses the Trino Connector compatibility framework to access Kudu tables through the Kudu Connector.Documentation
LakeSoulLakeSoulDoris supports accessing and reading LakeSoul table data using metadata stored in PostgreSQL.Documentation
MaxComputeMaxComputeMaxCompute is an enterprise-level SaaS (Software as a Service) cloud data warehouse on Alibaba Cloud.Documentation

Observability

NameLogoDescriptionResources
OpentelemetryOpentelemetryUsing AWS Glue Catalog to access Iceberg tables or Hive tables through CREATE CATALOG.Documentation
LogstashLogstashLogstash is a log ETL framework (collect, preprocess, send to storage systems) that supports custom output plugins to write data into storage systems.Documentation
BeatsBeatsDoris supports accessing Iceberg table data through various metadata services. In addition to reading data, Doris also supports writing to Iceberg tables.Documentation
FluentbitFluentbitDoris currently supports accessing Paimon table metadata through various metadata services and querying Paimon data.Documentation

Data Processing

NameLogoDescriptionResources
Apache SparksparkSpark Doris Connector can support reading data stored in Doris and writing data to Doris through Spark.GitHub (opens in a new tab)
Documentation
Apache FlinkflinkThe Flink Doris Connector is used to read from and write data to a Doris cluster through Flink.GitHub (opens in a new tab)
Documentation
dbtdbtThe dbt-doris adapter is developed based on dbt-core and relies on the mysql-connector-python driver to convert data to doris.Documentation

BI


NameLogoDescriptionResources
TableautableauInteractive data visualization software focused on business intelligenceDocumentation
Power BIpowerbiMicrosoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence.Documentation
QuickSightquicksightAmazon QuickSight powers data-driven organizations with unified business intelligence (BI).Documentation
Apache SupersetsupersetApache Superset is an open-source data exploration platform. It supports a rich variety of data source connections and numerous visualization methods.Documentation
FineBIfinebiFineBI supports rich data source connection and analysis and management of tables with multiple views.Documentation
SmartBIsmartbiSmartbi is a collection of software services and application connectors that can connect to a variety of data sources, including Oracle, SQL Server, MySQL, and Doris, enabling users to integrate and cleanse their data easily.Documentation
QuickBIquickbiQuick BI is a data warehouse-based business intelligence tool that helps enterprises set up impressive visual analyses quickly.Documentation

SQL Client


NameLogoDescriptionResources
DBeaverdbeaverDBeaver is a cross-platform database tool for developers, database administrators, analysts and anyone who works with data.Documentation
DataGripdatagripDataGrip is a powerful cross-platform database tool for relational and NoSQL databases from JetBrains.Documentation

Data Source


NameLogoDescriptionResources
Apache KafkaKafkaDoris integrates with Kafka via its efficient Routine Load for real-time streaming (CSV/JSON, Exactly-Once) and the Doris Kafka Connector for advanced formats.GitHub (opens in a new tab)
Documentation
Doris Kafka ConnectorKafkaDoris integrates with Kafka via its efficient Routine Load for real-time streaming (CSV/JSON, Exactly-Once) and the Doris Kafka Connector for advanced formats.GitHub (opens in a new tab)
Documentation
MySQLMySQLDoris JDBC Catalog supports connecting to MySQL databases via the standard JDBC interface.Documentation
PostgreSQLPostgreSQLDoris JDBC Catalog supports connecting to PostgreSQL databases via the standard JDBC interface.Documentation
Amazon S3amazon s3Doris supports loading S3 files using both asynchronous (S3 Load) and synchronous (TVF) methods.Documentation
AzureazureDoris supports loading Azure Storage files using both asynchronous (S3 Load) and synchronous (TVF) methods.Documentation
Google Cloud StoragegcpFor loading files from Google Cloud Storage, Doris provides two methods: the asynchronous S3 Load and the synchronous TVF.Documentation
MinIOMinIODoris supports loading MinIO files using both asynchronous (S3 Load) and synchronous (TVF) methods.Documentation
HDFSHDFSBy connecting to Hive Metastore or metadata services compatible with Hive Metastore, Doris can automatically retrieve Hive database and table information for data querying.Documentation

Data Ingestion

NameLogoDescriptionResources
Doris Streamloaderdoris streamloaderDoris Streamloader is a client tool designed for loading data into Apache Doris. In comparison to single-threaded load using curl, it reduces the load latency of large datasets by its concurrent loading capabilities.Documentation
Apache SeaTunnelseatunnelSeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data.Documentation
BladePipecloudcanalBladePipe is a real-time end-to-end data replication tool, moving data between 30+ databases, message queues, search engines, caching, real-time data warehouses, data lakes and more, with ultra-low latency less than 3 seconds.Documentation

More

NameLogoDescriptionResources
Doris OperatorcloudcanalDoris Operator implements the configuration, management and scheduling of Doris on the Kubernetes platform based on Kubernetes CustomResourceDefinitions (CRD).GitHub (opens in a new tab)
Documentation
AutoMQautomqAutoMQ is a cloud-native fork of Kafka by separating storage to object storage like S3.Documentation
DataXdataxThe DataX Doriswriter plugin supports synchronizing data from various data sources, such as MySQL, Oracle, and SQL Server, into Doris using the Stream Load method.Documentation
KettlekettleKettle Doris Plugin is used to write data from other data sources to Doris through Stream Load in Kettle.Documentation
Apache KyuubikyuubiApache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on Data Warehouses and Lakehouses.Documentation