Integration Overview

VeloDB integrations are categorized into BI, Lakehouse, Observability, SQL client, Data Source, Data Ingestion and Data Processing categories.

This list of VeloDB / Apache Doris integrations is continuously being updated and is not yet complete. We welcome any contributions of relevant VeloDB / Apache Doris integrations to help expand it. Contact Us to update the integration list.

Lakehouse

Name	Description	Resources
Apache Icebergaa	Doris supports accessing Iceberg table data through various metadata services. In addition to reading data, Doris also supports writing to Iceberg tables.	Documentation
Apache Hudi	By connecting to the Hive Metastore, or a metadata service compatible with the Hive Metastore, Doris can automatically obtain Hudi's database and table information and perform data queries.	Documentation
Amazon Glue	Using AWS Glue Catalog to access Iceberg tables or Hive tables through CREATE CATALOG.	Documentation
Apache Paimon	Doris currently supports accessing Paimon table metadata through various metadata services and querying Paimon data.	Documentation
Apache Hive	By connecting to Hive Metastore or metadata services compatible with Hive Metastore, Doris can automatically retrieve Hive database and table information for data querying.	Documentation
BigQuery	BigQuery Catalog uses the Trino Connector compatibility framework to access BigQuery tables through the BigQuery Connector.	Documentation
Apache Kudu	Kudu Catalog uses the Trino Connector compatibility framework to access Kudu tables through the Kudu Connector.	Documentation
LakeSoul	Doris supports accessing and reading LakeSoul table data using metadata stored in PostgreSQL.	Documentation
MaxCompute	MaxCompute is an enterprise-level SaaS (Software as a Service) cloud data warehouse on Alibaba Cloud.	Documentation

Observability

Name	Description	Resources
Opentelemetry	Using AWS Glue Catalog to access Iceberg tables or Hive tables through CREATE CATALOG.	Documentation
Logstash	Logstash is a log ETL framework (collect, preprocess, send to storage systems) that supports custom output plugins to write data into storage systems.	Documentation
Beats	Doris supports accessing Iceberg table data through various metadata services. In addition to reading data, Doris also supports writing to Iceberg tables.	Documentation
Fluentbit	Doris currently supports accessing Paimon table metadata through various metadata services and querying Paimon data.	Documentation

Data Processing

Name	Description	Resources
Apache Spark	Spark Doris Connector can support reading data stored in Doris and writing data to Doris through Spark.	GitHub Documentation
Apache Flink	The Flink Doris Connector is used to read from and write data integration/to a Doris cluster through Flink.	GitHub Documentation
dbt	The dbt-doris adapter is developed based on dbt-core and relies on the mysql-connector-python driver to convert data to doris.	Documentation

BI

Name	Description	Resources
Tableau	Interactive data visualization software focused on business intelligence	Documentation
Power BI	Microsoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence.	Documentation
QuickSight	Amazon QuickSight powers data-driven organizations with unified business intelligence (BI).	Documentation
Apache Superset	Apache Superset is an open-source data exploration platform. It supports a rich variety of data source connections and numerous visualization methods.	Documentation
FineBI	FineBI supports rich data source connection and analysis and management of tables with multiple views.	Documentation
SmartBI	Smartbi is a collection of software services and application connectors that can connect to a variety of data sources, including Oracle, SQL Server, MySQL, and Doris, enabling users to integrate and cleanse their data easily.	Documentation
QuickBI	Quick BI is a data warehouse-based business intelligence tool that helps enterprises set up impressive visual analyses quickly.	Documentation

SQL Client

Name	Logo	Description	Resources
DBeaver		DBeaver is a cross-platform database tool for developers, database administrators, analysts and anyone who works with data.	Documentation
DataGrip		DataGrip is a powerful cross-platform database tool for relational and NoSQL databases from JetBrains.	Docintegration/umentation

Data Source

Name	Description	Resources
Apache Kafka	Doris integrates with Kafka via its efficient Routine Load for real-time streaming (CSV/JSON, Exactly-Once) and the Doris Kafka Connector for advanced formats.	GitHub Documentation
Doris Kafka Connector	Doris integrates with Kafka via its efficient Routine Load for real-time streaming (CSV/JSON, Exactly-Once) and the Doris Kafka Connector for advanced formats.	GitHub Documentation
MySQL	Doris JDBC Catalog supports connecting to MySQL databases via the standard JDBC interface.	Documentation
PostgreSQL	Doris JDBC Catalog supports connecting to PostgreSQL databases via the standard JDBC interface.	Documentation
Amazon S3	Doris supports loading S3 files using both asynchronous (S3 Load) and synchronous (TVF) methods.	Documentation
Azure	Doris supports loading Azure Storage files using both asynchronous (S3 Load) and synchronous (TVF) methods.	Documentation
Google Cloud Storage	For loading files from Google Cloud Stintegration/orage, Doris provides two methods: the asynchronous S3 Load and the synchronous TVF.	Documentation
MinIO	Doris supports loading MinIO files using both asynchronous (S3 Load) and synchronous (TVF) methods.	Documentation
HDFS	By connecting to Hive Metastore or metadata services compatible with Hive Metastore, Doris can automatically retrieve Hive database and table information for data querying.	Documentation

Data Ingestion

Name	Description	Resources
Doris Streamloader	Doris Streamloader is a client tool designed for loading data into Apache Doris. In comparison to single-threaded load using curl, it reduces the load latency of large datasets by its concurrent loading capabilities.	Documentation
Apache SeaTunnel	SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data.	Documentation
BladePipe	BladePipe is a real-time end-to-end data replication tool, moving data between 30+ databases, message queues, search engines, caching, real-time data warehouses, data lakes and more, with ultra-low latency less than 3 seconds.	Documentation

Name	Description	Resources
AutoMQ	AutoMQ is a cloud-native fork of Kafka by separating storage to object storage like S3.	Documentation
DataX	The DataX Doriswriter plugin supports synchronizing data from various data sointegration/urces, such as MySQL, Oracle, and SQL Server, into Doris using the Stream Load method.	Documentation
Kettle	Kettle Doris Plugin is used to write data from other data sourintegration/ces to Doris through Stream Load in Kettle.	Documentation
Apache Kyuubi	Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on Data Warehouses and Lakehouses.	Documentation

Lakehouse​

Observability​

Data Processing​

BI​

SQL Client​

Data Source​

Data Ingestion​

More​