VeloDB Cloud 26.x·Apache Doris 4.x (≤ 4.0 supported)·"Since X.Y" tags refer to Doris versionsversion mapping →
Efficient Deduplication
Deduplication is one of the most resource-intensive operations in analytical workloads. Apache Doris provides two dedicated data types as alternatives to COUNT DISTINCT, completing deduplication with lower memory and latency cost: choose BITMAP when you need exact results, and choose HLL when you can accept a 1%–2% error in exchange for smaller storage.