CREATE-MATERIALIZED-VIEW
Name
CREATE MATERIALIZED VIEW
Description
This statement is used to create a materialized view.
This operation is an asynchronous operation. After the submission is successful, you need to view the job progress through SHOW ALTER TABLE MATERIALIZED VIEW. After displaying FINISHED, you can use the desc [table_name] all
command to view the schema of the materialized view.
Syntax:
CREATE MATERIALIZED VIEW [MV name] as [query]
[PROPERTIES ("key" = "value")]
Note:
-
MV name
: The name of the materialized view, required. Materialized view names for the same table cannot be repeated. -
query
: The query statement used to construct the materialized view, the result of the query statement is the data of the materialized view. Currently supported query formats are:SELECT select_expr[, select_expr ...] FROM [Base view name] GROUP BY column_name[, column_name ...] ORDER BY column_name[, column_name ...]
The syntax is the same as the query syntax.
select_expr
: All columns in the schema of the materialized view.- Only supports single column without expression calculation, aggregate column.
- The aggregate function currently only supports three types of SUM, MIN, and MAX, and the parameter of the aggregate function can only be a single column without expression calculation.
- Contains at least one single column.
- All involved columns can only appear once.
base view name
: The original table name of the materialized view, required.- Must be a single table and not a subquery
group by
: The grouping column of the materialized view, optional.- If not filled, the data will not be grouped.
order by
: the sorting column of the materialized view, optional.- The declaration order of the sort column must be the same as the column declaration order in select_expr.
- If order by is not declared, the sorting column is automatically supplemented according to the rules. If the materialized view is an aggregate type, all grouping columns are automatically supplemented as sort columns. If the materialized view is of a non-aggregate type, the first 36 bytes are automatically supplemented as the sort column.
- If the number of auto-supplemented sorts is less than 3, the first three are used as the sort sequence. If query contains a grouping column, the sorting column must be the same as the grouping column.
-
Properties
Declare some configuration of the materialized view, optional.
PROPERTIES ("key" = "value", "key" = "value" ...)
The following configurations can be declared here:
short_key: The number of sorting columns. timeout: The timeout for materialized view construction.
Example
The Base table structure is
mysql> desc duplicate_table;
+-------+--------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------+------+------+---------+-------+
| k1 | INT | Yes | true | N/A | |
| k2 | INT | Yes | true | N/A | |
| k3 | BIGINT | Yes | true | N/A | |
| k4 | BIGINT | Yes | true | N/A | |
+-------+--------+------+------+---------+-------+
create table duplicate_table(
k1 int null,
k2 int null,
k3 bigint null,
k4 bigint null
)
duplicate key (k1,k2,k3,k4)
distributed BY hash(k4) buckets 3
properties("replication_num" = "1");
Attention: If the materialized view contains partitioned and distributed columns of the Base table, these columns must be used as key columns in the materialized view
-
Create a materialized view that contains only the columns of the original table (k1, k2)
create materialized view k1_k2 as select k1, k2 from duplicate_table;
The schema of the materialized view is as follows, the materialized view contains only two columns k1, k2 without any aggregation
+-------+-------+--------+------+------+ ---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-------+-------+--------+------+------+ ---------+-------+ | k1_k2 | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | +-------+-------+--------+------+------+ ---------+-------+
-
Create a materialized view with k2 as the sort column
create materialized view k2_order as select k2, k1 from duplicate_table order by k2;
The schema of the materialized view is shown in the figure below. The materialized view contains only two columns k2, k1, where k2 is the sorting column without any aggregation.
+-------+-------+--------+------+------- +---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-------+-------+--------+------+------- +---------+-------+ | k2_order | k2 | INT | Yes | true | N/A | | | | k1 | INT | Yes | false | N/A | NONE | +-------+-------+--------+------+------- +---------+-------+
-
Create a materialized view with k1, k2 grouping and k3 column aggregated by SUM
create materialized view k1_k2_sumk3 as select k1, k2, sum(k3) from duplicate_table group by k1, k2;
The schema of the materialized view is shown in the figure below. The materialized view contains two columns k1, k2, sum(k3) where k1, k2 are the grouping columns, and sum(k3) is the sum value of the k3 column grouped by k1, k2.
Since the materialized view does not declare a sorting column, and the materialized view has aggregated data, the system defaults to supplement the grouping columns k1 and k2 as sorting columns.
+-------+-------+--------+------+------- +---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-------+-------+--------+------+------- +---------+-------+ | k1_k2_sumk3 | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | | | k3 | BIGINT | Yes | false | N/A | SUM | +-------+-------+--------+------+------- +---------+-------+
-
Create a materialized view that removes duplicate rows
create materialized view deduplicate as select k1, k2, k3, k4 from duplicate_table group by k1, k2, k3, k4;
The materialized view schema is as shown below. The materialized view contains columns k1, k2, k3, and k4, and there are no duplicate rows.
+-------+-------+--------+------+------- +---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-------+-------+--------+------+------- +---------+-------+ | deduplicate | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | | | k3 | BIGINT | Yes | true | N/A | | | | k4 | BIGINT | Yes | true | N/A | | +-------+-------+--------+------+------- +---------+-------+
-
Create a non-aggregate materialized view that does not declare a sort column
The schema of all_type_table is as follows
+-------+--------------+------+-------+---------+- ------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-------+---------+- ------+ | k1 | TINYINT | Yes | true | N/A | | | k2 | SMALLINT | Yes | true | N/A | | | k3 | INT | Yes | true | N/A | | | k4 | BIGINT | Yes | true | N/A | | | k5 | DECIMAL(9,0) | Yes | true | N/A | | | k6 | DOUBLE | Yes | false | N/A | NONE | | k7 | VARCHAR(20) | Yes | false | N/A | NONE | +-------+--------------+------+-------+---------+- ------+
The materialized view contains k3, k4, k5, k6, k7 columns, and does not declare a sort column, the creation statement is as follows:
create materialized view mv_1 as select k3, k4, k5, k6, k7 from all_type_table;
The default added sorting column of the system is k3, k4, k5 three columns. The sum of the bytes of these three column types is 4(INT) + 8(BIGINT) + 16(DECIMAL) = 28 < 36. So the addition is that these three columns are used as sorting columns. The schema of the materialized view is as follows, you can see that the key field of the k3, k4, k5 columns is true, that is, the sorting column. The key field of the k6, k7 columns is false, which is a non-sorted column.
+----------------+-------+--------------+------+-- -----+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +----------------+-------+--------------+------+-- -----+---------+-------+ | mv_1 | k3 | INT | Yes | true | N/A | | | | k4 | BIGINT | Yes | true | N/A | | | | k5 | DECIMAL(9,0) | Yes | true | N/A | | | | k6 | DOUBLE | Yes | false | N/A | NONE | | | k7 | VARCHAR(20) | Yes | false | N/A | NONE | +----------------+-------+--------------+------+-- -----+---------+-------+
Keywords
CREATE, MATERIALIZED, VIEW