mysql数据批量导入clickhouse
clickhouse准备
本地表
create table student on cluster luopc_mpp_cluster ( id UInt8, name String, age UInt8, create_time Datetime ) engine =ReplicatedMergeTree(/clickhouse/tables/{shard}/student,{replica}) primary key (id) order by (id,age);
分布式表
create table student_all on cluster luopc_mpp_cluster( id UInt8, name String, age UInt8, create_time Datetime )engine=Distributed(luopc_mpp_cluster,default,student,rand());
插入数据
insert into student_all values (1,a,17,2021-05-08 12:00:00), (2,b,25,2021-05-08 12:00:00), (3,c,20,2021-05-08 12:00:00), (4,d,22,2021-05-08 12:00:00), (5,e,30,2021-05-08 12:00:00);
说明
本地表建表之后,集群中各个节点均可查询到此表。分布式表是基于本地表的,作用是相当于是视图,提供全局查询和写入的操作,实际数据是在本地表中存储的。
mysql准备
建表
CREATE TABLE `student` ( `id` int(11) NOT NULL, `name` varchar(100) NOT NULL, `age` int(11) NOT NULL, `create_time` datetime NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
插入数据
INSERT INTO test.student VALUES (6, f, 25, 2021-06-28 12:00:00);
执行datax
python datax/bin/datax.py mysqltoclickhousedemo.json
导入之前数据情况
导入之后数据情况
datax执行日志如下
上一篇:
JS实现多线程数据分片下载