Flink Hive Source 并行度推断源码解析
批读 Hive
HiveOptions 中有两个配置
public static final ConfigOption<Boolean> TABLE_EXEC_HIVE_INFER_SOURCE_PARALLELISM = key("table.exec.hive.infer-source-parallelism") .defaultValue(true) .withDescription( "If is false, parallelism of source are set by config. " + "If is true, source parallelism is inferred according to splits number. "); public static final ConfigOption<Integer> TABLE_EXEC_HIVE_INFER_SOURCE_PARALLELISM_MAX = key("table.exec.hive.infer-source-parallelism.max") .defaultValue(1000) .withDescription("Sets max infer parallelism for source operator.");
-
table.exec.hive.infer-source-parallelism:默认值是 true,表示 source 的并行度是根据数据分区数和文件数推断的,如果设置为 false 的话表示并行度是以配置的为准
上一篇:
JS实现多线程数据分片下载