spark访问ES,HttpConnectionManager找不到的问题
1、导入es-spark依赖
<!-- elasticsearch-spark-20 -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_2.11</artifactId>
<version>7.7.0</version>
</dependency>
<!-- elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${
elasticsearch.version}</version>
</dependency>
val spark: SparkSession = SparkSession.builder()
.master("local[*]")
.appName(this.getClass.getSimpleName.filter(!_.equals($)))
.config("spark.debug.maxToStringFields", 5000)
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.config("es.nodes", "1xx.xxx.xxx.225,1xx.xxx.xxx.226,1xx.xxx.xxx.227")
.config("es.port", "9200")
.config("es.index.auto.create", "false")
//只通过client节点进行读取操作,因此主节点负载会特别高,性能很差。
// .config("es.nodes.wan.only","true")
.config("es.mapping.date.rich", "false") //对date不进行转换
.getOrCreate()
val map = Map("es.mapping.id" -> "id")
EsSparkSQL.saveToEs(dataFrame, "cell_link_highway_wea", map)
spark.close()
2、程序运行报错
3、解决办法:
导入以下依赖即可:
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
上一篇:
Python 安装包管理工具 pip
下一篇:
学习笔记——vscode用户代码片段功能
