spark 数据流输出
1、save to hdfs json file
spark.sparkContext.setLogLevel("error")
val data=spark.read.json("hdfs://cslcdip/home/dip/lzm/sparkdata/people.json")
data.write.mode("overwrite").json("hdfs://cslcdip/home/dip/lzm/sparkdata/result")
- SaveMode指定文件保存时的模式。
Overwrite:覆盖
Append:追加
ErrorIfExists:如果存在就报错
Ignore:如果存在就忽略
2、save to hdfs parquest
val spark:SparkSession=sparkBuilder.master("local[2]").enableHiveSupport().getOrCreate() spark.sparkContext.setLogLevel("error") val data=spark.read.json("hdfs://cslcdip/home/dip/lzm/sparkdata/people.json") data.write.mode("overwrite").parquet("hdfs://cslcdip/home/dip/lzm/sparkdata/result") val data2=spark.read.parquet("hdfs://cslcdip/home/dip/lzm/sparkdata/result/*") data2.show()
3、save to mysql
4、save to hive
spark.sql("use bd") spark.sql("insert overwrite table b_province_center select * from tiger.b_province_center ")