spark 数据流输出


1、save to  hdfs json file

spark.sparkContext.setLogLevel("error")
val data=spark.read.json("hdfs://cslcdip/home/dip/lzm/sparkdata/people.json")
data.write.mode("overwrite").json("hdfs://cslcdip/home/dip/lzm/sparkdata/result")

  • SaveMode指定文件保存时的模式。

Overwrite:覆盖

Append:追加

ErrorIfExists:如果存在就报错

Ignore:如果存在就忽略

 2、save to hdfs parquest

val spark:SparkSession=sparkBuilder.master("local[2]").enableHiveSupport().getOrCreate()
    spark.sparkContext.setLogLevel("error")
    val  data=spark.read.json("hdfs://cslcdip/home/dip/lzm/sparkdata/people.json")
    data.write.mode("overwrite").parquet("hdfs://cslcdip/home/dip/lzm/sparkdata/result")
    val data2=spark.read.parquet("hdfs://cslcdip/home/dip/lzm/sparkdata/result/*")
    data2.show()

3、save to mysql

4、save to hive

  spark.sql("use bd")
  spark.sql("insert overwrite table b_province_center select * from tiger.b_province_center ")

相关