10.4 DataFrame
和 DataSet
之间的交互
1. 从 DataFrame
到DataSet
scala> val df = spark.read.json("examples/src/main/resources/people.json")
df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
scala> case class People(name: String, age: Long)
defined class People
scala> val ds = df.as[People]
ds: org.apache.spark.sql.Dataset[People] = [age: bigint, name: string]
2. 从 DataSet
到DataFrame
scala> case class Person(name: String, age: Long)
defined class Person
scala> val ds = Seq(Person("Andy", 32)).toDS()
ds: org.apache.spark.sql.Dataset[Person] = [name: string, age: bigint]
scala> val df = ds.toDF
df: org.apache.spark.sql.DataFrame = [name: string, age: bigint]
scala> df.show
+----+---+
|name|age|
+----+---+
|Andy| 32|
+----+---+