10.7.1 自定义 UDF 函数

scala> val df = spark.read.json("examples/src/main/resources/people.json")
df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

scala> df.show
+----+-------+
| age|   name|
+----+-------+
|null|Michael|
|  30|   Andy|
|  19| Justin|
+----+-------+
// 注册一个 udf 函数: toUpper是函数名, 第二个参数是函数的具体实现
scala> spark.udf.register("toUpper", (s: String) => s.toUpperCase)
res1: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function1>,StringType,Some(List(StringType)))

scala> df.createOrReplaceTempView("people")

scala> spark.sql("select toUpper(name), age from people").show
+-----------------+----+
|UDF:toUpper(name)| age|
+-----------------+----+
|          MICHAEL|null|
|             ANDY|  30|
|           JUSTIN|  19|
+-----------------+----+
Copyright © 尚硅谷大数据 2019 all right reserved,powered by Gitbook
该文件最后修订时间: 2019-02-08 21:40:45

results matching ""

    No results matching ""