spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Xin <>
Subject need help to have a Java version of this scala script
Date Sat, 17 Dec 2016 07:54:53 GMT
what I am trying to do:I need to add column (could be complicated transformation based on value
of a column) to a give dataframe.
scala script:val hContext = new HiveContext(sc)
import hContext.implicits._
val df = hContext.sql("select x,y,cluster_no from test.dc")
val len = udf((str: String) => str.length)
val twice = udf { (x: Int) => println(s"Computed: twice($x)"); x * 2 }
val triple = udf { (x: Int) => println(s"Computed: triple($x)"); x * 3}
val df1 = df.withColumn("name-len", len($"x"))
val df2 = df1.withColumn("twice", twice($"cluster_no"))
val df3 = df2.withColumn("triple", triple($"cluster_no"))
The scala script above seems to work ok, but I am having trouble to do it Java way (note that
transformation based on value of a column could be complicated, not limited to simple add/minus
etc.). is there a way in java? Thanks.

View raw message