spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From unk1102 <>
Subject How to use registered Hive UDF in Spark DataFrame?
Date Fri, 02 Oct 2015 11:25:26 GMT
Hi I have registed my hive UDF using the following code:

hiveContext.udf().register("MyUDF",new UDF1(String,String)) {
public String call(String o) throws Execption {
//bla bla

Now I want to use above MyUDF in DataFrame. How do we use it? I know how to
use it in a sql and it works fine

hiveContext.sql(select MyUDF("test") from myTable);

My hiveContext.sql() query involves group by on multiple columns so for
scaling purpose I am trying to convert this query into DataFrame APIs"col1","col2","coln").groupby(""col1","col2","coln").count();

Can we do the follwing"col1"))??? Please guide.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message