hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Baron Tsai <tsaize...@gmail.com>
Subject How can Hive handle the complex data Type through SerDe and UDF/GenericUDF?
Date Fri, 06 Dec 2013 05:57:41 GMT
------------------------------------Table
Define------------------------------------
CREATE TABLE kvpair (
  id STRING,
  arrstr ARRAY<STRING>,
  arrmap ARRAY<MAP<STRING, STRING>>
 )
ROW FORMAT SERDE "com.cloudera.hive.serde.JSONSerDe";

##com.cloudera.hive.serde.JSONSerDe is a SerDe can handle complex json data.

------------------------------------Sample
Data------------------------------------
{
    "id": "I001",
    "arrstr": [
        "stringA",
        "stringB",
        "stringC"
    ],
    "arrmap": [
        {
            "t0000": "android",
            "t0001": "ca"
        },
        {
            "t0000": "ios",
            "t0001": "us"
        }
    ]
}
------------------------------------CLI------------------------------------
ArrayIterateUDF's Method evaluate signature:
public String evaluate(List<Map<String,String>>jsonStr, String key, String
value) ;

create temporary function kv as 'com.demo.udf.ArrayIterateUDF';
SELECT kv(tb.arrmap,"t0000","android") from kvpair tb;

------------------------------------Problem------------------------------------
I think the data pass into UDF's evaluate Method is processed by the
JSONSerDe, and in this DEMO the value should be some object that
deserialized by JSONSerDe which has the type of
 List<Map<String,String>>.However, it failed.
I dont know I can my UDF can receive the input data in evaluate and parse
it(parse it into JSON Object).And what's the relationship or
implementation between the SerDe and UDF.
Thank you all.

Mime
View raw message