hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Shao <zsh...@gmail.com>
Subject Re: NullPointerException when using join on table with map column
Date Tue, 17 Mar 2009 17:56:12 GMT
Hi,

It seems DynamicSerde has problems with the null values somewhere. We
are working on replacing it with LazySimpleSerDe.

In the meanwhile, please use subquery and the problem should disappear.

Select count(1), mykey from (Select id, mymap['mykey'] as mykey from
tablea) tablea join tableb on tablea.id = tableb.id group by mykey;

Zheng


On 3/17/09, Stephen Corona <scorona@adknowledge.com> wrote:
> Hey guys. I am trying to run a query like:
>
> select count(1), request.attributes['category_selection_method'] from
> request left outer join test on request.recipient_id = test.recipient_id
> group by request.attributes['category_selection_method'];
>
> I can run the query WITHOUT the join and it works fine. I can also run the
> query with the join if I don't reference the attributes column (i.e, the
> map). Is there a problem with doing joins when a table has a map column?
>
> However, when I run this query I get the following error:
>
> java.lang.RuntimeException: org.apache.hadoop.hive.serde2.SerDeException:
> java.lang.NullPointerException
> 	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:75)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:155)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.NullPointerException
> 	at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:191)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> 	at
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:89)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> 	at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:178)
> 	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
> 	... 3 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException:
> java.lang.NullPointerException
> 	at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:178)
> 	at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:189)
> 	... 10 more
> Caused by: java.lang.NullPointerException
> 	at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeTypeMap.serialize(DynamicSerDeTypeMap.java:136)
> 	at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeFieldList.serialize(DynamicSerDeFieldList.java:249)
> 	at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase.serialize(DynamicSerDeStructBase.java:81)
> 	at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:174)
> 	... 11 more
>
>
> describe request-
>
> request_id      string
> attributes	map<string,string>
> day	string
>
> create table for request:
>
> create table request (
> request_id
> attributes map <string, string>
> )
> row format delimited fields terminated by '\001' collection items terminated
> by '\003' map keys terminated by '\002';
>
> describe test-
> recipient_id string
>
> explain query
>
> ABSTRACT SYNTAX TREE:
>   (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_TABREF request) (TOK_TABREF
> test) (= (TOK_COLREF request recipient_id) (TOK_COLREF test recipient_id))))
> (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
> (TOK_SELEXPR (TOK_FUNCTION count 1)) (TOK_SELEXPR ([ (TOK_COLREF request
> attributes) 'category_selection_method'))) (TOK_GROUPBY ([ (TOK_COLREF
> request attributes) 'category_selection_method'))))
>
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-2 depends on stages: Stage-1
>   Stage-0 is a root stage
>
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Alias -> Map Operator Tree:
>         test
>             Reduce Output Operator
>               key expressions:
>                     expr: recipient_id
>                     type: string
>               sort order: +
>               Map-reduce partition columns:
>                     expr: recipient_id
>                     type: string
>               tag: 1
>               value expressions:
>                     expr: recipient_id
>                     type: string
>         request
>             Select Operator
>               expressions:
>                     expr: recipient_id
>                     type: string
>                     expr: attributes
>                     type: map<string,string>
>               Reduce Output Operator
>                 key expressions:
>                       expr: 0
>                       type: string
>                 sort order: +
>                 Map-reduce partition columns:
>                       expr: 0
>                       type: string
>                 tag: 0
>                 value expressions:
>                       expr: 0
>                       type: string
>                       expr: 1
>                       type: map<string,string>
>       Reduce Operator Tree:
>         Join Operator
>           condition map:
>                Left Outer Join0 to 1
>           condition expressions:
>             0 {VALUE.0} {VALUE.1}
>             1 {VALUE.0}
>           Group By Operator
>             aggregations:
>                   expr: count(1)
>             keys:
>                   expr: 1['category_selection_method']
>                   type: string
>             mode: hash
>             File Output Operator
>               compressed: false
>               GlobalTableId: 0
>               table:
>                   input format:
> org.apache.hadoop.mapred.SequenceFileInputFormat
>                   output format:
> org.apache.hadoop.mapred.SequenceFileOutputFormat
>                   name: binary_table
>
>   Stage: Stage-2
>     Map Reduce
>       Alias -> Map Operator Tree:
>         /tmp/hive-root/532094399/571818099.10002
>           Reduce Output Operator
>             key expressions:
>                   expr: 0
>                   type: string
>             sort order: +
>             Map-reduce partition columns:
>                   expr: 0
>                   type: string
>             tag: -1
>             value expressions:
>                   expr: 1
>                   type: bigint
>       Reduce Operator Tree:
>         Group By Operator
>           aggregations:
>                 expr: count(VALUE.0)
>           keys:
>                 expr: KEY.0
>                 type: string
>           mode: mergepartial
>           Select Operator
>             expressions:
>                   expr: 1
>                   type: bigint
>                   expr: 0
>                   type: string
>             File Output Operator
>               compressed: false
>               GlobalTableId: 0
>               table:
>                   input format: org.apache.hadoop.mapred.TextInputFormat
>                   output format:
> org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
>
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>
>
>

-- 
Sent from Gmail for mobile | mobile.google.com

Yours,
Zheng

Mime
View raw message