hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Shao <zsh...@gmail.com>
Subject Re: NullPointerException when using join on table with map column
Date Wed, 18 Mar 2009 08:46:38 GMT
Hi Stephen,

Maps are not expected to have a null value in Hive, because we use java
HashMap internally to store the Maps.

Zheng

On Tue, Mar 17, 2009 at 11:09 AM, Stephen Corona <scorona@adknowledge.com>wrote:

> Are map's allowed to have a key with a null value? Or does it expect every
> key to have a non-null value?
>
> Steve
> ________________________________________
> From: Zheng Shao [zshao9@gmail.com]
> Sent: Tuesday, March 17, 2009 1:56 PM
> To: hive-user@hadoop.apache.org
> Subject: Re: NullPointerException when using join on table with map column
>
> Hi,
>
> It seems DynamicSerde has problems with the null values somewhere. We
> are working on replacing it with LazySimpleSerDe.
>
> In the meanwhile, please use subquery and the problem should disappear.
>
> Select count(1), mykey from (Select id, mymap['mykey'] as mykey from
> tablea) tablea join tableb on tablea.id = tableb.id group by mykey;
>
> Zheng
>
>
> On 3/17/09, Stephen Corona <scorona@adknowledge.com> wrote:
> > Hey guys. I am trying to run a query like:
> >
> > select count(1), request.attributes['category_selection_method'] from
> > request left outer join test on request.recipient_id = test.recipient_id
> > group by request.attributes['category_selection_method'];
> >
> > I can run the query WITHOUT the join and it works fine. I can also run
> the
> > query with the join if I don't reference the attributes column (i.e, the
> > map). Is there a problem with doing joins when a table has a map column?
> >
> > However, when I run this query I get the following error:
> >
> > java.lang.RuntimeException: org.apache.hadoop.hive.serde2.SerDeException:
> > java.lang.NullPointerException
> >       at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:75)
> >       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> >       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> >       at org.apache.hadoop.mapred.Child.main(Child.java:155)
> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> > org.apache.hadoop.hive.serde2.SerDeException:
> java.lang.NullPointerException
> >       at
> >
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:191)
> >       at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> >       at
> >
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:89)
> >       at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> >       at
> >
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
> >       at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:315)
> >       at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:178)
> >       at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
> >       ... 3 more
> > Caused by: org.apache.hadoop.hive.serde2.SerDeException:
> > java.lang.NullPointerException
> >       at
> >
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:178)
> >       at
> >
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:189)
> >       ... 10 more
> > Caused by: java.lang.NullPointerException
> >       at
> >
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeTypeMap.serialize(DynamicSerDeTypeMap.java:136)
> >       at
> >
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeFieldList.serialize(DynamicSerDeFieldList.java:249)
> >       at
> >
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase.serialize(DynamicSerDeStructBase.java:81)
> >       at
> >
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:174)
> >       ... 11 more
> >
> >
> > describe request-
> >
> > request_id      string
> > attributes    map<string,string>
> > day   string
> >
> > create table for request:
> >
> > create table request (
> > request_id
> > attributes map <string, string>
> > )
> > row format delimited fields terminated by '\001' collection items
> terminated
> > by '\003' map keys terminated by '\002';
> >
> > describe test-
> > recipient_id string
> >
> > explain query
> >
> > ABSTRACT SYNTAX TREE:
> >   (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_TABREF request)
> (TOK_TABREF
> > test) (= (TOK_COLREF request recipient_id) (TOK_COLREF test
> recipient_id))))
> > (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
> > (TOK_SELEXPR (TOK_FUNCTION count 1)) (TOK_SELEXPR ([ (TOK_COLREF request
> > attributes) 'category_selection_method'))) (TOK_GROUPBY ([ (TOK_COLREF
> > request attributes) 'category_selection_method'))))
> >
> > STAGE DEPENDENCIES:
> >   Stage-1 is a root stage
> >   Stage-2 depends on stages: Stage-1
> >   Stage-0 is a root stage
> >
> > STAGE PLANS:
> >   Stage: Stage-1
> >     Map Reduce
> >       Alias -> Map Operator Tree:
> >         test
> >             Reduce Output Operator
> >               key expressions:
> >                     expr: recipient_id
> >                     type: string
> >               sort order: +
> >               Map-reduce partition columns:
> >                     expr: recipient_id
> >                     type: string
> >               tag: 1
> >               value expressions:
> >                     expr: recipient_id
> >                     type: string
> >         request
> >             Select Operator
> >               expressions:
> >                     expr: recipient_id
> >                     type: string
> >                     expr: attributes
> >                     type: map<string,string>
> >               Reduce Output Operator
> >                 key expressions:
> >                       expr: 0
> >                       type: string
> >                 sort order: +
> >                 Map-reduce partition columns:
> >                       expr: 0
> >                       type: string
> >                 tag: 0
> >                 value expressions:
> >                       expr: 0
> >                       type: string
> >                       expr: 1
> >                       type: map<string,string>
> >       Reduce Operator Tree:
> >         Join Operator
> >           condition map:
> >                Left Outer Join0 to 1
> >           condition expressions:
> >             0 {VALUE.0} {VALUE.1}
> >             1 {VALUE.0}
> >           Group By Operator
> >             aggregations:
> >                   expr: count(1)
> >             keys:
> >                   expr: 1['category_selection_method']
> >                   type: string
> >             mode: hash
> >             File Output Operator
> >               compressed: false
> >               GlobalTableId: 0
> >               table:
> >                   input format:
> > org.apache.hadoop.mapred.SequenceFileInputFormat
> >                   output format:
> > org.apache.hadoop.mapred.SequenceFileOutputFormat
> >                   name: binary_table
> >
> >   Stage: Stage-2
> >     Map Reduce
> >       Alias -> Map Operator Tree:
> >         /tmp/hive-root/532094399/571818099.10002
> >           Reduce Output Operator
> >             key expressions:
> >                   expr: 0
> >                   type: string
> >             sort order: +
> >             Map-reduce partition columns:
> >                   expr: 0
> >                   type: string
> >             tag: -1
> >             value expressions:
> >                   expr: 1
> >                   type: bigint
> >       Reduce Operator Tree:
> >         Group By Operator
> >           aggregations:
> >                 expr: count(VALUE.0)
> >           keys:
> >                 expr: KEY.0
> >                 type: string
> >           mode: mergepartial
> >           Select Operator
> >             expressions:
> >                   expr: 1
> >                   type: bigint
> >                   expr: 0
> >                   type: string
> >             File Output Operator
> >               compressed: false
> >               GlobalTableId: 0
> >               table:
> >                   input format: org.apache.hadoop.mapred.TextInputFormat
> >                   output format:
> > org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
> >
> >   Stage: Stage-0
> >     Fetch Operator
> >       limit: -1
> >
> >
> >
>
> --
> Sent from Gmail for mobile | mobile.google.com
>
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Mime
View raw message