hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <>
Subject [jira] Commented: (HIVE-819) Add lazy decompress ability to RCFile
Date Mon, 21 Sep 2009 04:12:16 GMT


Ning Zhang commented on HIVE-819:

Hi Yongqiang, the tests look good. It should cover most cases. Other queries such as map-reduce
joins, map-side joins, UDF, UDAF, etc may fall into the same code path. Namit and Zheng may
correct me if I'm wrong. 

As for where the check of double-decompression should be put, I prefer putting it in LazyDecompressionCallbackImpl
since this integrity check is introduced by the lazy decompression, thus is part of its responsibility.
And there may be more callers besides BytesRefWritable to LazyDecompressionCallbackImpl. If
putting it in LazyDecompressionCallbackImpl, we dont' need to implement the checking in each
of its callers. 

> Add lazy decompress ability to RCFile
> -------------------------------------
>                 Key: HIVE-819
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor, Serializers/Deserializers
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>             Fix For: 0.5.0
>         Attachments: hive-819-2009-9-12.patch
> This is especially useful for a filter scanning. 
> For example, for query 'select a, b, c from table_rc_lazydecompress where a>1;' we
only need to decompress the block data of b,c columns when one row's column 'a' in that block
satisfies the filter condition.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message