hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-756) performance improvement for RCFile and ColumnarSerDe in Hive
Date Mon, 17 Aug 2009 12:24:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744035#action_12744035

He Yongqiang commented on HIVE-756:

       if (!currentValue.inited) {
+        ret.resetValid(columnNumber); // do this only when not intialized 
       // we do not use BytesWritable here to avoid the byte-copy from
       // DataOutputStream to BytesWritable
-      ret.resetValid(columnNumber);

-        if (skippedColIDs[i]) {
-          if (ref != BytesRefWritable.ZeroBytesRefWritable)
-            ret.set(i, BytesRefWritable.ZeroBytesRefWritable);
-          continue;
-        }

The code can be used by non-hive code, and since getCurrentRow is a public method, we can
not gurantee that every time the passed in argument ret is the same as the one in previous
callings, so we need to do the "resetValid" and set(.., BytesRefWritable.ZeroBytesRefWritable)
everytime called.  what do you think?

> performance improvement for RCFile and ColumnarSerDe in Hive
> ------------------------------------------------------------
>                 Key: HIVE-756
>                 URL: https://issues.apache.org/jira/browse/HIVE-756
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: hive-756.patch
> There are some easy performance improvements in the columnar storage in Hive I found
during Hackathon. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message