hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2065) RCFile issues
Date Wed, 06 Apr 2011 17:17:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016443#comment-13016443
] 

jiraposter@reviews.apache.org commented on HIVE-2065:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/529/
-----------------------------------------------------------

(Updated 2011-04-06 17:13:30.910168)


Review request for hive and Yongqiang He.


Changes
-------

Updated patch where sequence file compliance is not addressed but the other two issues are.



Summary
-------

Patch for HIVE-2065


This addresses bug HIVE-2065.
    https://issues.apache.org/jira/browse/HIVE-2065


Diffs (updated)
-----

  build-common.xml 9f21a69 
  data/files/test_v6dot0_compressed.rc PRE-CREATION 
  data/files/test_v6dot0_uncompressed.rc PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java eb5305b 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
20d1f4e 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java f7eacdc

  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java bb1e3c9 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java 8bb6f3a 
  ql/src/test/results/clientpositive/alter_merge.q.out 25f36c0 
  ql/src/test/results/clientpositive/alter_merge_stats.q.out 243f7cc 
  ql/src/test/results/clientpositive/partition_wise_fileformat.q.out cee2e72 
  ql/src/test/results/clientpositive/partition_wise_fileformat3.q.out 067ab43 
  ql/src/test/results/clientpositive/sample10.q.out 50406c3 

Diff: https://reviews.apache.org/r/529/diff


Testing
-------

Tests added, existing tests updated


Thanks,

Krishna



> RCFile issues
> -------------
>
>                 Key: HIVE-2065
>                 URL: https://issues.apache.org/jira/browse/HIVE-2065
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Krishna Kumar
>            Assignee: Krishna Kumar
>            Priority: Minor
>         Attachments: HIVE.2065.patch.0.txt, HIVE.2065.patch.1.txt, Slide1.png, proposal.png
>
>
> Some potential issues with RCFile
> 1. Remove unwanted synchronized modifiers on the methods of RCFile. As per yongqiang
he, the class is not meant to be thread-safe (and it is not). Might as well get rid of the
confusing and performance-impacting lock acquisitions.
> 2. Record Length overstated for compressed files. IIUC, the key compression happens after
we have written the record length.
> {code}
>       int keyLength = key.getSize();
>       if (keyLength < 0) {
>         throw new IOException("negative length keys not allowed: " + key);
>       }
>       out.writeInt(keyLength + valueLength); // total record length
>       out.writeInt(keyLength); // key portion length
>       if (!isCompressed()) {
>         out.writeInt(keyLength);
>         key.write(out); // key
>       } else {
>         keyCompressionBuffer.reset();
>         keyDeflateFilter.resetState();
>         key.write(keyDeflateOut);
>         keyDeflateOut.flush();
>         keyDeflateFilter.finish();
>         int compressedKeyLen = keyCompressionBuffer.getLength();
>         out.writeInt(compressedKeyLen);
>         out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen);
>       }
> {code}
> 3. For sequence file compatibility, the compressed key length should be the next field
to record length, not the uncompressed key length.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message