hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1088) RCFile RecordReader's first split will read duplicate rows if the split end is < the first SYNC mark
Date Mon, 25 Jan 2010 18:37:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804639#action_12804639
] 

Ning Zhang commented on HIVE-1088:
----------------------------------

Another suggestion: in RCFileRecordReader.java, next(LongWritable, BytesRefArrayWritable)
shares many common code with next(LongWritable). Can you refactor this function to something
like:

public boolean next(LongWritable key, ByteRefArrayWritable value) 
  throws IOException {
more = next(key);
if ( more ) {
  in.getCurrentRow(value);
}
return more;
}

This will always keep the logic consistent for the two next() functions.

> RCFile RecordReader's first split will read duplicate rows if the split end is < the
first SYNC mark
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1088
>                 URL: https://issues.apache.org/jira/browse/HIVE-1088
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: hive-rcfile-reader-branch-0.5.patch, hive-rcfile-reader-trunk.2.patch,
hive-rcfile-reader-trunk.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message