hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2404) Allow RCFile Reader to tolerate corruptions
Date Fri, 02 Sep 2011 21:21:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096310#comment-13096310
] 

jiraposter@reviews.apache.org commented on HIVE-2404:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1671/#review1740
-----------------------------------------------------------



trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java
<https://reviews.apache.org/r/1671/#comment3962>

    ok, will do that



trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java
<https://reviews.apache.org/r/1671/#comment3961>

    The difference is that ret.resetValid(columnNumber); should be called when tolerateCorruptions
is true


- Ramkumar


On 2011-08-27 23:22:07, Ramkumar Vadali wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1671/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-27 23:22:07)
bq.  
bq.  
bq.  Review request for Yongqiang He and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Sometimes it is useful to tolerate corruptions during a query and return results based
on the files that can be processed. A single corrupt block of data should not prevent reading
the rest of the data.
bq.  
bq.  We need a way to gracefully ignore errors while reading a RC File
bq.  
bq.  
bq.  This addresses bug HIVE-2404.
bq.      https://issues.apache.org/jira/browse/HIVE-2404
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java 1161660 
bq.    trunk/ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java 1161660 
bq.  
bq.  Diff: https://reviews.apache.org/r/1671/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Manual test with corrupt RC file, added unit-test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ramkumar
bq.  
bq.



> Allow RCFile Reader to tolerate corruptions
> -------------------------------------------
>
>                 Key: HIVE-2404
>                 URL: https://issues.apache.org/jira/browse/HIVE-2404
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.7.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>            Priority: Minor
>         Attachments: toleratecorruptions.2.patch, toleratecorruptions.patch
>
>
> Sometimes it is useful to tolerate corruptions during a query and return results based
on the files that can be processed. A single corrupt block of data should not prevent reading
the rest of the data.
> We need a way to gracefully ignore errors while reading a RC File

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message