hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shivram Mani (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1075) Make checksum verification configurable in PXF HdfsTextSimple profile
Date Sun, 25 Sep 2016 15:59:20 GMT
Shivram Mani created HAWQ-1075:
----------------------------------

             Summary: Make checksum verification configurable in PXF HdfsTextSimple profile
                 Key: HAWQ-1075
                 URL: https://issues.apache.org/jira/browse/HAWQ-1075
             Project: Apache HAWQ
          Issue Type: Improvement
          Components: PXF
            Reporter: Shivram Mani
            Assignee: Goden Yao


Currently HdfsTextSimple profile which is the optimized profile to read Text/CSV uses ChunkRecordReader
to read chunks of records (as opposed to individual records). Here dfs.client.read.shortcircuit.skip.checksum
is explicitly set to true to avoid incurring any delays with checksum check while opening/reading
the file/block. 
This configuration needs to be exposed as an option and by default client side checksum check
must occur in order to be resilient to any data corruption issues which aren't caught internally
by the datanode block reporting mechanism (even fsck doesn't catch certain block corruption
issues).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message