hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6708) New file format for very large records
Date Fri, 16 Apr 2010 19:12:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857937#action_12857937

Aaron Kimball commented on HADOOP-6708:


This sounds like TFile could be adapted to my needs. Thanks for helping explain all of that
thoroughly. Is this block offset/length index already maintained? It sounds like the skip-to-the-next-block
optimization is not already implemented. Do you know what are the next steps required to make
that happen?

The other thing that needs to happen is an API to support long-valued lengths. Should I submit
a patch that modifies the existing method signatures? Or provide additional methods (e.g.,
as in http://commons.apache.org/io/api-1.4/org/apache/commons/io/input/CountingInputStream.html)

> New file format for very large records
> --------------------------------------
>                 Key: HADOOP-6708
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6708
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: lobfile.pdf
> A file format that handles multi-gigabyte records efficiently, with lazy disk access

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message