hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3315) New binary file format
Date Wed, 24 Sep 2008 00:11:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633977#action_12633977
] 

stack commented on HADOOP-3315:
-------------------------------

Oh, I see what you are saying about "not directly supporting random lookup".  I'd say thats
a bit of a hole in TFile, especially if you want to replace MapFile (MapFile.Reader.get(key)).

On "indexing the region by the endKey", pardon me, I'm not sure I follow.   Currently index
is block-based, not key-based IIUC so can I even make an index that has all keys? Or, can
you make an index that is key-based?  (Even if I could index all keys, if key/values are small,
might make for a big index so might need something like the MapFile interval).

When you say attribute, do you mean attribute of the key?  Or something else.

Thanks for entertaining my questions Hong.

Any remarks on my questions in 'stack - 23/Sep/08 02:26 PM'?  Thanks.

> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Amir Youssefi
>         Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch, HADOOP-3315_20080915_TFILE.patch,
TFile Specification Final.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs to compress
or decompress. It would be good to have a file format that only needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message