hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3315) New binary file format
Date Mon, 12 May 2008 19:37:56 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596175#action_12596175

Doug Cutting commented on HADOOP-3315:

> So the writer's constructor should have Serlializer<K> and Serializer <V>
parameters [ ... ]

On second thought, we should just use a SerializationFactory to construct serializers and
deserializers for the key and value classes.

> It is useful to be able to get the key/value class names without the class.

We should have constants defined metadata keys.  So getting the key class should look something


With such constants, we may not need methods like getKeyClass() at all...

> Does the metadata value have to be a String?

I'd vote for supporting both String and byte[] as metadata values, perhaps with methods like:

Writer#setMeta(String key, String value);
Writer#setMeta(String key, byte[] value);

String Reader#getMeta(String key);
byte[] Reader#getMetaBytes(String key);

> New binary file format
> ----------------------
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Srikanth Kakani
>         Attachments: Tfile-1.pdf
> SequenceFile's block compression format is too complex and requires 4 codecs to compress
or decompress. It would be good to have a file format that only needs 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message