hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 张春玮 <zcwe...@gmail.com>
Subject help:2 problems in using hadoop sequencefile
Date Wed, 19 Jan 2011 02:15:29 GMT

I am a HDFS beginner.I use hadoop 0.20.2 in my system, and there are many
small files which are needed to store in this system. These small files are
increasing day and day. So I adopt sequencefile to solve “a large number of
small files” problem. The problem appears in the following situation:

    *public* *static* *void* testSequenceFileWrite(String path,
SequenceFile.CompressionType type )

        *throws* Throwable {

      Writer w = *null*;

      *try* {

        w = SequenceFile.*createWriter*(*fs*, *conf*, *new* Path(path),

              BytesWritable.*class*, BytesWritable.*class*, type);

        *for* (*int* i = 0; i < fileCount; i++) {

           *byte* bs[] = *new* *byte*[i + 1 + 4096];

           *for* (*int* j = 0; j < bs.length; j++) {

              bs[j] = (*byte*) i;


           BytesWritable key = *new* BytesWritable(String.*valueOf*(i+4000)


           BytesWritable value = *new* BytesWritable(bs);

           System.*out*.printf("%d %d\n", i, w.getLength());

           w.append(key, value);


      } *catch* (Throwable t) {


      } *finally* {

        *if* (w != *null*) {





   Public static void main(String args[]) {

      testSequenceFileWrite(“/test”, 100,SequenceFile.CompressionType.RECORD);

      testSequenceFileWrite(“/test”, 100,SequenceFile.CompressionType.RECORD);


When I invoke this function 2 times in main function, the second time it
will overwrite not append the file “/test” in hdfs. Can you tell me how to
append data when reopen an existing sequencefile in hdfs?

Another problem:
Is Appending operation  supported in HAR file?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message