hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "SequenceFile" by Arun C Murthy
Date Wed, 16 Aug 2006 10:17:10 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by Arun C Murthy:
http://wiki.apache.org/lucene-hadoop/SequenceFile

------------------------------------------------------------------------------
  
  Essentially there are 3 different file formats for !SequenceFiles depending on whether ''compression''
and ''block compression'' are active.
  
- 
+ [[BR]]
- However any of the above formats share a common ''header'' (which is used by the !SequenceFile.Reader
to return the appropriate key/value pairs). The next section summarises the header:
+ However all of the above formats share a common ''header'' (which is used by the !SequenceFile.Reader
to return the appropriate key/value pairs). The next section summarises the header:
+ [[Anchor(SeqFileHeader)]]
- [[Anchor(SeqFileHeader)]]===== SequenceFile Common Header =====
+ ===== SequenceFile Common Header =====
   * version - A byte array: SEQ<version no.>
   * keyClassName - String
   * valueClassName - String
@@ -30, +31 @@

   * blockCompression -  A boolean which specifies if ''block compression'' is turned on for
keys/values in this file.
   * sync - A sync marker to denote end of the header.
  
- 
+ [[BR]]
  The formats for Uncompressed/!RecordCompressed Writers are very similar:
  ===== Uncompressed/RecordCompressed Writer Format =====
   * [#SeqFileHeader Header]
@@ -38, +39 @@

     * Key
     * (Compressed?) Value
   * A sync-marker every 100bytes or so to help in seeking to a random point in the file and
then seeking to next ''record''.
- <br>
  
+ [[BR]]
  The format for the !BlockCompressedWriter is as follows:
  ===== BlockCompressed Writer Format =====
   * [#SeqFileHeader Header]

Mime
View raw message