hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6196) sync(0); next() breaks SequenceFile
Date Tue, 01 Sep 2009 04:03:32 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Douglas updated HADOOP-6196:

    Attachment: C6196-test.patch

bq. So, it turns out that SequenceFile.Writer generates its sync block and headers statically
at classloader time, so if it generates the 'right' binary data, it'll evade this bug.

The unit test doesn't validate the data it reads; examining the records read, when SequnceFile
doesn't fail after a call to sync(0), it will return invalid records or skip records. It doesn't
look like it evades the bug so much as it silently fails. The fix to the header will work,
but we must also fix the Reader and backport that fix.

The random behavior comes from the sync marker; after a {{sync(long)}}, the reader backs up
to the start of the marker, and reads it as the record length. Since it backs up to the word
before the marker, it reads the last word written by the metadata (00 00 00 00 if the metadata
are empty) as the record length, then the first bytes of the sync marker as the key. Whether
this causes OOM, EOF, or no exception is as random as the sync marker and dependent on the
key type. I've attached some trace debugging added to illustrate (and a modified version of
Jay's test)

Since this can cause silent failures and data loss- and because the SequenceFile format has
been stable for over 2 years- I don't know that an incompatible change fixing this for future
SequenceFiles is practical. In discussing this with Arun, the solution that seemed most appropriate
and feasible was to record the end of the header in init() and adjust sync(long) calls within
that range to the first record boundary. Tentatively, setting seenSync to true when syncing
to within this range may also be necessary, but we'll need to study it.

Very good catch, Jay

> sync(0); next() breaks SequenceFile
> -----------------------------------
>                 Key: HADOOP-6196
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6196
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Jay Booth
>         Attachments: C6196-test.patch, hadoop-6196-multipatch.txt, sync-bug.patch
> Currently, the end of the SequenceFile header is a sync block that isn't prefaced with
SYNC_ESCAPE.  This means that sync(0) followed by next() fails.  Patch w/ test attached, bumps
VERSION from 6 to 7.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message