hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: data loss when splitLog()
Date Wed, 19 Oct 2011 18:00:27 GMT
Mmm ok, how did you kill the master exactly? kill -9 or a normal shutdown? I
think I could see how it would happen in the case of a normal shutdown, but
even then it would *really really* help to see the logs of what's going on.

J-D

On Tue, Oct 18, 2011 at 6:37 PM, Mingjian Deng <koven2049@gmail.com> wrote:

> @J-D: I used cloudrea CDH3. This loss can't replay every time but it could
> happen with the following logs:
> "2011-10-19 04:44:09,065 DEBUG
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Used 134218288 bytes
> of buffered edits, waiting for IO threads..."
> This log printed many times and even 134218288 didn't change. I kill master
> and restarted, the data loss. So I think the 134218288 bytes of entry was
> the last entry in memory. In the following codes:
> " synchronized (dataAvailable) {
>        totalBuffered += incrHeap;
>        while (totalBuffered > maxHeapUsage && (thrown == null ||
> thrown.get()== null)){
>          LOG.debug("Used " + totalBuffered + " bytes of buffered edits,
> waiting for IO threads...");
>          dataAvailable.wait(3000);
>        }
>        dataAvailable.notifyAll();
>      }"
> If (totalBuffered <= maxHeapUsage) and there was no more entry in .logs
> dir, archiveLogs would excute even before writeThread end.
>
> 2011/10/19 Jean-Daniel Cryans <jdcryans@apache.org>
>
> > Even if the files aren't closed properly, the fact that you are appending
> > should persist them.
> >
> > Are you using a version of Hadoop that supports sync?
> >
> > Do you have logs that show the issue where the logs were moved but not
> > written?
> >
> > Thx,
> >
> > J-D
> >
> > On Tue, Oct 18, 2011 at 7:40 AM, Mingjian Deng <koven2049@gmail.com>
> > wrote:
> >
> > > Hi:
> > >    There is a case cause data loss in our cluster. We block in splitLog
> > > because some error in our hdfs and we kill master. Some hlog files were
> > > moved from .logs to .oldlogs before them were wrote to
> .recovered.edits.
> > So
> > > rs couldn't replay these files.
> > >    In HLogSplitter.java, we found:
> > >    ...
> > >    archiveLogs(srcDir, corruptedLogs, processedLogs, oldLogDir, fs,
> > conf);
> > >    } finally {
> > >      LOG.info("Finishing writing output logs and closing down.");
> > >      splits = outputSink.finishWritingAndClose();
> > >    }
> > >    Why archiveLogs before outputSink.finishWritingAndClose()? Did these
> > > hlog files mv to .oldlogs and couldn't be split next startup if write
> > > threads failed but archiveLog success?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message