hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erwin Tam (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers
Date Thu, 08 Apr 2010 22:06:36 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855183#action_12855183

Erwin Tam commented on ZOOKEEPER-464:

There was an intermittent problem with the ledger delete junit tests prior to the last patch
uploaded (which resolved it).  I'll document the bug and the fix for that here.

The ledger delete junit tests were failing intermittently and it was related to an issue I
saw earlier when I was running unit tests with a very small entry log limit size (2K).  When
the entry logs roll over, we create a new one by first writing the "BKLO" 1024 byte header
to the beginning of the file.  The problem is, this byte buffer object is statically defined.
 In our junit tests, we have multiple Bookie servers (and thus EntryLogger instances) in the
same jvm.  If more than one EntryLogger is rolling over its current log and writing the next
one, they are accessing the same entryLog file header buffer.  This creates problems since
the static header isn't accessed in a synchronized way.  

This header byte buffer is cleared first before writing it to the log file. Since it is static,
one thread could clear it first, then another thread (from a second Bookie server) clears
it at the same time. The first thread writes the header but when it is done, the header's
byte buffer's internal pointers have it pointing to the end and aren't reset.  The second
thread will then be reading the header buffer that has not been cleared/reset.  What ends
up happening is the entry logs in the second Bookie are created without the header.  When
we're reading through those files later on to figure out which ledgers make it up, it'll read
incorrect values and try to allocate byte buffers based on an incorrect length segment (basically
reading in junk random bytes).  This creates the java heap space error.

The fix is simple and is to just make this logfile header a non-static variable, initializing
it in the EntryLogger constructor.  In practice, we shouldn't be running multiple Bookies
within the same jvm so we wouldn't run into this problem.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch,
> An application using BookKeeper is likely to use a large number of ledgers over time.
Such an application might not need all ledgers created over time and might want to delete
some of these ledgers to free up some space on bookies. The idea of this jira is to implement
a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and
delete the ledger data on corresponding bookies. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message