hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1880) DeleteColumns are not recovered properly from the write-ahead-log
Date Fri, 02 Oct 2009 04:02:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761466#action_12761466
] 

stack commented on HBASE-1880:
------------------------------

Patch looks good.  Moving writeToWAL till after the timestamp gets set seems like a no duh
kinda thing (Can you fix the formatting -- its anti-easy-read at moment obviously product
of a machine formatter).

So, no more reconstructeCache, just insert straight into memstore?  That solves another problem
I was worried about in that we keep an eye on size of memstore but not on this reconstructionCache.
 Removing it looks like a good move. 

So, these new edits are not going into a WAL at all?  They should?  (Make a new issue? My
sense is when a working flush all of our recovery will need to come out of the dark and get
a spotlight shone on it).

> DeleteColumns are not recovered properly from the write-ahead-log
> -----------------------------------------------------------------
>
>                 Key: HBASE-1880
>                 URL: https://issues.apache.org/jira/browse/HBASE-1880
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0, 0.20.1, 0.21.0
>            Reporter: Clint Morgan
>            Priority: Critical
>         Attachments: 1880-v2.patch, 1880.patch
>
>
> I found a couple of issues:
>  - The timestamp is being set to now after it has been written to the wal. So if the
WAL was flushed on that write, it gets in with ts of MAX_INT and is effectively lost.
>  - Even after that fix, I had issues getting the delete to apply properly. In my case,
the WAL had a put to a column, then a DeleteColumn for the same column. The DeleteColumn KV
had a later timestamp, but it was still lost on recovery. I traced around a bit, and it looks
like the current approach of just using an HFile.writer to write the set of KVs read from
the log will not work. There is special logic in MemStore for deletes that needs to happen
before writing. I got around this by just adding to memstore in the log recovery process.
Not sure if there are other implications of this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message