hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aravind Menon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2338) log recovery: deleted items may be resurrected
Date Tue, 30 Mar 2010 03:43:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851233#action_12851233
] 

Aravind Menon commented on HBASE-2338:
--------------------------------------

I think we found the issue. testDelete fails when we try to delete the two latest versions
of the same column within a column family. The specific test that fails is the following (TestClient.java,
line 1585, simplified):

put. = new Put(ROWS[2]);
put.add(FAMILIES[1], QUALIFIER, ts[0], VALUES[0]);
put.add(FAMILIES[1], QUALIFIER, ts[1], VALUES[1]);
ht.put(put);

delete = new Delete(ROWS[2]);
delete.deleteColumn(FAMILIES[1], QUALIFIER);
delete.deleteColumn(FAMILIES[1], QUALIFIER);

get = new Get(ROWS[2]);
get.addFamily(FAMILIES[1]);
get.setMaxVersions(Integer.MAX_VALUE);
result = ht.get(get);
assertTrue("Expected 0 key but received " + result.size(),
        result.size() == 0);

Previously for deleting specific column versions, after deleting a particular keyvalue, the
memstore was immediately updated, before the next delete was processed in the loop (HRegion.java,
line 1215). Thus, on deleting subsequent keyvalues, the "get" to retrieve the latest timestamp
would return the correct value. 

With the patch, we have changed the memstore update order. The timestamps for all keyvalues
are updated first, before the keyvalues are written to log and memstore. So, if there are
two deletes, they would both see the same latest version number for that column, and both
would delete the same version. 


> log recovery: deleted items may be resurrected
> ----------------------------------------------
>
>                 Key: HBASE-2338
>                 URL: https://issues.apache.org/jira/browse/HBASE-2338
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.4
>            Reporter: Kannan Muthukkaruppan
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: delete.patch
>
>
> While working on HBASE-2283, noticed that if you do a put followed by a delete, and then
crash the RS, and trigger log recovery to happen, then deleted entries may be resurrected.

> Suprisingly, the issue only affected delete of a specific column. Full row delete didn't
run into this issue.
> ---
> Code inspection revealed that we might have an issue with timestamps & WAL stuff
for delete that come in with "LATEST" timestamp. [Note: The "LATEST" timestamp is syntax sugar/hint
to the RS to convert it to "now". ]
> Basically, in:
> {code}
> delete(byte [] family, List<KeyValue> kvs, boolean writeToWAL)
> {code}
> the "kv.updateLatestStamp(byteNow);" time stamp massaging (from LATEST to now) happens
*after* the WAL log.append() call. So the KeyValue entries written to the HLog do not have
the massaged timestamp. On recovery, when these entries are replayed, we add them back to
reconstructionCache but don't do anything with timestamps. 
> The above could be the potential source of the problem. But there could be more to the
problem than my simple analysis. For instance, we still don't know why full row delete worked
fine, but delete of a specific column didn't work ok. Forking this off as a separate issue
from HBASE-2283.
> [Note: Aravind is starting to take a look at this issue.]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message