hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clint Morgan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1682) IndexedRegion does not properly handle deletes
Date Fri, 30 Oct 2009 22:04:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772117#action_12772117

Clint Morgan commented on HBASE-1682:

+1 reviewed, looks good: its a simpler/correct way to fallback on the old versions.

TestIndexTable passes, and my indexing tests on top of hbase pass.

> IndexedRegion does not properly handle deletes
> ----------------------------------------------
>                 Key: HBASE-1682
>                 URL: https://issues.apache.org/jira/browse/HBASE-1682
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew McCall
>            Assignee: Clint Morgan
>         Attachments: HBASE-1682-1.patch, hbase-1682-2.patch, hbase-1682-3.patch
> I've been using the IndexedTable stuff from contrib and come across a bit of an issue.
> When I delete a column my indexes are removed for that column. I've run through the code
in IndexedRegion and used very similar code in my own classes to recreate the index after
I've run the delete.
> I've also noticed that if I run a Put after the Delete then the index will be re-created.
> Neither the Delete or the subsequent Put in the second example uses any of the columns
that are part of the index (either indexed or additional columns).
> {code:title=org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegion.java}
> @Override
>  public void delete(Delete delete, final Integer lockid, boolean writeToWAL)
>      throws IOException {
>    if (!getIndexes().isEmpty()) {
>      // Need all columns
>      NavigableSet<byte[]> neededColumns = getColumnsForIndexes(getIndexes());
>      Get get = new Get(delete.getRow());
>      for (byte [] col : neededColumns) {
>       get.addColumn(col);
>      }
>      Result oldRow = super.get(get, null);
>      SortedMap<byte[], byte[]> oldColumnValues = convertToValueMap(oldRow);
>      for (IndexSpecification indexSpec : getIndexes()) {
>        removeOldIndexEntry(indexSpec, delete.getRow(), oldColumnValues);
>      }
>      // Handle if there is still a version visible.
>      if (delete.getTimeStamp() != HConstants.LATEST_TIMESTAMP) {
>        get.setTimeRange(1, delete.getTimeStamp());
>        oldRow = super.get(get, null);
>        SortedMap<byte[], byte[]> currentColumnValues = convertToValueMap(oldRow);
>        LOG.debug("There are " + currentColumnValues + " entries to re-index");
>        for (IndexSpecification indexSpec : getIndexes()) {
>          if (IndexMaintenanceUtils.doesApplyToIndex(indexSpec, currentColumnValues))
>            updateIndex(indexSpec, delete.getRow(), currentColumnValues);
>          }
>        }
>      }
>    }
>    super.delete(delete, lockid, writeToWAL);
>  }
> {code}
> It seems that any delete will remove the indexes, but they will only be rebuilt if the
delete is of a previous version for the row, and then the index will then be built using data
from the version prior to that which you've just deleted - which seems to mean it would, more
often than not, always be out of date.
> More broadly it also occurs to me that it may make sense not to delete the indexes at
all unless the Delete would otherwise affect them. In my case there isn't really any reason
to remove the indexes, the column I'm deleting is completely unrelated.
> Will follow with a patch shortly to resolve at least the first part of the issue. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message