hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew McCall <andrew.mcc...@goroam.net>
Subject Re: IndexedTable and Delete
Date Tue, 21 Jul 2009 22:11:45 GMT
Cool will do.

Andrew

On 21 Jul 2009, at 22:13, Clint Morgan wrote:

> Yeah, you've basically got it right.  Its a bug.
>
> Please open a JIRA  (and perhaps take a stab at a patch). Its low on  
> my
> priority list as we mostly just do updates or delete whole rows..
>
> -clint
>
> On Tue, Jul 21, 2009 at 1:04 PM, Andrew McCall <andrew.mccall@goroam.net 
> >wrote:
>
>> Hi,
>>
>> I've been using the IndexedTable stuff from contrib and come across  
>> a bit
>> of an issue.
>>
>> When I delete a column my indexes are removed for that column. I've  
>> run
>> through the code in IndexedRegion and used very similar code in my  
>> own
>> classes to recreate the index after I've run the delete.
>>
>> I've also noticed that if I run a Put after the Delete then the  
>> index will
>> be re-created.
>>
>> Neither the Delete or the subsequent Put in the second example uses  
>> any of
>> the columns that are part of the index (either indexed or additional
>> columns).
>>
>> If I'm not mistaken the problem lies in the code to rebuild the  
>> index from
>> org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegion:
>>
>> @Override
>> public void delete(Delete delete, final Integer lockid, boolean
>> writeToWAL)
>>     throws IOException {
>>
>>   if (!getIndexes().isEmpty()) {
>>     // Need all columns
>>     NavigableSet<byte[]> neededColumns =
>> getColumnsForIndexes(getIndexes());
>>
>>     Get get = new Get(delete.getRow());
>>     for (byte [] col : neededColumns) {
>>      get.addColumn(col);
>>     }
>>
>>     Result oldRow = super.get(get, null);
>>     SortedMap<byte[], byte[]> oldColumnValues =  
>> convertToValueMap(oldRow);
>>
>>
>>     for (IndexSpecification indexSpec : getIndexes()) {
>>       removeOldIndexEntry(indexSpec, delete.getRow(),  
>> oldColumnValues);
>>     }
>>
>>     // Handle if there is still a version visible.
>>     if (delete.getTimeStamp() != HConstants.LATEST_TIMESTAMP) {
>>       get.setTimeRange(1, delete.getTimeStamp());
>>       oldRow = super.get(get, null);
>>       SortedMap<byte[], byte[]> currentColumnValues =
>> convertToValueMap(oldRow);
>>       LOG.debug("There are " + currentColumnValues + " entries to
>> re-index");
>>
>>       for (IndexSpecification indexSpec : getIndexes()) {
>>         if (IndexMaintenanceUtils.doesApplyToIndex(indexSpec,
>> currentColumnValues)) {
>>           updateIndex(indexSpec, delete.getRow(),  
>> currentColumnValues);
>>         }
>>       }
>>     }
>>   }
>>   super.delete(delete, lockid, writeToWAL);
>> }
>>
>>
>> I'm not sure if I've got this right but it seems that any delete will
>> remove the indexes, but they will only be rebuilt if the delete is  
>> of a
>> previous version for the row, and then the index will then be built  
>> using
>> data from the version prior to that which you've just deleted -  
>> which seems
>> to mean it would, more often than not, always be out of date.
>>
>> More broadly it also occurs to me that it may make sense not to  
>> delete the
>> indexes at all unless the Delete would otherwise affect them. In my  
>> case
>> there isn't really any reason to remove the indexes, the column I'm  
>> deleting
>> is completely unrelated.
>>
>> Cheers,
>> Andrew
>>
>>
>>


Mime
View raw message