kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Brannan <paul.bran...@thesystech.com>
Subject Re: Delete row by partial key
Date Thu, 16 Feb 2017 16:36:09 GMT
Thank you for the quick response!

I do understand the desire to not present in the API the appearance of a
feature that isn't really there.  As an example, I was reading KUDU-1291
yesterday and realized that since predicates are specified by name and not
column index, it's easy to accidentally construct an inefficient scan by
omitting the first component of the key.  It's not surprising given how
kudu works, but it's not obvious just by looking at the API.

I would like to contribute a patch, though I would like to learn more about
kudu before I feel comfortable.  In this case, for example, my naive
solution may work, but isn't very robust -- since it has to switch on the
data type of each cell, it will fail when new data types are added.  I have
a hunch it's possible to copy row_data_ directly somehow, but I have no
idea how to correctly set isset_bitmap_ or owned_strings_bitmap_.

On Wed, Feb 15, 2017 at 1:32 PM, Todd Lipcon <todd@cloudera.com> wrote:

> Hi Paul,
>
> You're correct that there is no predicate-based delete currently
> available, so you have to scan and then feed the results of the scan back
> into your desired mutations/deletes. This is intentional, since right now
> we don't have multi-row transactional capabilities, and a "delete by
> predicate" API would probably give the false impression that it is
> transactional.
>
> I also think you're right that there isn't a nice way of propagating a
> RowResult into a PartialRow as you need to do here. It seems you're a C++
> programmer -- any interest in contributing a patch to make this
> transformation a bit easier?
>
> -Todd
>
>
>
> On Wed, Feb 15, 2017 at 10:27 AM, Paul Brannan <
> paul.brannan@thesystech.com> wrote:
>
>> I want to delete all rows that match a particular partial key.  For
>> example, if my schema includes columns "foo", "bar", and "baz" in its
>> primary key, I want to be able to delete all rows with "foo=16" and
>> "bar=32", regardless of the value of baz.  If I attempt to apply a
>> KuduDelete without specifying "baz", I get an error "Illegal state: Key not
>> specified".
>>
>> The best I have come up with so far is to do a scan and copy the data
>> cell-by-cell from the RowPtr returned by the scan into the KuduPartialRow
>> used by the delete; I don't see any good way in the interface to copy row
>> data from one to the other without copying cell-by-cell.  The code looks
>> something like:
>>
>>   for (auto idx : primary_key_column_indexes) {
>>     switch(schema.Column(idx).type()) {
>>     case KuduColumnSchema::INT16: // GetInt16/SetInt16
>>     case KuduColumnSchema::INT32: // GetInt32/SetInt32
>>     case KuduColumnSchema::STRING: // GetString/SetString
>>     // and so on...
>>     }
>>   }
>>
>> Is there a better way?
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
View raw message