accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Scan vs Filter performance.
Date Tue, 29 Sep 2015 13:47:51 GMT
I think it's hard to say definitively whether or not you would see a 
performance gain in using a custom iterator over a locality group. If 
you have many other columns in each row, it would likely be more 
efficient to operate only over the single column. For a small number of 
columns with more rows, I would guess you wouldn't see a large 
performance gain. It would be an interesting experiment.

Does that make sense?

mohit.kaushik wrote:
> Hi Keith,
>
> When we fetch a column or column family Ii seems, it does not seek and
> only scan by filtering the key/value pairs. But as you said if I design
> a custom iterator to fetch a column family, It may work faster.
>
> But I want to know what would be the scenario if I define a locality
> group for the column family and run the same custom iterator on it which
> scan and seeks both? what would be he impact on performance (gain or loss)?
>
> Thanks
> Mohit Kaushik
>
> On 09/28/2015 10:49 PM, Moises Baly wrote:
>> Hi Keith,
>>
>> No I wasn't aware of that. So I'll move forward with the custom iterator.
>>
>> Thank you for your time,
>>
>> Moises
>>
>> On Mon, Sep 28, 2015 at 12:35 PM, Keith Turner <keith@deenlo.com
>> <mailto:keith@deenlo.com>> wrote:
>>
>>     On Mon, Sep 28, 2015 at 12:19 PM, Moises Baly
>>     <moises@spatially.com <mailto:moises@spatially.com>> wrote:
>>
>>         Hi all:
>>
>>         I would like to perform a range scan on a table, tweaking the
>>         definition of what goes into a particular key range. One way I
>>         can think of is writing a filter on the key, and that would
>>         work fine. But I think it would be slow compared to a scan /
>>         seek custom iterator. How does the underlying login works?
>>         Does Filter goes through all records, or since is sorted
>>         follows the same underlying logic as a scan? Would a custom
>>         iterator perform better?
>>
>>
>>     Yes, filter will read all data.  Custom iterator that seeks may be
>>     faster.
>>
>>     Are you aware of the following?
>>
>>     https://issues.apache.org/jira/browse/ACCUMULO-3961
>>     https://github.com/apache/accumulo/pull/42
>>
>>
>>         Thank you for your time,
>>
>>         Moises
>>
>>
>>
>
>
> --
>
> *Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553
>
> <http://politicomapper.orkash.com>interactive social intelligence at work...
>
> <https://www.facebook.com/Orkash2012>
> <http://www.linkedin.com/company/orkash-services-private-limited>
> <https://twitter.com/Orkash> <http://www.orkash.com/blog/>
> <http://www.orkash.com>
> <http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty
>
> /This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use
> the information in this e-mail. If you have received it in error or are
> not the intended recipient, please destroy it and notify the sender
> immediately. Thank you /
>

Mime
View raw message