accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1471) SortedMapIterator.seek() doesn't respect columnFamilies
Date Wed, 29 May 2013 20:56:20 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669732#comment-13669732
] 

Keith Turner commented on ACCUMULO-1471:
----------------------------------------

There is actually a reason you would not want to filter in SortedMapIterator.   If you want
to read from multiple SortedMapIterator, then you will provide them as inputs to a MultiIterator.
 Column fam filtering should be done after the multi-iterator, like the following.

ColumnFamilySkippingIterator(MultiIterator(SortedMapIterator(map1),SortedMapIterator(map2)))

If every data source below the MultiIterator does column fam fitlering, then its possible
that multiple data sources could unnecessarily read and filter alot of data for each seek.
 They could do this even though another data source has visible key that sorts before the
data they are filtering.  This could lead to O(N^2) seek performance.

The reason ColumnFamilySkippingIterator passes columns through is so that lower level data
sources like rfile can possibly optimize what locality groups are read.   

So one possible fix for this is via javadocs.



                
> SortedMapIterator.seek() doesn't respect columnFamilies
> -------------------------------------------------------
>
>                 Key: ACCUMULO-1471
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1471
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.4.3, 1.5.0
>            Reporter: Michael Berman
>            Assignee: Michael Berman
>            Priority: Minor
>             Fix For: 1.5.1, 1.6.0
>
>
> If you specify columnFamilies in a seek() on a SortedMapIterator, it will happily return
results from other column families.  The arguments are never even read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message