accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: data miss when use rowiterator
Date Fri, 10 Feb 2017 15:39:04 GMT
On Thu, Feb 9, 2017 at 11:39 PM, Josh Elser <elserj@apache.org> wrote:
> Just to be clear, Lu, for now stick to using a Scanner with the RowIterator
> :)
>
> It sounds like we might have to re-think how the RowIterator works with the
> BatchScanner...

I opened : https://issues.apache.org/jira/browse/ACCUMULO-4586

>
> Christopher wrote:
>>
>> I suspected that was the case. BatchScanner does not guarantee ordering
>> of entries, which is needed for the behavior you're expecting with
>> RowIterator. This means that the RowIterator could see the same row
>> multiple times with different subsets of the row's columns. This is
>> probably affecting your count.
>>
>> On Thu, Feb 9, 2017 at 10:29 PM Lu Q <luq.java@gmail.com
>> <mailto:luq.java@gmail.com>> wrote:
>>
>>     I use BatchScanner
>>
>>>     在 2017年2月10日,11:24,Christopher <ctubbsii@apache.org
>>>     <mailto:ctubbsii@apache.org>> 写道:
>>>
>>>     Does it matter if your scanner is a BatchScanner or a Scanner?
>>>     I wonder if this is due to the way BatchScanner could split rows up.
>>>
>>>     On Thu, Feb 9, 2017 at 9:50 PM Lu Q <luq.java@gmail.com
>>>     <mailto:luq.java@gmail.com>> wrote:
>>>
>>>
>>>         I use accumulo 1.8.0,and I develop a ORM framework for
>>>         conversion the scan result to a object.
>>>
>>>         Before,I use Rowiterator because it faster than direct to use
>>> scan
>>>
>>>         RowIterator rows = new RowIterator(scan);
>>>         rows.forEachRemaining(rowIterator -> {
>>>         while (rowIterator.hasNext()) {
>>>         Map.Entry<Key, Value> entry = rowIterator.next();
>>>         ...
>>>         }
>>>         }
>>>
>>>         it works ok until I query 1000+ once .I found that when the
>>>         range size bigger then 1000,some data miss.
>>>         I think maybe I conversion it error ,so I change it to a map
>>>         struct ,the row_id as the map key ,and other as the map value
>>>         ,the problem still exists.
>>>
>>>         Then I not use RowIterator,it works ok.
>>>         for (Map.Entry<Key, Value> entry : scan) {
>>>         ...
>>>         }
>>>
>>>
>>>         Is the bug or my program error ?
>>>         Thanks.
>>>
>>>     --
>>>     Christopher
>>
>>
>> --
>> Christopher

Mime
View raw message