accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From madhvi <madhvi.gu...@orkash.com>
Subject Re: Abnormal behaviour of custom iterator in getting entries
Date Thu, 18 Jun 2015 04:45:40 GMT
Hi,

Thanks for the blog you shared.I found it quite useful for my requirement.
"How are you passing these IDs to the batch scanner?"
I am passing row ids received as a previous query result from another 
table as 'new Range(entry.getKey().getRow())' in a Range type list and 
passing that list to batch Scanner.

"Are you trying to sum across all rows that you queried? "
Yes we need to sum a particular column qualifier across the rows ids 
passed to batch scanner.How the summation can be done across the rows as 
you said "you can put a second iterator "above" the first"?

Thanks
Madhvi
On Wednesday 17 June 2015 08:43 PM, Josh Elser wrote:
> Madhvi,
>
> Understood. A few more questions..
>
> How are you passing these IDs to the batch scanner? Are you providing 
> individual Ranges for each ID (e.g. `new Range(new Key("row1", "", 
> "id1"), true, new Key("row1", "", "id1\x00"), false))`)? Or are you 
> providing an entire row (or set of rows) and using the 
> fetchColumns(Text,Text) method (or similar) on the BatchScanner?
>
> Are you trying to sum across all rows that you queried? Or is your sum 
> per-row? If the former, that is going to cause you problems. The quick 
> explanation is that you can't reliably know the tablet boundaries so 
> you should try to perform an initial sum, per row. If you want, you 
> can put a second iterator "above" the first and do a summation across 
> all rows to reduce the amount of data sent to a client. However, if 
> you use a BatchScanner, you will still have to perform a final 
> summation at the client.
>
> Check out 
> https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo 
> for more details on that..
>
> madhvi wrote:
>> Hi Josh,
>>
>> Sorry, my company policy doesn't allow me to share full source.What we
>> are tryng to do is summing over a unique field stored in column
>> qualifier for IDs passed to batch scanner.Can u suggest how it can be
>> done in accumulo.
>>
>> Thanks
>> Madhvi
>> On Wednesday 17 June 2015 10:32 AM, Josh Elser wrote:
>>> You put random values in the family and qualifier? Do I misunderstand
>>> you?
>>>
>>> Also, if you can put up the full source for the iterator, that will be
>>> much easier if you need help debugging it. It's hard for us to guess
>>> at why your code might not be working as you expect.
>>>
>>> madhvi wrote:
>>>> Hi Josh,
>>>>
>>>> I have changed HashMap to TreeMap which sorts lexicographically and I
>>>> have inserted random values in column family and qualifier.Value of
>>>> TreeMap in value.
>>>> Used scanner and batch scanner but getting results only with scanner.
>>>>
>>>> Thanks
>>>> Madhvi
>>>>
>>>> On Tuesday 16 June 2015 08:42 PM, Josh Elser wrote:
>>>>> Additionally, you're placing the Value into the ColumnQualifier and
>>>>> dropping the ColumnFamily completely. Granted, that may not be a
>>>>> problem for the specific data in your table, but it's not going to
>>>>> work for any data.
>>>>>
>>>>> Christopher wrote:
>>>>>> You're iterating over a HashMap. That's not sorted.
>>>>>>
>>>>>> -- 
>>>>>> Christopher L Tubbs II
>>>>>> http://gravatar.com/ctubbsii
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 16, 2015 at 1:58 AM, madhvi<madhvi.gupta@orkash.com>
>>>>>> wrote:
>>>>>>> Hi Josh,
>>>>>>> Thanks for replying. I will enable remote debugger on my Accumulo
>>>>>>> server.
>>>>>>>
>>>>>>> However I am slightly confused with your statement "you are not
>>>>>>> returning
>>>>>>> your data in sorted order". Can you point the part in my iterator
>>>>>>> code which
>>>>>>> seems innapropriate and any possible solution for that?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Madhvi
>>>>>>>
>>>>>>>
>>>>>>> On Tuesday 16 June 2015 11:07 AM, Josh Elser wrote:
>>>>>>>> //matched the condition and put values to holder map.
>>>>>>>
>>>>
>>


Mime
View raw message