accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yamini Joshi <yamini.1...@gmail.com>
Subject Re: Accumulo Equivalent of Mongo Aggr Query
Date Mon, 26 Sep 2016 12:41:08 GMT
Hi Dylan

This is what I'm trying to do:
#groupby id and create 2 new columns: np2 and shared
 query = {'$group': {'_id': '$student_id', 'np2': {'$first': '$count'},
'shared': {'$sum': 1}}}

The statement written above is one of the stages in a mongo aggregate
query. The results of allthe stages are computed on the server side and the
final result returned to the user.

My problem is: I can't figure out 2 things:
1. How to add new columns while writing a Combiner/iterator
2. How to do group by (based on a condition since data in accumulo is
always stored in a group).


Best regards,
Yamini Joshi

On Sun, Sep 25, 2016 at 5:18 PM, Dylan Hutchison <dhutchis@cs.washington.edu
> wrote:

> Hi Yamini,
>
> Could you further describe the computation you have in mind, for those of
> us not familiar with MongoDB's "Aggr" function?  You may want to look at
> Accumulo's built-in Combiner iterators
> <https://accumulo.apache.org/1.8/accumulo_user_manual#_combiners>.  They
> seem more relevant than Filters.
>
> I don't know what you mean when you write that your output is not visible
> to "the complete Database".
>
> Regards, Dylan
>
> On Sun, Sep 25, 2016 at 11:34 AM, Yamini Joshi <yamini.1691@gmail.com>
> wrote:
>
>>
>> Hello everyone
>>
>> I wanted to know if there is any equivalent of Mongo Aggr queries in
>> Acuumulo. I have a complex query in form of a Mongo aggregate
>> (multi-staged) query. I'm trying to model the same in Accumulo. As of know,
>> with the limited knowledge that I have, I have created a class extending
>> Filter class. My question is: since my queries depend on a input, is there
>> any other way of using the iterators/filters only for one query or change
>> their input with every single query? As of now, my filter is getting
>> attached to the table on 'SCAN' that means the output will be visible to
>> the subsequent queries and not the complete Database.
>>
>> Best regards,
>> Yamini Joshi
>>
>>
>

Mime
View raw message