apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AJAY GUPTA <ajaygit...@gmail.com>
Subject Re: Sort Accumulation
Date Tue, 14 Mar 2017 06:00:51 GMT
Hi Apex Dev community,

Kindly let me know if implementing this accumulation using TreeMultiSet is
fine.


Ajay

On Thu, Mar 9, 2017 at 12:24 PM, AJAY GUPTA <ajaygit158@gmail.com> wrote:

> Hi Bright,
>
> I couldnot completely understand the bucketing approach you mentioned. How
> would we bucket the data considering we have no idea what the data will be?
>
> How about using a TreeMultiSet?
>
>
> Thanks,
> Ajay
>
> On Wed, Mar 8, 2017 at 11:24 PM, Bright Chen <bright@datatorrent.com>
> wrote:
>
>> Hi Ajay,
>> I think sort at getOutput() probably will get this method stuck due to
>> very
>> high volume of computation.
>> And as we still need to persistent the data, it will not very helpful to
>> increase the performance of processing tuple. Probably we can bucket the
>> data with range of value. Such as following:
>> - process tuple in one window: sort data of current window in memory
>> - end window: merge the sorted memory data into buckets.
>>
>> thanks
>> Bright
>>
>> On Wed, Mar 8, 2017 at 8:51 AM, AJAY GUPTA <ajaygit158@gmail.com> wrote:
>>
>> > Hi Thomas,
>> >
>> > I looked at TopN. The accumulate() of TopN is an O(n*k). Using similar
>> > approach for Sort will lead to an O(n^2) complexity.
>> > Since we have to sort all elements, we can do it in a single sort call
>> in
>> > getOutput().
>> >
>> >
>> > On Wed, Mar 8, 2017 at 10:09 PM, Thomas Weise <thw@apache.org> wrote:
>> >
>> > > Look at the existing topN accumulation. It should be a generalization,
>> > > where you don't have a limit.
>> > >
>> > >
>> > > On Wed, Mar 8, 2017 at 8:05 AM, AJAY GUPTA <ajaygit158@gmail.com>
>> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I would like to propose the Sort Accumulation. The accumulation
>> will be
>> > > > responsible for sorting the input POJO stream. The accumulation will
>> > > > require a comparator to compare and sort the input tuples. Another
>> > > boolean
>> > > > parameter "sortDesc" will be used to decide sorting order.
>> > > >
>> > > > Let me know your views.
>> > > >
>> > > > Thanks,
>> > > > Ajay
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message