druid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gian Merlino <g...@apache.org>
Subject Re: ItemsSketch Aggregator in druid-datasketches extension
Date Fri, 23 Jul 2021 20:39:02 GMT
Btw, one additional comment on suggestion (2) above. It's also a hack, of
course, and it's more of a hack than (1) is. At least (1) is using the
proper off-heap storage almost all of the time, which means there is not
much risk of using excessive heap memory. Approach (2) has a higher risk of
using too much heap memory. The only advantage (2) has is that you don't
need a Direct version of the ItemsSketch for it to work.

On Fri, Jul 23, 2021 at 1:35 PM Gian Merlino <gian@apache.org> wrote:

> Hey Michael,
>
> Very cool!
>
> To answer your question: it is critical to have a BufferAggregator. Some
> context; there are 3 kinds of aggregators:
>
> - Aggregator: stores intermediate state on heap; is used during ingestion
> and by the non-vectorized timeseries query engine. Required, or else some
> queries won't work properly.
> - BufferAggregator: stores intermediate state off-heap; is used by the
> non-vectorized topN and groupBy engines. Required, or else some queries
> won't work properly.
> - VectorAggregator: stores intermediate state off-heap; is used by
> vectorized query engines. Optional, but if there are any aggregators in a
> query that don't have a VectorAggregator implementation, then the whole
> query will run non-vectorized, which will slow it down.
>
> The main reason that BufferAggregators exist is to support off-heap
> aggregation, which minimizes GC pressure and prevents out-of-memory errors
> during the aggregation process.
>
> > Assuming it is, we can begin talking with the datasketches team about
> the possibility of a Direct implementation.
>
> With ItemsSketch, the biggest roadblock you're likely to run into is the
> fact that the items may be of variable size. Currently in Druid each
> BufferAggregator (and VectorAggregator) must have a fixed-size amount of
> space allocated off heap. The way this is done is by returning the number
> of bytes you need from AggregatorFactory.getMaxIntermediateSize. It's all
> allocated upfront. It can depend on aggregator configuration (for example,
> the HLL aggregators need more space if they're configured to have higher
> accuracy) but it can't depend on the data that is encountered during the
> query. So if you do have variable size items, you get into quite a
> conundrum!
>
> In order to have a "proper" implementation of an off-heap variable-sized
> sketch, we'd need an aggregator API that allows them to allocate additional
> memory at runtime. The DataSketches folks have asked us for this before but
> we haven't built it yet. There's a couple ways you could solve it in the
> meantime:
>
> 1) Allocate a fixed-size amount of memory that is enough to store most
> data you reasonably expect to encounter, update that off-heap, and allocate
> additional memory on heap if needed, without Druid "knowing" about it. This
> is something that'd require a Direct version of the ItemsSketch. It's a
> technique used by the quantiles sketches too. It's a hack but it works. You
> can see it in action by looking at DirectUpdateDoublesSketch
> <https://github.com/apache/datasketches-java/blob/master/src/main/java/org/apache/datasketches/quantiles/DirectUpdateDoublesSketch.java>
> in DataSketches (check out the spots where "mem_" is reassigned: it's
> allocating new on-heap memory that Druid doesn't know about) and
> DoublesSketchBuildBufferAggregator in Druid (check out the "sketches" map
> and "relocate" method in DoublesSketchBuildBufferAggregatorHelper: it's
> making sure to retain the references just in case one of them grew into the
> heap).
>
> 2) Allocate some arbitrary amount of memory, but don't use it, use a
> technique like the one in DoublesSketchBuildBufferAggregatorHelper to
> simply maintain references to on-heap sketches. This would work but you
> would run the risk of running out of heap memory. (There's nothing that
> explicitly controls how much heap you'll use.) To a degree you can mitigate
> this by allocating a larger amount of unused off-heap memory via
> AggregatorFactory.getMaxIntermediateSize, which will get Druid to flush
> aggregation state more often (it does this when it runs out of off-heap
> buffer space), which will sort of limit how many sketches are going to be
> in flight at once. It's quite indirect but it's the best I can think of.
>
> > I am also thinking of finishing the implementation by explicitly
> serializing the entire sketch on each update, but this would only be for
> experimentation as I doubt this is the intended behavior for
> implementations of BufferedAggregator.
>
> You're right, that's not the intended sort of implementation. It works but
> it's usually too slow to be practical. (It incurs a full serialization
> roundtrip for every row.)
>
> On Fri, Jul 23, 2021 at 12:18 PM Michael Schiff <michaelschiff@apache.org>
> wrote:
>
>> I am looking into implementing a new Aggregator in the datasketches
>> extension using the ItemSketch in the frequencies package:
>>
>> https://datasketches.apache.org/docs/Frequency/FrequentItemsOverview.html
>>
>> https://github.com/apache/datasketches-java/tree/master/src/main/java/org/apache/datasketches/frequencies
>>
>> Ive started on a partial implementation here (still a WIP, lots of TODOs):
>>
>> https://github.com/apache/druid/compare/master...michaelschiff:fis-aggregator?expand=1
>>
>> From everything I've seen, it's critical that there is an efficient
>> implementation of BufferAggregator. The existing aggregators take advantage
>> of other sketch types providing "Direct" implementations that are
>> implemented directly against a ByteBuffer.  This leads to fairly
>> transparent implementation of BufferAggregator.  ItemSketch is able to
>> serialize itself and to wrap ByteBuffer for instantiation, but the actual
>> interactions are all on heap (core of the implementation is
>> https://github.com/apache/datasketches-java/blob/27ecce938555d731f29df97f12f4744a0efb663d/src/main/java/org/apache/datasketches/frequencies/ReversePurgeItemHashMap.java
>> ).
>>
>> Can anyone confirm that it is critical (i.e. Aggregator will not
>> function) to have an implementation of BufferAggregator? Assuming it is, we
>> can begin talking with the datasketches team about the possibility of a
>> Direct implementation.  I am also thinking of finishing the implementation
>> by explicitly serializing the entire sketch on each update, but this would
>> only be for experimentation as I doubt this is the intended behavior for
>> implementations of BufferedAggregator.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
>> For additional commands, e-mail: dev-help@druid.apache.org
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message