kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Heo <jason.heo....@gmail.com>
Subject Re: Any plans for "Aggregation Push down" or integrating Impala + Kudu more tightly?
Date Fri, 30 Jun 2017 02:16:47 GMT
Hi Todd,

Thank you for your answer.

>> *That said, it hasn't been a high priority relative to other performance
areas we're exploring*.

Could I know some main idea about performance? I tried to search them on
Kudu Jira, but failed to find.

Last question. in your answer:

*>> Regarding sharing memory, the first step which is already implemented
is to share a common in-memory layout.*
*>> ...*
*>> Currently, as mentioned above, the data (in the common format) is
transferred from Kudu to Impala via a localhost TCP socket.*

So, that means they send/receive same in-memory format using TCP socket not
memcpy(). Am I right?

Then, I thought Impala is just a client program of Kudu using Kudu client
library. How  'sharing in-memory layout" can be achieved using Kudu client
library? Can I know conceptual Idea? Actually we don't use Impala for our
service but we've made Java Program to serve our service. I wanted to know
same idea can be applied to our client program.

Thanks.

Regards,

Jason

2017-06-30 3:53 GMT+09:00 Todd Lipcon <todd@cloudera.com>:

> Hey Jason,
>
> Answers inline below
>
> On Thu, Jun 29, 2017 at 2:52 AM, Jason Heo <jason.heo.sde@gmail.com>
> wrote:
>
>> Hi,
>>
>> Q1.
>>
>> After reading Druid vs Kudu
>> <http://druid.io/docs/latest/comparisons/druid-vs-kudu.html>, I wondered
>> Druid has aggregation push down.
>>
>> *Druid includes its own query layer that allows it to push down
>>> aggregations and computations directly to data nodes for faster query
>>> processing. *
>>
>>
>> If I understand "Aggregation Push down" correctly, it seems that partial
>> aggregation is done by data node side, so that only small amount of result
>> set can be transferred to a client which could lead to great performance
>> gain. (Am I right?)
>>
>
> That's right.
>
>
>>
>> So I wanted to know if Apache Kudu has a plan for Aggregation push down
>> scan feature (Or already has it)
>>
>
> It currently doesn't have it, and there aren't any current plans to do so.
>
> Usually, we assume that Kudu tablet servers are collocated with either
> Impala daemons or Spark executors. So, it's less important to provide
> pushdown into the tablet server itself, since even without it, we are
> typically avoiding any network transfer from the TS into the execution
> environment which does the aggregation.
>
> The above would be less true if there were some way in which Kudu itself
> could perform the aggregation more efficiently based on knowledge of its
> underlying storage. For example, one could imagine doing a GROUP BY
> 'foo_col' more efficiently within Kudu if the column is dictionary-encoded
> by aggregating on the code-words rather than the resulting decoded strings,
> since the integer code words are fixed length and faster to hash, compare,
> etc.
>
> That said, it hasn't been a high priority relative to other performance
> areas we're exploring.
>
>
>>
>> Q2.
>>
>> One thing that I concern when using Impala+Kudu is that all matching rows
>> should transferred to impala process from kudu tserver. Usually Impala and
>> Kudu tserver run on same node. So It would be happy If Impala can read Kudu
>> Tablet directly. Any plan for this kind of features?
>>
>> How-to: Use Impala and Kudu Together for Analytic Workloads
>> <https://blog.cloudera.com/blog/2016/04/how-to-use-impala-and-kudu-together-for-analytic-workloads/>
>> says that:
>>
>> *we intend to implement the Apache Arrow in-memory data format and to
>>> share memory between Kudu and Impala, which we expect will help with
>>> performance and resource usage.*
>>>
>>
>> What does "share memory between Kudu and Impala"? Does this already
>> implemented?
>>
>
> Yes, currently all matching rows are transferred from the Kudu TS to the
> Impala daemon. Impala schedules scanners for locality, though, so this is a
> localhost-scoped connection which is quite fast. To give you a sense of the
> speed, I just tested a localhost TCP connection using 'iperf' and measured
> ~6GB/sec on a single core. Although this is significantly slower than a
> within-process memcpy, it's still fast enough that it usually represents a
> small fraction of the overall CPU consumption of a query.
>
> Regarding sharing memory, the first step which is already implemented is
> to share a common in-memory layout. That is to say, the format in which
> Kudu returns rows over the wire to the client matches the same format that
> Impala expects its rows in memory. So, when it receives a block of rows in
> the scanner, it doesn't have to "parse" or "convert" them to some other
> format. It can simply interpret the data in place. This saves a lot of CPU.
>
> Using something like Arrow would be even more efficient than the current
> format since it is columnar rather than row-oriented. However, Impala
> currently does not use a columnar format for its operator pipeline, so we
> can't currently make use of Arrow to optimize the interchange.
>
> Currently, as mentioned above, the data (in the common format) is
> transferred from Kudu to Impala via a localhost TCP socket. A few years ago
> we had an intern who experimented with using a Unix domain socket and found
> some small speedup. He also experimented with setting up a shared memory
> region and also found another small speedup over the domain socket.
> However, there was a lot of complexity involved in this code (particularly
> the shared memory approach) relative to the gain that we saw, so we didn't
> end up merging it before his internship ended :)
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
View raw message