impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Robinson <he...@apache.org>
Subject Re: SendersBlockedTimer
Date Fri, 26 May 2017 18:16:45 GMT
On 26 May 2017 at 01:23, Evo Eftimov <evo.eftimov@isecc.com> wrote:

> Hi Henry,
>
>
>
> The parquet table which is being exported in full via JDBC is about 800 MB
> compressed as stored as parquet file and when extracted as CSV or via JDBC
> ie uncompressed is 2.6 GB
>
>
>
> I had already tried impala-shell too and it had demonstrated more or less
> identical performance as the Cloudera / Simba driver – impala-shell was run
> with the –B option and its output redirected to /dev/null for max
> performance during fetch
>
>
>
> If I extract the parquet table to CSV on HDFS with INSERT SELECT from
> within Impala that results in a 2.6 GB csv file which if I then download
> from HDFS with “hdfs dfs –get” it takes only 30 sec – this is a staggering
> difference with the performance demonstrated by impala-shell and the
> Cloudera/Simba JDBC driver – are these drivers / tools that poorly
> optimized if optimized / designed / implemented properly at all ?
>


They're not optimized for large extracts, on the order of GB of data. 'hdfs
-get' has a number of significant advantages: it doesn't have to understand
the results, so there's no serialization steps - it just copies the bytes.
Since it's just copying a file, and not trying to present the abstraction
of an ordered sequence of rows, it can take advantage of the parallelism
available in HDFS and read different blocks from different datanodes. These
options aren't so easily available to a query engine or its client.

There are definitely improvements that we can make to the speed of result
retrieval, but getting close to HDFS speeds would probably require an
architectural overhaul of the way clients interact with Impala. As I said,
large data extracts are not a use case that the clients, or Impala, are
optimized for right now.

You might try experimenting with the 'fetch size' parameter in the JDBC
driver - larger batches might reduce overheads due to repeated RPC calls.

Henry


>
>
> Ps: I run the JDBC client with JVM with 4GB heap size to minimize any
> impact of Garbage Collection
>
>
>
> Regards,
>
> Evo
>
>
>
> *From:* Henry Robinson [mailto:henry@apache.org]
> *Sent:* Thursday, May 25, 2017 10:32 PM
> *To:* user@impala.incubator.apache.org
> *Subject:* Re: SendersBlockedTimer
>
>
>
>
>
>
>
> On 25 May 2017 at 12:19, Evo Eftimov <evo.eftimov@isecc.com> wrote:
>
> Hi Henry,
>
>
>
> I was referring specifically to the EXCHANGE_NODE section of the
> Coordinator Fragment – doesn’t that pin it down specifically to the
> Coordinator Node ie the node to which the JDBC Client is connected directly
> ?
>
>
>
> Also how can the streaming the records from simple full table scan query
> like “select * from table” be accelerated so that SendersBlockedTimer
> value does not represent the 95% of the overall time of the query.
> Basically imagine you have a 3GB parquet table in Impala and a JDBC Driver
> Client connected to the Coordinator ImpalaD and trying to stream out all of
> the data in the table (3GB) as quickly as possible.
>
> The execution part of the query completes blindingly fast and the data is
> streamed out of HDFS within 30 seconds. However the Fetch phase of the full
> table scan query takes 15 min as 14 min and 30 sec of that time is  the
> value in the SendersBlockedTimer
>
>
>
> The JDBC Client uses the latest Cloudera JDBC driver for Impala (which is
> actually the Simba driver) and performs nothing but just ResultSet.next()
> ie not parsing and data transformation of the columns of each row, no
> output to screen or disk etc. The network between the JDBC Client and
> Coordinator is 10 GB and “hdfs client get” of the csv version of the same
> table takes only 30 sec ….
>
>
>
> Out of the above 15 min total time, Client Fetch Wait Time is 35% or about
> 6 min. Then we also have  SendersBlockedTimer of 14 min and 30 sec – so
> who is to be blamed here for the slow streaming of records compared to hdfs
> get – a) innefecient implementation of the JDBC Client or the Coordinator
> Node needing more resources like more parallel threads and therefore CPU
> cores etc
>
>
>
> How do we interpret the above two figures and what do they point to - the
> jdbc driver or the Coordinator Node
>
>
>
> Most likely the driver, as the query takes 6 minutes, per the Client Fetch
> Wait Time. SendersBlockedTimer tracks the amount of time for which at least
> one sender was blocked. Since it is high, we know that the coordinator is
> moving slower than the results are being sent to it. The coordinator does
> very little in a SELECT * query, so the likelihood is that it is serving
> rows to the client as fast as it can consume them. Therefore I'd expect the
> client to be the bottleneck.
>
>
>
> Try using the impala-shell, and setting -B (and redirecting the output to
> /dev/null); this is about as fast as a single client can go right now and
> should give you a feeling for a lower bound on the query performance.
>
>
>
> How much data does this query return? The client API and driver are not
> really optimized for large ETL-style retrieval - for that you might be
> better off using INSERT to write some files to HDFS, and then downloading
> them in parallel from HDFS.
>
>
>
> Best,
>
> Henry
>
>
>
>
>
> Regards,
>
> Evo
>
>
>
> *From:* Henry Robinson [mailto:henry@apache.org]
> *Sent:* Thursday, May 25, 2017 7:23 PM
> *To:* user@impala.incubator.apache.org; evo.eftimov@isecc.com
> *Subject:* Re: SendersBlockedTimer
>
>
>
> Hi Evo -
>
>
>
> Just to clarify: the EXCHANGE_NODE is the operator in the plan tree which
> mediates communication between workers, not between the client and the
> coordinator.
>
>
>
> The SendersBlockedTimer measures the amount of time that senders have row
> batches to deliver to an exchange node, but the exchange is busy delivering
> a previously sent row batch. That is, the senders are sending faster than
> the exchange node (and the upstream plan) processes those rows.
>
>
>
> In a select * from table query, there'll be one exchange on the
> coordinator, but that's not generally true - exchanges connect all the
> fragment instances. Having the senders blocked in this case is typical,
> because there'll lots of senders sending at high rate fanning in to a
> single receiver, serving a single client.
>
>
>
> The delivery of rows to the client is managed by the coordinator fragment
> instance through a different part of the code to the exchange node.
>
>
>
> Henry
>
>
>
> On 25 May 2017 at 08:31, Evo Eftimov <evo.eftimov@isecc.com> wrote:
>
> What is the purpose of SendersBlockedTimer attribute in the EXCHANGE_NODE
> section of the Coordinator Fragment – part of the PROFILE of SQL statement
> executed by Impala
>
>
>
> I have reviewed the Impala source code and know that the Exchange Node
> uses a Blocking Queue as part of “Stream Manager” module which it
> instantiates
>
>
>
> In the specific context I am interested in, the Exchange Node returns the
> row from a result set to a JDBC driver client. The result set is produced
> by a simple full table scan only query of the type “select * from table”
>
>
>
> The “Sender” Parallel Threads (presumably with the Exchange Node) publish
> rows to the Blocking Queue also in the Exchange Node and the JDBC client
> reads rows from the same queue via remote JDBC session / connection over
> TCP/IP – is that a correct description of how the Exchange Node mediates
> between JDBC client on the one hand and ImpalaD workers on the other? Btw
> the Exchange Node is part of the Coordinator Node in terms of terminology –
> right?
>
>
>
> My specific question is what is the purpose/meaning  of
>   SendersBlockedTimer – e.g. does it mean that the Sender Threads WITHIN
> the Exchange Node have been in a blocked state for the time shown in the
> value of the attribute. And if this is correct then does that mean that
> they have been blocked because the JDBC Client couldn’t not keep up with
> draining the Blocking Queue during the aggregated time duration in
> SendersBlockedTimer?
>
>

Mime
View raw message