hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hubert Zhang <hzh...@pivotal.io>
Subject Re: Planner invoked twice on master
Date Wed, 31 Aug 2016 05:30:47 GMT
Hi lirong,
Thanks for your sharing the history of the resource negotiator.
But your opinion to "embed the logic into planner" I think is not suitable
for HAWQ.
IMO, The logic in your context is hawq specific and it is the policy to
determine how many resources are
needed for the current query. The policy is application depended, for
example, the resources
are mainly determined by the HDFS block number and hash table bucket number
in HAWQ,
which is not used by other database.
HAWQ use both legacy planner and ORCA,  but ORCA is a general optimizer,
also used by
other MPP database, such as Greenplum.
Embedding hawq specific logic into ORCA is not appropriate.

On Wed, Aug 31, 2016 at 1:14 PM, Hubert Zhang <hzhang@pivotal.io> wrote:

> Hi Kavinder,
> If your problem "multiple REST requests" is really caused by planner
> invoked twice,
> my suggestions is to avoid the related function call(maybe a call to rest
> service) at the
> first plan stage.
> You can refers to src/backend/optimizer/util/clauses.c:2834
> using is_in_planning_phase() to determine whether the legacy planner is at
> the first
> or the second stage.
> Thanks
> On Wed, Aug 31, 2016 at 12:31 AM, Lirong Jian <jianlirong@gmail.com>
> wrote:
>> Hi Kavinder,
>> If you are looking for some guy "to be blamed" for making the decision
>> of invoking the planner twice on master, unfortunately, that guy would be
>> me, :).
>> First of all, I have left the company (Pivotal Inc.) about haft a year ago
>> and was no longer working on the Apache HAWQ project full time anymore.
>> And
>> I have little time to keep tracking the code changes these days. Thus,
>> what
>> I am going to say is probably inconsistent with the latest code.
>> * Problem*
>> To improve the scalability and throughput of the system, we want to
>> implement the dynamic resource allocation mechanism for HAWQ 2.0.  In
>> other
>> words, we want to allocate exactly needed resource (the number of virtual
>> segments), rather than the fixed amount of resource (like HAWQ 1.x), to
>> run
>> the underlying query. In order to achieve that, we need to calculate the
>> cost of the query before allocating the resource, which means we need to
>> figure out the execution plan first. However, in the framework of the
>> current planner (either the old planner or the new optimizer ORCA),
>> the optimal execution plan is generated given the to-be-run query and the
>> number of segments. Thus, this is a chicken and egg problem.
>> *Solution*
>> IMO, the ideal solution for the above problem is to use an iterative
>> algorithm: given a default number of segments, calculate the optimal
>> execution plan; based on the output optimal execution plan, figure out the
>> appropriate number of segments needed to run this query; and calculate the
>> optimal execution plan again, and again, until the result is stable.
>> *Implementation*
>> In the actual implementation, we set the number of iterations to 2 for two
>> major reasons: (1) two iterations are enough to give out a good result;
>> (2)
>> there is some cost associated with invoking the planner, especially the
>> new
>> optimizer ORCA.
>> After implementing the first version, we later found that determining the
>> number of virtual segments based on the cost of the query sometimes gave
>> out very bad results (although this is the issue of the planner, because
>> the cost of the planner provided doesn't imply the actual running cost of
>> the query correctly). So, borrowing the idea from Hadoop MapReduce, we
>> calculate the cost based on the total size of all tables needed to be
>> scanned for the underlying query. It seemed we don't need to invoke the
>> planner before allocating resource anymore. However, in our current
>> resource manager, the allocated resource is segment-based, not
>> process-based. For example, if an execution plan consists of three slices,
>> meaning we need to setup three processes on each segment to run this
>> query.
>> One allocated resource unit (virtual segment) is for all three processes.
>> In order to avoid the case where too many processes are started on one
>> physical host, we need to know how many processes (the number of slices of
>> the execution plan) are going to start on one virtual segment when we
>> require resource from the resource manager. Thus, the execution plan is
>> still needed. We could write a new function to calculate number of slices
>> of the plan rather than invoking the planner, but after some
>> investigation,
>> we found the the new function did almost the same thing as the planner.
>> So,
>> why bother writing more duplicated code?
>> *Engineering Consideration*
>> IMO, for the long term, maybe the best solution is to embed the logic
>> of resource negotiation into the planner. In that case, the output of the
>> planner consists of the needed number of virtual segments and the
>> associated optimal execution plan. The planner can be invoked just once on
>> master.
>> However, back to that time, we decided to separate the functionalities of
>> resource negation and planner completely. Although it may looks a little
>> ugly from the architecture view, it saved us a lot of code refactoring
>> effort and the communication cost among different teams. We did have a
>> release deadline, :).
>> Above is just my 2 cents.
>> Best,
>> Lirong
>> Lirong Jian
>> HashData Inc.
>> 2016-08-30 1:42 GMT+08:00 Goden Yao <godenyao@apache.org>:
>> > Some back ground info:
>> >
>> > HAWQ 2.0 right now doesn't do dynamic resource allocation for PXF
>> queries
>> > (External Table).
>> > It was a compromise we made as PXF used to have its own allocation logic
>> > and we didn't get a chance to converge the logic with HAWQ 2.0.
>> > So to make it compatible (on performance) with 1.x HAWQ, the current
>> logic
>> > will assume external table queries need 8 segments per node to execute.
>> > (e.g. if 3 nodes in the cluster, it'll need 24 segments).
>> > If that allocation fails, the query will fail and user will see the
>> error
>> > message like "do not have sufficient resources" or "segments" to execute
>> > the query.
>> >
>> > As I understand, the 1st call is to get fragment info, 2nd call is to
>> > optimize allocation for fragments to segments based on the info got from
>> > 1st call and generate the optimized plan.
>> >
>> > -Goden
>> >
>> > On Mon, Aug 29, 2016 at 10:31 AM Kavinder Dhaliwal <
>> kdhaliwal@pivotal.io>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > Recently I was looking into the issue of PXF receiving multiple REST
>> > > requests to the fragmenter. Based on offline discussions I have got a
>> > rough
>> > > idea that this is happening because HAWQ plans every query twice on
>> the
>> > > master. I understand that this is to allow resource negotiation that
>> was
>> > a
>> > > feature of HAWQ 2.0. I'd like to know if anyone on the mailing list
>> can
>> > > give any more background into the history of the decision making
>> behind
>> > > this change for HAWQ 2.0 and whether this is only a short term
>> solution
>> > >
>> > > Thanks,
>> > > Kavinder
>> > >
>> >
> --
> Thanks
> Hubert Zhang


Hubert Zhang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message