tajo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: Big big query
Date Sat, 06 Sep 2014 12:01:41 GMT
Thank you for sharing your query. I'll dig into this problem. I'll response
soon.

Best regards,
Hyunsik


On Fri, Sep 5, 2014 at 11:33 PM, Christian Schwabe <
Christian.Schwabe@gmx.com> wrote:

>
> Hello Hyunsik,
>
> thanks for your response.
> Here is the anonymized query.
> If you need any further information feel free to ask ;)
>
> >>QUERY:
> drop table if exists anonym1;
> create table anonym1 using parquet
> as
> select
> table1.a,
> table1.b,
> table1.c,
> case when table1.ag = '1800-01-01' and table3.ah = '' then table3.ai else
> table1.ai end      as ai,
> table1.d,
> table1.e,
> table1.f,
> table1.g,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.aj
> else table1.aj end      as aj,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.ak
> else table1.ak end      as ak,
> table1.h,
> table1.i,
> table1.j,
> table1.k,
> table1.l,
> table1.m,
> table1.n,
> table1.o,
> table1.p,
> table1.q,
> table1.r,
> table1.s,
> table1.t,
> table1.u,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.ag
> else table1.ag end      as ag,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.al
> else table1.al end      as al,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.am
> else table1.am end      as am,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.an
> else table1.an end      as an,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.ao
> else table1.ao end      as ao,
> case when table1.ag = '1800-01-01' and table3.ah = ''  then table3.ap
> else table1.ap end      as ap,
> table1.v,
> table2.w,
> table2.x,
> table2.y,
> table2.z,
> table2.aa,
> table2.ab,
> table2.ac,
> table2.ad,
> table2.ae,
> table2.af
> from anonym2 table1
> left join dfkkopw_hist table3 on (
> table1.a = table3.a
> and table1.b = table3.b
> and table1.c = table3.c
> and table1.aq = table3.aq
> )
> inner join anonym3 table2 on (
> table1.a = table2.a
> and table1.b = table2.b
> and table1.c = table2.c
> )
> ;
>
>
>
>
> Warm regards,
> Chris
>
> Am 05.09.2014 13:46:44, schrieb Hyunsik Choi:
>
> Hi Chris,
>
> That's ok. I'd like suggest the following manners. I've been used those
> manners to fix special user query cases. I'm expecting that you can do at
> least one of them.
>
> - The first manner is to change column names and table names in order to
> hide your business logic if possible.
> - The second manner is to share only join conditions with changing field
> names.
>
> Best regards,
> Hyunsik
>
>
>
>
> On Fri, Sep 5, 2014 at 3:15 AM, Christian Schwabe <
> Christian.Schwabe@gmx.com> wrote:
>
> Hello Hyunsik,
>
> are the more detailed information been helpful? I could offer you attend a
> TeamViewer session to examine the problem more closely. Would that be a
> proposal on which you would let you in? So at least ensured that I must not
> publish queries.
> Due to the time difference in Germany would have to find only a suitable
> date and time.
>
>
> Warm regards,
> Chris
>
>
> Am 01.09.2014 20:26:35, schrieb Christian Schwabe:
>
> Hello Hyunsik,
>
> the late reply does not matter. I'm glad you support me so much. That's worth
> a lot! Unfortunately I can not post the query for privacy reasons. Can I
> send you other information to identify the problem in more detail?
>
> I can try to describe the general query: The projection refers to 43
> fields, it is a left join combined with an inner join, the join conditions are
> made for up to four fields.
> The tables are saved in parquet-format.
> First table (12,853,555 rows 544,6MB)
> Second table (17,896,966 rows 361,1MB)
> Third table (18,825,153 rows 943,2MB)
> Is there a way to force Tajo to a specific algorithm?
>
> You've talked about to give another advice, in the case of the nested join
> is the only choice. What would this mean excactly?
>
> Warm regards,
> Chris
>
> Am 01.09.2014 19:09:59, schrieb Hyunsik Choi:
>
> Hi Chris,
>
> Could you share your query? I can see a part of your query from your
> screenshot, but the query is not entire one.
>
> I think that the first problem is that the query planner chooses nested
> loop join which is very inefficient join algorithm. In some cases, there is
> no alternative if there is no join condition or join condition is theta
> join. But, I'm suspecting that it may be a bug because most of joins are
> equi-joins.
>
> I'll give another advice If block nested-loop join is only choice.
>
> Best regards,
> Hyunsik
>
>
> On Mon, Sep 1, 2014 at 8:54 PM, Christian Schwabe <
> Christian.Schwabe@gmx.com> wrote:
>
>
> Hello guys,
>
> Does anyone have an idea what I could do against the failure of a "nested
> loop join"?
>
>
> Best regards,
> Chris
>
> Am 27.08.2014 10:55:27, schrieb Christian Schwabe:
>
> Hello Jinhang,
>
> Thank you for your helpful hints. In fact, he makes an impact. You can see
> it on the attached screenshot. Now he seems to hang when merging the two
> tables. This helpful hint you should definitely include in your
> documentation. I have previously read this parameter still nowhere …
>
> The log show the following (partial list):
>
>
> 2014-08-27 10:13:09,583 INFO org.apache.tajo.worker.Task: * Local task
> dir: file:/tmp/tajo-chris/tmpdir/q_1409126149411_0001/output/5/1_0
> 2014-08-27 10:13:09,583 INFO org.apache.tajo.worker.Task:
> ==================================
> 2014-08-27 10:13:09,584 INFO org.apache.tajo.worker.Task: the directory is
> created
>  file:/tmp/tajo-chris/tmpdir/q_1409126149411_0001/in/eb_1409126149411_0001_000005/1/0/eb_1409126149411_0001_000003
> 2014-08-27 10:13:09,584 INFO org.apache.tajo.worker.Task: the directory is
> created
>  file:/tmp/tajo-chris/tmpdir/q_1409126149411_0001/in/eb_1409126149411_0001_000005/1/0/eb_1409126149411_0001_000004
> 2014-08-27 10:13:09,600 INFO org.apache.tajo.worker.TaskAttemptContext:
> Query status of ta_1409126149411_0001_000005_000001_00 is changed to
> TA_RUNNING
> 2014-08-27 10:13:09,602 INFO org.apache.tajo.worker.Fetcher: Status:
> FETCH_FETCHING, URI:
> http://192.168.178.101:52334/?qid=q_1409126149411_0001&sid=4&p=1&type=h&ta=68_0,17_0,50_0,32_0,61_0,25_0,55_0,60_0,47_0,69_0,16_0,38_0,2_0,24_0,33_0,11_0,5_0,41_0,44_0,66_0,52_0,19_0,8_0,30_0,63_0,27_0,21_0,26_0,72_0,67_0,36_0,57_0,18_0,49_0,31_0,13_0,29_0,62_0,10_0,0_0,46_0,65_0,3_0,1_0,54_0,39_0,15_0,37_0,23_0,6_0,64_0,59_0,73_0,51_0,28_0,42_0,34_0,12_0,20_0,9_0,56_0,35_0,45_0,48_0,53_0,4_0,70_0,71_0,58_0,40_0,22_0,7_0,14_0,43_0
> 2014-08-27 10:13:09,602 INFO org.apache.tajo.worker.Fetcher: Status:
> FETCH_FETCHING, URI:
> http://christians-mbp.fritz.box:52334/?qid=q_1409126149411_0001&sid=3&p=1&type=h&ta=1_0
> 2014-08-27 10:13:09,602 INFO
> org.apache.tajo.pullserver.TajoPullServerService: Current number of shuffle
> connections (2)
> 2014-08-27 10:13:09,602 INFO
> org.apache.tajo.pullserver.TajoPullServerService: Current number of shuffle
> connections (3)
> 2014-08-27 10:13:09,603 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer request param:
> shuffleType=h, sid=3, partId=1, taskIds=[1_0]
> 2014-08-27 10:13:09,603 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer baseDir:
> /tmp/tajo-chris/tmpdir/q_1409126149411_0001/output
> 2014-08-27 10:13:09,604 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer request param:
> shuffleType=h, sid=4, partId=1,
> taskIds=[68_0,17_0,50_0,32_0,61_0,25_0,55_0,60_0,47_0,69_0,16_0,38_0,2_0,24_0,33_0,11_0,5_0,41_0,44_0,66_0,52_0,19_0,8_0,30_0,63_0,27_0,21_0,26_0,72_0,67_0,36_0,57_0,18_0,49_0,31_0,13_0,29_0,62_0,10_0,0_0,46_0,65_0,3_0,1_0,54_0,39_0,15_0,37_0,23_0,6_0,64_0,59_0,73_0,51_0,28_0,42_0,34_0,12_0,20_0,9_0,56_0,35_0,45_0,48_0,53_0,4_0,70_0,71_0,58_0,40_0,22_0,7_0,14_0,43_0]
> 2014-08-27 10:13:09,604 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer baseDir:
> /tmp/tajo-chris/tmpdir/q_1409126149411_0001/output
> 2014-08-27 10:13:15,313 INFO org.apache.tajo.worker.Fetcher: Data fetch is
> done (total received bytes: 991342257)
> 2014-08-27 10:13:15,317 INFO org.apache.tajo.worker.Fetcher: Status:
> FETCH_FINISHED, URI:
> http://192.168.178.101:52334/?qid=q_1409126149411_0001&sid=4&p=1&type=h&ta=68_0,17_0,50_0,32_0,61_0,25_0,55_0,60_0,47_0,69_0,16_0,38_0,2_0,24_0,33_0,11_0,5_0,41_0,44_0,66_0,52_0,19_0,8_0,30_0,63_0,27_0,21_0,26_0,72_0,67_0,36_0,57_0,18_0,49_0,31_0,13_0,29_0,62_0,10_0,0_0,46_0,65_0,3_0,1_0,54_0,39_0,15_0,37_0,23_0,6_0,64_0,59_0,73_0,51_0,28_0,42_0,34_0,12_0,20_0,9_0,56_0,35_0,45_0,48_0,53_0,4_0,70_0,71_0,58_0,40_0,22_0,7_0,14_0,43_0
> 2014-08-27 10:13:18,685 INFO org.apache.tajo.worker.Fetcher: Data fetch is
> done (total received bytes: 2225872857)
> 2014-08-27 10:13:18,685 INFO org.apache.tajo.worker.Fetcher: Status:
> FETCH_FINISHED, URI:
> http://christians-mbp.fritz.box:52334/?qid=q_1409126149411_0001&sid=3&p=1&type=h&ta=1_0
> 2014-08-27 10:13:18,686 INFO org.apache.tajo.worker.Task:
> ta_1409126149411_0001_000005_000001_00 All fetches are done!
> 2014-08-27 10:13:18,688 INFO
> org.apache.tajo.engine.planner.PhysicalPlannerImpl: Left Outer Join (9)
> chooses [Nested Loop Join].
>
>
> What means „Nested Loop Join“? The last entry in the above log is already
> passed since 40 minutes. Nothing happens since that point.
>
>
> Best regards,
> Chris
>
>
> Am 26.08.2014 um 08:17 schrieb Jinhang Choi <jinhang@linewalks.com>:
>
> At first, you can take a try of changing resource-tracker's heartbeat
> timeout before investigating the actual problem. worker log shows
> on-going fetcher operations even though acknowledging heartbeat responses
> are delayed.
>
> besides, master log indicates Worker's deactivation with
> LivelinessMonitor's timeout as follow:
>
> ====
>
> 2014-08-25 21:18:47,968 INFO
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor:
> Expired:christians-mbp.fritz.box:28093:28091 Timed out after 120 secs
>
> 2014-08-25 21:18:47,978 INFO org.apache.tajo.master.rm.Worker:
> Deactivating Node christians-mbp.fritz.box:28093:28091 as it is now LOST
> ====
>
>
> How about testing tajo.resource-tracker.heartbeat.timeout-secs in
> tajo-site.xml like this?
>
>
> <configuration>
>
> ....
>
>     <property>
>
>         <name>tajo.resource-tracker.heartbeat.timeout-secs</name>
>
>         <value>240000</value>  // or, your own longer value. default is
> 120*1000 (2 minutes)
>
>     </property>
>
> </configuration>
>
>
> Sincerely,
>
> ----------------------------------------------
> Jinhang Choi, CTO.
> Linewalks Inc. Seoul 137-860, Korea
> Office: +82 70 7702 3043
> FAX: +82 2 2055 0612
>
> -----Original Message-----
> *From:* "Christian Schwabe"<Christian.Schwabe@gmx.com>
> *To:* <user@tajo.apache.org>; "Jinhang Choi"<jinhang@linewalks.com>;
> *Cc:*
> *Sent:* 2014-08-26 (화) 12:09:37
> *Subject:* Re: Big big query
>
> Hello everyone,
>
> I've tested a lot again today. I want to share my experiences and discuss
> it here. I have attached again a log of the Worker for the query which runs
> seemingly endless.
> As requested by you I have assigned to the worker 4GB. 8GB has my MacBook
> available.
> Also assign more memory is no solution.
> Joins on small tables with windows_functions an little content seem to be
> no problem.
> Joins with many columns and large tables, as it is a table in this example
> seems to be a real problem. My guess at this point is an incorrect memory
> management of Tajo.
> I have a video made by you better show the WebUI and to get a better
> picture of the situation. In combination with the submitted logs I hope
> here together to find a solution.
> Here is the video: http://youtu.be/_TKzRluzg38
>
>
> Best regards,
> Chris
>
>
>
>
> Am 25.08.2014 um 08:52 schrieb Jinhang Choi <jinhang@linewalks.com>:
>
> Dear Christian,
>
> worker log indicates that "GC overhead limit exceeded."
>
> would you mind to extend worker's heap memory size at tajo-env.sh?
>
> (please refer to
> http://tajo.apache.org/docs/current/configuration/worker_configuration.html)
>
> Sincerely,
> ----------------------------------------------
> Jinhang Choi, CTO.
> Linewalks Inc. Seoul 137-860, Korea
> Office: +82 70 7702 3043
> FAX: +82 2 2055 0612
>
> -----Original Message-----
> *From:* "Christian Schwabe"<Christian.Schwabe@gmx.com>
> *To:* <user@tajo.apache.org>;
> *Cc:*
> *Sent:* 2014-08-25 (월) 15:33:15
> *Subject:* Re: Big big query
>
>
> Hello Hyunsik,
> sorry. My fault. I will send you another email with the attached logs.
>
> Best regards,
> Chris
>
> Am 25.08.2014 08:28:17, schrieb Hyunsik Choi:
>
> Hi Chris,
>
> As Jihoon mentioned, it would be better to find the problems if you attach
> master and worker logs.
>
> Thanks,
> hyunsik
>
>
> On Sun, Aug 24, 2014 at 4:17 PM, Christian Schwabe <
> Christian.Schwabe@gmx.com> wrote:
>
> Hello guys
>
> i started following query last night and this morning have seen that still
> ran the query with the fact that has the display of procent not changed and
> only ran on time. So I have to start again this morning reproduce the query
> to the error. The result you see in the table below:
> Running QueriesQueryIdQuery Master Started ProgressTimeStatus sql Kill
> Queryq_1408862421753_0001
> <http://christians-mbp.fritz.box:28080/querydetail.jsp?queryId=q_1408862421753_0001>
> christians-mbp.fritz.box2014-08-24 08:46:33 45% 15 mins, 48 sec
> QUERY_RUNNINGINSERT INTO TMP_DFKKKO_DFKKOP select pos.validthru,
> pos.mandt, pos.opbel, pos.opupk, pos.opupz, pos.bukrs, pos.augrs,
> pos.abwkt, pos.hvorg, pos.tvorg, pos.kofiz, pos.hkont, pos.mwskz,
> pos.mwszkz, pos.xanza, pos.stakz, pos.astkz, pos.opsta, pos.infoz,
> pos.inkps, pos.betrh, pos.studt, ko.fikey, ko.blart, ko.herkf, ko.stbel,
> ko.storb, ko.ernam, ko.cpudt, ko.cputm, ko.bldat, ko.budat from dfkkop_hist
> pos left join dfkkopw_hist wdh on ( pos.validthru = wdh.validthru and
> pos.mandt = wdh.mandt and pos.opbel = wdh.opbel and pos.whgrp = wdh.whgrp )
> inner join dfkkko_hist ko on ( pos.validthru = ko.validthru and pos.mandt =
> ko.mandt and pos.opbel = ko.opbel )
>
> Second Screenshot:
> Running QueriesQueryIdQuery Master Started ProgressTimeStatus sql Kill
> Queryq_1408862421753_0001
> <http://christians-mbp.fritz.box:28080/querydetail.jsp?queryId=q_1408862421753_0001>
> christians-mbp.fritz.box2014-08-24 08:46:33 43% 23 mins, 21 sec
> QUERY_RUNNINGINSERT INTO TMP_DFKKKO_DFKKOP select pos.validthru,
> pos.mandt, pos.opbel, pos.opupk, pos.opupz, pos.bukrs, pos.augrs,
> pos.abwkt, pos.hvorg, pos.tvorg, pos.kofiz, pos.hkont, pos.mwskz,
> pos.mwszkz, pos.xanza, pos.stakz, pos.astkz, pos.opsta, pos.infoz,
> pos.inkps, pos.betrh, pos.studt, ko.fikey, ko.blart, ko.herkf, ko.stbel,
> ko.storb, ko.ernam, ko.cpudt, ko.cputm, ko.bldat, ko.budat from dfkkop_hist
> pos left join dfkkopw_hist wdh on ( pos.validthru = wdh.validthru and
> pos.mandt = wdh.mandt and pos.opbel = wdh.opbel and pos.whgrp = wdh.whgrp )
> inner join dfkkko_hist ko on ( pos.validthru = ko.validthru and pos.mandt =
> ko.mandt and pos.opbel = ko.opbel )
>
> Third Screenshot:
>
> Finished QueriesQueryIdQuery Master Started FinishedTimeStatus sql
> q_1408862421753_0001
> <http://christians-mbp.fritz.box:28080/querydetail.jsp?queryId=q_1408862421753_0001>
> christians-mbp.fritz.box 2014-08-24 08:46:33-- QUERY_RUNNINGINSERT INTO
> TMP_DFKKKO_DFKKOP select pos.validthru, pos.mandt, pos.opbel, pos.opupk,
> pos.opupz, pos.bukrs, pos.augrs, pos.abwkt, pos.hvorg, pos.tvorg,
> pos.kofiz, pos.hkont, pos.mwskz, pos.mwszkz, pos.xanza, pos.stakz,
> pos.astkz, pos.opsta, pos.infoz, pos.inkps, pos.betrh, pos.studt, ko.fikey,
> ko.blart, ko.herkf, ko.stbel, ko.storb, ko.ernam, ko.cpudt, ko.cputm,
> ko.bldat, ko.budat from dfkkop_hist pos left join dfkkopw_hist wdh on (
> pos.validthru = wdh.validthru and pos.mandt = wdh.mandt and pos.opbel =
> wdh.opbel and pos.whgrp = wdh.whgrp ) inner join dfkkko_hist ko on (
> pos.validthru = ko.validthru and pos.mandt = ko.mandt and pos.opbel =
> ko.opbel )
>
> As you can see, the query is still running but there is no image data and
> progress more.
>
> I attached a log in which you can see the output in the console. Striking
> here is the display of the percentage jumps sometimes and not further
> altered towards the end and remains constant.
> The data size of the tables to which I JOINS by lead is for dfkkop_hist
> 5,83GB, dfkkopw_hist 2,47GB and dfkkko_hist 2.35 GB. As I write this, the
> description of the query is still running.
> I know these are large amounts of data, but I would have expected which
> thus constitutes the colloquial no problem. Can you imagine why it comes
> here to this problem?
>
>
>
>
> Best regards,
> Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Mime
View raw message