Return-Path: X-Original-To: apmail-aurora-reviews-archive@minotaur.apache.org Delivered-To: apmail-aurora-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 605541894D for ; Fri, 12 Feb 2016 01:09:35 +0000 (UTC) Received: (qmail 85393 invoked by uid 500); 12 Feb 2016 01:09:35 -0000 Delivered-To: apmail-aurora-reviews-archive@aurora.apache.org Received: (qmail 85343 invoked by uid 500); 12 Feb 2016 01:09:35 -0000 Mailing-List: contact reviews-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@aurora.apache.org Delivered-To: mailing list reviews@aurora.apache.org Received: (qmail 85315 invoked by uid 99); 12 Feb 2016 01:09:35 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2016 01:09:35 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 093622A60A5; Fri, 12 Feb 2016 01:09:34 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============8637582314504276925==" MIME-Version: 1.0 Subject: Re: Review Request 43457: Increase throughput of DbTaskStore From: Maxim Khutornenko To: Maxim Khutornenko , John Sirois Cc: Bill Farner , Aurora ReviewBot , Zameer Manji , Aurora Date: Fri, 12 Feb 2016 01:09:34 -0000 Message-ID: <20160212010934.24149.10793@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Maxim Khutornenko X-ReviewGroup: Aurora X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/43457/ X-Sender: Maxim Khutornenko References: <20160211005719.24150.75135@reviews.apache.org> In-Reply-To: <20160211005719.24150.75135@reviews.apache.org> X-ReviewBoard-Diff-For: src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java Reply-To: Maxim Khutornenko X-ReviewRequest-Repository: aurora --===============8637582314504276925== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit > On Feb. 11, 2016, 12:57 a.m., Bill Farner wrote: > > It would be nice to hear how this change jives with the opposite change made in https://reviews.apache.org/r/42882 > > Maxim Khutornenko wrote: > I thought about that too. I think there are 2 major differences: the total number of rows generated by the multi-join select statement and number of required subselects. In that RB, lowering the row count from 500k to just under 100 plus the low number of required subselects helped to unlock perf gains. > > In this particular scenario, it appears that the frequency of subselects trumps everything else. > > Zameer, what's the overall number of rows returned by the select statement for a single task in your case? > > Zameer Manji wrote: > Running: > ```` > SELECT > t.id AS row_id, > t.task_config_row_id AS task_config_row_id, > t.task_id AS task_id, > t.instance_id AS instance_id, > t.status AS status, > t.failure_count AS failure_count, > t.ancestor_task_id AS ancestor_id, > j.role AS c_j_role, > j.environment AS c_j_environment, > j.name AS c_j_name, > h.slave_id AS slave_id, > h.host AS slave_host, > tp.name as tp_name, > tp.port as tp_port, > te.timestamp_ms as te_timestamp, > te.status as te_status, > te.message as te_message, > te.scheduler_host as te_scheduler > FROM tasks AS t > INNER JOIN task_configs as c ON c.id = t.task_config_row_id > INNER JOIN job_keys AS j ON j.id = c.job_key_id > LEFT OUTER JOIN task_ports as tp ON tp.task_row_id = t.id > LEFT OUTER JOIN task_events as te ON te.task_row_id = t.id > LEFT OUTER JOIN host_attributes AS h ON h.id = t.slave_row_id > WHERE task_id = '1454546771388-zmanji-devel-labrat-237-0e52b4a9-a8da-4958-997f-7bbe3db6b5d2' > ```` > > On a test cluster returns 4 rows where thhe task is in the RUNNING state. > > If we consider it, a job typically does not allocate that many ports, and will have less than 8 events on the task. > > Further running > ```` > SELECT > c.id AS id, > c.creator_user AS creator_user, > c.service AS is_service, > c.num_cpus AS num_cpus, > c.ram_mb AS ram_mb, > c.disk_mb AS disk_mb, > c.priority AS priority, > c.max_task_failures AS max_task_failures, > c.production AS production, > c.contact_email AS contact_email, > c.executor_name AS executor_name, > c.executor_data AS executor_data, > c.tier AS tier, > j.role AS j_role, > j.environment AS j_environment, > j.name AS j_name, > p.port_name AS p_port_name, > d.id AS c_id, > d.image AS c_image, > m.id AS m_id, > m.key AS m_key, > m.value AS m_value, > tc.id AS constraint_id, > tc.name AS constraint_name, > tlc.id AS constraint_l_id, > tlc.value AS constraint_l_limit, > tvc.id AS constraint_v_id, > tvc.negated AS constraint_v_negated, > tvcv.value as constraint_v_v_value > FROM task_configs AS c > INNER JOIN job_keys AS j ON j.id = c.job_key_id > LEFT OUTER JOIN task_config_requested_ports AS p ON p.task_config_id = c.id > LEFT OUTER JOIN task_config_docker_containers AS d ON d.task_config_id = c.id > LEFT OUTER JOIN task_config_metadata AS m ON m.task_config_id = c.id > LEFT OUTER JOIN task_constraints AS tc ON tc.task_config_id = c.id > LEFT OUTER JOIN limit_constraints as tlc ON tlc.constraint_id = tc.id > LEFT OUTER JOIN value_constraints as tvc ON tvc.constraint_id = tc.id > LEFT OUTER JOIN value_constraint_values AS tvcv ON tvcv.value_constraint_id = tvc.id > WHERE c.id = 1 > ```` > > Returns 2 rows for a a task in the above job. > > I think this is because a tpyical job doesn't have that many constraints. Thanks Zameer. This confirms my assumptions about row count vs. sub-select chattiness. - Maxim ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43457/#review118790 ----------------------------------------------------------- On Feb. 11, 2016, 8:03 p.m., Zameer Manji wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43457/ > ----------------------------------------------------------- > > (Updated Feb. 11, 2016, 8:03 p.m.) > > > Review request for Aurora, John Sirois and Maxim Khutornenko. > > > Repository: aurora > > > Description > ------- > > Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION > src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 > src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 > > Diff: https://reviews.apache.org/r/43457/diff/ > > > Testing > ------- > > Master: > Benchmark (numTasks) Mode Cnt Score Error Units > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 10000 thrpt 5 44.052 ± 14.689 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 50000 thrpt 5 0.179 ± 0.052 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 100000 thrpt 5 0.087 ± 0.022 ops/s > > This Patch: > Benchmark (numTasks) Mode Cnt Score Error Units > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 10000 thrpt 5 51.531 ± 7.236 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 50000 thrpt 5 7.370 ± 1.320 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 100000 thrpt 5 2.143 ± 1.234 ops/s > > > Thanks, > > Zameer Manji > > --===============8637582314504276925==--