Return-Path: X-Original-To: apmail-tajo-dev-archive@minotaur.apache.org Delivered-To: apmail-tajo-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 26A4810D9D for ; Tue, 24 Dec 2013 07:20:19 +0000 (UTC) Received: (qmail 9741 invoked by uid 500); 24 Dec 2013 07:20:18 -0000 Delivered-To: apmail-tajo-dev-archive@tajo.apache.org Received: (qmail 9670 invoked by uid 500); 24 Dec 2013 07:20:16 -0000 Mailing-List: contact dev-help@tajo.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tajo.incubator.apache.org Delivered-To: mailing list dev@tajo.incubator.apache.org Received: (qmail 9658 invoked by uid 99); 24 Dec 2013 07:20:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Dec 2013 07:20:14 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 24 Dec 2013 07:20:12 +0000 Received: (qmail 9294 invoked by uid 99); 24 Dec 2013 07:19:50 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Dec 2013 07:19:50 +0000 Date: Tue, 24 Dec 2013 07:19:50 +0000 (UTC) From: "Jihoon Son (JIRA)" To: dev@tajo.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (TAJO-385) Refactoring TaskScheduler to assign multiple fragments MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/TAJO-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jihoon Son updated TAJO-385: ---------------------------- Attachment: TAJO-385_5.patch I uploaded a new patch. In this patch, I made some changes to the configurations as follows. {noformat} tajo.querymaster.lazy-task-scheduler.algorithm task_scheduling_algorithm_for_lazy_task_scheduler tajo.task.size-mb task_size_in_mb {noformat} And, I added a new scheduling algorithm, called GreedyFagmentSchedulingAlgorithm, for LazyTaskScheduler. When I tested DefaultFragmentSchedulingAlgorithm *after clearing the disk cache*, the query processing performance is significantly decreased by the increased number of remote tasks. I tried to find the reason of increased remote tasks, but i couldn't. So, I created GreedyFagmentSchedulingAlgorithm that reduces the number of remote tasks even when the disk cache is cleared. I'm going to upload this patch to the RB. Please review the patch. Thanks and Happy Christmas!! > Refactoring TaskScheduler to assign multiple fragments > ------------------------------------------------------ > > Key: TAJO-385 > URL: https://issues.apache.org/jira/browse/TAJO-385 > Project: Tajo > Issue Type: Improvement > Components: query master > Affects Versions: 0.8-incubating > Reporter: Jihoon Son > Assignee: Jihoon Son > Attachments: TAJO-385.patch, TAJO-385_2.patch, TAJO-385_3.patch, TAJO-385_4.patch, TAJO-385_5.patch > > > In the current implementation, each task processes only one fragment. > However, processing multiple fragments in a task will increase the query processing performance according to the storage layout and the user queries. > In this issue, TaskScheduler is refactored to enable assigning multiple fragments to each task. > Followings should be contained. > * Schedule Fragments instead of QueryUnits in TaskScheduler > ** The QueryUnit creation is postponed until TaskScheduler receives task requests from workers. > ** When TaskScheduler receives task requests from workers, it dynamically creates an QueryUnit and assigns one or more fragments. > ** The fragment scheduling should take into account the disk load balancing. -- This message was sent by Atlassian JIRA (v6.1.5#6160)