tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jihoon Son (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TAJO-367) Separate the locality information from Fragment
Date Tue, 03 Dec 2013 07:21:35 GMT

     [ https://issues.apache.org/jira/browse/TAJO-367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jihoon Son updated TAJO-367:
----------------------------

    Description: 
Fragment is designed to represent a portion of the abstracted input source.
However, since It is currently used for the task scheduling and the task allocation, it includes
the locality information as well as the abstraction of the input data.
The locality information is used only in the task scheduling, and thus the locality information
should be separated from Fragment.

The locality information is used in the task scheduling to assign tasks to workers closes
to the data regardless of the kind of the storage layer.
To consider input data and their locality in the task scheduler, we need to design a new class
including a Fragment and the locality information such as FragmentWithHost.

In this issue, following works should be resolved.
* Removing the host information from FileFragment
* Creating a new class FragmentWithHost that contains an instance of the Fragment interface
and the locality information consisting of hosts and disk ids
* Refactoring SubQuery, StorageManager and TaskScheduler to use FragmentWithHost

  was:
Fragment is designed to represent a portion of the abstracted input source.
However, since It is currently used for the task scheduling and the task allocation, it includes
the locality information as well as the abstraction of the input data.
The locality information is used only in the task scheduling, and thus the locality information
should be separated from Fragment.

The locality information is used in the task scheduling to assign tasks to workers closes
to the data regardless of the kind of the storage layer.
To consider input data and their locality in the task scheduler, we need to design a new class
including a Fragment and the locality information such as FragmentWithHost.


> Separate the locality information from Fragment
> -----------------------------------------------
>
>                 Key: TAJO-367
>                 URL: https://issues.apache.org/jira/browse/TAJO-367
>             Project: Tajo
>          Issue Type: Improvement
>          Components: master, storage, worker
>            Reporter: Jihoon Son
>             Fix For: 0.8-incubating
>
>
> Fragment is designed to represent a portion of the abstracted input source.
> However, since It is currently used for the task scheduling and the task allocation,
it includes the locality information as well as the abstraction of the input data.
> The locality information is used only in the task scheduling, and thus the locality information
should be separated from Fragment.
> The locality information is used in the task scheduling to assign tasks to workers closes
to the data regardless of the kind of the storage layer.
> To consider input data and their locality in the task scheduler, we need to design a
new class including a Fragment and the locality information such as FragmentWithHost.
> In this issue, following works should be resolved.
> * Removing the host information from FileFragment
> * Creating a new class FragmentWithHost that contains an instance of the Fragment interface
and the locality information consisting of hosts and disk ids
> * Refactoring SubQuery, StorageManager and TaskScheduler to use FragmentWithHost



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message