tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyunsik Choi (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TAJO-178) Implements StorageManager for scanning asynchronously
Date Thu, 12 Sep 2013 09:54:53 GMT

    [ https://issues.apache.org/jira/browse/TAJO-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765308#comment-13765308
] 

Hyunsik Choi edited comment on TAJO-178 at 9/12/13 9:53 AM:
------------------------------------------------------------

The current issue title represents better the submitted patch.

Also, I agree with the main purpose of this feature and its promising. Even though the current
throughput of this patch is similar to the existing one, it will be a key feature for efficient
I/O when we fully optimize it.

Later, we could also improve this to something like cooperative scans for multiple queries.

I'll take a look at this patch today night.
                
      was (Author: hyunsik):
    The current issue title represents better the submitted patch.

Also, I agree with the main purpose of this feature and its promising. Even though the current
throughput of this patch is similar to the existing one, it will be a key feature for efficient
I/O when we fully optimize it.

Later, we could also improve this to something like cooperative scans for multiple queries.

I'll take a look at this patch today's night.
                  
> Implements StorageManager for scanning asynchronously
> -----------------------------------------------------
>
>                 Key: TAJO-178
>                 URL: https://issues.apache.org/jira/browse/TAJO-178
>             Project: Tajo
>          Issue Type: Improvement
>          Components: storage
>    Affects Versions: 0.2-incubating
>            Reporter: hyoungjunkim
>            Assignee: hyoungjunkim
>         Attachments: TAJO-178_1.path, TAJO-178.patch_2, TAJO-178.path, tajo_storage_manager.png
>
>
> The current StorageManager does not provide scan scheduling function. All scan operations
run concurrently. This is the cause of random disk access and disk read performance is not
good.
> The proposed StorageManager is based on double buffering. Each disk has a scheduler to
schedule by order of scanned adjust. Each Scanner has a InputStream and a Tuple pool. The
next() operation of ScanNode is blocked until Tuple pool is filled. Assigned Scanner by the
scheduler read data(xMB) and fills Tuple Pool and notifies to next() operation. After scanning
Scanner re-enter DiskScanQueue.
> In this way Scanner can pass column vector to Vectorized Query Engine.
> See the attached file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message