cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johan Oskarsson (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-890) Get Hadoop input format sub splits in parallel
Date Tue, 16 Mar 2010 13:39:27 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Johan Oskarsson updated CASSANDRA-890:
--------------------------------------

    Attachment: CASSANDRA-890.patch

Uses an Executor to run all the getSubSplits calls in parallel, speeding up the startup in
case of many calls.

> Get Hadoop input format sub splits in parallel
> ----------------------------------------------
>
>                 Key: CASSANDRA-890
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-890
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Johan Oskarsson
>         Attachments: CASSANDRA-890.patch
>
>
> To improve Hadoop job startup time we can multithread parts of the input format. Specifically
the fetching of "sub splits" from many nodes can be run in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message