hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HBASE-675) Report correct server hosting a table split for assignment to for MR Jobs
Date Sat, 30 Aug 2008 21:29:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627256#action_12627256
] 

viper799 edited comment on HBASE-675 at 8/30/08 2:28 PM:
--------------------------------------------------------------

One of the main benefits I can see us getting from this patch is not speed of job completion
but it should reduced network traffic. This will improve performance on large clusters that
have limited bandwidth between racks etc..
But in theory you should have seen lower bandwidth used between the servers of any size just
might not be noticeable on small cluster.

      was (Author: viper799):
    One of the main benefits I can see getting from this patch is not speed of job completion
but reduced network traffic. This will improve performance on large clusters that have limited
bandwidth between racks etc..
But in theory you should have seen lower bandwidth used between the servers of any size just
might not be noticeable on small cluster.
  
> Report correct server hosting a table split for assignment to for MR Jobs
> -------------------------------------------------------------------------
>
>                 Key: HBASE-675
>                 URL: https://issues.apache.org/jira/browse/HBASE-675
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.2.0
>            Reporter: Billy Pearson
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: hbase-675-v1.patch
>
>
> Currently we return a null String array to the MR framework to use a random node for
MR job assignment.
> class: org.apache.hadoop.hbase.mapred.tableSplit
> function getLocations()
> We should be able to query the meta now for the current host name of the server hosting
the region in question.
> This will help with scaling as there will be less cross server communication removing
bandwidth as a bottleneck.
> The side effect of fixing this will help from overloading region servers with lots of
MR clients all pulling from the same region server while theres work local for them to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message