helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinoth Chandar <vin...@uber.com>
Subject Re: Balancing out skews in FULL_AUTO mode with built-in rebalancer
Date Wed, 16 Mar 2016 00:52:03 GMT
Hi Kishore,
I think the changes I made are exercised when computing the preferred assignment, later when
the reconciliation happens with existing assignment/orphaned partitions etc, I think it does
not take effect.
The effective assignment I saw was all partitions (2 per resource) were assigned to first
2 servers. I started to dig into the above mentioned parts of the code, will report back tmrw
when I pick this back up.
Thanks,
Vinoth

    _____________________________
From: kishore g <g.kishore@gmail.com>
Sent: Tuesday, March 15, 2016 2:01 PM
Subject: Re: Balancing out skews in FULL_AUTO mode with built-in rebalancer
To:  <user@helix.apache.org>


                1) I am guessing it gets overriden by other logic in computePartitionAssignment(..),
the end assignment is still skewed.     
             
             What is the logic you are referring to?             
             Can you print the assignment count for your use case?             
             
                  thanks,          Kishore G          
       On Tue, Mar 15, 2016 at 1:45 PM, Vinoth Chandar     <vinoth@uber.com> wrote:
   
                                                                                     Hi guys,
             
             
            We are hitting a fairly known issue where we have 100s of resource with < 8
resources spreading across 10 servers and the built-in assignment always assigns partitions
from first to last, resulting in heavy skew for a few nodes.             
            
           Chatted with Kishore offline and made a patch as            here.Tested with 5
resources with 2 partitions each across 8 servers, logging out the nodeShift & ultimate
index picked does indicate that we choose servers other than the first two, which is good
          
           
But           
1) I am guessing it gets overriden by other logic in computePartitionAssignment(..), the end
assignment is still skewed.           
                   2) Even with murmur hash, there is some skew on the nodeshift, which needs
to ironed out.         
         
        I will keep chipping at this.. Any feedback appreciated        
        
       Thanks       
            Vinoth
                
    


  
Mime
View raw message