incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Chunlu <springri...@gmail.com>
Subject Re: how to solve one node is in heavy load in unbalanced cluster
Date Thu, 04 Aug 2011 15:39:39 GMT
hi, any  help? thanks!

On Thu, Aug 4, 2011 at 5:02 AM, Yan Chunlu <springrider@gmail.com> wrote:

> forgot to mention I am using cassandra 0.7.4
>
>
> On Thu, Aug 4, 2011 at 5:00 PM, Yan Chunlu <springrider@gmail.com> wrote:
>
>> also nothing happens about the streaming:
>>
>> nodetool -h node3 netstats
>> Mode: Normal
>> Not sending any streams.
>>  Nothing streaming from /10.28.53.11
>> Pool Name                    Active   Pending      Completed
>> Commands                        n/a         0      165086750
>> Responses                       n/a         0       99372520
>>
>>
>>
>> On Thu, Aug 4, 2011 at 4:56 PM, Yan Chunlu <springrider@gmail.com> wrote:
>>
>>> sorry the ring info should be this:
>>>
>>> nodetool -h node3 ring
>>> Address         Status State   Load            Owns    Token
>>>
>>>
>>>  84944475733633104818662955375549269696
>>> node1      Up     Normal  13.18 GB        81.09%
>>>  52773518586096316348543097376923124102
>>> node2     Up     Normal  22.85 GB        10.48%
>>>  70597222385644499881390884416714081360
>>> node3      Up     Leaving 25.44 GB        8.43%
>>> 84944475733633104818662955375549269696
>>>
>>>
>>>
>>> On Thu, Aug 4, 2011 at 4:55 PM, Yan Chunlu <springrider@gmail.com>wrote:
>>>
>>>> I have tried the nodetool move but get the following error....
>>>>
>>>> node3:~# nodetool -h node3 move 0
>>>> Exception in thread "main" java.lang.IllegalStateException: replication
>>>> factor (3) exceeds number of endpoints (2)
>>>>  at
>>>> org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60)
>>>> at
>>>> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:930)
>>>>  at
>>>> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:896)
>>>> at
>>>> org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1596)
>>>>  at
>>>> org.apache.cassandra.service.StorageService.move(StorageService.java:1734)
>>>> at
>>>> org.apache.cassandra.service.StorageService.move(StorageService.java:1709)
>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>  at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>  at
>>>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>>>> at
>>>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>>>>  at
>>>> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>>>> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
>>>>  at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
>>>> at
>>>> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
>>>>  at
>>>> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
>>>> at
>>>> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
>>>>  at
>>>> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>>>> at
>>>> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
>>>>  at
>>>> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
>>>> at
>>>> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
>>>>  at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)
>>>> at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>  at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>>>>  at sun.rmi.transport.Transport$1.run(Transport.java:159)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>  at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>>>> at
>>>> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>>>>  at
>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>>>> at
>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>>>>  at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>  at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>
>>>> then nodetool shows the node is leaving....
>>>>
>>>>
>>>> nodetool -h node3 ring
>>>> Address         Status State   Load            Owns    Token
>>>>
>>>>
>>>>  84944475733633104818662955375549269696
>>>> node1      Up     Normal  13.18 GB        81.09%
>>>>  52773518586096316348543097376923124102
>>>> node2     Up     Normal  22.85 GB        10.48%
>>>>  70597222385644499881390884416714081360
>>>>  node3      Up     Leaving 25.44 GB        8.43%
>>>> 84944475733633104818662955375549269696
>>>>
>>>> the log didn't show any error message neither anything abnormal.  is
>>>> there something wrong?
>>>>
>>>>
>>>> I used to have RF=2, and changed it to RF=3 using cassandra-cli.
>>>>
>>>>
>>>> On Mon, Aug 1, 2011 at 10:22 AM, Yan Chunlu <springrider@gmail.com>wrote:
>>>>
>>>>> thanks a lot! I will try the "move".
>>>>>
>>>>>
>>>>> On Mon, Aug 1, 2011 at 7:07 AM, mcasandra <mohitanchlia@gmail.com>wrote:
>>>>>
>>>>>>
>>>>>> springrider wrote:
>>>>>> >
>>>>>> > is that okay to do nodetool move before a completely repair?
>>>>>> >
>>>>>> > using this equation?
>>>>>> > def tokens(nodes):
>>>>>> >
>>>>>> >    - for x in xrange(nodes):
>>>>>> >       - print 2 ** 127 / nodes * x
>>>>>> >
>>>>>>
>>>>>> Yes use that logic to get the tokens. I think it's safe to run move
>>>>>> first
>>>>>> and reair later. You are moving some nodes data as is so it's no
worse
>>>>>> than
>>>>>> what you have right now.
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-solve-one-node-is-in-heavy-load-in-unbalanced-cluster-tp6630827p6639317.html
>>>>>> Sent from the cassandra-user@incubator.apache.org mailing list
>>>>>> archive at Nabble.com.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message