cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Graham <sjclou...@gmail.com>
Subject Re: Adjusting Token Spaces and Rebalancing Data
Date Mon, 01 Mar 2010 23:39:26 GMT
Hello,

I did find these exceptions. I issued the loadbalance command on node
192.168.2.10.

INFO [MESSAGING-SERVICE-POOL:3] 2010-03-01 10:34:40,764 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:55973 remote=/
192.168.2.13:7000]
 WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-03-01 10:34:40,964
MessagingService.java (line 555) Running on default stage - beware
 WARN [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 484) Problem reading from socket connected to :
java.nio.channels.SocketChannel[connected local=/192.168.2.10:40758 remote=/
192.168.2.13:7000]
 WARN [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 485) Exception was generated at : 03/01/2010 10:34:40 on thread
MESSAGING-SERVICE-POOL:1
Reached an EOL or something bizzare occured. Reading from:
/192.168.2.13BufferSizeRemaining: 16
java.io.IOException: Reached an EOL or something bizzare occured. Reading
from: /192.168.2.13 BufferSizeRemaining: 16
    at org.apache.cassandra.net.io.StartState.doRead(StartState.java:44)
    at org.apache.cassandra.net.io.ProtocolState.read(ProtocolState.java:39)
    at org.apache.cassandra.net.io.TcpReader.read(TcpReader.java:95)
    at
org.apache.cassandra.net.TcpConnection$ReadWorkItem.run(TcpConnection.java:445)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
 INFO [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:40758 remote=/
192.168.2.13:7000]
 INFO [MESSAGE-STREAMING-POOL:1] 2010-03-01 10:35:23,171 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:56728 remote=/
192.168.2.13:7000]
 INFO [MESSAGE-STREAMING-POOL:1] 2010-03-01 10:35:23,221 FileStreamTask.java
(line 79) Exception was generated at : 03/01/2010 10:35:23 on thread
MESSAGE-STREAMING-POOL:1
Value too large for defined data type
java.io.IOException: Value too large for defined data type
    at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
    at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
    at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
    at org.apache.cassandra.net.TcpConnection.stream(TcpConnection.java:226)
    at org.apache.cassandra.net.FileStreamTask.run(FileStreamTask.java:55)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

I can certainly upgrade to 0.6 and try a loadbalance there, do you
still think it is advisable?

All of my key/value entries are well under 1024 bytes but I have millions of
them.

 Do you think I have a data corruption problem?

Thanks,
Jon
On Mon, Mar 1, 2010 at 2:54 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> On Mon, Mar 1, 2010 at 3:18 PM, Jon Graham <sjcloud22@gmail.com> wrote:
> > Thanks Jonathan.
> >
> > It seems like the load balance operation isn't moving. I haven't seen any
> > data file time changes in 2 hours and no location file time
> > changes in over an hour.
> >
> > I can see a tcp port # 7000 opened on the node where I ran the
> loadbalance
> > command. It is connected to
> > port 39033 on the node receiving the data. The CPU usage on both systems
> is
> > very low. There are about 10
> > million records on the node where the load balance command was issued.
>
> Did you check logs for exceptions?
>
> > My six node Cassandra ring consists of tokens for nodes 1-6 of:  0
> > (ascii 0x30)  6  B  H  O (the letter O)  T
> >
> > The load balance target node initially had a token of 'H' (using ordered
> > partitioning). The source node has a key of 0 (ascii 0x30). Most of the
> data
> > on the source node has keys starting with '/'. Slash falls between tokens
> T
> > and  0 in my ring so most of the data landed on the node with token 0
> with
> > replicas on the next 2 nodes. My token space is badly divided for the
> data I
> > have already inserted.
> >
> > Does the initial token value of the load balance target node selected by
> > Cassandra need to be cleared or set to a specific value before hand to
> > accomodate the load balance data transfer?
>
> No.
>
> > Would I have better luck decommissioning nodes 4,5,6 and trying to
> > bootstrapping these nodes one at a time
> > with better initial token values?
>
> LoadBalance is basically sugar for decommission + bootstrap, so no.
>
> > I am looking for a good way to move/split/re-balance data from nodes
> 1,2,3
> > to nodes 4, 5, 6 while achiving a better token space distribution.
>
> I would upgrade to the 0.6 beta and try loadbalance again.
>
> -Jonathan
>

Mime
View raw message