cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1896) Improve throughput by adding buffering to the inter-node TCP communication
Date Mon, 27 Dec 2010 22:12:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12975355#action_12975355
] 

Jonathan Ellis commented on CASSANDRA-1896:
-------------------------------------------

at first glance the ByteBuffer code on write looks buggy, it should be using remaining instead
of limit and may not be including offsets correctly.

after dealing w/ BB issues we can commit this to 0.6, but for 0.7 I'd prefer to finish CASSANDRA-1788
before muddying the water further.  We basically write three ints and a byte array; i wouldn't
be surprised if buffering is a win on the ints, but a lose on the byte[].  I.e., this demonstrates
that a copy + syscall is faster than four syscalls, but two syscalls [and still no copy] may
be faster yet.

> Improve throughput by adding buffering to the inter-node TCP communication
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1896
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1896
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Tsiki
>            Assignee: Brandon Williams
>             Fix For: 0.6.9, 0.7.1
>
>         Attachments: 1896.txt
>
>
> The inbound and outbound TCP implementation under org.apache.cassandra.net  does not
buffer the socket streams. A simple change in IncomingTcpConnection and OutboundTcpConnection
may give a rather big throughput increase. In my tests, I got up t o 30% more out of my cluster.
Below is the diff of these two files with buffering included. The diff is over release 0.6.5
but can be quite simply applied also to 0.7. I suggest perhaps to limit the buffered input
stream I added in IncomingTcpConnection to 4K. The Outbound implementation can surely be implemented
a bit better (remove some of the code I duplicated there).
> diff -r apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
fix/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
> 44c44
> <             input = new DataInputStream(new BufferedInputStream(socket.getInputStream()));
> ---
> >             input = new DataInputStream(socket.getInputStream());
> diff -r apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
fix/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
> 77d76
> <     	byte[] buf = new byte[4096];
> 80,112c79,89
> <         	int l = 0;
> <         	ByteBuffer bb;
> <         	while ((bb = queue.peek()) != null && l+bb.limit() < buf.length)
{ 
> <         		bb = take();
> <         		System.arraycopy(bb.array(), 0, buf, l, bb.limit());
> <         		l += bb.limit();
> <         	}
> <         	if (l == 0) {
> < 	        	bb = take();
> < 	            if (bb == CLOSE_SENTINEL)
> < 	            {
> < 	                disconnect();
> < 	                continue;
> < 	            }
> < 	            if (socket != null || connect())
> < 	                writeConnected(bb);
> < 	            else
> < 	                // clear out the queue, else gossip messages back up.
> < 	                queue.clear();
> <         	} else {
> <         		if (socket != null || connect()) {
> < 	        		try {
> < 		        		output.write(buf, 0, l);
> < 		                if (queue.peek() == null)
> < 		                    output.flush();
> < 	        		} catch (IOException e) {
> < 	                    logger.info("error writing to " + endpoint);
> < 	                    disconnect();
> < 	        		}
> <         		} else {
> <         			queue.clear();
> <         		}
> <         	}
> ---
> >             ByteBuffer bb = take();
> >             if (bb == CLOSE_SENTINEL)
> >             {
> >                 disconnect();
> >                 continue;
> >             }
> >             if (socket != null || connect())
> >                 writeConnected(bb);
> >             else
> >                 // clear out the queue, else gossip messages back up.
> >                 queue.clear();            

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message