cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3838) Repair Streaming hangs between multiple regions
Date Sat, 04 Feb 2012 21:51:53 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200566#comment-13200566
] 

Vijay commented on CASSANDRA-3838:
----------------------------------

Hi Sylvain,
My observation on this is that... when there is network congestion the Routers will start
to drop the packets and which will cause the write on the socket to hang.... Until we write
again to the socket we will not know if the socket is closed or not... hence it will be better
to have it in both the sides... 

I will add streaming_socket_timeout and add documentation in the next patch... if you are
ok with the above Thanks!
                
> Repair Streaming hangs between multiple regions
> -----------------------------------------------
>
>                 Key: CASSANDRA-3838
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3838
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.0.8
>
>         Attachments: 0001-Add-streaming-socket-timeouts.patch
>
>
> Streaming hangs between datacenters, though there might be multiple reasons for this,
a simple fix will be to add the Socket timeout so the session can retry.
> The following is the netstat of the affected node (the below output remains this way
for a very long period).
> [test_abrepairtest@test_abrepair--euwest1c-i-1adfb753 ~]$ nt netstats
> Mode: NORMAL
> Streaming to: /50.17.92.159
>    /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2221-Data.db sections=7002 progress=1523325354/2475291786
- 61%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2233-Data.db sections=4581 progress=0/595026085
- 0%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-g-2235-Data.db sections=6631 progress=0/2270344837
- 0%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2239-Data.db sections=6266 progress=0/2190197091
- 0%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2230-Data.db sections=7662 progress=0/3082087770
- 0%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2240-Data.db sections=7874 progress=0/587439833
- 0%
>    /mnt/data/cassandra070/data/abtests/cust_allocs-g-2226-Data.db sections=7682 progress=0/2933920085
- 0%
> "Streaming:1" daemon prio=10 tid=0x00002aaac2060800 nid=0x1676 runnable [0x000000006be85000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at com.sun.net.ssl.internal.ssl.OutputRecord.writeBuffer(OutputRecord.java:297)
>         at com.sun.net.ssl.internal.ssl.OutputRecord.write(OutputRecord.java:286)
>         at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:743)
>         at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:731)
>         at com.sun.net.ssl.internal.ssl.AppOutputStream.write(AppOutputStream.java:59)
>         - locked <0x00000006afea1bd8> (a com.sun.net.ssl.internal.ssl.AppOutputStream)
>         at com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:133)
>         at com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203)
>         at com.ning.compress.lzf.LZFOutputStream.flush(LZFOutputStream.java:117)
>         at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:152)
>         at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Streaming from: /46.51.141.51
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2241-Data.db sections=7231
progress=0/1548922508 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2231-Data.db sections=4730
progress=0/296474156 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2244-Data.db sections=7650
progress=0/1580417610 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2217-Data.db sections=7682
progress=0/196689250 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2220-Data.db sections=7149
progress=0/478695185 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2171-Data.db sections=443
progress=0/78417320 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-g-2235-Data.db sections=6631
progress=0/2270344837 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2222-Data.db sections=4590
progress=0/1310718798 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2233-Data.db sections=4581
progress=0/595026085 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-g-2226-Data.db sections=7682
progress=0/2933920085 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2213-Data.db sections=7876
progress=0/3308781588 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2216-Data.db sections=7386
progress=0/2868167170 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2240-Data.db sections=7874
progress=0/587439833 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2254-Data.db sections=4618
progress=0/215989758 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2221-Data.db sections=7002
progress=1542191546/2475291786 - 62%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2239-Data.db sections=6266
progress=0/2190197091 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2210-Data.db sections=6698
progress=0/2304563183 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2230-Data.db sections=7662
progress=0/3082087770 - 0%
>    abtests: /mnt/data/cassandra070/data/abtests/cust_allocs-hc-2229-Data.db sections=7386
progress=0/1324787539 - 0%
> "Thread-198896" prio=10 tid=0x00002aaac0e00800 nid=0x4710 runnable [0x000000004251b000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
>         at com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(InputRecord.java:405)
>         at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:360)
>         at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:798)
>         - locked <0x00000005e220a170> (a java.lang.Object)
>         at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:755)
>         at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
>         - locked <0x00000005e220a1b8> (a com.sun.net.ssl.internal.ssl.AppInputStream)
>         at com.ning.compress.lzf.LZFDecoder.readFully(LZFDecoder.java:392)
>         at com.ning.compress.lzf.LZFDecoder.decompressChunk(LZFDecoder.java:190)
>         at com.ning.compress.lzf.LZFInputStream.readyBuffer(LZFInputStream.java:254)
>         at com.ning.compress.lzf.LZFInputStream.read(LZFInputStream.java:129)
>         at java.io.DataInputStream.readFully(DataInputStream.java:178)
>         at java.io.DataInputStream.readLong(DataInputStream.java:399)
>         at org.apache.cassandra.utils.BytesReadTracker.readLong(BytesReadTracker.java:115)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:119)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
>         at org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:244)
>         at org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:148)
>         at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:90)
>         at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:185)
>         at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:81)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message