cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Steinmaurer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14691) Cassandra 2.1 backport - The JVM should exit if jmx fails to bind
Date Wed, 05 Sep 2018 13:21:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604394#comment-16604394
] 

Thomas Steinmaurer commented on CASSANDRA-14691:
------------------------------------------------

Could be a source for corruption.
{noformat}
ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error starting local jmx server:

java.rmi.server.ExportException: Port already in use: 7199; nested exception is: 
                java.net.BindException: Address already in use (Bind failed)
…
 
But then continuing with stuff like opening system and even user tables:
 
INFO  [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing key cache with capacity
of 100 MBs.
INFO  [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing row cache with capacity
of 0 MBs
INFO  [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing counter cache with
capacity of 50 MBs
INFO  [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling counter cache save
to every 7200 seconds (going to save all keys).
INFO  [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 - Initializing system.sstable_activity
INFO  [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165
(2023 bytes)
INFO  [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167
(2336 bytes)
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166
(2686 bytes)
INFO  [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 - Initializing system.hints
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377
(46210621 bytes)
INFO  [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 - Initializing system.compaction_history
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475 - Opening /var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129
(91269 bytes)
…
 
Replaying commit logs:
 
…
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 - Replaying /var/opt/xxx-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
(CL version 4, messaging version 8)
…
 
Even writing memtables already (below just pasted system tables, but also user tables):
 
…
INFO  [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 - Writing Memtable-size_estimates@1941663179(2.655MiB
serialized bytes, 325710 ops, 2%/0% of on/off-heap limit)
INFO  [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 - Writing Memtable-peer_events@1474667699(0.199KiB
serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
…
 
Until it comes to a point where it can’t bind ports like the storage port 7000:
 
ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: /XXX:7000 is in use by another process.
 Change listen_address:storage_port in cassandra.yaml to values that do not conflict with
other services
                at org.apache.cassandra.net.MessagingService.getServerSockets(MessagingService.java:495)
~[apache-cassandra-2.1.18.jar:2.1.18]
…
 
Until Cassandra stops:
 
…
INFO  [StorageServiceShutdownHook] 2018-08-05 21:11:54,361 Gossiper.java:1454 - Announcing
shutdown
…
 
 
So, we have around 2 minutes where Cassandra is mangling with existing data, although it shouldn’t.
 
Sounds like a potential candidate for data corruption, right? E.g. later on we then see things
like (still while being in progress to shutdown?):
 
WARN  [SharedPool-Worker-1] 2018-08-05 21:11:58,181 AbstractTracingAwareExecutorService.java:169
- Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
java.lang.RuntimeException: java.io.FileNotFoundException: /var/opt/xxx-managed/cassandra/xxx/xxx-fdc68b70950611e8ad7179f2d5bfa3cf/xxx-xxx-ka-15-Data.db
(No such file or directory)
                at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:52)
~[apache-cassandra-2.1.18.jar:2.1.18]
                at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createPooledReader(CompressedPoolingSegmentedFile.java:95)
~[apache-cassandra-2.1.18.jar:2.1.18]
                at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:62)
~[apache-cassandra-2.1.18.jar:2.1.18]
…

{noformat}

> Cassandra 2.1 backport - The JVM should exit if jmx fails to bind
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-14691
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14691
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Thomas Steinmaurer
>            Assignee: Alex Petrov
>            Priority: Major
>              Labels: lhf
>             Fix For: 2.1.x
>
>
> If you are already running a cassandra instance, but for some reason try to start another
one, this happens:
> {noformat}
> INFO  20:57:09 JNA mlockall successful
> WARN  20:57:09 JMX is not enabled to receive remote connections. Please see cassandra-env.sh
for more info.
> ERROR 20:57:10 Error starting local jmx server:
> java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
>         java.net.BindException: Address already in use
>         at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:340) ~[na:1.7.0_76]
>         at sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:248) ~[na:1.7.0_76]
>         at sun.rmi.transport.tcp.TCPEndpoint.exportObject(TCPEndpoint.java:411) ~[na:1.7.0_76]
>         at sun.rmi.transport.LiveRef.exportObject(LiveRef.java:147) ~[na:1.7.0_76]
>         at sun.rmi.server.UnicastServerRef.exportObject(UnicastServerRef.java:207) ~[na:1.7.0_76]
>         at sun.rmi.registry.RegistryImpl.setup(RegistryImpl.java:122) ~[na:1.7.0_76]
>         at sun.rmi.registry.RegistryImpl.<init>(RegistryImpl.java:98) ~[na:1.7.0_76]
>         at java.rmi.registry.LocateRegistry.createRegistry(LocateRegistry.java:239) ~[na:1.7.0_76]
>         at org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:100)
[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:222)
[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564)
[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653)
[main/:na]
> Caused by: java.net.BindException: Address already in use
>         at java.net.PlainSocketImpl.socketBind(Native Method) ~[na:1.7.0_76]
>         at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376) ~[na:1.7.0_76]
>         at java.net.ServerSocket.bind(ServerSocket.java:376) ~[na:1.7.0_76]
>         at java.net.ServerSocket.<init>(ServerSocket.java:237) ~[na:1.7.0_76]
>         at javax.net.DefaultServerSocketFactory.createServerSocket(ServerSocketFactory.java:231)
~[na:1.7.0_76]
>         at org.apache.cassandra.utils.RMIServerSocketFactoryImpl.createServerSocket(RMIServerSocketFactoryImpl.java:13)
~[main/:na]
>         at sun.rmi.transport.tcp.TCPEndpoint.newServerSocket(TCPEndpoint.java:666) ~[na:1.7.0_76]
>         at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:329) ~[na:1.7.0_76]
>         ... 11 common frames omitted
> {noformat}
> However the startup continues, and ends up replaying commitlogs, which is probably not
a good thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message