incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Repair fails with java.io.IOError: java.io.EOFException
Date Thu, 21 Jul 2011 23:30:47 GMT
The default init.d script will direct std out/err to that file, how are you starting brisk
/ cassandra ?

Check the syslog and other logs in /var/log to see if the OS killed cassandra. 

Also, what was the last thing in the casandra log before INFO [main] 2011-07-21 15:48:07,233
AbstractCassandraDaemon.java (line 78) Logging initialised ?


Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jul 2011, at 10:50, Sameer Farooqui wrote:

> Hey Aaron,
> 
> I don't have any output.log files in that folder:
> 
> ubuntu@ip-10-2-x-x:~$ cd /var/log/cassandra
> ubuntu@ip-10-2-x-x:/var/log/cassandra$ ls
> system.log     system.log.11  system.log.4  system.log.7
> system.log.1   system.log.2   system.log.5  system.log.8
> system.log.10  system.log.3   system.log.6  system.log.9
> 
> 
> 
> On Thu, Jul 21, 2011 at 3:40 PM, aaron morton <aaron@thelastpickle.com> wrote:
> Check /var/log/cassandra/output.log (assuming the default init scripts)
> 
> A
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 22 Jul 2011, at 10:13, Sameer Farooqui wrote:
> 
>> Hmm. Just looked at the log more closely.
>> 
>> So, what actually happened is while Repair was running on this specific node, the
Cassandra java process terminated itself automatically. The last entries in the log are:
>> 
>>  INFO [ScheduledTasks:1] 2011-07-21 13:00:20,285 GCInspector.java (line 128) GC for
ParNew: 214 ms, 162748656 reclaimed leaving 1845274888 used; max is 4030726144
>>  INFO [ScheduledTasks:1] 2011-07-21 13:00:27,375 GCInspector.java (line 128) GC for
ParNew: 266 ms, 158835624 reclaimed leaving 1864471688 used; max is 4030726144
>>  INFO [ScheduledTasks:1] 2011-07-21 13:00:57,658 GCInspector.java (line 128) GC for
ParNew: 251 ms, 148861328 reclaimed leaving 1931111120 used; max is 4030726144
>>  INFO [ScheduledTasks:1] 2011-07-21 13:01:19,358 GCInspector.java (line 128) GC for
ParNew: 260 ms, 157638152 reclaimed leaving 1955746368 used; max is 4030726144
>>  INFO [ScheduledTasks:1] 2011-07-21 13:01:22,729 GCInspector.java (line 128) GC for
ParNew: 325 ms, 154157352 reclaimed leaving 1969361176 used; max is 4030726144
>>  INFO [ScheduledTasks:1] 2011-07-21 13:01:51,187 GCInspector.java (line 128) GC for
ParNew: 202 ms, 153219160 reclaimed leaving 2040879600 used; max is 4030726144
>>  
>> When we came in this morning, nodetool ring from another node showed the 1st node
as down and OpsCenter also reported it as down.
>> 
>> Next we ran "sudo netstat -anp | grep 7199" from the 1st node to see the status of
the Cassandra PID and it was not running.
>> 
>> We then started Cassandra:
>> 
>> INFO [main] 2011-07-21 15:48:07,233 AbstractCassandraDaemon.java (line 78) Logging
initialized
>>  INFO [main] 2011-07-21 15:48:07,266 AbstractCassandraDaemon.java (line 96) Heap
size: 3894411264/3894411264
>>  INFO [main] 2011-07-21 15:48:11,678 CLibrary.java (line 106) JNA mlockall successful
>>  INFO [main] 2011-07-21 15:48:11,702 DatabaseDescriptor.java (line 121) Loading settings
from file:/home/ubuntu/brisk/resources/cassandra/conf/cassandra.yaml
>> 
>> 
>> It was during this start process that the java.io.EOFException was seen, but yes,
like you said Jonathan, the Cassandra process started back up and joined the ring. 
>> 
>> We're now wondering why the Repair failed and why Cassandra crashed in the first
place. We only had default level logging enabled. Is there something else I can check or that
you suspect?
>> 
>> Should we turn the logging up to debug and retry the Repair?
>> 
>> 
>> - Sameer
>> 
>> 
>> On Thu, Jul 21, 2011 at 12:37 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> Looks harmless to me.
>> 
>> On Thu, Jul 21, 2011 at 1:41 PM, Sameer Farooqui
>> <cassandralabs@gmail.com> wrote:
>> > While running Repair on a 0.8.1 node, we got this error in the system.log:
>> >
>> > ERROR [Thread-23] 2011-07-21 15:48:43,868 AbstractCassandraDaemon.java (line
>> > 113) Fatal exception in thread Thread[Thread-23,5,main]
>> > java.io.IOError: java.io.EOFException
>> > at
>> > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
>> > Caused by: java.io.EOFException
>> > at java.io.DataInputStream.readInt(DataInputStream.java:375)
>> > at
>> > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>> >
>> > There's just a bunch of informational messages about Gossip before this.
>> >
>> > Looks like the file or stream unexpectedly ended?
>> > http://download.oracle.com/javase/1.4.2/docs/api/java/io/EOFException.html
>> >
>> > Is this a bug or something wrong in our environment?
>> >
>> >
>> > - Sameer
>> >
>> 
>> 
>> 
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>> 
> 
> 


Mime
View raw message