activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Octavian Covalschi <octavian.covals...@gmail.com>
Subject Re: Apollo 1.7 - LevelDB corruption
Date Sun, 15 Jun 2014 16:37:16 GMT
Yes, I thought that too, but even after multiple restarts it still wasn't
working... I was getting ClosedConnectionException  (or something like
that) on the consumer side... so recovery message is misleading.

PS: I'm have an ssd, w/o much of OS tuning though...


On Sun, Jun 15, 2014 at 12:28 PM, Johan Edstrom <seijoed@gmail.com> wrote:

> That looks like it was corrupted and then repaired.
> Since running on a laptop, suspended disks, write caches
> and a lot of other things could come to mind for the corruption?
>
>
> On Jun 15, 2014, at 10:23 AM, Octavian Covalschi <
> octavian.covalschi@gmail.com> wrote:
>
> > Hello,
> >
> > I'm evaluating Apollo for our current project and so far it worked well
> for
> > our use case, however today I've encountered a problem, corrupted
> LevelDB...
> >
> > What could possibly corrupt it? I have Apollo 1.7 installed on my laptop
> > and everything was fine till today when I discovered the problem (see
> some
> > log messages below). Nothing unusual happened, so cannot link this with
> any
> > incident...
> >
> > So I was wondering:
> >
> > What's the proper fix for this? This time i deleted all files in data dir
> > and restarted apollo, but I'm not sure it's the way to go in a production
> > environment...
> >
> > What can be done to avoid disasters? A msg queue is a critical component
> of
> > our architecture so, if that stops most of the functionalities are
> stopped
> > too...
> >
> > Is there an easy way to replicate persisted messages it?
> >
> > Is there a more verbose logging that I could use to monitor/find the
> > problem in future?
> >
> > Should I use different storage engine?
> >
> > Is there a best practice/pattern to recover from this kind of situations?
> > Like re-publishing messages, but it increases complexity of our app.
> >
> > Thank you in advance.
> >
> >
> > [Log messages]
> >
> > 2014-06-15 11:57:16,866 | INFO  | OS     : Linux 3.14.5-200.fc20.x86_64
> > ("Fedora release 20 (Heisenbug)") |
> > 2014-06-15 11:57:16,870 | INFO  | JVM    : Java HotSpot(TM) 64-Bit Server
> > VM 1.7.0_51 (Oracle Corporation) |
> > 2014-06-15 11:57:16,870 | INFO  | Apollo : 1.7 (at: /opt/apollo/home) |
> > 2014-06-15 11:57:16,871 | INFO  | OS is restricting the open file limit
> to:
> > 100000 |
> > 2014-06-15 11:57:17,077 | INFO  | Starting store: leveldb store at
> > /opt/apollo/brokers/local/data |
> > 2014-06-15 11:57:17,139 | INFO  | Accepting connections at: tcp://
> > 0.0.0.0:60013 |
> > 2014-06-15 11:57:17,144 | INFO  | Opening the log file took: 27.97 ms |
> > 2014-06-15 11:57:17,196 | WARN  | DB operation failed. (entering recovery
> > mode): org.fusesource.leveldbjni.internal.NativeDB$DBException:
> Corruption:
> > CURRENT file does not end with newline | 146a03f386f
> > 2014-06-15 11:57:18,081 | INFO  | virtual host startup is waiting on
> store
> > startup |
> > 2014-06-15 11:57:18,199 | INFO  | DB recovered from failure. |
> > 2014-06-15 11:57:18,200 | ERROR | Store startup failure:
> > org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption:
> > CURRENT file does not end with newline | 146a03f3870
> > 2014-06-15 11:57:18,201 | INFO  | virtual host startup is no longer
> > waiting.  It waited a total of 1 seconds. |
> > 2014-06-15 11:57:18,305 | INFO  | Administration interface available at:
> > http://127.0.0.1:60080/
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message