accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Mullins <padraigdida...@gmail.com>
Subject Re: Accumulo Upgrade from 1.4.2 to 1.5.0 Issues
Date Mon, 04 Nov 2013 21:58:22 GMT
It begins to show various txid lines
Then the errors begin.

Thread "org.apache.accumulo.server.fate.Admin" died null
It then goes into a series of errors.
These all stem at the bottom of them from a deserialize error in a section
starting with a "caused by:java.io.EOFException" error.

I will attempt to get a printout of the errors to hand type in if need be.





On Mon, Nov 4, 2013 at 1:51 PM, Eric Newton <eric.newton@gmail.com> wrote:

> These symptoms would appear to be caused by problems with table
> operations, which are heavily dependent on the master being able to
> use data in zookeeper.
>
> So, try to find the first errors, especially those related to
> serialization or deserialization closest to when the master first
> started.
>
> What do you get when you run:
>
> $ ./bin/accumulo org.apache.accumulo.server.fate.Admin print
>
> ?
>
> -Eric
>
>
> On Fri, Nov 1, 2013 at 4:17 PM, Dave Mullins <padraigdidanan@gmail.com>
> wrote:
> > Hadoop version 0.20.2-cdh3u5
> > This was installed from the cdh rpms but is not controlled by a cloudera
> > manager.
> >
> > I read what documentation I could find on the upgrade.
> > I installed from the tarball version of 1.5.0.
> > I made sure to include the commons collection in the accumulo library
> path.
> > I made sure to add the dfs.support.append true to the hdfs-site files.
> > I did a complete restart ( to include a reboot) of the system.
> >
> > All of the tablet servers come online
> > all the master's services come online and seem to be working. (The
> monitor
> > does show the correct number of tablets, tablet servers, and so forth.)
> >
> > I am able to use some of the features of the accumulo shell
> > I can display the contents of a table.
> > I can't create or delete a table without getting the following error:
> > [impl.ThriftTransportPool] WARN: Thread "shell" stuck on io to
> > x.x.x.x:9999:9999 (0) for at least 120040 ms
> >
> > When I go digging in the logs I find very few errors. (These systems are
> not
> > on a net I can cut and paste to here so I am trying to represent the
> issue
> > as best I can.)
> >
> > There are 4 errors that the Repo runner [0-3] threads died
> >
> > Another error that springs up occasionally is : WARN: Thread "GC" stuck
> on
> > io to x.x.x.x:9999:9999 (0) for at least 120040 ms
> >
> > A netstat run before I start the master up shows nothing running on port
> > 9999 nor any connections to that port.
> > A netstat after about the accumulo start shows about 16 connections in a
> > TIME_WAIT state in the 35k-36k port range from the master. It also show
> an
> > established state for 1 both both direction (36783) and inbound from port
> > 9999 to port 47636 also from the master.
> >
> > It seems after this point anything that tries to connect to port 9999
> goes
> > into a TIME_WAIT and never does anything.
> >
> > I have checked all the permissions I can think of and everything seems
> to be
> > correct.
> > HDFS is running correctly and jobs not associated with accumulo all see
> to
> > be working.
>

Mime
View raw message