accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ott, Charles H." <CHARLES.H....@saic.com>
Subject RE: Thread "shell" Stuck on IO
Date Thu, 18 Oct 2012 14:46:50 GMT
I apologize for not giving more information from the start.

 

I am running a single instance on a single virtual server.  Zookeeper
shows a single server ssdev:2181 in 'standalone' mode.

 

This is a development system and there are no tables at this time.  The
IP conflict issue was noticed when I tried to create a table for the
first time the shell started to hang.

 

I have tried restarting the system but have been seeing the message:
"Recovery of 192.168.0.130:11224:[some UUID] failed." And the shell
still hangs when performing a scan or createtable.

 

I will look into "re-initializing" the server.

 

From: user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1496-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Eric Newton
Sent: Thursday, October 18, 2012 7:41 AM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

 

The reference to 192.168.0.130 is in zookeeper or the metadata table.

 

Unfortunately, this is a known problem with 1.3 and 1.4.  You can't
change your IP addresses.  You can incrementally shutdown servers and
change the IP address one-at-a-time, but not all at once.

 

If this is a dev system and you don't need the data, the fastest thing
to do is to reset the system and re-load your test data.

 

If you can't reload your data, you will have to move your data in hdfs,
re-initialize and bulk-import the existing tables.

 

-Eric

 

On Wed, Oct 17, 2012 at 5:40 PM, Ott, Charles H.
<CHARLES.H.OTT@saic.com> wrote:

I believe you have already helped me get on the right track...

First, 192.168.0.130 is the IP that the VM came with preconfigured.
I changed the IP for this new environment in RHEL5 and "most" everything
seems to be running... however, the fact that it is reporting
192.168.0.130 tells me that somewhere in the logger configuration it's
still using the old IP?

All of the properties files I have looked at specify the hostname, not
IP... I checked the hosts file and the hostname is resolving the proper
IP, so that shouldn't be an issue.

When I try to start the logger with:

# ./cloudbase.sh logger

 I see:
Failed to initialize log service args=[]
        java.io.IOException: Failed to acquire lock file
                at
cloudbase.server.logger.LogService.<init>(LogService.java:122)
                at
cloudbase.server.logger.LogService.main(LogService.java:83)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
                at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
                at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at cloudbase.start.Main$1.run(Main.java:73)
                at java.lang.Thread.run(Thread.java:662)


-----Original Message-----
From: user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1492-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Keith Turner
Sent: Wednesday, October 17, 2012 5:09 PM
To: user@accumulo.apache.org
Subject: Re: Thread "shell" Stuck on IO

Is the logger at 192.168.0.130 running.   The stack trace indicates
that the master was attempting to contact the logger at 192.168.0.130 to
initiate log recovery.

On Wed, Oct 17, 2012 at 4:58 PM, Ott, Charles H.
<CHARLES.H.OTT@saic.com> wrote:
> I am using a VMware ESXi 4.1 server  with Cloudbase(Accumulo)  on
RHEL5.
>
> I cannot start with a fresh install because I am somewhat required to
> use the preconfigured image on the vm. (business rules out of my
> hands)
>
> Unfortunately the support for this preconfigured instance is not

> available and I am tasked with getting it working anyway...

>
>
>
> I am able to log into the shell and view the tables, however if  I
> attempt to create a table or perform a scan, a line return is shown
> and then it just hangs there until finally throwing the following
error:
>
> WARN thread "shell" stuck on IO to ssdev:9999:9999 (0) for at least
> 120044 ms.
>
>
>
> I did also discover that 9999 is the property: master.port.client in
> my conf/accumulo-site.xml
>
>
>
> There is also an event log that was added to the VM with web based UI
> reporting:
>
> Unable to recover
>
192.168.0.130:11224/b4da830b-8ecb-4868-a480-35a39f4af17a(java.io.IOExcep
tion:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out)
>
>          java.io.IOException:
> org.apache.thrift.transport.TTransportException:
java.net.ConnectException:
> Connection timed out
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:75)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.startCopy(C
> oordinateRecoveryTask.java:109)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask$RecoveryJob.access$400(
> CoordinateRecoveryTask.java:93)
>
>                  at
> cloudbase.server.master.CoordinateRecoveryTask.recover(CoordinateRecov
> eryTask.java:279)
>
>                  at
> cloudbase.server.master.Master$TabletGroupWatcher.run(Master.java:1155
> )
>
>          Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection timed out
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:428)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:415)
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.getTransport(ThriftTran
> sportPool.java:392)
>
>                  at
> cloudbase.core.util.ThriftUtil.getClient(ThriftUtil.java:58)
>
>                  at
> cloudbase.server.tabletserver.log.RemoteLogger.<init>(RemoteLogger.jav
> a:73)
>
>                  ... 4 more
>
>          Caused by: java.net.ConnectException: Connection timed out
>
>                  at sun.nio.ch.Net.connect(Native Method)
>
>                  at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:81)
>
>                  at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:65)
>
>                  at
> cloudbase.core.util.TTimeoutTransport.create(TTimeoutTransport.java:23
> )
>
>                  at
> cloudbase.core.client.impl.ThriftTransportPool.createNewTransport(Thri
> ftTransportPool.java:426)
>
>                  ... 8 more
>
>
>
>
>
> I have seen posts relating this to the walogs folder not being
> available, but I have checked that and the .lock file is being created
automatically.
>
> A #netstat | grep 9999 shows no processes using 9999 before logging

> into the shell... so Im not sure there is a port conflict either.

>
>
>
> Any thoughts on the matter would be greatly appreciated.

 


Mime
View raw message