accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: Connection Refused Reading Instance Id From HDFS
Date Wed, 09 Jul 2014 18:03:00 GMT
sounds like it. does the system already wait for the namenode to come up
before starting the Accumulo related daemons?


On Wed, Jul 9, 2014 at 12:34 PM, David Medinets <david.medinets@gmail.com>
wrote:

> Line 116 of namenode log:
> 14/07/09 12:33:44 INFO namenode.NameNode: NameNode RPC up at:
> grail/172.17.0.40:8020
>
>
> Last modified timestamp of gc log:
> $ date -r accumulo-gc-stderr---supervisor-6DKWec.log
> Wed Jul  9 12:33:44 EDT 2014
>
> They correspond to each other. So, perhaps, the current situation is good
> enough? The error message can simply be ignored?
>
>
>
> On Wed, Jul 9, 2014 at 12:49 PM, Sean Busbey <busbey@cloudera.com> wrote:
>
> > if you look at line 116 of the namenode log, it took ~3 seconds for it to
> > get through start up, out of safe mode, and ready to deal with RPC.
> >
> > if the gc / master / whatever is trying to grab things prior to that, it
> > would explain the failure you're seeing.
> >
> > What's the last modified timestamp on that GC log?
> >
> >
> > On Wed, Jul 9, 2014 at 11:43 AM, David Medinets <
> david.medinets@gmail.com>
> > wrote:
> >
> > > I see the same connection refused message in master, monitor, tserver,
> > and
> > > gc logs. Maybe it's a coordination issue where one process takes a bit
> of
> > > time to fully start?
> > >
> > > GIST of ./make_image.sh
> > > https://gist.github.com/medined/c1677c278bd72cd2ee64
> > >
> > > $ ./make_container.sh grail grail
> > > db73a147ef89a683435475ac9fcba7ef31219710e39759afd87b0742f6cc7490
> > >
> > > GIST of file diff from base image
> > > https://gist.github.com/medined/fe7d04bd4aaa902a66ed
> > >
> > > Use ./enter_image.sh grail to investigate container
> > >
> > > GIST of namenode log
> > > https://gist.github.com/medined/46ce499603d75888a118
> > >
> > > GIST of datanode log
> > > https://gist.github.com/medined/1e99bb6db3466427013d
> > >
> > > GIST of accumulo gc log
> > > https://gist.github.com/medined/c1abf7eaee876057e38f
> > >
> > >
> > >
> > >
> > > On Wed, Jul 9, 2014 at 12:29 PM, Sean Busbey <busbey@cloudera.com>
> > wrote:
> > >
> > > > can you put your namenode, datanode, and gc logs into a gist?
> > > >
> > > >
> > > > On Wed, Jul 9, 2014 at 11:20 AM, David Medinets <
> > > david.medinets@gmail.com>
> > > > wrote:
> > > >
> > > > > I updated my accumulo-env.sh so that Accumulo uses IPV4. And now
> the
> > > > shell
> > > > > starts, I can create tables and insert entries. I can even view the
> > > > monitor
> > > > > page. However, the logs still show the connection refused error.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jul 9, 2014 at 10:38 AM, Sean Busbey <busbey@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > This might be the same ipv6 issue that's causing your monitor
> > > failures.
> > > > > >
> > > > > > --
> > > > > > Sean
> > > > > > On Jul 9, 2014 9:13 AM, "David Medinets" <
> david.medinets@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I'm hoping someone has a few minutes to help debug this
> > networking
> > > > > issue.
> > > > > > > I'm running a single-node Accumulo 1.5.1 instance (using
Hadoop
> > > 2.4)
> > > > > > inside
> > > > > > > Docker. I can find the Accumulo instance id manually:
> > > > > > >
> > > > > > > -bash-4.1# hdfs dfs -ls /accumulo/instance_id
> > > > > > > Found 1 items
> > > > > > > -rw-r--r--   1 accumulo accumulo          0 2014-07-09
08:22
> > > > > > > /accumulo/instance_id/9421cd33-5f37-4f6d-b645-372feb431cae
> > > > > > >
> > > > > > > But when the gc tries to find the instance id, I see this
> message
> > > in
> > > > > the
> > > > > > > log file:
> > > > > > >
> > > > > > > -bash-4.1# cat
> > > > > > > /var/log/supervisor/accumulo-gc-stdout---supervisor-nwDCZU.log
> > > > > > > 2014-07-09 09:04:07,006 [client.ZooKeeperInstance] ERROR:
> Problem
> > > > > reading
> > > > > > > instance id out of hdfs at /accumulo/instance_id
> > > > > > > java.net.ConnectException: Call From grail/172.17.0.2 to
> > > grail:8020
> > > > > > failed
> > > > > > > on connection exception: java.net.ConnectException: Connection
> > > > refused;
> > > > > > >
> > > > > > > The hostname is 'grail' which resolves to 172.17.0.2 via
> > /etc/hosts
> > > > and
> > > > > > it
> > > > > > > can ping itself:
> > > > > > >
> > > > > > > david@zareason-verix545:~/projects/docker-builds/accumulo$
> > > > > > > ./enter_image.sh
> > > > > > > grail
> > > > > > > -bash-4.1# ping grail
> > > > > > > PING grail (172.17.0.2) 56(84) bytes of data.
> > > > > > > 64 bytes from grail (172.17.0.2): icmp_seq=1 ttl=64 time=0.083
> ms
> > > > > > > 64 bytes from grail (172.17.0.2): icmp_seq=2 ttl=64 time=0.054
> ms
> > > > > > >
> > > > > > > Do I need to use IP addresses in the 'gc' and other
> configuration
> > > > > files?
> > > > > > >
> > > > > > > Any ideas for me to try?
> > > > > > >
> > > > > > > When the NameNode starts, the log file has no errors. It
starts
> > > with:
> > > > > > >
> > > > > > > STARTUP_MSG: Starting NameNode
> > > > > > > STARTUP_MSG:   host = grail/172.17.0.2
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Sean
> > > >
> > >
> >
> >
> >
> > --
> > Sean
> >
>



-- 
Sean

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message