Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA4F61024D for ; Thu, 12 Sep 2013 18:36:36 +0000 (UTC) Received: (qmail 28758 invoked by uid 500); 12 Sep 2013 18:28:33 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 28266 invoked by uid 500); 12 Sep 2013 18:28:26 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 28066 invoked by uid 99); 12 Sep 2013 18:28:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Sep 2013 18:28:24 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pgcarlson@gmail.com designates 209.85.214.171 as permitted sender) Received: from [209.85.214.171] (HELO mail-ob0-f171.google.com) (209.85.214.171) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Sep 2013 18:28:19 +0000 Received: by mail-ob0-f171.google.com with SMTP id wm4so179089obc.30 for ; Thu, 12 Sep 2013 11:27:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=eDVa7HXAzGqtHd8yeUdtfnOOO7Q0QrGKBKVybV5Le9w=; b=JqL7ZLZ8zsAgOlu0q3nhvVSgT4W0qswaiGiy/jAP+0n84wJGg4m72x0o1pfE11iuma gdS7YPe62z3XTEZawHF0FyLCK0BMr2xYGgdrbm4PHHD7vsSa4QT8YXm2v81w4IyTwKrx ibqj6cDS5b/PT93pExca4y2IFbvkNj99qPvA9pokpYHhoOhX+p+Bk67hzLt1cDw4pUcs J3/c6gz2V5wX2lcMEyBi/E40c+bOJFMCJTZ0wsx8ngmN9PkEWWgTmOBqG+GOBS388pOZ 8shh8WaGmVjRJT17pYB7i08sdMUbpL1RO9pvaLE1edUnedvq0G3IKka23rYVeFwMtCGg n9Nw== MIME-Version: 1.0 X-Received: by 10.60.133.133 with SMTP id pc5mr2979945oeb.63.1379010479108; Thu, 12 Sep 2013 11:27:59 -0700 (PDT) Received: by 10.182.72.161 with HTTP; Thu, 12 Sep 2013 11:27:59 -0700 (PDT) Date: Thu, 12 Sep 2013 14:27:59 -0400 Message-ID: Subject: My Accumulo 1.5.0 instance has no tablet servers From: Pete Carlson To: "user@accumulo.apache.org" Cc: josh.elser@gmail.com, Eric Newton , keith.turner@deenlo.com Content-Type: multipart/alternative; boundary=047d7b47249cbfb96c04e633e56a X-Virus-Checked: Checked by ClamAV on apache.org --047d7b47249cbfb96c04e633e56a Content-Type: text/plain; charset=ISO-8859-1 Ok, so now that I have an Accumulo monitor I discovered that my Accumulo instance doesn't have any tablet servers. Here is what I tried so far to resolve the issue: 1) Looked in the tserver_localhost.localdomain.log file, and found this FATAL message: 2013-09-12 08:09:42,273 [tabletserver.TabletServer] FATAL: Must set dfs.durable.sync OR dfs.support.append to true. Which one needs to be set depends on your version of HDFS. See ACCUMULO-623. HADOOP RELEASE VERSION SYNC NAME DEFAULT Apache Hadoop 0.20.205 dfs.support.append false Apache Hadoop 0.23.x dfs.support.append true Apache Hadoop 1.0.x dfs.support.append false Apache Hadoop 1.1.x dfs.durable.sync true Apache Hadoop 2.0.0-2.0.2 dfs.support.append true Cloudera CDH 3u0-3u3 ???? true Cloudera CDH 3u4 dfs.support.append true Hortonworks HDP `1.0 dfs.support.append false Hortonworks HDP `1.1 dfs.support.append false 2013-09-12 11:54:00,752 [server.Accumulo] INFO : tserver starting 2013-09-12 11:54:00,768 [server.Accumulo] INFO : Instance d57cdc38-8ceb-4192-9da3-1ce2664df33b 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Data Version 5 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Attempting to talk to zookeeper 2013-09-12 11:54:00,952 [server.Accumulo] INFO : Zookeeper connected and initialized, attemping to talk to HDFS 2013-09-12 11:54:00,956 [server.Accumulo] INFO : Connected to HDFS 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.delay = 5m 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.start = 30s 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.port.client = 50091 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.threads.delete = 16 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.trash.ignore = false I saw this same FATAL message 8 times in the tserver_localhost.localdomain.log between blocks of INFO messages, but no other fatal or warn messages. Btw, this FATAL message also appears in my tserver_localhost.localdomain.debug.log file. When I googled this Fatal message I found this page: http://mail-archives.apache.org/mod_mbox/accumulo-user/201304.mbox/%3C515F5518.1090703@gmail.com%3E with the same "WARN: There are no tablet servers: check that zookeeper and accumulo are running." message. I checked http://127.0.0.1:50095/tservers, and it showed that there were no tablet servers online. I looked at http://127.0.0.1:50095/log, and saw the following messages: FATAL: Must set dfs.durable.sync or dfs.support.append to true. Which one needs to be set depends on your version of HDFS. See Accumulo-623. WARN: There are no tablet servers: check that zookeeper and accumulo are running. Using the info from the page I referenced above, I checked my $ACCUMULO_HOME path and realized that I hadn't set that in the conf/accumulo-env.sh So, I set it to the following: test -z "$ACCUMULO_HOME" && export ACCUMULO_HOME=/home/accumulo/accumulo-1.5.0 When I did an echo of $ACCUMULO_HOME it didn't return anything, so I also tried setting it in my bash profile to see if that made any difference (it didn't). I also looked in the lib directory but didn't see any stray jars. In my tracer_localhost_localdomain.log I saw the following Exception with Zookeeper: 2013-09-11 16:09:48,649 [impl.ServerClient] WARN : There are no tablet servers: check that zookeeper and accumulo are running. 2013-09-11 18:02:23,385 [zookeeper.ZooCache] WARN : Zookeeper error, will retry org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /accumulo/d57cdc38-8ceb-4192-9da3-1ce2664df33b/tservers at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468) at org.apache.accumulo.fate.zookeeper.ZooCache$1.run(ZooCache.java:167) at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:130) at org.apache.accumulo.fate.zookeeper.ZooCache.getChildren(ZooCache.java:178) at org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:140) at org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:128) at org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123) at org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:105) at org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71) at org.apache.accumulo.core.client.impl.ConnectorImpl.(ConnectorImpl.java:64) at org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:154) at org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:149) at org.apache.accumulo.server.trace.TraceServer.(TraceServer.java:185) at org.apache.accumulo.server.trace.TraceServer.main(TraceServer.java:260) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.accumulo.start.Main$1.run(Main.java:101) at java.lang.Thread.run(Thread.java:724) 2013-09-12 08:09:44,861 [server.Accumulo] INFO : tracer starting 2013-09-12 08:09:44,926 [server.Accumulo] INFO : Instance d57cdc38-8ceb-4192-9da3-1ce2664df33b 2013-09-12 08:09:44,929 [server.Accumulo] INFO : Data Version 5 2013-09-12 08:09:44,929 [server.Accumulo] INFO : Attempting to talk to zookeeper 2013-09-12 08:09:45,114 [server.Accumulo] INFO : Zookeeper connected and initialized, attemping to talk to HDFS 2013-09-12 08:09:45,130 [server.Accumulo] INFO : Connected to HDFS 2013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.delay = 5m 2013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.start = 30s but then it appeared to reconnect with Zookeeper. 2) I looked at the ACCUMULO-623 Jira ticket from the FATAL message above i.e., https://issues.apache.org/jira/browse/ACCUMULO-623 , but this Jira ticket indicates this issue is fixed in Accumulo 1.5.0 although that ticket references Hadoop 1.0.3, and Zookeeper 3.3.3 (I'm using Hadoop 1.2.1, and Zookeeper 3.4.5) I noticed that a fix was added to Hadoop 1.1 for a related Hadoop Jira ticket. 3) Next, I went to the Accumulo Jira page i.e., https://issues.apache.org/jira/browse/accumulo to look for this issue. Besides ACCUMULO-623, the following tickets are similar but not quite the same: - ACCUMULO-327 ( but I don't have any tablet servers to begin with to be killed) - ACCUMULO-1235 (I only have a the default !METADATA table) 4) Looked again at the User manual to see if there was information about configuring the tablet server, but didn't see anything. Any suggestions on what I should try next? Thanks, Pete --047d7b47249cbfb96c04e633e56a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Ok, so now that I have an Accumulo monitor I discover= ed that my Accumulo instance doesn't have any tablet servers.

Here is what I tried so far to resolve the issue:

1) Looked in the tserve= r_localhost.localdomain.log file, and found this FATAL message:
<= br>2013-09-12 08:09:42,273 [tabletserver.TabletServer] FATAL: Must set dfs.= durable.sync OR dfs.support.append to true.=A0 Which one needs to be set de= pends on your version of HDFS.=A0 See ACCUMULO-623.
HADOOP RELEASE=A0=A0=A0=A0=A0=A0=A0=A0=A0 VERSION=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 SYNC NAME=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 DEFAULT
Apache Hado= op=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.20.205=A0=A0=A0=A0=A0=A0=A0=A0=A0 dfs.su= pport.append=A0=A0=A0 false
Apache Hadoop=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0 0.23.x=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 dfs.support.append=A0=A0=A0 true Apache Hadoop=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 1.0.x=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 dfs.support.append=A0=A0=A0 false
Apache Hadoop=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 1.1.x=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 dfs.durable.sync=A0=A0=A0=A0=A0 true
Apache Hadoop=A0=A0=A0= =A0=A0=A0=A0=A0=A0 2.0.0-2.0.2=A0=A0=A0=A0=A0=A0=A0 dfs.support.append=A0= =A0=A0 true
Cloudera CDH=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 3u0-3u3=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ????=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 true
Cloudera CDH=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 3u4= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 dfs.support.append=A0=A0=A0 true
Hortonworks HDP=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 `1.0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 dfs.support.append=A0=A0=A0 false
Hortonworks HDP=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0 `1.1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 dfs.support.append= =A0=A0=A0 false
2013-09-12 11:54:00,752 [server.Accumulo] INFO : tserver= starting
2013-09-12 11:54:00,768 [server.Accumulo] INFO : Instance d57c= dc38-8ceb-4192-9da3-1ce2664df33b
2013-09-12 11:54:00,771 [server.Accumulo] INFO : Data Version 5
2013-09-= 12 11:54:00,771 [server.Accumulo] INFO : Attempting to talk to zookeeper2013-09-12 11:54:00,952 [server.Accumulo] INFO : Zookeeper connected and i= nitialized, attemping to talk to HDFS
2013-09-12 11:54:00,956 [server.Accumulo] INFO : Connected to HDFS
2013-= 09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.delay =3D 5m
2013-0= 9-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.start =3D 30s
2013-0= 9-12 11:54:00,969 [server.Accumulo] INFO : gc.port.client =3D 50091
2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.threads.delete =3D 162013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.trash.ignore =3D fals= e

I saw this same FATAL message= 8 times in the tserver_localho= st.localdomain.log between bloc= ks of INFO messages, but no other fatal or warn messages.=A0 Btw, th= is FATAL message also appears in my tserver_localhost.localdomain.debug.log= file.

When I googled this Fatal m= essage I found this page:=A0
http://mail-archives.apache.org/mod_mbox/accumulo-user/2= 01304.mbox/%3C515F5518.1090703@gmail.com%3E=A0with the same "WARN:= There are no tablet servers: check that zookeeper and accumulo are running= ." message.

FATAL: Must set dfs.durable.sync or dfs.sup= port.append to true. Which one needs to be set depends on your version of = HDFS. See Accumulo-623.

WARN: There are no tablet servers: check that zoo= keeper and accumulo are running.

Using the info from the page I referenced above, I chec= ked my $ACCUMULO_HOME path and realized that I hadn't set that in the c= onf/accumulo-env.sh

So, I set it to = the following:

test -z "$ACCUMULO_HOME" && expor= t ACCUMULO_HOME=3D/home/accumulo/accumulo-1.5.0

When I did an echo of $ACCUMULO_HOME it didn'= t return anything, so I also tried setting it in my bash profile to see if = that made any difference (it didn't).

I also looked in the lib directory but didn't= see any stray jars.

In my tracer_localhost_localdomain.log I saw the following Exceptio= n with Zookeeper:

2013-09-11 16:09:48,649 [impl.ServerClient] WARN : There are no tablet = servers: check that zookeeper and accumulo are running.
2013-09-11 18:02= :23,385 [zookeeper.ZooCache] WARN : Zookeeper error, will retry
org.apac= he.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode =3D S= ession expired for /accumulo/d57cdc38-8ceb-4192-9da3-1ce2664df33b/tservers<= br> at org.apache.zookeeper.KeeperException.create(KeeperException.java= :127)
at org.apache.zookeeper.KeeperException.create(KeeperExcep= tion.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooK= eeper.java:1468)
at org.apache.accumulo.fate.zookeeper.ZooCache$1.run(ZooCache.java:= 167)
at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCac= he.java:130)
at org.apache.accumulo.fate.zookeeper.ZooCache.getC= hildren(ZooCache.java:178)
at org.apache.accumulo.core.client.impl.ServerClient.getConnection(= ServerClient.java:140)
at org.apache.accumulo.core.client.impl.S= erverClient.getConnection(ServerClient.java:128)
at org.apache.a= ccumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123)<= br> at org.apache.accumulo.core.client.impl.ServerClient.executeRaw(Ser= verClient.java:105)
at org.apache.accumulo.core.client.impl.Serv= erClient.execute(ServerClient.java:71)
at org.apache.accumulo.co= re.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:64)
at org.apache.accumulo.server.client.HdfsZooInstance.getConnector(H= dfsZooInstance.java:154)
at org.apache.accumulo.server.client.Hd= fsZooInstance.getConnector(HdfsZooInstance.java:149)
at org.apac= he.accumulo.server.trace.TraceServer.<init>(TraceServer.java:185)
at org.apache.accumulo.server.trace.TraceServer.main(TraceServer.ja= va:260)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native M= ethod)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMeth= odAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.= java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:101)=
at java.lang.Thread.run(Thread.java:724)
2013-09-12 08:09:44,861= [server.Accumulo] INFO : tracer starting
2013-09-12 08:09:44,926 [serve= r.Accumulo] INFO : Instance d57cdc38-8ceb-4192-9da3-1ce2664df33b
2013-09= -12 08:09:44,929 [server.Accumulo] INFO : Data Version 5
2013-09-12 08:09:44,929 [server.Accumulo] INFO : Attempting to talk to zook= eeper
2013-09-12 08:09:45,114 [server.Accumulo] INFO : Zookeeper connect= ed and initialized, attemping to talk to HDFS
2013-09-12 08:09:45,130 [s= erver.Accumulo] INFO : Connected to HDFS
2013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.delay =3D 5m
2= 013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.start =3D 30s
<= br>
but then it appea= red to reconnect with Zookeeper.

2) I looked at the ACCUMULO-623 Jira ticket from the FA= TAL message above i.e., https://issues.apache.org/jira/browse/ACCUMUL= O-623 , but this Jira ticket indicates this issue is fixed in Accumulo = 1.5.0 although that ticket references Hadoop 1.0.3, and Zookeeper 3.3.3=A0 = (I'm using Hadoop 1.2.1, and Zookeeper 3.4.5) =A0I noticed that a fix w= as added to Hadoop 1.1 for a related Hadoop Jira ticket.

3) Next, I went to the Accumulo Jira page i.e., <= a href=3D"https://issues.apache.org/jira/browse/accumulo" target=3D"_blank"= >https://issues.apache.org/jira/browse/accumulo to look for this issue.= =A0 Besides ACCUMULO-623, the following tickets are similar but not quite t= he same:
  • ACCUMULO-327 ( but I don't have any tablet servers to begin wit= h to be killed)
  • ACCUMULO-1235 (I only have a the default !METAD= ATA table)
4) Looked again at the User manual to se= e if there was information about configuring the tablet server, but didn= 9;t see anything.

Any suggestions on what I should try next?

Thanks,

Pete
<= /div>
--047d7b47249cbfb96c04e633e56a--