accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: My Accumulo 1.5.0 instance has no tablet servers
Date Tue, 01 Oct 2013 19:10:43 GMT
Yep.


On Tue, Oct 1, 2013 at 2:48 PM, Adam Fuchs <afuchs@apache.org> wrote:

> To follow up on this, I think maybe the config should be
> <name>dfs.datanode.synconclose<name>, not <name>dfs.data.synconclose<name>.
> Was that a typo, Eric?
>
> Thanks,
> Adam
>
>
>
> On Thu, Sep 12, 2013 at 2:31 PM, Eric Newton <eric.newton@gmail.com>wrote:
>
>> Add:
>>
>>   <property>
>>       <name>dfs.support.append</name>
>>       <value>true</value>
>>   </property>
>>   <property>
>>       <name>dfs.data.synconclose</name>
>>       <value>true</value>
>>   </property>
>>
>> To hdfs-site.xml in your hadoop configuration.
>>
>> -Eric
>>
>>
>>
>> On Thu, Sep 12, 2013 at 2:27 PM, Pete Carlson <pgcarlson@gmail.com>wrote:
>>
>>> Ok, so now that I have an Accumulo monitor I discovered that my Accumulo
>>> instance doesn't have any tablet servers.
>>>
>>> Here is what I tried so far to resolve the issue:
>>>
>>> 1) Looked in the tserver_localhost.localdomain.log file, and found this
>>> FATAL message:
>>>
>>> 2013-09-12 08:09:42,273 [tabletserver.TabletServer] FATAL: Must set
>>> dfs.durable.sync OR dfs.support.append to true.  Which one needs to be set
>>> depends on your version of HDFS.  See ACCUMULO-623.
>>> HADOOP RELEASE          VERSION           SYNC NAME             DEFAULT
>>> Apache Hadoop           0.20.205          dfs.support.append    false
>>> Apache Hadoop            0.23.x           dfs.support.append    true
>>> Apache Hadoop             1.0.x           dfs.support.append    false
>>> Apache Hadoop             1.1.x           dfs.durable.sync      true
>>> Apache Hadoop          2.0.0-2.0.2        dfs.support.append    true
>>> Cloudera CDH             3u0-3u3             ????               true
>>> Cloudera CDH               3u4            dfs.support.append    true
>>> Hortonworks HDP           `1.0            dfs.support.append    false
>>> Hortonworks HDP           `1.1            dfs.support.append    false
>>> 2013-09-12 11:54:00,752 [server.Accumulo] INFO : tserver starting
>>> 2013-09-12 11:54:00,768 [server.Accumulo] INFO : Instance
>>> d57cdc38-8ceb-4192-9da3-1ce2664df33b
>>> 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Data Version 5
>>> 2013-09-12 11:54:00,771 [server.Accumulo] INFO : Attempting to talk to
>>> zookeeper
>>> 2013-09-12 11:54:00,952 [server.Accumulo] INFO : Zookeeper connected and
>>> initialized, attemping to talk to HDFS
>>> 2013-09-12 11:54:00,956 [server.Accumulo] INFO : Connected to HDFS
>>> 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.delay = 5m
>>> 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.cycle.start = 30s
>>> 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.port.client = 50091
>>> 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.threads.delete = 16
>>> 2013-09-12 11:54:00,969 [server.Accumulo] INFO : gc.trash.ignore = false
>>>
>>> I saw this same FATAL message 8 times in the tserver_localhost.localdomain.log
>>> between blocks of INFO messages, but no other fatal or warn messages.
>>> Btw, this FATAL message also appears in my
>>> tserver_localhost.localdomain.debug.log file.
>>>
>>> When I googled this Fatal message I found this page:
>>>
>>> http://mail-archives.apache.org/mod_mbox/accumulo-user/201304.mbox/%3C515F5518.1090703@gmail.com%3E
with
>>> the same "WARN: There are no tablet servers: check that zookeeper and
>>> accumulo are running." message.
>>>
>>> I checked http://127.0.0.1:50095/tservers, and it showed that there
>>> were no tablet servers online. I looked at http://127.0.0.1:50095/log,
>>> and saw the following messages:
>>>
>>> FATAL: Must set dfs.durable.sync or dfs.support.append to true. Which
>>> one needs to be set depends on your version of HDFS. See Accumulo-623.
>>>
>>> WARN: There are no tablet servers: check that zookeeper and accumulo are
>>> running.
>>>
>>> Using the info from the page I referenced above, I checked my
>>> $ACCUMULO_HOME path and realized that I hadn't set that in the
>>> conf/accumulo-env.sh
>>>
>>> So, I set it to the following:
>>>
>>> test -z "$ACCUMULO_HOME" && export
>>> ACCUMULO_HOME=/home/accumulo/accumulo-1.5.0
>>>
>>> When I did an echo of $ACCUMULO_HOME it didn't return anything, so I
>>> also tried setting it in my bash profile to see if that made any difference
>>> (it didn't).
>>>
>>> I also looked in the lib directory but didn't see any stray jars.
>>>
>>> In my tracer_localhost_localdomain.log I saw the following Exception
>>> with Zookeeper:
>>>
>>> 2013-09-11 16:09:48,649 [impl.ServerClient] WARN : There are no tablet
>>> servers: check that zookeeper and accumulo are running.
>>> 2013-09-11 18:02:23,385 [zookeeper.ZooCache] WARN : Zookeeper error,
>>> will retry
>>> org.apache.zookeeper.KeeperException$SessionExpiredException:
>>> KeeperErrorCode = Session expired for
>>> /accumulo/d57cdc38-8ceb-4192-9da3-1ce2664df33b/tservers
>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
>>> at org.apache.accumulo.fate.zookeeper.ZooCache$1.run(ZooCache.java:167)
>>> at org.apache.accumulo.fate.zookeeper.ZooCache.retry(ZooCache.java:130)
>>> at
>>> org.apache.accumulo.fate.zookeeper.ZooCache.getChildren(ZooCache.java:178)
>>> at
>>> org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:140)
>>> at
>>> org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:128)
>>> at
>>> org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123)
>>> at
>>> org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:105)
>>> at
>>> org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
>>> at
>>> org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:64)
>>> at
>>> org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:154)
>>> at
>>> org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:149)
>>> at
>>> org.apache.accumulo.server.trace.TraceServer.<init>(TraceServer.java:185)
>>> at
>>> org.apache.accumulo.server.trace.TraceServer.main(TraceServer.java:260)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at org.apache.accumulo.start.Main$1.run(Main.java:101)
>>> at java.lang.Thread.run(Thread.java:724)
>>> 2013-09-12 08:09:44,861 [server.Accumulo] INFO : tracer starting
>>> 2013-09-12 08:09:44,926 [server.Accumulo] INFO : Instance
>>> d57cdc38-8ceb-4192-9da3-1ce2664df33b
>>> 2013-09-12 08:09:44,929 [server.Accumulo] INFO : Data Version 5
>>> 2013-09-12 08:09:44,929 [server.Accumulo] INFO : Attempting to talk to
>>> zookeeper
>>> 2013-09-12 08:09:45,114 [server.Accumulo] INFO : Zookeeper connected and
>>> initialized, attemping to talk to HDFS
>>> 2013-09-12 08:09:45,130 [server.Accumulo] INFO : Connected to HDFS
>>> 2013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.delay = 5m
>>> 2013-09-12 08:09:45,150 [server.Accumulo] INFO : gc.cycle.start = 30s
>>>
>>> but then it appeared to reconnect with Zookeeper.
>>>
>>> 2) I looked at the ACCUMULO-623 Jira ticket from the FATAL message above
>>> i.e., https://issues.apache.org/jira/browse/ACCUMULO-623 , but this
>>> Jira ticket indicates this issue is fixed in Accumulo 1.5.0 although that
>>> ticket references Hadoop 1.0.3, and Zookeeper 3.3.3  (I'm using Hadoop
>>> 1.2.1, and Zookeeper 3.4.5)  I noticed that a fix was added to Hadoop 1.1
>>> for a related Hadoop Jira ticket.
>>>
>>> 3) Next, I went to the Accumulo Jira page i.e.,
>>> https://issues.apache.org/jira/browse/accumulo to look for this issue.
>>> Besides ACCUMULO-623, the following tickets are similar but not quite the
>>> same:
>>>
>>>    - ACCUMULO-327 ( but I don't have any tablet servers to begin with
>>>    to be killed)
>>>    - ACCUMULO-1235 (I only have a the default !METADATA table)
>>>
>>> 4) Looked again at the User manual to see if there was information about
>>> configuring the tablet server, but didn't see anything.
>>>
>>> Any suggestions on what I should try next?
>>>
>>> Thanks,
>>>
>>> Pete
>>>
>>
>>
>

Mime
View raw message