Yes, it is a single-node "all-in-one" accumulo server at the moment, a toy server really for me to get familiar with the ecosystem needed to run the GeoMesa iterators (my main interest). 

I tracked it down when I noticed that there was an /accumulo directory in the regular file system instead of the HDFS store. I suppose an additional check on loading the value for instance.dfs.dir could check if it is a relative or absolute URI first?

Anyways, I'm thinking of going into production for real on EMR using the bootstrap actions available. I realize there's some performance implications using s3 as the backing file store, but durability and low-management-overhead is more a priority for me at the moment.

Thanks again for the responses.

Mike


On Wed, Jan 7, 2015 at 5:19 PM, Christopher <ctubbsii@apache.org> wrote:
Honestly, I'm surprised that worked. This might cause problems elsewhere, because we may assume we can simply concatenate instance.dfs.uri with instance.dfs.dir. If you encounter any issues, please let us know.

Also, is this a single-node instance? If not, you should replace "localhost" with your actual hostname, else your tservers won't be able to communicate with the namenode.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Wed, Jan 7, 2015 at 2:49 PM, Mike Atlas <mike@weft.io> wrote:
I was able to narrow the problem down to my accumulo-site.xml file.

I had the following:

    <property>
      <name>instance.dfs.uri</name>
      <value>hdfs://localhost:9000/</value>
    </property>
    <property>
      <name>instance.dfs.dir</name>
      <value>/accumulo</value>
    </property>

I changed the instance.dfs.dir value to be a full URI and my problem with the "Mkdir" failures no longer happen, even on recovery bootup.

    <property>
      <name>instance.dfs.dir</name>
      <value>hdfs://localhost:9000/accumulo</value>
    </property>

Thanks for the suggestions everyone.

-Mike


On Tue, Jan 6, 2015 at 9:48 PM, Josh Elser <josh.elser@gmail.com> wrote:
Is HDFS actually healthy? Have you checked the namenode status page (http://$hostname:50070 by default) to make sure the NN is up and out of safemode, expected number of DNs have reported in, hdfs reports available space, etc?

Any other Hadoop details (version, etc) would be helpful too!

Mike Atlas wrote:
Well, I caught the same error again after terminating my machine with a
hard stop - which isn't a normal way to do things but I fat-finger saved
an AMI image of it thinking I could boot up just fine afterward.

The only workaround I could do to resolve it was to blow away the HDFS
/accumulo directory and re-init my accumulo instance again --- which is
fine for playing around, but I'm wondering what exactly is going on? I
don't want that to happen if I went to production and had real data.

Thoughts on how to debug?


On Tue, Jan 6, 2015 at 10:40 AM, Keith Turner <keith@deenlo.com
<mailto:keith@deenlo.com>> wrote:



    On Mon, Jan 5, 2015 at 6:50 PM, Mike Atlas <mike@weft.io
    <mailto:mike@weft.io>> wrote:

        Hello,

        I'm running Accumulo 1.5.2, trying to test out the GeoMesa
        <http://www.geomesa.org/2014/05/28/geomesa-quickstart/> family

        of spatio-temporal iterators using their quickstart
        demonstration tool. I think I'm not making progress due to my
        Accumulo setup, though, so can someone validate that all looks
        good from here?

        start-all.sh output:

        hduser@accumulo:~$ $ACCUMULO_HOME/bin/start-all.sh
        Starting monitor on localhost
        Starting tablet servers .... done
        Starting tablet server on localhost
        2015-01-05 21:37:18,523 [server.Accumulo] INFO : Attempting to talk to zookeeper
        2015-01-05 21:37:18,772 [server.Accumulo] INFO : Zookeeper connected and initialized, attemping to talk to HDFS
        2015-01-05 21:37:19,028 [server.Accumulo] INFO : Connected to HDFS
        Starting master on localhost
        Starting garbage collector on localhost
        Starting tracer on localhost

        hduser@accumulo:~$


        I do believe my HDFS is set up correctly:

        hduser@accumulo:/home/ubuntu/geomesa-quickstart$ hadoop fs -ls /accumulo
        Found 5 items
        drwxrwxrwx   - hduser supergroup          0 2014-12-10 01:04 /accumulo/instance_id
        drwxrwxrwx   - hduser supergroup          0 2015-01-05 21:22 /accumulo/recovery
        drwxrwxrwx   - hduser supergroup          0 2015-01-05 20:14 /accumulo/tables
        drwxrwxrwx   - hduser supergroup          0 2014-12-10 01:04 /accumulo/version
        drwxrwxrwx   - hduser supergroup          0 2014-12-10 01:05 /accumulo/wal


        However, when I check the Accumulo monitor logs, I see these
        errors post-startup:

        java.io.IOException: Mkdirs failed to create directory /accumulo/recovery/15664488-bd10-4d8d-9584-f88d8595a07c/part-r-00000
                java.io.IOException: Mkdirs failed to create directory /accumulo/recovery/15664488-bd10-4d8d-9584-f88d8595a07c/part-r-00000
                        at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:264)
                        at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:103)
                        at org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.writeBuffer(LogSorter.java:196)
                        at org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.sort(LogSorter.java:166)
                        at org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.process(LogSorter.java:89)
                        at org.apache.accumulo.server.zookeeper.DistributedWorkQueue$1.run(DistributedWorkQueue.java:101)
                        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                        at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
                        at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
                        at java.lang.Thread.run(Thread.java:745)


        I don't really understand - I started accumulo as the hduser,
        which is the same user that has access to the HDFS directory
        /accumulo/recovery, and it looks like the directory was created
        actually, except for the last directory (part-r-0000):

        hduser@accumulo:~$ hadoop fs -ls /accumulo0/recovery/
        Found 1 items
        drwxr-xr-x   - hduser supergroup          0 2015-01-05 22:11 /accumulo/recovery/87fb7aac-0274-4aea-8014-9d53dbbdfbbc


        I'm not out of physical disk space:

        hduser@accumulo:~$ df -h
        Filesystem      Size  Used Avail Use% Mounted on
        /dev/xvda1     1008G  8.5G  959G   1% /


        What could be going on here? Any ideas on something simple I
        could have missed?


    One possibility is that tserver where the exception occurred had bad
    or missing config for hdfs.  In this case the hadoop code may try to
    create /accumulo/recovery/.../part-r-00000 in local fs, which would
    fail.


        Thanks,
        Mike