From David Patterson <>
Subject Re: Accessing Accumulo from a different machine
Date Thu, 19 Feb 2015 21:21:45 GMT
Josh and anyone else interested,

More data on this problem.

I have tried debugging the code in Eclipse (running it on my Windows
machine). The ZooKeeperInstance is working fine in this remote mode. I can
query the instance, and get the instanceID, instance Name, zookeepers
string, and session timeout.

I've also tried creating a ZooCache and a UUID object with the long string
value of my actual instance identification.  If I do
String instanceName = ZooKeeperInstance.lookupInstance( zooCache uuid);
It is able to return the string name of the instance. So, that part of the
communication seems to be fine.

The hang-up is still coming on the instance.getConnector( username, new
PasswordToken( password));

It hangs, and when I ran my code in debug mode on Eclipse, I interrupted it
while it was doing nothing.

I see a long string of calls going from ZooKeeperInstance.getConnector
to ConnectorImpl constructor
to ServerClient.execute
to ServerClient.executeRaw
to ServerClient.getConnection(Instance)
to ServerClient.getConnection(Instance, boolean)
to ServerClient.getConnection(Instance, boolean, long)
to ThriftTransportPool.getAnyTransport(List<ThriftTransportKey>, boolean)

At this point, I see that the ThriftTransportKey has a host name:
"localhost" and a port of "9997".

>From there, it goes to ThriftUtil.createClientTransport,
TTimeoutTransport.create(HostAndPort), TTimeoutTransport(SocketAddress,
SocketAdapter.connect(SocketAddress), SocketAdapter.connect(SocketAddress,
int), SocketChannelImpl.connect( SocketAddress),
Net.connect(FileDescriptor, InetAddress,int),
Net.connect(ProtocolFamily,FileDescriptor, InetAddress, int) and finally
Net.connect0(boolean, FileDescriptor, InetAddress, int)

I guess I don't understand why this is going into Thrift code.

Is there some authorization I need to provide to let me do a remote
connection into Accumulo (Zookeeper seems happy to work, but is Accumulo
stopping me?)?

If anyone wants line numbers, etc. I can supply more info.

Dave Patterson

On Wed, Feb 18, 2015 at 10:20 AM, Josh Elser <> wrote:

> > a)  a copy of Zookeeper running on the machine from which I'm calling
> for data
> > b) call the "local" zookeeper for data and let it connect to the remote
> node for the data?
> No, a ZooKeeper server does not have to be machine local for you to use
> it. It just has to be reachable on the network.
> I'm sorry to say, I kind of at a loss. I'm not sure what you are running
> into. You could try remote debugging your application on the "other" cloud
> machine to see how exactly your code is converting the instance name into
> the instanceID (and confirm that the value in the TCredentials object is,
> in fact, different than what you expect it to be).
> As for your local windows machine, I know some people have connected to
> Accumulo from Windows before, but it is a YMMV platform. Hopefully it works
> just fine because it's Java under the hood, but we have no tests to
> guarantee that this does work.
> David Patterson wrote:
>> Josh, thanks for your help.
>> 1) Running on the machine that has the accumulo/hadoop/zookeeper code,
>> in the accumulo shell for the user name "dave" I see the UUID for my
>> instance.
>> 2) Running on the "other" machine, launching the zookeeper client,
>> pointing to the ip address of the server and issuing the get
>> /accumulo/instance/{my-instance-name}, I see the same UUID for the
>> instance.
>> 3) Running on the "other" machine, when I run my java code to connect to
>> the remote machine with the proper instance name, userid and password, I
>> get the INVALID_INSTANCEID as described in detail above.
>> 4) Running on my normal machine (Windows) running eclipse where I've
>> developed the code, if I run the code as a Java Application, it hangs.
>> 5) Running on my windows machine, if I debug the application, I can
>> interrupt it when it hangs up and it is waiting on the line with
>>       Connector connector = instance.getConnector( acUserName, new
>> PasswordToken( acPassword));
>> Can my application create a connector to a remote machine's
>> ZookeeperInstance and reference it from "afar"? Do I have to have:
>> a)  a copy of Zookeeper running on the machine from which I'm calling
>> for data
>> b) call the "local" zookeeper for data and let it connect to the remote
>> node for the data?
>> The code I'm writing receives a row identifier as a String parameter,
>> creates a Scanner, sets the range to a single row (same value for both
>> ends of the range) and iterates over the (one and only) row.
>> I'm using Accumulo 1.6.1, Hadoop 2.6.0, and zookeeper 3.4.6, Java 7
>> (Oracle). The two cloud machines are running Ubuntu 14.04.
>> Thanks.
>> Dave
>> On Tue, Feb 17, 2015 at 5:24 PM, Josh Elser <
>> <>> wrote:
>>     Oops, sorry. I used '>' to denote the shell prompt. The bits below
>>     where it converted them to a quote is just meant to denote commands
>>     that are run inside the zkCli :)
>>     Josh Elser wrote:
>>         If you're using the same exact code on both machines, it sounds
>>         like you
>>         might have something unexpected going on with your networking.
>>         Accumulo can share ZooKeeper and HDFS instances -- it uses the
>>         notion of
>>         an InstanceID to do this. The InstanceID is a UUID assigned to an
>>         Accumulo instance during `accumulo init`. Because a UUID is hard
>> to
>>         memorize, and you need to identify the Accumulo instance you want
>> to
>>         connect to in the client API, there is also a mapping of some
>>         'easy-to-remember' name to that UUID. For example
>>         'daves_accumulo' maps
>>         to '12345678-1234-1234-__123456789012'.
>>         The error you're seeing is because the UUID your client found
>>         from the
>>         `instanceName` is different than the instanceID the Accumulo
>>         server has.
>>         A quick sanity check is to look at ZooKeeper:
>> -server your_zk_host:2181
>>             get /accumulo/instances/your___instance_name
>>         Compare the value of that node (first line of output) with the
>>         instance
>>         ID displayed on the Accumulo monitor (top of the page). They
>>         should be
>>         the same.
>>         I don't think I've ever seen this personally, so I'm not sure
>>         what to
>>         guess at how it happened. It's possible you might have
>>         networking messed
>>         up and are talking to a different ZooKeeper than you think you are
>>         (common problem if you have misconfigured a quorum and each ZK
>>         node is
>>         acting independent instead of together). A quick fix would be to
>>         change
>>         the node in ZK to the correct instance ID.
>> -server your_zk_host:2181
>>             delete /accumulo/instances/your___instance_name
>>             create /accumulo/instances/your___instance_name
>>             instance_id_from_monitor
>>         If that doesn't help, please give us some more information
>> (versions
>>         you're using, how you set up the system, anything special you
>> did).
>>         David Patterson wrote:
>>             I'm running a very simple test configuration with on Ubuntu 14
>>             machine. If I run code on that machine I can read the data
>>             I've added.
>>             I'm only using column family name, (empty_text for the
>>             qualifier) and
>>             a value -- no authorizations.
>>             When I run the exact same program (identical jar) on another
>>             Ubuntu 14
>>             machine, I get
>>             org.apache.accumulo.core.__client.__
>> AccumuloSecurityException:
>>             Error
>>             INVALID_INSTANCEID for user dave - Unknown security exception
>>             at
>>             org.apache.accumulo.core.__client.impl.ServerClient.__
>> execute(
>>             at
>>             org.apache.accumulo.core.__client.impl.ConnectorImpl.<__
>> init>(
>>             at
>>             org.apache.accumulo.core.__client.ZooKeeperInstance.__
>> getConnector(
>>             at<__init>(
>>             at
>>             Caused by: ThriftSecurityException(user:__dave,
>>             code:INVALID_INSTANCEID)
>>             The error occurs on the instance.getConnector call (the
>>             second line
>>             below)
>>             instance = new ZooKeeperInstance(__instanceName, zooServers);
>>             connector = instance.getConnector( acUserName, new
>>             PasswordToken(
>>             acPassword));
>>             One possible source for strangeness is that both of these
>>             machines are
>>             on a cloud server. Each of them has 2 ip addresses -- one
>>             that is
>>             available from the outside, and one that is available only
>>             inside the
>>             cloud. I'm using the outside-the-cloud ip address in the
>>             zooServers
>>             string.
>>             The /etc/hosts file on the machine with the Accumulo data
>>             has the
>>             external ip address as the name of the machine. It also has
>>             defined as localhost.
>>             Any suggestions?
>>             Dave Patterson

