accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Havanki (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-2876) Unexpected looping from HdfsZooInstance.getInstanceID()
Date Mon, 09 Jun 2014 19:31:03 GMT


Bill Havanki commented on ACCUMULO-2876:

I think the best way to resolve this is to change the use of {{VolumeManagerImpl.get()}} (the
no-argument form) in {{HdfsZooInstance._getInstanceID()}}. It looks for the Accumulo "system"
configuration, which is a ZooKeeper-based configuration, as the basis for setup. However,
it probably only needs the site configuration (accumulo-site.xml + defaults). Here's how {{HdfsZooInstance}}
figures out its instance ID under 1.5.x:

String instanceIdFromFile = ZooKeeperInstance.getInstanceIDFromHdfs(ServerConstants.getInstanceIdLocation());

Although this is a static call to the {{ZooKeeperInstance}} class, the method there doesn't
consult ZooKeeper at all. It uses the site configuration. So, I imagine that under 1.6.x the
instance ID can also still be found with just the site configuration.

I have two options:
# Change the call to {{VolumeManagerImpl.get(ServerConfiguration.getSiteConfiguration()}}.
# Change {{VolumeManagerImpl.get()}} to use the site configuration always, instead of the
system configuration.

Any thoughts from those more familiar with {{VolumeManager}}? [~ecn][~elserj]

> Unexpected looping from HdfsZooInstance.getInstanceID()
> -------------------------------------------------------
>                 Key: ACCUMULO-2876
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.6.0
>            Reporter: Bill Havanki
>            Assignee: Bill Havanki
>            Priority: Minor
> While working ACCUMULO-2615, I encountered a weird looping behavior rooted at {{HdfsZooInstance.getInstanceID()}}
that seems to accidentally avoid a stack overflow. I'm a bit blocked at the moment on -2615
since my work exposed the loop.
> Here's a summary of the loop. The unit test {{SystemCredentialsTest}} in the tserver
module exercises it.
> * Start at {{HdfsZooInstance.getInstanceID()}}, which calls to an internal {{_getInstanceID()}}
the first time.
> * A volume manager is needed to find the instance ID in HDFS, so {{VolumeManagerImpl.get()}}
is called.
> * That call needs the "system" configuration, so a call to {{ServerConfiguration.getSystemConfiguration()}}
is made, passing the {{HdfsZooInstance}} object.
> * The system configuration is a {{ZooConfiguration}} object, and that is created from
> * That factory method creates a {{ZooConfiguration}} object, saved as a static field.
The code then tries to get the instance ID for the passed-in instance, which is the {{HdfsZooInstance}}
object. So we're back at the top of the loop.
> In the last step of the second iteration of the loop, the factory method sees that the
static field for the singleton instance of {{ZooConfiguration}} was set in the first iteration,
so it returns it and doesn't look for the instance ID. That stops the looping and the call
stack unwinds.
> (My refactoring work has trouble with this because it gets rid of the single static field
in favor of a map of objects keyed by instance ID.)
> This loop indicates a mutual dependency between configurations, the volume manager, and
{{HdfsZooInstance}} that should be resolved.

This message was sent by Atlassian JIRA

View raw message