accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-2234) Cannot run offline mapreduce over non-default instance.dfs.dir value
Date Thu, 23 Jan 2014 21:49:38 GMT


Josh Elser commented on ACCUMULO-2234:

bq. Implementation should not add a dependency on server configuration files which cannot
assumed to be known by the launching process. It should use conn.instanceOperations().getSiteConfiguration()
to get the configuration via thrift, without additional classpath dependencies on server configuration

The *can* be assumed to be known by the launching process as ACCUMULO_CONF_DIR is expected
to be set for other continuous ingest tests. Can instance.dfs.dir be pulled from a Connector/Instance
without having it specified by the accumulo-site.xml? If so, I was unaware of this.

However, IMO, there is still no downside to this given that this is a system test and expectations
on accumulo-site.xml already being present. This works as is -- I would be inclined that you
should open a different ticket for this as these changes do successfully satisfy the lack
of functionality in such a way that I do not see issue with.

bq. Also, is this really a blocker?

I could not run a required test for released as I should have been able to. So, yes this is
a blocker to me. If we say you should be able to do something that affects a release, but
it is impossible to do so, that's a blocker. If you don't agree, you have the ability to change
the priority of this ticket.

> Cannot run offline mapreduce over non-default instance.dfs.dir value
> --------------------------------------------------------------------
>                 Key: ACCUMULO-2234
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.4.4, 1.5.0
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.4.5, 1.5.1, 1.6.0
> The javadoc for setting up offline scans over RFiles (InputFormatBase.setScanOffline
in 1.4 or InputFormatBase.setOfflineTableScan in 1.5) includes a nice little comment to the
effect that if a "non-standard" directory is used for Accumulo in HDFS (read as, if the default
value for instance.dfs.dir), accumulo-site.xml may need to be on the classpath for the mappers.
> Best as I can tell, even if accumulo-site.xml is on the classpath, it makes no difference
as InputFormatBase is creating a new ZooKeeperInstance which, in turn, will only ever make
a DefaultConfiguration and never try to check if an accumulo-site.xml file is available. This
would make it impossible for a non-default value for instance.dfs.dir to ever be used.

This message was sent by Atlassian JIRA

View raw message