accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-3704) Localize client configuration for MapReduce
Date Wed, 01 Apr 2015 18:32:55 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Elser updated ACCUMULO-3704:
---------------------------------
    Issue Type: Bug  (was: Improvement)

> Localize client configuration for MapReduce
> -------------------------------------------
>
>                 Key: ACCUMULO-3704
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3704
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, mapreduce
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.7.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backstory is that I had a Kerberized Hadoop node and was running ContinuousVerify on
it.
> The job launched successfully, but the mappers hung, unable to authenticate with the
TabletServers. I knew that I had the configuration (mostly) right, because the Tool (client
code) was able to fetch the split points for the job: the mappers were just unable to read
from Accumulo.
> The Tool was able to talk to Accumulo because ACCUMULO_CONF_DIR was correctly set by
config.sh (called from tool.sh). However, environment variables from the Tool are not passed
into the child mappers/reducers. As such, the Mappers could only guess at a few locations
where the client configuration file might be. In my case, they did not guess correctly. This
kind of boils down to the following:
> 1. Client launches job with correct environment
> 2. Mappers reliably fail to talk to Accumulo
> [~billie.rinaldi] had the suggestion that we localize the client configuration in the
Job itself. I think the easiest way to do this is to construct a ClientConfiguration in the
Tool, serialize it as a property file and add it to the distributed cache.
> Then, when we construct the RecordReader, we can search for that file first, and then
fall back to loading the default. This should make a seamless experience for users and prevents
the need for Accumulo configuration across all YARN nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message