incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Williams (JIRA)" <>
Subject [jira] [Commented] (BLUR-397) Improve data loading from M/R
Date Tue, 16 Dec 2014 13:31:13 GMT


Tim Williams commented on BLUR-397:

Regarding the permissions question, I think making it configurable would be best.  Maybe even
allowing it to be configurable per-table.  Essentially let the user define POSIX style permission
short for the key paths + default for the system?

I think not letting the M/R process write directly into Blur's FS space is good and wonder
if we should make that a general constraint within the system - the shard server is the only
component allowed to write to the table-space of the fileystem; all other components should
delegate/go through the front door?

> Improve data loading from M/R
> -----------------------------
>                 Key: BLUR-397
>                 URL:
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur, Blur MapReduce
>            Reporter: Tim Williams
> There's an awkward permissions dilemma when writing data into Blur from Map/Reduce. 

> A job would typically create a table, then load the data.  The challenge is that the
table itself is created through the controller, which means it's written to DFS as the user
actually running the controller daemon - typically 'blur'.  The Map/Reduce job may be run
as some other user totally, but it may be a user that you don't want to have write access
inside blur's directory paths. In other words, you'd like arbitrary user(s) to be able to
create/populate table data without necessarily having write access to blur's internal stuffs.
> One approach is to have the user's job write to any location they have access to, the
"tell" Blur to 'import' it - at which time, Blur would literally move the data into it's control.

This message was sent by Atlassian JIRA

View raw message