incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron McCurry (JIRA)" <>
Subject [jira] [Commented] (BLUR-397) Improve data loading from M/R
Date Mon, 15 Dec 2014 19:25:14 GMT


Aaron McCurry commented on BLUR-397:

The first phase of this has been committed here:;a=commit;h=22200a3a9008f614d7216c8ffaef55fe3e43c7c5

The only remaining issue is that the table file/directory permissions need to be setup correctly
by Blur itself.  I propose the following permission assignments:

rwxr-xr-x **/<table root>
rwxr-xr-x **/<table root>/<tablename>
rwxr-xr-x **/<table root>/<tablename>/types
rwx------ **/<table root>/<tablename>/shard-N
rwx------ **/<table root>/<tablename>/shard-N/*

> Improve data loading from M/R
> -----------------------------
>                 Key: BLUR-397
>                 URL:
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur, Blur MapReduce
>            Reporter: Tim Williams
> There's an awkward permissions dilemma when writing data into Blur from Map/Reduce. 

> A job would typically create a table, then load the data.  The challenge is that the
table itself is created through the controller, which means it's written to DFS as the user
actually running the controller daemon - typically 'blur'.  The Map/Reduce job may be run
as some other user totally, but it may be a user that you don't want to have write access
inside blur's directory paths. In other words, you'd like arbitrary user(s) to be able to
create/populate table data without necessarily having write access to blur's internal stuffs.
> One approach is to have the user's job write to any location they have access to, the
"tell" Blur to 'import' it - at which time, Blur would literally move the data into it's control.

This message was sent by Atlassian JIRA

View raw message