hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1839) HadoopArchives should provide a way to configure replication
Date Thu, 03 Jun 2010 17:58:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875222#action_12875222

Koji Noguchi commented on MAPREDUCE-1839:

bq. I tested it yesterday on Hadoop 0.20 and it doesn't work. 

Could you clarify what didn't work?
If the mapreduce archive job failed with unknown param, maybe you don't have MAPREDUCE-826
which sets the ToolRunner.

I tried just now.  Got 

% hadoop dfs -lsr mytest1.har                  
-rw-------   5 knoguchi users        947 2010-06-03 17:47 /user/knoguchi/mytest1.har/_index
-rw-------   5 knoguchi users         23 2010-06-03 17:47 /user/knoguchi/mytest1.har/_masterindex
-rw-------   2 knoguchi users      68064 2010-06-03 17:46 /user/knoguchi/mytest1.har/part-0

Replication was successfully set to 2. 

Maybe you're talking about the replication shown when doing listStatus on the files inside
the har ?

When I do 
hadoop dfs -lsr har:///user/knoguchi/mytest1.har , it shows 
-rw-------   5 knoguchi users      17018 2010-06-03 17:47 /user/knoguchi/mytest1.har/tmptmp/abc

This is because permission and replication factor is simply taken from the _index file.  This
is fixed in MAPREDUCE-1628.

> HadoopArchives should provide a way to configure replication
> ------------------------------------------------------------
>                 Key: MAPREDUCE-1839
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1839
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: harchive
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Priority: Minor
> When creating HAR archives, the part files use the default replication of the filesystem.
This should be made configurable through either the configuration file or command line.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message