hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-171) need standard API to set dfs replication = high
Date Thu, 27 Apr 2006 00:31:03 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-171?page=comments#action_12376616 ] 

Konstantin Shvachko commented on HADOOP-171:
--------------------------------------------

We can implement
short highReplicationHint()
which would ask namenode before anything what it thinks an appropriate
replication for highly accessed files would be.
Then copyFromLocalFile() would use that value to create files.
The size of the cluster does not vary often. 10% variation due to node
failure and come back is not a big deal with respect to sqrt or /10.

> need standard API to set dfs replication = high
> -----------------------------------------------
>
>          Key: HADOOP-171
>          URL: http://issues.apache.org/jira/browse/HADOOP-171
>      Project: Hadoop
>         Type: New Feature

>   Components: dfs
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko

>
> There should be a standard way to indicate that files should be highly replicated, appropriate
for files that all nodes will read.  This should be settable both on file creation and for
already-existing files.  Perhaps specifying a particular replication value, like Short.MAX_VALUE,
or zero, can be used to signal this.  The level should not be constant, but should be relative
to the cluster size and network topography.  If more nodes are added or if nodes are deleted,
the actual replication count should increase or decrease.
> Initially, all that is needed is an API to specify this.  It could initially be implemented
with a constant (e.g., 10) or with something related to the number of datanodes (sqrt?), and
needn't auto-adjust as the cluster size changes  That is only  the long-term goal.
> When JobClient copies job files (job.xml & job.jar) into the job's filesystem, it
should specify this replication level.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message