hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HdfsFutures" by SanjayRadia
Date Thu, 27 Mar 2008 21:18:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by SanjayRadia:
http://wiki.apache.org/hadoop/HdfsFutures

------------------------------------------------------------------------------
+ = HDFS Futures =
+ 
- = HDFS Futures: Categorized List and Descriptions of HDFS Future Features =
+ Below is a categorized list and descriptions of HDFS Future Features
  
  
  '''The following page is under development - it is being converted from TWiki to Wiki'''
@@ -38, +40 @@

     * E.g. Paritioning name space can also improve performance of each NN slave<br />
  
  === Summary of various options that scale name space and performance (details below) ===
-  (Also see [[http://twiki.corp.yahoo.com/pub/Grid/HdfsFeaturePlanning/ScaleNN_Sea_of_Options.pdf][Scaling
NN: Sea of Options]])
+  (Also see [http://twiki.corp.yahoo.com/pub/Grid/HdfsFeaturePlanning/ScaleNN_Sea_of_Options.pdf/
Scaling NN: Sea of Options])
     * Grow memory
        * Scales name space but not performance
        * Issue: GC and Java scaling for large memories
@@ -96, +98 @@

     * Move more functionality to data node
           * Distributed replica creation  - not simple
  
-    *  Improve Block report processing [[https://issues.apache.org/jira/browse/HADOOP-2448][HADOOP-2448]]
+    *  Improve Block report processing [https://issues.apache.org/jira/browse/HADOOP-2448/
HADOOP-2448]
             2K nodes mean a block report every 3 sec.<br />
        * Currently: Each DN sends Full BR are sent as array of longs every hour. Initial
BR has random backoff (configurable)
-       * Incremental and Event based B-reports - [[https://issues.apache.org/jira/browse/HADOOP-1079][HADOOP-1079]]
+       * Incremental and Event based B-reports - [https://issues.apache.org/jira/browse/HADOOP-1079/
HADOOP-1079]
           * E.g when disk is lost. or blocks are deleted, etc
           * DN can determine what if anything has changed and send only of there are changes
        * Send only checksums
           * NN recalculates the checksum, OR has rolling checksum<br />
-       * Make intial block report's random backoff to be dynamicaly set via NN when DNs register.
 -  [[https://issues.apache.org/jira/browse/HADOOP-2444][HADOOP-2444]]
+       * Make intial block report's random backoff to be dynamicaly set via NN when DNs register.
 -  [https://issues.apache.org/jira/browse/HADOOP-2444/ HADOOP-2444]
  
  
     * <br />
@@ -171, +173 @@

     * Open files are NOT accessible by readers in the event of deletion or renaming
     * Growable Files
        *    via atomic append with multiple writers
-       * Via append with 1 writer [[http://issues.apache.org/jira/secure/QuickSearch.jspa][Hadoop-1700]]
+       * Via append with 1 writer [http://issues.apache.org/jira/browse/HADOOP-1700/ Hadoop-1700]
     * Truncate files
        * Use case for this?
        * note truncate and append needs to be designed together
@@ -191, +193 @@

  == File IO Performance ==
     * In memory checksum caching (full or partial) on Datanodes (What is this Sameer?)
     * Reduce CPU utilization on IO
-       * Remove double buffering [[http://issues.apache.org/jira/browse/HADOOP-1702][Hadoop-1702]]
+       * Remove double buffering [http://issues.apache.org/jira/browse/HADOOP-1702/ Hadoop-1702]
        * Take advantage of send file
-    * Random access performance [[http://issues.apache.org/jira/browse/HADOOP-2736][Hadoop-2736]]
+    * Random access performance [http://issues.apache.org/jira/browse/HADOOP-2736/ Hadoop-2736]
  
  
  == Namespace Features ==
@@ -245, +247 @@

  
  === RPC Timeouts, Connection handling, Q handling, threading ===
     * When load spikes occur, the clients timeout and the spiral of death occurs<br />
-    * Remove Timeout, Instead Ping to detect server failures [[http://issues.apache.org/jira/browse/HADOOP-2188][HADOOP-2188]]
+    * Remove Timeout, Instead Ping to detect server failures [http://issues.apache.org/jira/browse/HADOOP-2188/
HADOOP-2188]
     * Improve Connection handling, idle connections etc
  
  === Client-side recovery from NN restarts and faIlovers ===

Mime
View raw message