Modified: hadoop/core/branches/branch-0.17/docs/quickstart.html URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.17/docs/quickstart.html?rev=656523&r1=656522&r2=656523&view=diff ============================================================================== --- hadoop/core/branches/branch-0.17/docs/quickstart.html (original) +++ hadoop/core/branches/branch-0.17/docs/quickstart.html Thu May 15 00:03:48 2008 @@ -150,7 +150,10 @@ Mailing Lists
+ Added: hadoop/core/branches/branch-0.17/docs/releasenotes.html URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.17/docs/releasenotes.html?rev=656523&view=auto ============================================================================== --- hadoop/core/branches/branch-0.17/docs/releasenotes.html (added) +++ hadoop/core/branches/branch-0.17/docs/releasenotes.html Thu May 15 00:03:48 2008 @@ -0,0 +1,888 @@ + + +| Issue | +Component | +Notes | +
| + HADOOP-2828 + | ++ conf + | +
+ Remove these deprecated methods in
+ org.apache.hadoop.conf.Configuration:
|
+
| + HADOOP-2410 + | ++ contrib/ec2 + | ++ The command hadoop-ec2 + run has been replaced by hadoop-ec2 launch-cluster + <group> <number of instances>, and hadoop-ec2 + start-hadoop has been removed since Hadoop is started on instance + start up. See http://wiki.apache.org/hadoop/AmazonEC2 + for details. + | +
| + HADOOP-2796 + | ++ contrib/hod + | ++ Added a provision to reliably detect a + failing script's exit code. When the HOD script option + returns a non-zero exit code, look for a script.exitcode + file written to the HOD cluster directory. If this file is present, it + means the script failed with the exit code given in the file. + | +
| + HADOOP-2775 + | ++ contrib/hod + | ++ Added A unit testing framework based on + pyunit to HOD. Developers contributing patches to HOD should now + contribute unit tests along with the patches when possible. + | +
| + HADOOP-3137 + | ++ contrib/hod + | ++ The HOD version is now the same as the Hadoop version. + | +
| + HADOOP-2855 + | ++ contrib/hod + | ++ HOD now handles relative + paths correctly for important HOD options such as the cluster directory, + tarball option, and script file. + | +
| + HADOOP-2899 + | ++ contrib/hod + | ++ HOD now cleans up the HOD generated mapred system directory + at cluster deallocation time. + | +
| + HADOOP-2982 + | ++ contrib/hod + | ++ The number of free nodes in the cluster + is computed using a better algorithm that filters out inconsistencies in + node status as reported by Torque. + | +
| + HADOOP-2947 + | ++ contrib/hod + | ++ The stdout and stderr streams of + daemons are redirected to files that are created under the hadoop log + directory. Users can now send a kill 3 signal to the daemons to get stack traces + and thread dumps for debugging. + | +
| + HADOOP-3168 + | ++ contrib/streaming + | ++ Decreased the frequency of logging + in Hadoop streaming (from every 100 records to every 10,000 records). + | +
| + HADOOP-3040 + | ++ contrib/streaming + | ++ Fixed a critical bug to restore important functionality in Hadoop streaming. If the first character on a line is + the separator, then an empty key is assumed and the whole line is the value. + | +
| + HADOOP-2820 + | ++ contrib/streaming + | +
+ Removed these deprecated classes:
|
+
| + HADOOP-3280 + | ++ contrib/streaming + | ++ Added the + mapred.child.ulimit configuration variable to limit the maximum virtual memory allocated to processes launched by the +Map-Reduce framework. This can be used to control both the Mapper/Reducer +tasks and applications using Hadoop pipes, Hadoop streaming etc. + | +
| + HADOOP-2657 + | ++ dfs + | +Added the new API DFSOututStream.flush() to + flush all outstanding data to DataNodes. + | +
| + HADOOP-2219 + | ++ dfs + | +
+ Added a new fs -count command for
+ counting the number of bytes, files, and directories under a given path. + + Added a new RPC getContentSummary(String path) to ClientProtocol. + |
+
| + HADOOP-2559 + | ++ dfs + | ++ Changed DFS block placement to + allocate the first replica locally, the second off-rack, and the third + intra-rack from the second. + | +
| + HADOOP-2758 + | ++ dfs + | ++ Improved DataNode CPU usage by 50% while serving data to clients. + | +
| + HADOOP-2634 + | ++ dfs + | ++ Deprecated ClientProtocol's exists() method. Use getFileInfo(String) instead. + | +
| + HADOOP-2423 + | ++ dfs + | ++ Improved FSDirectory.mkdirs(...) performance by about 50% as measured by the NNThroughputBenchmark. + | +
| + HADOOP-3124 + | ++ dfs + | ++ Made DataNode socket write timeout configurable, however the configuration variable is undocumented. + | +
| + HADOOP-2470 + | ++ dfs + | +
+ Removed open() and isDir() methods from ClientProtocol without first deprecating. + + Remove deprecated getContentLength() from ClientProtocol. + + Deprecated isDirectory in DFSClient. Use getFileStatus() instead. + |
+
| + HADOOP-2854 + | ++ dfs + | ++ Removed deprecated method org.apache.hadoop.ipc.Server.getUserInfo(). + | +
| + HADOOP-2239 + | ++ dfs + | ++ Added a new FileSystem, HftpsFileSystem, that allows access to HDFS data over HTTPS. + | +
| + HADOOP-771 + | ++ dfs + | ++ Added a new method to FileSystem API, delete(path, boolean), + and deprecated the previous delete(path) method. + The new method recursively deletes files only if boolean is set to true. + | +
| + HADOOP-3239 + | ++ dfs + | ++ Modified org.apache.hadoop.dfs.FSDirectory.getFileInfo(String) to return null when a file is not + found instead of throwing FileNotFoundException. + | +
| + HADOOP-3091 + | ++ dfs + | ++ Enhanced hadoop dfs -put command to accept multiple + sources when destination is a directory. + | +
| + HADOOP-2192 + | ++ dfs + | ++ Modified hadoop dfs -mv to be closer in functionality to + the Linux mv command by removing unnecessary output and return + an error message when moving non existent files/directories. + | +
|
+ |
+
+ dfs + mapred + |
+
+ Added rack awareness for map tasks and moves the rack resolution logic to the
+ NameNode and JobTracker. The administrator can specify a
+ loadable class given by topology.node.switch.mapping.impl to specify the
+ class implementing the logic for rack resolution. The class must implement
+ a method - resolve(List<String> names), where names is the list of
+ DNS-names/IP-addresses that we want resolved. The return value is a list of
+ resolved network paths of the form /foo/rack, where rack is the rackID
+ where the node belongs to and foo is the switch where multiple racks are
+ connected, and so on. The default implementation of this class is packaged
+ along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping
+ and this class loads a script that can be used for rack resolution. The
+ script location is configurable. It is specified by
+ topology.script.file.name and defaults to an empty script. In the case
+ where the script name is empty, /default-rack is returned for all
+ dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides
+ administrators fleixibilty to define how their site's node resolution
+ should happen. |
+
| + HADOOP-2063 + | ++ fs + | ++ Added a new option -ignoreCrc to fs -get and fs -copyToLocal. The option causes CRC checksums to be + ignored for this command so that corrupt files may be downloaded. + | +
| + HADOOP-3001 + | ++ fs + | ++ Added a new Map/Reduce framework + counters that track the number of bytes read and written to HDFS, local, + KFS, and S3 file systems. + | +
| + HADOOP-2027 + | ++ fs + | ++ Added a new FileSystem method getFileBlockLocations to return the number of bytes in each block in a file + via a single rpc to the NameNode. Deprecated getFileCacheHints. + | +
| + HADOOP-2839 + | ++ fs + | ++ Removed deprecated method org.apache.hadoop.fs.FileSystem.globPaths(). + | +
| + HADOOP-2563 + | ++ fs + | ++ Removed deprecated method org.apache.hadoop.fs.FileSystem.listPaths(). + | +
| + HADOOP-1593 + | ++ fs + | ++ Modified FSShell commands to accept non-default paths. Now you can commands like hadoop dfs -ls hdfs://remotehost1:port/path + and hadoop dfs -ls hdfs://remotehost2:port/path without changing your Hadoop config. + | +
| + HADOOP-3048 + | ++ io + | ++ Added a new API and a default + implementation to convert and restore serializations of objects to strings. + | +
| + HADOOP-3152 + | ++ io + | ++ Add a static method + MapFile.setIndexInterval(Configuration, int interval) so that Map/Reduce + jobs using MapFileOutputFormat can set the index interval. + | +
| + HADOOP-3073 + | ++ ipc + | ++ SocketOutputStream.close() now closes the + underlying channel. This increase compatibility with + java.net.Socket.getOutputStream. + | +
| + HADOOP-3041 + | ++ mapred + | +
+ Deprecated JobConf.setOutputPath and JobConf.getOutputPath. + Deprecated OutputFormatBase. Added FileOutputFormat. Existing output + formats extending OutputFormatBase now extend FileOutputFormat.
+ Added the following methods to FileOutputFormat:
+ |
+
| + HADOOP-3204 + | ++ mapred + | ++ Fixed ReduceTask.LocalFSMerger to handle errors and exceptions better. Prior to this all + exceptions except IOException would be silently ignored. + | +
| + HADOOP-1986 + | ++ mapred + | +
+ Programs that implement the raw
+ Mapper or Reducer interfaces will need modification to compile with this
+ release. For example, +
+ class MyMapper implements Mapper {
+ public void map(WritableComparable key, Writable val,
+ OutputCollector out, Reporter reporter) throws IOException {
+ // ...
+ }
+ // ...
+ }
+
+ will need to be changed to refer to the parameterized type. For example: +
+ class MyMapper implements Mapper<WritableComparable, Writable, WritableComparable, Writable> {
+ public void map(WritableComparable key, Writable val,
+ OutputCollector<WritableComparable, Writable>
+ out, Reporter reporter) throws IOException {
+ // ...
+ }
+ // ...
+ }
+
+ Similarly implementations of the following raw interfaces will need
+ modification:
+
|
+
| + HADOOP-910 + | ++ mapred + | ++ Reducers now perform merges of + shuffle data (both in-memory and on disk) while fetching map outputs. + Earlier, during shuffle they used to merge only the in-memory outputs. + | +
| + HADOOP-2822 + | ++ mapred + | ++ Removed the deprecated classes org.apache.hadoop.mapred.InputFormatBase + and org.apache.hadoop.mapred.PhasedFileSystem. + | +
| + HADOOP-2817 + | ++ mapred + | ++ Removed the deprecated method + org.apache.hadoop.mapred.ClusterStatus.getMaxTasks() + and the deprecated configuration property mapred.tasktracker.tasks.maximum. + | +
| + HADOOP-2825 + | ++ mapred + | ++ Removed the deprecated method + org.apache.hadoop.mapred.MapOutputLocation.getFile(FileSystem fileSys, Path + localFilename, int reduce, Progressable pingee, int timeout). + | +
| + HADOOP-2818 + | ++ mapred + | ++ Removed the deprecated methods + org.apache.hadoop.mapred.Counters.getDisplayName(String counter) and + org.apache.hadoop.mapred.Counters.getCounterNames(). + Undeprecated the method + org.apache.hadoop.mapred.Counters.getCounter(String counterName). + | +
| + HADOOP-2826 + | ++ mapred + | +
+ Changed The signature of the method
+ public org.apache.hadoop.streaming.UTF8ByteArrayUtils.readLIne(InputStream) to
+ UTF8ByteArrayUtils.readLIne(LineReader, Text). Since the old
+ signature is not deprecated, any code using the old method must be changed
+ to use the new method.
+ + Removed the deprecated methods org.apache.hadoop.mapred.FileSplit.getFile() + and org.apache.hadoop.mapred.LineRecordReader.readLine(InputStream in, + OutputStream out). + + Made the constructor org.apache.hadoop.mapred.LineRecordReader.LineReader(InputStream in, Configuration + conf) public. + |
+
| + HADOOP-2819 + | ++ mapred + | +
+ Removed these deprecated methods from org.apache.hadoop.JobConf:
+
|
+
| + HADOOP-3093 + | ++ mapred + | +
+ Added the following public methods to org.apache.hadoop.conf.Configuration:
+
|
+
| + HADOOP-2399 + | ++ mapred + | ++ The key and value objects that are given + to the Combiner and Reducer are now reused between calls. This is much more + efficient, but the user can not assume the objects are constant. + | +
| + HADOOP-3162 + | ++ mapred + | +
+ Deprecated the public methods org.apache.hadoop.mapred.JobConf.setInputPath(Path) and
+ org.apache.hadoop.mapred.JobConf.addInputPath(Path).
+
+ Added the following public methods to org.apache.hadoop.mapred.FileInputFormat:
+ |
+
| + HADOOP-2178 + | ++ mapred + | +
+ Provided a new facility to
+ store job history on DFS. Cluster administrator can now provide either localFS
+ location or DFS location using configuration property
+ mapred.job.history.location to store job histroy. History will also
+ be logged in user specified location if the configuration property
+ mapred.job.history.user.location is specified.
+
+ Removed these classes and method:
+ + Changed the signature of the public method + org.apache.hadoop.mapred.DefaultJobHistoryParser.parseJobTasks(File + jobHistoryFile, JobHistory.JobInfo job) to + DefaultJobHistoryParser.parseJobTasks(String jobHistoryFile, + JobHistory.JobInfo job, FileSystem fs). + Changed the signature of the public method + org.apache.hadoop.mapred.JobHistory.parseHistory(File path, Listener l) + to JobHistory.parseHistoryFromFS(String path, Listener l, FileSystem fs). + |
+
| + HADOOP-2055 + | ++ mapred + | +
+ Users are now provided the ability to specify what paths to ignore when processing the job input directory
+ (apart from the filenames that start with "_" and ".").
+ To do this, two new methods were defined:
+
|
+
| + HADOOP-2116 + | ++ mapred + | ++ Restructured the local job directory on the tasktracker. Users are provided with a job-specific shared directory + (mapred-local/taskTracker/jobcache/$jobid/work) for use as scratch + space, through configuration property and system property + job.local.dir. The directory ../work is no longer available from the task's current working directory. + | +
| + HADOOP-1622 + | ++ mapred + | +
+ Added new command line options for hadoop jar command:
+ + hadoop jar -files <comma seperated list of files> -libjars <comma + seperated list of jars> -archives <comma seperated list of + archives> + + where the options have these meanings: + +
|
+
| + HADOOP-2823 + | ++ record + | +
+ Removed the deprecated methods in
+ org.apache.hadoop.record.compiler.generated.SimpleCharStream:
+
|
+
| + HADOOP-2551 + | ++ scripts + | ++ Introduced new environment variables to allow finer grained control of Java options passed to server and + client JVMs. See the new *_OPTS variables in conf/hadoop-env.sh. + | +
| + HADOOP-3099 + | ++ util + | +
+ Added a new -p option to distcp for preserving file and directory status:
+ + -p[rbugp] Preserve status + r: replication number + b: block size + u: user + g: group + p: permission ++ The -p option alone is equivalent to -prbugp + |
+
| + HADOOP-2821 + | ++ util + | ++ Removed the deprecated classes org.apache.hadoop.util.ShellUtil and org.apache.hadoop.util.ToolBase. + | +