hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ma...@apache.org
Subject svn commit: r1302216 - in /hadoop/common/branches/branch-1: CHANGES.txt src/docs/releasenotes.html
Date Sun, 18 Mar 2012 20:17:48 GMT
Author: mattf
Date: Sun Mar 18 20:17:48 2012
New Revision: 1302216

URL: http://svn.apache.org/viewvc?rev=1302216&view=rev
preparing for release 1.0.2

    hadoop/common/branches/branch-1/CHANGES.txt   (contents, props changed)

Modified: hadoop/common/branches/branch-1/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/CHANGES.txt?rev=1302216&r1=1302215&r2=1302216&view=diff
--- hadoop/common/branches/branch-1/CHANGES.txt (original)
+++ hadoop/common/branches/branch-1/CHANGES.txt Sun Mar 18 20:17:48 2012
@@ -161,7 +161,7 @@ Release 1.1.0 - unreleased
     MAPREDUCE-2835. Make per-job counter limits configurable. (tomwhite)
-Release 1.0.2 - unreleased
+Release 1.0.2 - 2012.03.18

Propchange: hadoop/common/branches/branch-1/CHANGES.txt
  Merged /hadoop/common/branches/branch-1.0/CHANGES.txt:r1302214

Modified: hadoop/common/branches/branch-1/src/docs/releasenotes.html
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/docs/releasenotes.html?rev=1302216&r1=1302215&r2=1302216&view=diff
--- hadoop/common/branches/branch-1/src/docs/releasenotes.html (original)
+++ hadoop/common/branches/branch-1/src/docs/releasenotes.html Sun Mar 18 20:17:48 2012
@@ -2,7 +2,7 @@
 <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
-<title>Hadoop 1.0.0 Release Notes</title>
+<title>Hadoop 1.0.2 Release Notes</title>
 <STYLE type="text/css">
 		H1 {font-family: sans-serif}
 		H2 {font-family: sans-serif; margin-left: 7mm}
@@ -10,10 +10,231 @@
-<h1>Hadoop 1.0.0 Release Notes</h1>
+<h1>Hadoop 1.0.2 Release Notes</h1>
 		These release notes include new developer and user-facing incompatibilities, features,
and major improvements. 
 <a name="changes"/>
+<h2>Changes since Hadoop 1.0.1</h2>
+<h3>Jiras with Release Notes (describe major or incompatible changes)</h3>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-1722">HADOOP-1722</a>.
+     Major improvement reported by runping and fixed by klbostee <br>
+     <b>Make streaming to handle non-utf8 byte array</b><br>
+     <blockquote>                                              Streaming allows binary
(or other non-UTF8) streams.
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3851">MAPREDUCE-3851</a>.
+     Major bug reported by kihwal and fixed by tgraves (tasktracker)<br>
+     <b>Allow more aggressive action on detection of the jetty issue</b><br>
+     <blockquote>                    added new configuration variables to control when
TT aborts if it sees a certain number of exceptions:
+&nbsp;&nbsp;&nbsp;&nbsp;// Percent of shuffle exceptions (out of sample size)
seen before it&#39;s
+&nbsp;&nbsp;&nbsp;&nbsp;// fatal - acceptable values are from 0 to 1.0, 0
disables the check.
+&nbsp;&nbsp;&nbsp;&nbsp;// ie. 0.3 = 30% of the last X number of requests
matched the exception,
+&nbsp;&nbsp;&nbsp;&nbsp;// so abort.
+&nbsp;&nbsp;&nbsp;&nbsp;// The number of trailing requests we track, used
for the fatal
+&nbsp;&nbsp;&nbsp;&nbsp;// limit calculation
+<h3>Other Jiras (describe bug fixes and minor changes)</h3>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-5450">HADOOP-5450</a>.
+     Blocker improvement reported by klbostee and fixed by klbostee <br>
+     <b>Add support for application-specific typecodes to typed bytes</b><br>
+     <blockquote>For serializing objects of types that are not supported by typed bytes
serialization, applications might want to use a custom serialization format. Right now, typecode
0 has to be used for the bytes resulting from this custom serialization, which could lead
to problems when deserializing the objects because the application cannot know if a byte sequence
following typecode 0 is a customly serialized object or just a raw sequence of bytes. Therefore,
a range of typecodes that are treated as ali...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7206">HADOOP-7206</a>.
+     Major new feature reported by eli and fixed by tucu00 <br>
+     <b>Integrate Snappy compression</b><br>
+     <blockquote>Google release Zippy as an open source (APLv2) project called Snappy
(http://code.google.com/p/snappy). This tracks integrating it into Hadoop.<br><br>{quote}<br>Snappy
is a compression/decompression library. It does not aim for maximum compression, or compatibility
with any other compression library; instead, it aims for very high speeds and reasonable compression.
For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster
for most inputs, but the resulting compressed ...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8050">HADOOP-8050</a>.
+     Major bug reported by kihwal and fixed by kihwal (metrics)<br>
+     <b>Deadlock in metrics</b><br>
+     <blockquote>The metrics serving thread and the periodic snapshot thread can deadlock.<br>It
happened a few times on one of namenodes we have. When it happens RPC works but the web ui
and hftp stop working. I haven&apos;t look at the trunk too closely, but it might happen
there too.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8088">HADOOP-8088</a>.
+     Major bug reported by kihwal and fixed by  (security)<br>
+     <b>User-group mapping cache incorrectly does negative caching on transient failures</b><br>
+     <blockquote>We&apos;ve seen a case where some getGroups() calls fail when
the ldap server or the network is having transient failures. Looking at the code, the shell-based
and the JNI-based implementations swallow exceptions and return an empty or partial list.
The caller, Groups#getGroups() adds this likely empty list into the mapping cache for the
user. This will function as negative caching until the cache expires. I don&apos;t think
we want negative caching here, but even if we do, it should be intelligent eno...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8090">HADOOP-8090</a>.
+     Major improvement reported by gkesavan and fixed by gkesavan <br>
+     <b>rename hadoop 64 bit rpm/deb package name</b><br>
+     <blockquote>change hadoop rpm/deb name from hadoop-&lt;version&gt;.amd64.rpm/deb
hadoop-&lt;version&gt;.x86_64.rpm/deb   </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8132">HADOOP-8132</a>.
+     Major bug reported by arpitgupta and fixed by arpitgupta <br>
+     <b>64bit secure datanodes do not start as the jsvc path is wrong</b><br>
+     <blockquote>64bit secure datanodes were looking for /usr/libexec/../libexec/jsvc.
instead of /usr/libexec/../libexec/jsvc.amd64</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2701">HDFS-2701</a>.
+     Major improvement reported by eli and fixed by eli (name-node)<br>
+     <b>Cleanup FS* processIOError methods</b><br>
+     <blockquote>Let&apos;s rename the various &quot;processIOError&quot;
methods to be more descriptive. The current code makes it difficult to identify and reason
about bug fixes. While we&apos;re at it let&apos;s remove &quot;Fatal&quot;
from the &quot;Unable to sync the edit log&quot; log since it&apos;s not actually
a fatal error (this is confusing to users). And 2NN &quot;Checkpoint done&quot; should
be info, not a warning (also confusing to users).<br><br>Thanks to HDFS-1073 these
issues don&apos;t exist on trunk or 23.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2702">HDFS-2702</a>.
+     Critical bug reported by eli and fixed by eli (name-node)<br>
+     <b>A single failed name dir can cause the NN to exit </b><br>
+     <blockquote>There&apos;s a bug in FSEditLog#rollEditLog which results in the
NN process exiting if a single name dir has failed. Here&apos;s the relevant code:<br><br>{code}<br>close()
 // So editStreams.size() is 0 <br>foreach edits dir {<br>  ..<br>  eStream
= new ...  // Might get an IOE here<br>  editStreams.add(eStream);<br>} catch
(IOException ioe) {<br>  removeEditsForStorageDir(sd);  // exits if editStreams.size()
&lt;= 1  <br>}<br>{code}<br><br>If we get an IOException before
we&apos;ve added two edits streams to the list we&apos;ll exit, eg if there&apos;s
an ...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2703">HDFS-2703</a>.
+     Major bug reported by eli and fixed by eli (name-node)<br>
+     <b>removedStorageDirs is not updated everywhere we remove a storage dir</b><br>
+     <blockquote>There are a number of places (FSEditLog#open, purgeEditLog, and rollEditLog)
where we remove a storage directory but don&apos;t add it to the removedStorageDirs list.
This means a storage dir may have been removed but we don&apos;t see it in the log or
Web UI. This doesn&apos;t affect trunk/23 since the code there is totally different.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2978">HDFS-2978</a>.
+     Major new feature reported by atm and fixed by atm (name-node)<br>
+     <b>The NameNode should expose name dir statuses via JMX</b><br>
+     <blockquote>We currently display this info on the NN web UI, so users who wish
to monitor this must either do it manually or parse HTML. We should publish this information
via JMX.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3006">HDFS-3006</a>.
+     Major bug reported by bcwalrus and fixed by szetszwo (name-node)<br>
+     <b>Webhdfs &quot;SETOWNER&quot; call returns incorrect content-type</b><br>
+     <blockquote>The SETOWNER call returns an empty body. But the header has &quot;Content-Type:
application/json&quot;, which is a contradiction (empty string is not valid json). This
appears to happen for SETTIMES and SETPERMISSION as well.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3075">HDFS-3075</a>.
+     Major improvement reported by brandonli and fixed by brandonli (name-node)<br>
+     <b>Backport HADOOP-4885 to branch-1</b><br>
+     <blockquote>When a storage directory is inaccessible, namenode removes it from
the valid storage dir list to a removedStorageDirs list. Those storage directories will not
be restored when they become healthy again. <br><br>The proposed solution is to
restore the previous failed directories at the beginning of checkpointing, say, rollEdits,
by copying necessary metadata files from healthy directory to unhealthy ones. In this way,
whenever a failed storage directory is recovered by the administrator, he/she can ...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3101">HDFS-3101</a>.
+     Major bug reported by wangzw and fixed by szetszwo (hdfs client)<br>
+     <b>cannot read empty file using webhdfs</b><br>
+     <blockquote>STEP:<br>1, create a new EMPTY file<br>2, read it using
webhdfs.<br><br>RESULT:<br>expected: get a empty file<br>I got: {&quot;RemoteException&quot;:{&quot;exception&quot;:&quot;IOException&quot;,&quot;javaClassName&quot;:&quot;java.io.IOException&quot;,&quot;message&quot;:&quot;Offset=0
out of the range [0, 0); OPEN, path=/testFile&quot;}}<br><br>First of all,
[0, 0) is not a valid range, and I think read a empty file should be OK.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-764">MAPREDUCE-764</a>.
+     Blocker bug reported by klbostee and fixed by klbostee (contrib/streaming)<br>
+     <b>TypedBytesInput&apos;s readRaw() does not preserve custom type codes</b><br>
+     <blockquote>The typed bytes format supports byte sequences of the form {{&lt;custom
type code&gt; &lt;length&gt; &lt;bytes&gt;}}. When reading such a sequence
via {{TypedBytesInput}}&apos;s {{readRaw()}} method, however, the returned sequence currently
is {{0 &lt;length&gt; &lt;bytes&gt;}} (0 is the type code for a bytes array),
which leads to bugs such as the one described [here|http://dumbo.assembla.com/spaces/dumbo/tickets/54].</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3583">MAPREDUCE-3583</a>.
+     Critical bug reported by zhihyu@ebaysf.com and fixed by zhihyu@ebaysf.com <br>
+     <b>ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException</b><br>
+     <blockquote>HBase PreCommit builds frequently gave us NumberFormatException.<br><br>From
01:44:01,180 WARN  [main] mapred.JobClient(784): No job jar file set.  User classes may not
be found. See JobConf(Class) or JobConf#setJar(String).<br>java.lang.NumberFormatException:
For input string: &quot;18446743988060683582&quot;<br>	at java.lang.NumberFormatException.fo...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3773">MAPREDUCE-3773</a>.
+     Major new feature reported by owen.omalley and fixed by owen.omalley (jobtracker)<br>
+     <b>Add queue metrics with buckets for job run times</b><br>
+     <blockquote>It would be nice to have queue metrics that reflect the number of
jobs in each queue that have been running for different ranges of time.<br><br>Reasonable
time ranges are probably 0-1 hr, 1-5 hr, 5-24 hr, 24+ hrs; but they should be configurable.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3824">MAPREDUCE-3824</a>.
+     Critical bug reported by aw and fixed by tgraves (distributed-cache)<br>
+     <b>Distributed caches are not removed properly</b><br>
+     <blockquote>Distributed caches are not being properly removed by the TaskTracker
when they are expected to be expired. </blockquote></li>
+<h2>Changes since Hadoop 1.0.0</h2>
+<h3>Jiras with Release Notes (describe major or incompatible changes)</h3>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8009">HADOOP-8009</a>.
+     Critical improvement reported by tucu00 and fixed by tucu00 (build)<br>
+     <b>Create hadoop-client and hadoop-minicluster artifacts for downstream projects
+     <blockquote>                    Generate integration artifacts &quot;org.apache.hadoop:hadoop-client&quot;
and &quot;org.apache.hadoop:hadoop-minicluster&quot; containing all the jars needed
to use Hadoop client APIs, and to run Hadoop MiniClusters, respectively.  Push these artifacts
to the maven repository when mvn-deploy, along with existing artifacts. 
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8037">HADOOP-8037</a>.
+     Blocker bug reported by mattf and fixed by gkesavan (build)<br>
+     <b>Binary tarball does not preserve platform info for native builds, and RPMs
fail to provide needed symlinks for libhadoop.so</b><br>
+     <blockquote>                    This fix is marked &quot;incompatible&quot;
only because it changes the bin-tarball directory structure to be consistent with the source
tarball directory structure. The source tarball is unchanged. RPMs and DEBs now use an intermediate
bin-tarball with an &quot;${os.arch}&quot; tag (like the packages themselves). The
un-tagged bin-tarball is now multi-platform and retains the structure of the source tarball;
it is in fact generated by target &quot;tar&quot;, not by target &quot;binary&quot;.
Finally, in the 64-bit RPMs and DEBs, the native libs go in the &quot;lib64&quot;
directory instead of &quot;lib&quot;.
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3184">MAPREDUCE-3184</a>.
+     Major improvement reported by tlipcon and fixed by tlipcon (jobtracker)<br>
+     <b>Improve handling of fetch failures when a tasktracker is not responding on
+     <blockquote>                    The TaskTracker now has a thread which monitors
for a known Jetty bug in which the selector thread starts spinning and map output can no longer
be served. If the bug is detected, the TaskTracker will shut itself down. This feature can
be disabled by setting mapred.tasktracker.jetty.cpu.check.enabled to false.
+<h3>Other Jiras (describe bug fixes and minor changes)</h3>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7470">HADOOP-7470</a>.
+     Minor improvement reported by stevel@apache.org and fixed by enis (util)<br>
+     <b>move up to Jackson 1.8.8</b><br>
+     <blockquote>I see that hadoop-core still depends on Jackson 1.0.1 -but that project
is now up to 1.8.2 in releases. Upgrading will make it easier for other Jackson-using apps
that are more up to date to keep their classpath consistent.<br><br>The patch
would be updating the ivy file to pull in the later version; no test</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7960">HADOOP-7960</a>.
+     Major bug reported by gkesavan and fixed by mattf <br>
+     <b>Port HADOOP-5203 to branch-1, build version comparison is too restrictive</b><br>
+     <blockquote>hadoop services should not be using the build timestamp to verify
version difference in the cluster installation. Instead it should use the source checksum
as in HADOOP-5203.<br>  </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7964">HADOOP-7964</a>.
+     Blocker bug reported by kihwal and fixed by daryn (security, util)<br>
+     <b>Deadlock in class init.</b><br>
+     <blockquote>After HADOOP-7808, client-side commands hang occasionally. There are
cyclic dependencies in NetUtils and SecurityUtil class initialization. Upon initial look at
the stack trace, two threads deadlock when they hit the either of class init the same time.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7987">HADOOP-7987</a>.
+     Major improvement reported by devaraj and fixed by jnp (security)<br>
+     <b>Support setting the run-as user in unsecure mode</b><br>
+     <blockquote>Some applications need to be able to perform actions (such as launch
MR jobs) from map or reduce tasks. In earlier unsecure versions of hadoop (20.x), it was possible
to do this by setting user.name in the configuration. But in 20.205 and 1.0, when running
in unsecure mode, this does not work. (In secure mode, you can do this using the kerberos
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7988">HADOOP-7988</a>.
+     Major bug reported by jnp and fixed by jnp <br>
+     <b>Upper case in hostname part of the principals doesn&apos;t work with kerberos.</b><br>
+     <blockquote>Kerberos doesn&apos;t like upper case in the hostname part of
the principals.<br>This issue has been seen in 23 as well as 1.0.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8010">HADOOP-8010</a>.
+     Minor bug reported by rvs and fixed by rvs (scripts)<br>
+     <b>hadoop-config.sh spews error message when HADOOP_HOME_WARN_SUPPRESS is set
to true and HADOOP_HOME is present</b><br>
+     <blockquote>Running hadoop daemon commands when HADOOP_HOME_WARN_SUPPRESS is set
to true and HADOOP_HOME is present produces:<br>{noformat}<br>  [: 76: true: unexpected
+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8052">HADOOP-8052</a>.
+     Major bug reported by reznor and fixed by reznor (metrics)<br>
+     <b>Hadoop Metrics2 should emit Float.MAX_VALUE (instead of Double.MAX_VALUE) to
avoid making Ganglia&apos;s gmetad core</b><br>
+     <blockquote>Ganglia&apos;s gmetad converts the doubles emitted by Hadoop&apos;s
Metrics2 system to strings, and the buffer it uses is 256 bytes wide.<br><br>When
the SampleStat.MinMax class (in org.apache.hadoop.metrics2.util) emits its default min value
(currently initialized to Double.MAX_VALUE), it ends up causing a buffer overflow in gmetad,
which causes it to core, effectively rendering Ganglia useless (for some, the core is continuous;
for others who are more fortunate, it&apos;s only a one-time Hadoop-startup-time thi...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2379">HDFS-2379</a>.
+     Critical bug reported by tlipcon and fixed by tlipcon (data-node)<br>
+     <b>0.20: Allow block reports to proceed without holding FSDataset lock</b><br>
+     <blockquote>As disks are getting larger and more plentiful, we&apos;re seeing
DNs with multiple millions of blocks on a single machine. When page cache space is tight,
block reports can take multiple minutes to generate. Currently, during the scanning of the
data directories to generate a report, the FSVolumeSet lock is held. This causes writes and
reads to block, timeout, etc, causing big problems especially for clients like HBase.<br><br>This
JIRA is to explore some of the ideas originally discussed in HADOOP-458...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2814">HDFS-2814</a>.
+     Minor improvement reported by hitesh and fixed by hitesh <br>
+     <b>NamenodeMXBean does not account for svn revision in the version information</b><br>
+     <blockquote>Unlike the jobtracker where both the UI and jmx information report
the version as &quot;x.y.z, r&lt;svn revision&quot;, in case of the namenode,
the UI displays x.y.z and svn revision info but the jmx output only contains the x.y.z version.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3343">MAPREDUCE-3343</a>.
+     Major bug reported by ahmed.radwan and fixed by zhaoyunjiong (mrv1)<br>
+     <b>TaskTracker Out of Memory because of distributed cache</b><br>
+     <blockquote>This Out of Memory happens when you run large number of jobs (using
the distributed cache) on a TaskTracker. <br><br>Seems the basic issue is with
the distributedCacheManager (instance of TrackerDistributedCacheManager in TaskTracker.java),
this gets created during TaskTracker.initialize(), and it keeps references to TaskDistributedCacheManager
for every submitted job via the jobArchives Map, also references to CacheStatus via cachedArchives
map. I am not seeing these cleaned up between jobs, so th...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3607">MAPREDUCE-3607</a>.
+     Major improvement reported by tomwhite and fixed by tomwhite (client)<br>
+     <b>Port missing new API mapreduce lib classes to 1.x</b><br>
+     <blockquote>There are a number of classes under mapreduce.lib that are not present
in the 1.x series. Including these would help users and downstream projects using the new
MapReduce API migrate to later versions of Hadoop in the future.<br><br>A few
examples of where this would help:<br>* Sqoop uses mapreduce.lib.db.DBWritable and mapreduce.lib.input.CombineFileInputFormat
(SQOOP-384).<br>* Mahout uses mapreduce.lib.output.MultipleOutputs (MAHOUT-822).<br>*
HBase has a backport of mapreduce.lib.partition.InputSampler ...</blockquote></li>
 <h2>Changes since Hadoop</h2>
 <h3>Jiras with Release Notes (describe major or incompatible changes)</h3>

View raw message