<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>core-dev@hadoop.apache.org Archives</title>
<link rel="self" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/?format=atom"/>
<link href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/"/>
<id>http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/</id>
<updated>2009-06-30T11:27:20Z</updated>
<entry>
<title>[jira] Commented: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;Chris Douglas (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1288951475.1246240007561.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1288951475-1246240007561-JavaMail-jira@brutus%3e</id>
<updated>2009-06-29T01:46:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12725017#action_12725017
] 

Chris Douglas commented on HADOOP-6109:
---------------------------------------

{noformat}
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified
tests.
     [exec]                         Please justify why no new tests are needed for this patch.
     [exec]                         Also please list what manual steps were performed to verify
this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     -1 javac.  The applied patch generated 64 javac compiler warnings (more than
the trunk's current 124 warnings).
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of
release audit warnings.
{noformat}

The javac objection is spurious. The patch adds no warnings.

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: io
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;            Assignee: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch, HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Apache git mirrors, post project split</title>
<author><name>Chris Douglas &lt;cdouglas@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1267dd3b0906281829x18f24a5ye3b94dc36aa55a6f@mail.gmail.com%3e"/>
<id>urn:uuid:%3c1267dd3b0906281829x18f24a5ye3b94dc36aa55a6f@mail-gmail-com%3e</id>
<updated>2009-06-29T01:29:22Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
The new git repositories for HDFS and Map/Reduce,
git://git.apache.org/hadoop-{mapreduce,hdfs}.git are up.

There are two open issues:
(1) What to do with the defunct git mirror that points to core
(2) What to call the common mirror

For (1), I'd advocate simply deleting it. There doesn't seem to be a
compelling case for maintaining an archive of a mirror next to its
active replacements.

For (2), the git mirror is currently being built as hadoop-common.git.
Since it will contain the pre-0.21 tagged releases/branches and
developers pulling from the old mirror will error out either way, I'd
lean toward calling it hadoop.git, but am mostly ambivalent on this.

The ticket for this is INFRA-2108. -C


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Info about Cluster Rebalancing</title>
<author><name>Ted Dunning &lt;ted.dunning@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3cc7d45fc70906281118u7d61cc2fx31a82584cf202cea@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc7d45fc70906281118u7d61cc2fx31a82584cf202cea@mail-gmail-com%3e</id>
<updated>2009-06-28T18:18:26Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Rebalancing is already built into hadoop.

http://hadoop.apache.org/core/docs/r0.17.2/hdfs_user_guide.html#Rebalancer

On Sun, Jun 28, 2009 at 9:01 AM, arun kumar &lt;arunkumar_skcet@yahoo.com&gt;wrote:

&gt; I am a beginner and I am interested in implementing a Cluster Re-balancing
&gt; scheme. I will keep searching regarding, meanwhile if anybody is working on
&gt; or knows anything, please let me know.
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>Info about Cluster Rebalancing</title>
<author><name>arun kumar &lt;arunkumar_skcet@yahoo.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c347698.40998.qm@web50302.mail.re2.yahoo.com%3e"/>
<id>urn:uuid:%3c347698-40998-qm@web50302-mail-re2-yahoo-com%3e</id>
<updated>2009-06-28T16:01:58Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hello,

I am a beginner and I am interested in implementing a Cluster Re-balancing scheme. I will
keep searching regarding, meanwhile if anybody is working on or knows anything, please let
me know.

Thank you,
Arun



      

</pre>
</div>
</content>
</entry>
<entry>
<title>Re: 0.19.2 release needed</title>
<author><name>Thibaut_ &lt;tbritz@blue.lu&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c24242623.post@talk.nabble.com%3e"/>
<id>urn:uuid:%3c24242623-post@talk-nabble-com%3e</id>
<updated>2009-06-28T15:44:56Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

Hi,

I would also be interested in having a 0.19.2 release, or even more
releases. 

If you are starting to run your cluster, you don't know which version you
should take. Some releases were unstable (eg taskstrackers running out of
memory, or executing maps sequentially after a certain number of jobs,
multifileoutputformat not working with threads, ..). If we are upgrading to
a major version number, we are nearly always hitting a bug which makes our
system unreliable.

Also now with yahoo's release of their own version, what happens if I use
that version? Will there be more releases after that one from yahoo? Can I
migrate my cluster back to the official branch afterwards without losing
data?

Thibaut





jason hadoop wrote:
&gt; 
&gt; I want to feed some mapside join patches in, time to get off my duff and
&gt; submit them :)
&gt; 
&gt; 
&gt; On Mon, Jun 15, 2009 at 12:33 PM, Scott Carey
&gt; &lt;scott@richrelevance.com&gt;wrote:
&gt; 
&gt;&gt; I am still interested in seeing a 0.19.2 release soon.  The 0.19.2-dev
&gt;&gt; and
&gt;&gt; 0.18.4-dev branches have been close to dormant for over a month, yet
&gt;&gt; contain
&gt;&gt; many critical fixes.
&gt;&gt;
&gt;&gt;
&gt;&gt; With the Yahoo distribution out there, 0.20.1 is a slightly lower
&gt;&gt; priority
&gt;&gt; for me.  Additionally, several other components are not yet 0.20.x
&gt;&gt; compatible anyway.
&gt;&gt;
&gt;&gt;
&gt;&gt; What does the rest of the community think?
&gt;&gt;
&gt;&gt; -Scott
&gt;&gt;
&gt;&gt; On 5/27/09 4:08 PM, "Owen O'Malley" &lt;omalley@apache.org&gt; wrote:
&gt;&gt;
&gt;&gt; If no one else does, I'll roll 0.18.4, 0.19.2, and 0.20.1 next week.
&gt;&gt;
&gt;&gt; -- Owen
&gt;&gt;
&gt;&gt;
&gt; 
&gt; 
&gt; -- 
&gt; Pro Hadoop, a book to guide you from beginner to hadoop mastery,
&gt; http://www.amazon.com/dp/1430219424?tag=jewlerymall
&gt; www.prohadoopbook.com a community for Hadoop Professionals
&gt; 
&gt; 

-- 
View this message in context: http://www.nabble.com/0.19.2-release-needed-tp23748571p24242623.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: FileStatus.getLen(): bug in documentation or implememtation?</title>
<author><name>Dima Rzhevskiy &lt;dima@rzhevskiy.info&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c7ec9beea0906280636r366f0be3rda554b2d84b38204@mail.gmail.com%3e"/>
<id>urn:uuid:%3c7ec9beea0906280636r366f0be3rda554b2d84b38204@mail-gmail-com%3e</id>
<updated>2009-06-28T13:36:03Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I create bug record in https://issues.apache.org/jira/browse/HADOOP-6114
with attached patch for trunk.
Dmitry.


2009/6/28 Dhruba Borthakur &lt;dhruba@gmail.com&gt;

&gt; From my understanding, getLen() always returned the length of the file in
&gt; bytes.
&gt;
&gt; thank,
&gt; dhruba
&gt;
&gt;
&gt; On Fri, Jun 26, 2009 at 9:20 AM, Dima Rzhevskiy &lt;dima@rzhevskiy.info&gt;
&gt; wrote:
&gt;
&gt; &gt; Hi all
&gt; &gt; I try get length of file hadoop(RawFilesysten or hdfs) .
&gt; &gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend
&gt; that
&gt; &gt; this method "return the length of this file, in blocks"
&gt; &gt; But method return size in bytes.
&gt; &gt;
&gt; &gt; Is this bug in documentation or implememtation?
&gt; &gt; I use  hadoop-0.18.3.
&gt; &gt;
&gt; &gt;
&gt; &gt; Dmitry Rzhevskiy.
&gt; &gt;
&gt;



-- 
С уважением,
Ржевский Дмитрий,
www.rzhevskiy.info
тел. +7(926)426-55-55


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;Chris Douglas (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c631753662.1246180729105.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c631753662-1246180729105-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T09:18:49Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Douglas updated HADOOP-6109:
----------------------------------

    Component/s:     (was: util)
                 io

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: io
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch, HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;Chris Douglas (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c753391925.1246180729111.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c753391925-1246180729111-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T09:18:49Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Douglas updated HADOOP-6109:
----------------------------------

    Assignee: thushara wijeratna
      Status: Patch Available  (was: Open)

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: io
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;            Assignee: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch, HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: FileStatus.getLen(): bug in documentation or implememtation?</title>
<author><name>Dhruba Borthakur &lt;dhruba@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c4aa34eb70906271753u58da79e3tf6f48971647044db@mail.gmail.com%3e"/>
<id>urn:uuid:%3c4aa34eb70906271753u58da79e3tf6f48971647044db@mail-gmail-com%3e</id>
<updated>2009-06-28T00:53:52Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
&gt;From my understanding, getLen() always returned the length of the file in
bytes.

thank,
dhruba


On Fri, Jun 26, 2009 at 9:20 AM, Dima Rzhevskiy &lt;dima@rzhevskiy.info&gt; wrote:

&gt; Hi all
&gt; I try get length of file hadoop(RawFilesysten or hdfs) .
&gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that
&gt; this method "return the length of this file, in blocks"
&gt; But method return size in bytes.
&gt;
&gt; Is this bug in documentation or implememtation?
&gt; I use  hadoop-0.18.3.
&gt;
&gt;
&gt; Dmitry Rzhevskiy.
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6116) Sqoop depends on commons-cli, which is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1972006308.1246148687168.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1972006308-1246148687168-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:24:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6116:
-------------------------------

    Status: Patch Available  (was: Open)

&gt; Sqoop depends on commons-cli, which is not in its ivy.xml.
&gt; ----------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6116
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;         Attachments: HADOOP-6116.patch
&gt;
&gt;
&gt; Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6116) Sqoop depends on commons-cli, which is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c2086500637.1246148567219.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c2086500637-1246148567219-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:22:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6116:
-------------------------------

    Attachment: HADOOP-6116.patch

The new dependency in ivy.xml.  Aaron Kimball actually made this code change on my computer
while we were together; I'm just submitting it here.

&gt; Sqoop depends on commons-cli, which is not in its ivy.xml.
&gt; ----------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6116
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;         Attachments: HADOOP-6116.patch
&gt;
&gt;
&gt; Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6116) Sqoop depends on commons-cli, which is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c838707250.1246148447202.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c838707250-1246148447202-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:20:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6116:
-------------------------------

    Status: Patch Available  (was: Open)

Adding the dependency to ivy.xml.  Aaron Kimball actually made this change on my computer
while we were together; I'm just committing it.

&gt; Sqoop depends on commons-cli, which is not in its ivy.xml.
&gt; ----------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6116
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;
&gt; Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6116) Sqoop depends on commons-cli, which is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1841796570.1246148447267.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1841796570-1246148447267-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:20:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6116:
-------------------------------

    Status: Open  (was: Patch Available)

&gt; Sqoop depends on commons-cli, which is not in its ivy.xml.
&gt; ----------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6116
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;
&gt; Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6116) Sqoop depends on commons-cli, which is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c581556061.1246147967227.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c581556061-1246147967227-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:12:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6116:
-------------------------------

    Summary: Sqoop depends on commons-cli, which is not in its ivy.xml.  (was: Sqoop depends
on commons-cli, but it is not in its ivy.xml.)

&gt; Sqoop depends on commons-cli, which is not in its ivy.xml.
&gt; ----------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6116
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;
&gt; Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (HADOOP-6116) Sqoop depends on commons-cli, but it is not in its ivy.xml.</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1110660990.1246147848446.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1110660990-1246147848446-JavaMail-jira@brutus%3e</id>
<updated>2009-06-28T00:10:48Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Sqoop depends on commons-cli, but it is not in its ivy.xml.
-----------------------------------------------------------

                 Key: HADOOP-6116
                 URL: https://issues.apache.org/jira/browse/HADOOP-6116
             Project: Hadoop Common
          Issue Type: Bug
          Components: contrib/sqoop
    Affects Versions: 0.21.0
            Reporter: Kevin Weil
            Priority: Minor


Sqoop's ivy.xml needs commons-cli in order to build from scratch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-6115) Sqoop should allow a &quot;where&quot; clause to avoid having to export entire tables</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c228916814.1246145207178.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c228916814-1246145207178-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T23:26:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724906#action_12724906
] 

Kevin Weil commented on HADOOP-6115:
------------------------------------

I have this working except for in the case of optimized local importing.  Not sure why this
is failing ATM because the same command line statement succeeds, but I expect to have a patch
by the end of the weekend.

&gt; Sqoop should allow a "where" clause to avoid having to export entire tables
&gt; ---------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6115
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6115
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;
&gt; Sqoop currently only exports at the granularity of a table.  This doesn't work well on
systems with large tables, where the overhead of performing a full dump each time is significant.
 Allowing the user to specify a where clause is a relatively simple task which will give Sqoop
a lot more flexibility.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (HADOOP-6115) Sqoop should allow a where clause in its export statements</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c247895648.1246144967626.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c247895648-1246144967626-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T23:22:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Sqoop should allow a where clause in its export statements
----------------------------------------------------------

                 Key: HADOOP-6115
                 URL: https://issues.apache.org/jira/browse/HADOOP-6115
             Project: Hadoop Common
          Issue Type: Improvement
          Components: contrib/sqoop
    Affects Versions: 0.21.0
            Reporter: Kevin Weil
            Priority: Minor


Sqoop currently only exports at the granularity of a table.  This doesn't work well on systems
with large tables, where the overhead of performing a full dump each time is significant.
 Allowing the user to specify a where clause is a relatively simple task which will give Sqoop
a lot more flexibility.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6115) Sqoop should allow a &quot;where&quot; clause to avoid having to export entire tables</title>
<author><name>&quot;Kevin Weil (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1414008347.1246144967659.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1414008347-1246144967659-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T23:22:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Weil updated HADOOP-6115:
-------------------------------

    Summary: Sqoop should allow a "where" clause to avoid having to export entire tables 
(was: Sqoop should allow a where clause in its export statements)

&gt; Sqoop should allow a "where" clause to avoid having to export entire tables
&gt; ---------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6115
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6115
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: contrib/sqoop
&gt;    Affects Versions: 0.21.0
&gt;            Reporter: Kevin Weil
&gt;            Priority: Minor
&gt;
&gt; Sqoop currently only exports at the granularity of a table.  This doesn't work well on
systems with large tables, where the overhead of performing a full dump each time is significant.
 Allowing the user to specify a where clause is a relatively simple task which will give Sqoop
a lot more flexibility.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6114) bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()</title>
<author><name>&quot;Dmitry Rzhevskiy (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1292864064.1246142447218.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1292864064-1246142447218-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T22:40:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Rzhevskiy updated HADOOP-6114:
-------------------------------------

    Comment: was deleted

(was: project compiled but  some tests filed (befor modificatios tests failed too too)
)

&gt; bug in documentation: org.apache.hadoop.fs.FileStatus.getLen() 
&gt; ---------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6114
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6114
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;    Affects Versions: 0.18.3, 0.20.0
&gt;            Reporter: Dmitry Rzhevskiy
&gt;         Attachments: HADOOP-6114.patch
&gt;
&gt;
&gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that this method
"return the length of this file, in blocks"
&gt; But method return size in bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6114) bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()</title>
<author><name>&quot;Dmitry Rzhevskiy (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c982745897.1246142327209.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c982745897-1246142327209-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T22:38:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Rzhevskiy updated HADOOP-6114:
-------------------------------------

    Attachment: HADOOP-6114.patch

trunk of project compiled but some tests failed ( tests failed,  before applying patch too)

&gt; bug in documentation: org.apache.hadoop.fs.FileStatus.getLen() 
&gt; ---------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6114
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6114
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;    Affects Versions: 0.18.3, 0.20.0
&gt;            Reporter: Dmitry Rzhevskiy
&gt;         Attachments: HADOOP-6114.patch
&gt;
&gt;
&gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that this method
"return the length of this file, in blocks"
&gt; But method return size in bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-5925) EC2 scripts should exit on error</title>
<author><name>&quot;Hudson (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c867037118.1246101047302.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c867037118-1246101047302-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T11:10:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724841#action_12724841
] 

Hudson commented on HADOOP-5925:
--------------------------------

Integrated in Hadoop-Common-trunk #9 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/9/])
    . EC2 scripts should exit on error.


&gt; EC2 scripts should exit on error
&gt; --------------------------------
&gt;
&gt;                 Key: HADOOP-5925
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-5925
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: contrib/ec2
&gt;            Reporter: Tom White
&gt;            Assignee: Tom White
&gt;             Fix For: 0.21.0
&gt;
&gt;         Attachments: hadoop-5925.patch
&gt;
&gt;
&gt; For example, if an ec2-authorize command fails the script should stop so that the problem
is easier to debug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-5897) Add more Metrics to Namenode to capture heap usage</title>
<author><name>&quot;Hudson (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c842646745.1246101047261.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c842646745-1246101047261-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T11:10:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724840#action_12724840
] 

Hudson commented on HADOOP-5897:
--------------------------------

Integrated in Hadoop-Common-trunk #9 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/9/])
    . Promote new name-node metrics to  branch 0.20.


&gt; Add more Metrics to Namenode to capture heap usage
&gt; --------------------------------------------------
&gt;
&gt;                 Key: HADOOP-5897
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-5897
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: dfs, metrics
&gt;            Reporter: Suresh Srinivas
&gt;            Assignee: Suresh Srinivas
&gt;             Fix For: 0.21.0
&gt;
&gt;         Attachments: 5897.rel20.patch, stats.1.patch, stats.patch, stats.patch
&gt;
&gt;
&gt; Recently we had GC issues, where Namenode used more heap than usual. There was no growth
indicated by the data in current Metrics to justify the heap usage. Adding more stats such
as:
&gt; - Counter to track blocks that are pending deletion
&gt; - BlocksMap hashmap capacity
&gt; - Counter to track excess number of blocks 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-6112) to fix hudsonPatchQueueAdmin for different projects</title>
<author><name>&quot;Hudson (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1792558406.1246101047201.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1792558406-1246101047201-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T11:10:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724839#action_12724839
] 

Hudson commented on HADOOP-6112:
--------------------------------

Integrated in Hadoop-Common-trunk #9 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/9/])
    . Fix hudsonPatchQueueAdmin for different projects.


&gt; to fix hudsonPatchQueueAdmin for different projects
&gt; ---------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6112
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6112
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: build
&gt;            Reporter: Giridharan Kesavan
&gt;            Assignee: Giridharan Kesavan
&gt;         Attachments: HADOOP-6112.patch
&gt;
&gt;
&gt; To fix hudsonPatchQueueAdmin process for different hadoop projects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: PROJECTS SPLIT</title>
<author><name>&quot;Owen O'Malley&quot; &lt;owen.omalley@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c5f7f740b0906262118l684864a4u37a775cd93e23785@mail.gmail.com%3e"/>
<id>urn:uuid:%3c5f7f740b0906262118l684864a4u37a775cd93e23785@mail-gmail-com%3e</id>
<updated>2009-06-27T04:18:51Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Fri, Jun 26, 2009 at 11:10 AM, Aaron Kimball&lt;aaron@cloudera.com&gt; wrote:
&gt; Any progress on this over the last few days? Did the
&gt; infrastructure-dev team provide an ETA when git repos can be set up?
&gt; I've got some work in the pipeline, would be great to clear the dust
&gt; out of my development process and get back on track ;)

I sent email to Jukka and he has been on vacation this week. He will
try and set us up soon.

-- Owen


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: more information about project split</title>
<author><name>&quot;Owen O'Malley&quot; &lt;owen.omalley@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c5f7f740b0906262020me756413wfe9d7447221b35d6@mail.gmail.com%3e"/>
<id>urn:uuid:%3c5f7f740b0906262020me756413wfe9d7447221b35d6@mail-gmail-com%3e</id>
<updated>2009-06-27T03:20:14Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Of course the 4' is much closer to my choice #2, which makes me happy too. +1


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6114) bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()</title>
<author><name>&quot;Dmitry Rzhevskiy (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1599205969.1246064387196.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1599205969-1246064387196-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T00:59:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Rzhevskiy updated HADOOP-6114:
-------------------------------------

    Attachment:     (was: HADOOP-6114.patch)

&gt; bug in documentation: org.apache.hadoop.fs.FileStatus.getLen() 
&gt; ---------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6114
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6114
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;    Affects Versions: 0.18.3, 0.20.0
&gt;            Reporter: Dmitry Rzhevskiy
&gt;
&gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that this method
"return the length of this file, in blocks"
&gt; But method return size in bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6114) bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()</title>
<author><name>&quot;Dmitry Rzhevskiy (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c2003330061.1246064267241.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c2003330061-1246064267241-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T00:57:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Rzhevskiy updated HADOOP-6114:
-------------------------------------

    Attachment: HADOOP-6114.patch

project compiled but  some tests filed (befor modificatios tests failed too too)


&gt; bug in documentation: org.apache.hadoop.fs.FileStatus.getLen() 
&gt; ---------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6114
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6114
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;    Affects Versions: 0.18.3, 0.20.0
&gt;            Reporter: Dmitry Rzhevskiy
&gt;         Attachments: HADOOP-6114.patch
&gt;
&gt;
&gt; In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that this method
"return the length of this file, in blocks"
&gt; But method return size in bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (HADOOP-6114) bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()</title>
<author><name>&quot;Dmitry Rzhevskiy (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1160221570.1246062827743.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1160221570-1246062827743-JavaMail-jira@brutus%3e</id>
<updated>2009-06-27T00:33:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
bug in documentation: org.apache.hadoop.fs.FileStatus.getLen() 
---------------------------------------------------------------

                 Key: HADOOP-6114
                 URL: https://issues.apache.org/jira/browse/HADOOP-6114
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.20.0, 0.18.3
            Reporter: Dmitry Rzhevskiy


In javadoc method  org.apache.hadoop.fs.FileStatus.getLen()  writtend that this method "return
the length of this file, in blocks"
But method return size in bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>RE: New subproject logos</title>
<author><name>&quot;Jim Kellerman (POWERSET)&quot; &lt;Jim.Kellerman@microsoft.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3cE9C6F5B134FDA3468879CB4E533CF1C804BEC2@TK5EX14MBXC118.redmond.corp.microsoft.com%3e"/>
<id>urn:uuid:%3cE9C6F5B134FDA3468879CB4E533CF1C804BEC2@TK5EX14MBXC118-redmond-corp-microsoft-com%3e</id>
<updated>2009-06-27T00:30:03Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
So will you post the new ones?

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


&gt; -----Original Message-----
&gt; From: Nigel Daley [mailto:ndaley@yahoo-inc.com]
&gt; Sent: Friday, June 26, 2009 4:51 PM
&gt; To: core-dev@hadoop.apache.org
&gt; Subject: Re: New subproject logos
&gt;
&gt; I just spent some time with the graphic designer.  The lower case
&gt; version didn't look good because of the tail on the hadoop "p" meant
&gt; mapreduce had to be too small.
&gt;
&gt; I'll checkin cleaned up vector CMYK and RGB versions of the 3 logos.
&gt;
&gt; Nige
&gt;
&gt;
&gt; On Jun 26, 2009, at 10:33 AM, Doug Cutting wrote:
&gt;
&gt; &gt; I like them except I think they should be all lowercase, to be
&gt; &gt; consistent with the style of the existing Hadoop logo.
&gt; &gt;
&gt; &gt; Doug
&gt; &gt;
&gt; &gt; Nigel Daley wrote:
&gt; &gt;&gt; Here are some logos for the new subprojects
&gt; &gt;&gt; http://www.flickr.com/photos/88199325@N00/3661433605/
&gt; &gt;&gt; Please vote +1 if you like 'em and -1 if you don't.
&gt; &gt;&gt; Cheers,
&gt; &gt;&gt; Nige
&gt;



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: New subproject logos</title>
<author><name>Nigel Daley &lt;ndaley@yahoo-inc.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3cF0D57C51-7E92-4754-B1A7-7061F87F6CF5@yahoo-inc.com%3e"/>
<id>urn:uuid:%3cF0D57C51-7E92-4754-B1A7-7061F87F6CF5@yahoo-inc-com%3e</id>
<updated>2009-06-26T23:50:54Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I just spent some time with the graphic designer.  The lower case  
version didn't look good because of the tail on the hadoop "p" meant  
mapreduce had to be too small.

I'll checkin cleaned up vector CMYK and RGB versions of the 3 logos.

Nige


On Jun 26, 2009, at 10:33 AM, Doug Cutting wrote:

&gt; I like them except I think they should be all lowercase, to be  
&gt; consistent with the style of the existing Hadoop logo.
&gt;
&gt; Doug
&gt;
&gt; Nigel Daley wrote:
&gt;&gt; Here are some logos for the new subprojects
&gt;&gt; http://www.flickr.com/photos/88199325@N00/3661433605/
&gt;&gt; Please vote +1 if you like 'em and -1 if you don't.
&gt;&gt; Cheers,
&gt;&gt; Nige



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Resolved: (HADOOP-5775) HdfsProxy Unit Test should not depend on HDFSPROXY_CONF_DIR environment</title>
<author><name>&quot;zhiyong zhang (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c706775340.1246057667257.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c706775340-1246057667257-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T23:07:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

zhiyong zhang resolved HADOOP-5775.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.21.0

solved in HDFS-447

&gt; HdfsProxy Unit Test should not depend on HDFSPROXY_CONF_DIR environment
&gt; -----------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-5775
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-5775
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: contrib/hdfsproxy
&gt;            Reporter: zhiyong zhang
&gt;            Assignee: zhiyong zhang
&gt;             Fix For: 0.21.0
&gt;
&gt;         Attachments: HADOOP-5775.patch
&gt;
&gt;
&gt; as war target read user-certs.xml and user-permissions.xml from $HDFSPROXY_CONF_DIR.
If a user set this environment and have some files in it, it could potentially cause the unit
test to fail if the conf files does not match what the unit test needs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-5897) Add more Metrics to Namenode to capture heap usage</title>
<author><name>&quot;Konstantin Shvachko (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c2074775216.1246056947175.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c2074775216-1246056947175-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T22:55:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724755#action_12724755
] 

Konstantin Shvachko commented on HADOOP-5897:
---------------------------------------------

+1
Committed to branch 0.20.

&gt; Add more Metrics to Namenode to capture heap usage
&gt; --------------------------------------------------
&gt;
&gt;                 Key: HADOOP-5897
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-5897
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: dfs, metrics
&gt;            Reporter: Suresh Srinivas
&gt;            Assignee: Suresh Srinivas
&gt;             Fix For: 0.21.0
&gt;
&gt;         Attachments: 5897.rel20.patch, stats.1.patch, stats.patch, stats.patch
&gt;
&gt;
&gt; Recently we had GC issues, where Namenode used more heap than usual. There was no growth
indicated by the data in current Metrics to justify the heap usage. Adding more stats such
as:
&gt; - Counter to track blocks that are pending deletion
&gt; - BlocksMap hashmap capacity
&gt; - Counter to track excess number of blocks 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;thushara wijeratna (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1550958974.1246055627515.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1550958974-1246055627515-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T22:33:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

thushara wijeratna updated HADOOP-6109:
---------------------------------------

    Attachment: HADOOP-1234.patch

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: util
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch, HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;thushara wijeratna (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1056663345.1246054907191.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1056663345-1246054907191-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T22:21:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724741#action_12724741
] 

thushara wijeratna commented on HADOOP-6109:
--------------------------------------------

correct Chris - the speculated over-allocation is the only significant difference between
ByteArrayOutputStream and Text class, with regard to increasing capacity.
i have tested this and can confirm the perf improvements. after running the hadoop tests i
will attach the patch.
thanks, 

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: util
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-6109) Handle large (several MB) text input lines in a reasonable amount of time</title>
<author><name>&quot;Chris Douglas (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1910495108.1246047709596.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1910495108-1246047709596-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T20:21:49Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724695#action_12724695
] 

Chris Douglas commented on HADOOP-6109:
---------------------------------------

You might want to try the same experiment with a larger io.file.buffer.size, say 1 or 2MB.
Though the growth remains linear, at least it grows by more than 4k per read.

Rather than growing a separate buffer and copying that into Text, replacing the current code
in Text::setCapacity with {{bytes = Arrays.copyOf(bytes, Math.max(len, length &lt;&lt; 1))}}
should improve Text's performance. I don't think there's any reason why Text needs to be exactly
the length of the largest value it's held.

&gt; Handle large (several MB) text input lines in a reasonable amount of time
&gt; -------------------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-6109
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-6109
&gt;             Project: Hadoop Common
&gt;          Issue Type: Improvement
&gt;          Components: util
&gt;    Affects Versions: 0.19.0
&gt;         Environment: Linux 2.6 kernel, java 1.6 AMD Dual-Core Opteron 2.6GHz with 1M
L1/L2 cache 1.8G RAM
&gt;            Reporter: thushara wijeratna
&gt;         Attachments: HADOOP-1234.patch
&gt;
&gt;
&gt; problem:
&gt; =======
&gt; hadoop was timing out on a simple pass-through job (with the default 10 min timeout)
&gt; cause:
&gt; =====
&gt; i hunted this down to how Text lines are being processed inside org.apache.hadoop.util.LineReader.
&gt; i have a fix, a task that took more than 20 minutes and still failed to complete, completes
with this fix in under 30 s.
&gt; i attach the patch (for trunk)
&gt; the problem traces:
&gt; ================
&gt; hadoop version: 0.19.0
&gt; userlogs on slave node:
&gt; 2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
attempt_200905281652_0013_m_000006_1
&gt; [root@domU-12-31-38-01-7C-92 attempt_200905281652_0013_m_000006_1]#
&gt; tellingly, the last input line processed right before this WARN is 19K. (i log the full
input line in the map function for debugging)
&gt; output on map-reduce task:
&gt; Task attempt_200905281652_0013_m_000006_2 failed to report status for 600 seconds. Killing!
&gt; 09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
&gt; 09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
&gt; java.io.IOException: Job failed!
&gt;     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
&gt;     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
&gt;     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
&gt;     at java.lang.reflect.Method.invoke(Method.java:597)
&gt;     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
&gt;     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
&gt;     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
&gt;     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (HADOOP-2366) Space in the value for dfs.data.dir can cause great problems</title>
<author><name>&quot;Tsz Wo (Nicholas), SZE (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c1975202233.1246040387464.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1975202233-1246040387464-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T18:19:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12724644#action_12724644
] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2366:
------------------------------------------------

After the project split, this problem needs two separated patches, one for Common and the
other one for HDFS.  I think we should change Configuration and StringUtils in this issue
and create a new HDFS issue for the rest.

&gt; Space in the value for dfs.data.dir can cause great problems
&gt; ------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-2366
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-2366
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: conf
&gt;            Reporter: Ted Dunning
&gt;            Assignee: Michele (aka pirroh) Catasta
&gt;         Attachments: HADOOP-2366-trimmed.patch, HADOOP-2366.patch
&gt;
&gt;
&gt; The following configuration causes problems:
&gt; &lt;property&gt;
&gt;   &lt;name&gt;dfs.data.dir&lt;/name&gt;
&gt;   &lt;value&gt;/mnt/hstore2/hdfs, /home/foo/dfs&lt;/value&gt;  
&gt;   &lt;description&gt;
&gt;   Determines where on the local filesystem an DFS data node  should store its bl
&gt; ocks.  If this is a comma-delimited  list of directories, then data will be stor
&gt; ed in all named  directories, typically on different devices.  Directories that 
&gt; do not exist are ignored.  
&gt;   &lt;/description&gt;
&gt; &lt;/property&gt;
&gt; The problem is that the space after the comma causes the second directory for storage
to be " /home/foo/dfs" which is in a directory named &lt;SPACE&gt; which contains a sub-dir
named "home" in the hadoop datanodes default directory.  This will typically cause the user's
home partition to fill, but will be very hard for the user to understand since a directory
with a whitespace name is hard to understand.
&gt; My proposed solution would be to trimLeft all path names from this and similar property
after splitting on comma.  This still allows spaces in file and directory names but avoids
this problem. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: New subproject logos</title>
<author><name>Konstantin Shvachko &lt;shv@yahoo-inc.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c4A45103A.9020505@yahoo-inc.com%3e"/>
<id>urn:uuid:%3c4A45103A-9020505@yahoo-inc-com%3e</id>
<updated>2009-06-26T18:15:22Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
It looks better when the elephant stands on the ground, imo.
Common should be moved down. Right now C overlaps with upper H.
It should be below to match the other 2 logos.
Enlargement of the elephant should be really small if needed at all.

+1

--Konstantin

Doug Cutting wrote:
&gt; Enis Soztutar wrote:
&gt;&gt; +1 with a minor suggestion: how about we enlarge the elephants so that 
&gt;&gt; the height matches both lines, and she can stand on the ground : )
&gt; 
&gt; +1 That would probably look better, if someone has the time to make this 
&gt; change.
&gt; 
&gt; Doug
&gt; 


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: PROJECTS SPLIT</title>
<author><name>Aaron Kimball &lt;aaron@cloudera.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3cd6d7c4410906261110o5d61bd91j1a89764ceed60991@mail.gmail.com%3e"/>
<id>urn:uuid:%3cd6d7c4410906261110o5d61bd91j1a89764ceed60991@mail-gmail-com%3e</id>
<updated>2009-06-26T18:10:16Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Any progress on this over the last few days? Did the
infrastructure-dev team provide an ETA when git repos can be set up?
I've got some work in the pipeline, would be great to clear the dust
out of my development process and get back on track ;)

Thanks!
- Aaron

On Mon, Jun 22, 2009 at 10:52 AM, Owen O'Malley&lt;omalley@apache.org&gt; wrote:
&gt;
&gt; On Jun 19, 2009, at 8:11 PM, Todd Lipcon wrote:
&gt;
&gt;&gt; Any git users here know how this will affect git.apache.org? Will there
&gt;&gt; now
&gt;&gt; be three repos there?
&gt;
&gt; I sent email to infrastructure-dev asking for the repositories to be set up.
&gt; And yes, there will be 3 repositories there.
&gt;
&gt; -- Owen
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (HADOOP-5921) JobTracker does not come up because of NotReplicatedYetException</title>
<author><name>&quot;Amar Kamat (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c959544274.1246039307599.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c959544274-1246039307599-JavaMail-jira@brutus%3e</id>
<updated>2009-06-26T18:01:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/HADOOP-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amar Kamat updated HADOOP-5921:
-------------------------------

    Release Note: jobtracker fails if it cannot create jobtracker.info (if sufficient datanodes
are not up). With this patch it waits forever for it to create. 

&gt; JobTracker does not come up because of NotReplicatedYetException
&gt; ----------------------------------------------------------------
&gt;
&gt;                 Key: HADOOP-5921
&gt;                 URL: https://issues.apache.org/jira/browse/HADOOP-5921
&gt;             Project: Hadoop Common
&gt;          Issue Type: Bug
&gt;          Components: mapred
&gt;    Affects Versions: 0.20.0
&gt;            Reporter: Amareshwari Sriramadasu
&gt;            Assignee: Amar Kamat
&gt;             Fix For: 0.20.1
&gt;
&gt;         Attachments: HADOOP-5921-v1.0.patch, HADOOP-5921-v2.4.patch, HADOOP-5921-v2.5.patch
&gt;
&gt;
&gt; Sometimes (On a big cluster) Jobtracker does not come up, because Info file could not
be replicated on dfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: more information about project split</title>
<author><name>Amr Awadallah &lt;aaa@cloudera.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200906.mbox/%3c4A450C9F.2080807@cloudera.com%3e"/>
<id>urn:uuid:%3c4A450C9F-2080807@cloudera-com%3e</id>
<updated>2009-06-26T17:59:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
+1 for 4' :)

-- amr

Ted Dunning wrote:
&gt; On Fri, Jun 26, 2009 at 10:49 AM, Doug Cutting &lt;cutting@apache.org&gt; wrote:
&gt;
&gt;   
&gt;&gt; Oops, I added a new option that I called (4), but should have called (5):
&gt;&gt;
&gt;&gt;  (4) Send create/resolve to -dev and all others to -issues (a new
&gt;&gt;     
&gt;&gt;&gt; list) plus prohibit all comment edits and permit comment deletion
&gt;&gt;&gt; only by admins.  (Closing is not generally interesting, since it's
&gt;&gt;&gt; only done to seal an issue once its included in a release.)
&gt;&gt;&gt;
&gt;&gt;&gt;       
&gt;&gt; Lots of folks +1'd (4) after this, and I thought they were voting for my
&gt;&gt; (4), not Owen's.
&gt;&gt;
&gt;&gt; If anyone meant to vote for Owen's, not mine, please speak up.
&gt;&gt;
&gt;&gt; Sorry for the confusion!
&gt;&gt;
&gt;&gt;     
&gt;
&gt;
&gt; Ahhh....
&gt;
&gt; I *didn't* vote for (4) because I missed the overload.
&gt;
&gt; +1 for 4'
&gt;
&gt;   


</pre>
</div>
</content>
</entry>
</feed>
