hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re:
Date Sat, 01 Jun 2013 16:35:32 GMT
just add more, continue the above thread:

  protected synchronized long getEstimatedTotalMapOutputSize()  {
    if(completedMapsUpdates < threshholdToUse) {
      return 0;
    } else {
      long inputSize = job.getInputLength() + job.desiredMaps();
      //add desiredMaps() so that randomwriter case doesn't blow up
      //the multiplication might lead to overflow, casting it with
      //double prevents it
      long estimate = Math.round(((double)inputSize *
          completedMapsOutputSize * 2.0)/completedMapsInputSize);
      if (LOG.isDebugEnabled()) {
        LOG.debug("estimate total map output will be " + estimate);
      }
      return estimate;
    }
  }


On Sun, Jun 2, 2013 at 12:34 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:

> This should be fixed in hadoop-1.1.2 stable release.
> if we determine completedMapsInputSize is zero, then job's map tasks MUST
> be zero, so the estimated output size is zero.
> below is the code:
>
>   long getEstimatedMapOutputSize() {
>     long estimate = 0L;
>     if (job.desiredMaps() > 0) {
>       estimate = getEstimatedTotalMapOutputSize()  / job.desiredMaps();
>     }
>     return estimate;
>   }
>
>
>
> On Sat, Jun 1, 2013 at 11:49 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> Does smell like a bug as that number you get is simply Long.MAX_VALUE,
>> or 8 exbibytes.
>>
>> Looking at the sources, this turns out to be a rather funny Java issue
>> (there's a divide by zero happening and [1] suggests Long.MAX_VALUE
>> return in such a case). I've logged a bug report for this at
>> https://issues.apache.org/jira/browse/MAPREDUCE-5288 with a
>> reproducible case.
>>
>> Does this happen consistently for you?
>>
>> [1]
>> http://docs.oracle.com/javase/6/docs/api/java/lang/Math.html#round(double)
>>
>> On Sat, Jun 1, 2013 at 7:27 PM, Lanati, Matteo <Matteo.Lanati@lrz.de>
>> wrote:
>> > Hi all,
>> >
>> > I stumbled upon this problem as well while trying to run the default
>> wordcount shipped with Hadoop 1.2.0. My testbed is made up of 2 virtual
>> machines: Debian 7, Oracle Java 7, 2 GB RAM, 25 GB hard disk. One node is
>> used as JT+NN, the other as TT+DN. Security is enabled. The input file is
>> about 600 kB and the error is
>> >
>> > 2013-06-01 12:22:51,999 WARN org.apache.hadoop.mapred.JobInProgress: No
>> room for map task. Node 10.156.120.49 has 22854692864 bytes free; but we
>> expect map to take 9223372036854775807
>> >
>> > The logfile is attached, together with the configuration files. The
>> version I'm using is
>> >
>> > Hadoop 1.2.0
>> > Subversion
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r
>> 1479473
>> > Compiled by hortonfo on Mon May  6 06:59:37 UTC 2013
>> > From source with checksum 2e0dac51ede113c1f2ca8e7d82fb3405
>> > This command was run using
>> /home/lu95jib/hadoop-exmpl/hadoop-1.2.0/hadoop-core-1.2.0.jar
>> >
>> > If I run the default configuration (i.e. no securty), then the job
>> succeeds.
>> >
>> > Is there something missing in how I set up my nodes? How is it possible
>> that the envisaged value for the needed space is so big?
>> >
>> > Thanks in advance.
>> >
>> > Matteo
>> >
>> >
>> >
>> >>Which version of Hadoop are you using. A quick search shows me a bug
>> >>https://issues.apache.org/jira/browse/HADOOP-5241 that seems to show
>> >>similar symptoms. However, that was fixed a long while ago.
>> >>
>> >>
>> >>On Sat, Mar 23, 2013 at 4:40 PM, Redwane belmaati cherkaoui <
>> >>reduno1985@googlemail.com> wrote:
>> >>
>> >>> This the content of the jobtracker log file :
>> >>> 2013-03-23 12:06:48,912 INFO org.apache.hadoop.mapred.JobInProgress:
>> Input
>> >>> size for job job_201303231139_0001 = 6950001. Number of splits = 7
>> >>> 2013-03-23 12:06:48,925 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000000 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,927 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000001 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,930 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000002 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,931 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000003 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,933 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000004 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,934 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000005 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,939 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> tip:task_201303231139_0001_m_000006 has split on
>> >>> node:/default-rack/hadoop0.novalocal
>> >>> 2013-03-23 12:06:48,950 INFO org.apache.hadoop.mapred.JobInProgress:
>> >>> job_201303231139_0001 LOCALITY_WAIT_FACTOR=0.5
>> >>> 2013-03-23 12:06:48,978 INFO org.apache.hadoop.mapred.JobInProgress:
>> Job
>> >>> job_201303231139_0001 initialized successfully with 7 map tasks and
1
>> >>> reduce tasks.
>> >>> 2013-03-23 12:06:50,855 INFO org.apache.hadoop.mapred.JobTracker:
>> Adding
>> >>> task (JOB_SETUP) 'attempt_201303231139_0001_m_000008_0' to tip
>> >>> task_201303231139_0001_m_000008, for tracker
>> >>> 'tracker_hadoop0.novalocal:hadoop0.novalocal/127.0.0.1:44879'
>> >>> 2013-03-23 12:08:00,340 INFO org.apache.hadoop.mapred.JobInProgress:
>> Task
>> >>> 'attempt_201303231139_0001_m_000008_0' has completed
>> >>> task_201303231139_0001_m_000008 successfully.
>> >>> 2013-03-23 12:08:00,538 WARN org.apache.hadoop.mapred.JobInProgress:
>> No
>> >>> room for map task. Node hadoop0.novalocal has 8791543808 bytes free;
>> but we
>> >>> expect map to take 1317624576693539401
>> >>> 2013-03-23 12:08:00,543 WARN org.apache.hadoop.mapred.JobInProgress:
>> No
>> >>> room for map task. Node hadoop0.novalocal has 8791543808 bytes free;
>> but we
>> >>> expect map to take 1317624576693539401
>> >>> 2013-03-23 12:08:00,544 WARN org.apache.hadoop.mapred.JobInProgress:
>> No
>> >>> room for map task. Node hadoop0.novalocal has 8791543808 bytes free;
>> but we
>> >>> expect map to take 1317624576693539401
>> >>> 2013-03-23 12:08:00,544 WARN org.apache.hadoop.mapred.JobInProgress:
>> No
>> >>> room for map task. Node hadoop0.novalocal has 8791543808 bytes free;
>> but we
>> >>> expect map to take 1317624576693539401
>> >>> 2013-03-23 12:08:01,264 WARN org.apache.hadoop.mapred.JobInProgress:
>> No
>> >>> room for map task. Node hadoop1.novalocal has 8807518208 bytes free;
>> but we
>> >>> expect map to take 1317624576693539401
>> >>>
>> >>>
>> >>> The value in we excpect map to take is too huge   1317624576693539401
>> >>> bytes  !!!!!!!
>> >>>
>> >>> On Sat, Mar 23, 2013 at 11:37 AM, Redwane belmaati cherkaoui <
>> >>> reduno1985@googlemail.com> wrote:
>> >>>
>> >>>> The estimated value that the hadoop compute is too huge for the
>> simple
>> >>>> example that i am running .
>> >>>>
>> >>>> ---------- Forwarded message ----------
>> >>>> From: Redwane belmaati cherkaoui <reduno1985@googlemail.com>
>> >>>>  Date: Sat, Mar 23, 2013 at 11:32 AM
>> >>>> Subject: Re: About running a simple wordcount mapreduce
>> >>>> To: Abdelrahman Shettia <ashettia@hortonworks.com>
>> >>>> Cc: user@hadoop.apache.org, reduno1985 <reduno1985@gmail.com>
>> >>>>
>> >>>>
>> >>>> This the output that I get I am running two machines  as you can
see
>>  do
>> >>>> u see anything suspicious ?
>> >>>> Configured Capacity: 21145698304 (19.69 GB)
>> >>>> Present Capacity: 17615499264 (16.41 GB)
>> >>>> DFS Remaining: 17615441920 (16.41 GB)
>> >>>> DFS Used: 57344 (56 KB)
>> >>>> DFS Used%: 0%
>> >>>> Under replicated blocks: 0
>> >>>> Blocks with corrupt replicas: 0
>> >>>> Missing blocks: 0
>> >>>>
>> >>>> -------------------------------------------------
>> >>>> Datanodes available: 2 (2 total, 0 dead)
>> >>>>
>> >>>> Name: 11.1.0.6:50010
>> >>>> Decommission Status : Normal
>> >>>> Configured Capacity: 10572849152 (9.85 GB)
>> >>>> DFS Used: 28672 (28 KB)
>> >>>> Non DFS Used: 1765019648 (1.64 GB)
>> >>>> DFS Remaining: 8807800832(8.2 GB)
>> >>>> DFS Used%: 0%
>> >>>> DFS Remaining%: 83.31%
>> >>>> Last contact: Sat Mar 23 11:30:10 CET 2013
>> >>>>
>> >>>>
>> >>>> Name: 11.1.0.3:50010
>> >>>> Decommission Status : Normal
>> >>>> Configured Capacity: 10572849152 (9.85 GB)
>> >>>> DFS Used: 28672 (28 KB)
>> >>>> Non DFS Used: 1765179392 (1.64 GB)
>> >>>> DFS Remaining: 8807641088(8.2 GB)
>> >>>> DFS Used%: 0%
>> >>>> DFS Remaining%: 83.3%
>> >>>> Last contact: Sat Mar 23 11:30:08 CET 2013
>> >>>>
>> >>>>
>> >>>> On Fri, Mar 22, 2013 at 10:19 PM, Abdelrahman Shettia <
>> >>>> ashettia@hortonworks.com> wrote:
>> >>>>
>> >>>>> Hi Redwane,
>> >>>>>
>> >>>>> Please run the following command as hdfs user on any datanode.
The
>> >>>>> output will be something like this. Hope this helps
>> >>>>>
>> >>>>> hadoop dfsadmin -report
>> >>>>> Configured Capacity: 81075068925 (75.51 GB)
>> >>>>> Present Capacity: 70375292928 (65.54 GB)
>> >>>>> DFS Remaining: 69895163904 (65.09 GB)
>> >>>>> DFS Used: 480129024 (457.89 MB)
>> >>>>> DFS Used%: 0.68%
>> >>>>> Under replicated blocks: 0
>> >>>>> Blocks with corrupt replicas: 0
>> >>>>> Missing blocks: 0
>> >>>>>
>> >>>>> Thanks
>> >>>>> -Abdelrahman
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Mar 22, 2013 at 12:35 PM, reduno1985 <
>> reduno1985@googlemail.com>wrote:
>> >>>>>
>> >>>>>>
>> >>>>>> I have my hosts running on openstack virtual machine instances
each
>> >>>>>> instance has 10gb hard disc . Is there a way too see how
much
>> space is in
>> >>>>>> the hdfs without web ui .
>> >>>>>>
>> >>>>>>
>> >>>>>> Sent from Samsung Mobile
>> >>>>>>
>> >>>>>> Serge Blazhievsky <hadoop.ca@gmail.com> wrote:
>> >>>>>> Check web ui how much space you have on hdfs???
>> >>>>>>
>> >>>>>> Sent from my iPhone
>> >>>>>>
>> >>>>>> On Mar 22, 2013, at 11:41 AM, Abdelrahman Shettia <
>> >>>>>> ashettia@hortonworks.com> wrote:
>> >>>>>>
>> >>>>>> Hi Redwane ,
>> >>>>>>
>> >>>>>> It is possible that the hosts which are running tasks are
do not
>> have
>> >>>>>> enough space. Those dirs are confiugred in mapred-site.xml
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Fri, Mar 22, 2013 at 8:42 AM, Redwane belmaati cherkaoui
<
>> >>>>>> reduno1985@googlemail.com> wrote:
>> >>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> ---------- Forwarded message ----------
>> >>>>>>> From: Redwane belmaati cherkaoui <reduno1985@googlemail.com>
>> >>>>>>> Date: Fri, Mar 22, 2013 at 4:39 PM
>> >>>>>>> Subject: About running a simple wordcount mapreduce
>> >>>>>>> To: mapreduce-issues@hadoop.apache.org
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Hi
>> >>>>>>> I am trying to run  a wordcount mapreduce job on several
files
>> (<20
>> >>>>>>> mb) using two machines . I get stuck on 0% map 0% reduce.
>> >>>>>>> The jobtracker log file shows the following warning:
>> >>>>>>>  WARN org.apache.hadoop.mapred.JobInProgress: No room
for map
>> task.
>> >>>>>>> Node hadoop0.novalocal has 8791384064 bytes free; but
we expect
>> map to
>> >>take
>> >>>>>>> 1317624576693539401
>> >>>>>>>
>> >>>>>>> Please help me ,
>> >>>>>>> Best Regards,
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >
>> >
>> > Matteo Lanati
>> > Distributed Resources Group
>> > Leibniz-Rechenzentrum (LRZ)
>> > Boltzmannstrasse 1
>> > 85748 Garching b. M√ľnchen (Germany)
>> > Phone: +49 89 35831 8724
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
View raw message