hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Saxena <vsaxena.va...@gmail.com>
Subject Re: Yarn ResourceManager web UI does not show job
Date Tue, 22 Sep 2015 14:39:30 GMT
Job History runs cleaner 30sec. after restart and then after every 1 day,
if cleaner is enabled. That is why jobs older than 7 days would have got
deleted.
Regarding your second question.
No, you cannot recover deleted files.

Regards,
Varun Saxena

On Tue, Sep 22, 2015 at 7:08 AM, Boyu Zhang <boyuzhang35@gmail.com> wrote:

> Thanks a lot for the answer!
>
> If you don't mind help more on this, here is what I am seeing.
>
> - The NameNode/DataNode and ResourceManager/NodeManager were running for 6
> months before I discovered that the job history server was not running.
> After bringing up the job history server, I saw like 2k+ jobs showing up
> from the history server web ui. But then the job history server got
> restarted, and I don't see any jobs more than 7 days old showing up in the
> history web ui.
>
> - I've disabled the cleaner in the config file.
>
> My question is, is there a way to find/recover the job history files more
> than 7 days old? I read that the container logs are stored locally in the
> NodeManger user log dir, and there are files (I have not dig through them
> yet). I am not sure if the deleted job history files (by history cleaner)
> are not easy to recover.
>
> Thanks in advance,
> Boyu
>
>
> On Mon, Sep 21, 2015 at 4:35 PM, Varun Saxena <vsaxena.varun@gmail.com>
> wrote:
>
>> MR jobs will write history files to path given by config
>> mapreduce.jobhistory.intermediate-done-dir
>> History server will then move them to done dir which is given by config m
>> apreduce.jobhistory.done-dir.
>>
>> By default these config values
>> are ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate
>> and ${yarn.app.mapreduce.am.staging-dir}/history/done respectively.
>>
>> 7 days is also configurable(config being mapreduce.jobhistory.max-age-ms).
>> You can set this value according to your cluster.
>>
>> I hope this answers your question.
>>
>> Regards,
>> Varun Saxena.
>>
>> On Tue, Sep 22, 2015 at 1:39 AM, Boyu Zhang <boyuzhang35@gmail.com>
>> wrote:
>>
>>> Thanks a lot for the clarification!
>>>
>>> I tried to find the log and history information about finished jobs. But
>>> they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you
>>> please give some pointers on where the statistical/job history files are
>>> located? The hfs://xxxx/history/done only stores history files up to 7 days.
>>>
>>> Thanks,
>>> Boyu
>>>
>>> On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxena <vsaxena.varun@gmail.com>
>>> wrote:
>>>
>>>> No, you cant show them in RM UI then.
>>>>
>>>> However if you can start another daemon, you can consider using YARN
>>>> Application History/Timeline Server or MR Job History Server(only for MR
>>>> jobs)  to see information about completed jobs.
>>>> You can look up Hadoop documentation to learn more about them and how
>>>> to configure them.
>>>>
>>>> Just to clarify though, the apps themselves are not lost, as in, the
>>>> output is not lost. Its just the information about them which is no longer
>>>> present on RM restart.
>>>>
>>>> Regards,
>>>> Varun Saxena.
>>>>
>>>> On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang <boyuzhang35@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks for the answer Varun.
>>>>>
>>>>> It is the case that yarn.resourcemanager.recovery.enabled is set to be
>>>>> false. Is there a way to show the jobs that are submitted before the
>>>>> restart? We don't want to lose that data.
>>>>>
>>>>> Thanks,
>>>>> Boyu
>>>>>
>>>>>
>>>>> On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena <
>>>>> vsaxena.varun@gmail.com> wrote:
>>>>>
>>>>>> Hi Boyu,
>>>>>>
>>>>>> RM stores apps in state store if recovery is enabled. Only then they
>>>>>> will be available on restart.
>>>>>> Otherwise they are kept in memory and hence lost on restart.
>>>>>>
>>>>>> You may not have it enabled. Check config value for below config.
By
>>>>>> default its false.
>>>>>> yarn.resourcemanager.recovery.enabled
>>>>>>
>>>>>> Regards,
>>>>>> Varun.
>>>>>>
>>>>>> On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang <boyuzhang35@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Everyone,
>>>>>>>
>>>>>>> I have a strange error regarding the ResourceManager web UI (
>>>>>>> http://xx.xx:8088).
>>>>>>>
>>>>>>> Someone before me set up the hadoop + yarn cluster using Pivotal
HD,
>>>>>>> it was running fine. Then today, the resource manager and node
manager
>>>>>>> disappeared, the logs did not record this. I restarted them,
they are up
>>>>>>> and running, but the resource manger web UI does not show any
jobs. We have
>>>>>>> 700+ jobs in the past, and they were showing before.
>>>>>>>
>>>>>>> If I submit MapReduce jobs, the new submitted ones show up. But
the
>>>>>>> disappear again after restart the resource manger and node manager.
>>>>>>>
>>>>>>> Can anyone give any hint on where to look?
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> Boyu
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message