mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom Arnfeld" <...@duedil.com>
Subject Re: Debugging hadoop-mesos
Date Thu, 07 May 2015 11:05:40 GMT
Hi Brian,




At this point you should see the TT attempting to be launched via Mesos. The "launched but
not heartbeat yet" count tells us that the framework has accepted resources for 4 slots but
the TT hasn't actually come up yet.




Do you see the task in your Meaos cluster UI, and is there anything interesting in the task
logs?



--


Tom Arnfeld

Developer // DueDil





(+44) 7525940046

25 Christopher Street, London, EC2A 2BS

On Thu, May 7, 2015 at 12:01 PM, Brian Topping <brian.topping@gmail.com>
wrote:

> Thanks guys, this was helpful. I started the job tracker as a service, but apparently
I never started the task tracker (or it failed to start and I didn't notice). I started it
after Haosdent's message, but wasn't able to see any difference and I kept poking around.
> After making some changes and the VM wouldn't boot, my OCD got the better of me and I
reinstalled everything from scratch. There are just too many moving parts to hassle you guys
with an imperfect install on my end.
> This time through, I felt a lot more confident to use the Mesosphere RPMs, but I couldn't
find the best way to get things launched. https://docs.mesosphere.com/reference/packages/
<https://docs.mesosphere.com/reference/packages/> has a Last-Modified of Fri, 01 May
2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't have any init.d service descriptions
as the packages page would indicate. For now, I just launched them manually, but would like
to get the machine to completely load on boot as services.
> At this point, I have tested Mesos with:
> 	mesos-execute --master="localhost:5050" --name="test-exec" --command="sleep 10"
> The only problem there is it seems that "localhost" isn't good enough for my install,
it needs to be the FQDN, but it works and the job flows through the UI.
> Now, back to a hadoop job. When I try the job now, the logs show the following stream
of repeated messages:
>> 2015-05-07 17:52:53,124 INFO org.apache.hadoop.mapred.ResourcePolicy: Satisfied map
and reduce slots needed.
>> 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
TaskTracker: http://10.211.55.16:50060.
>> [Repeated a few times a second for five seconds]
>> 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy: JobTracker
Status
>>       Pending Map Tasks: 4
>>    Pending Reduce Tasks: 1
>>       Running Map Tasks: 0
>>    Running Reduce Tasks: 0
>>          Idle Map Slots: 0
>>       Idle Reduce Slots: 0
>>      Inactive Map Slots: 4 (launched but no hearbeat yet)
>>   Inactive Reduce Slots: 1 (launched but no hearbeat yet)
>>        Needed Map Slots: 0
>>     Needed Reduce Slots: 0
>>      Unhealthy Trackers: 0
> This looks close.
> What's the best way to get a JDWP port set up to break in this code (i.e. learning to
fish...)?
> best, Brian
>> On May 7, 2015, at 12:11 PM, Adam Bordelon <adam@mesosphere.io> wrote:
>> 
>> From the mesos-master log and the JT log, it doesn't look like the MesosScheduler
ever registered with Mesos, which should mean that it wouldn't start any TTs or map/reduce
tasks. However, your `ps` output does seem to show a tasktracker running. Did you start that
yourself (or automatically as a system service)?
>> 
>> On Wed, May 6, 2015 at 9:32 AM, haosdent <haosdent@gmail.com <mailto:haosdent@gmail.com>>
wrote:
>> Do you start tasktracker successfully?
>> 
>> On Wed, May 6, 2015 at 11:32 PM, Brian Topping <brian.topping@gmail.com <mailto:brian.topping@gmail.com>>
wrote:
>> Hi all, I'm happy to report that I'm very close to getting 2.6.0-cdh5.4.0 integrated
against Mesos 0.22.1 with the hadoop-mesos 0.10 code on Github. Hoping someone might have
a few minutes to parse what I've got here and suggest something to try.
>> 
>> https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 <https://gist.github.com/briantopping/0dfd0777ff4ce5a81219>
hopefully has all the data necessary between the console output of the client run, the mesos
master and slave console, the XML configuration of the JT and the output that was generated
by it. Please let me know if I've left something out.
>> 
>> I iterated a few times getting all the errors from missing paths or libraries sorted
out, but the example client ultimately just sits waiting forever at "map 0% reduce 0%".
>> 
>> Any input kindly appreciated!
>> 
>> Brian
>> 
>> 
>> 
>> --
>> Best Regards,
>> Haosdent Huang
>> 
Mime
View raw message