mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Topping <brian.topp...@gmail.com>
Subject Re: Debugging hadoop-mesos
Date Thu, 07 May 2015 11:00:09 GMT
Thanks guys, this was helpful. I started the job tracker as a service, but apparently I never
started the task tracker (or it failed to start and I didn't notice). I started it after Haosdent's
message, but wasn't able to see any difference and I kept poking around.

After making some changes and the VM wouldn't boot, my OCD got the better of me and I reinstalled
everything from scratch. There are just too many moving parts to hassle you guys with an imperfect
install on my end.

This time through, I felt a lot more confident to use the Mesosphere RPMs, but I couldn't
find the best way to get things launched. https://docs.mesosphere.com/reference/packages/
<https://docs.mesosphere.com/reference/packages/> has a Last-Modified of Fri, 01 May
2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't have any init.d service descriptions
as the packages page would indicate. For now, I just launched them manually, but would like
to get the machine to completely load on boot as services.

At this point, I have tested Mesos with:

	mesos-execute --master="localhost:5050" --name="test-exec" --command="sleep 10"

The only problem there is it seems that "localhost" isn't good enough for my install, it needs
to be the FQDN, but it works and the job flows through the UI.

Now, back to a hadoop job. When I try the job now, the logs show the following stream of repeated
messages:

> 2015-05-07 17:52:53,124 INFO org.apache.hadoop.mapred.ResourcePolicy: Satisfied map and
reduce slots needed.
> 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
TaskTracker: http://10.211.55.16:50060.
> [Repeated a few times a second for five seconds]
> 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy: JobTracker Status
>       Pending Map Tasks: 4
>    Pending Reduce Tasks: 1
>       Running Map Tasks: 0
>    Running Reduce Tasks: 0
>          Idle Map Slots: 0
>       Idle Reduce Slots: 0
>      Inactive Map Slots: 4 (launched but no hearbeat yet)
>   Inactive Reduce Slots: 1 (launched but no hearbeat yet)
>        Needed Map Slots: 0
>     Needed Reduce Slots: 0
>      Unhealthy Trackers: 0

This looks close.

What's the best way to get a JDWP port set up to break in this code (i.e. learning to fish...)?

best, Brian


> On May 7, 2015, at 12:11 PM, Adam Bordelon <adam@mesosphere.io> wrote:
> 
> From the mesos-master log and the JT log, it doesn't look like the MesosScheduler ever
registered with Mesos, which should mean that it wouldn't start any TTs or map/reduce tasks.
However, your `ps` output does seem to show a tasktracker running. Did you start that yourself
(or automatically as a system service)?
> 
> On Wed, May 6, 2015 at 9:32 AM, haosdent <haosdent@gmail.com <mailto:haosdent@gmail.com>>
wrote:
> Do you start tasktracker successfully?
> 
> On Wed, May 6, 2015 at 11:32 PM, Brian Topping <brian.topping@gmail.com <mailto:brian.topping@gmail.com>>
wrote:
> Hi all, I'm happy to report that I'm very close to getting 2.6.0-cdh5.4.0 integrated
against Mesos 0.22.1 with the hadoop-mesos 0.10 code on Github. Hoping someone might have
a few minutes to parse what I've got here and suggest something to try.
> 
> https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 <https://gist.github.com/briantopping/0dfd0777ff4ce5a81219>
hopefully has all the data necessary between the console output of the client run, the mesos
master and slave console, the XML configuration of the JT and the output that was generated
by it. Please let me know if I've left something out.
> 
> I iterated a few times getting all the errors from missing paths or libraries sorted
out, but the example client ultimately just sits waiting forever at "map 0% reduce 0%".
> 
> Any input kindly appreciated!
> 
> Brian
> 
> 
> 
> --
> Best Regards,
> Haosdent Huang
> 


Mime
View raw message