Return-Path: X-Original-To: apmail-mesos-user-archive@www.apache.org Delivered-To: apmail-mesos-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F394B174EC for ; Thu, 7 May 2015 11:01:16 +0000 (UTC) Received: (qmail 56135 invoked by uid 500); 7 May 2015 11:01:16 -0000 Delivered-To: apmail-mesos-user-archive@mesos.apache.org Received: (qmail 56084 invoked by uid 500); 7 May 2015 11:01:16 -0000 Mailing-List: contact user-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mesos.apache.org Delivered-To: mailing list user@mesos.apache.org Received: (qmail 56074 invoked by uid 99); 7 May 2015 11:01:16 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 May 2015 11:01:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1EE8EC22D1 for ; Thu, 7 May 2015 11:01:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.902 X-Spam-Level: *** X-Spam-Status: No, score=3.902 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=3, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id O5s1ziQKebOb for ; Thu, 7 May 2015 11:01:03 +0000 (UTC) Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id A86C0215CA for ; Thu, 7 May 2015 11:01:02 +0000 (UTC) Received: by pabsx10 with SMTP id sx10so37212457pab.3 for ; Thu, 07 May 2015 04:00:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:message-id :references:to; bh=klZm7aFD85fMY/vialWEOzD4GH88zBNtlj44l8kMEnM=; b=d95vW65kUJ3gQ/MShzHZhmX5spJ0BrMCGphgH19wszNkeoJ4JH0ogBPBhXwlwLlpFO KTP5O0xuyjoCYNkhhdQlPF5HUDyGYs4KLZdCsGyGQY79gIdx3i8Azz0/UGWNsIdSoFkt pnZPNZdBOVDTFSWxcBrhLlxeVS7x6H7AvW8IzxARHnRosX9iPqXuFwpzvXcNfmAoYmRW HvuvLTV9kTyZzLiYT4k+zOrzGdZB0ja67h0jPDd95DTRu4JUpO9Zg0q+Vb9zpdGzx/QC c4YDG9PLekf0vg/a0sMwy0hk81NdPstwcFL19VvdjpM1C8uzMgmxmS7i1GucGrMAQhDC q/gQ== X-Received: by 10.68.242.41 with SMTP id wn9mr5740673pbc.117.1430996454516; Thu, 07 May 2015 04:00:54 -0700 (PDT) Received: from [203.107.211.51] ([203.107.211.51]) by mx.google.com with ESMTPSA id wh6sm1796319pbc.96.2015.05.07.04.00.51 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 07 May 2015 04:00:52 -0700 (PDT) Content-Type: multipart/signed; boundary="Apple-Mail=_531F2FF2-85A9-45E8-BA8E-BAB0610C3EC9"; protocol="application/pgp-signature"; micalg=pgp-sha512 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\)) Subject: Re: Debugging hadoop-mesos X-Pgp-Agent: GPGMail 2.5b6 From: Brian Topping In-Reply-To: Date: Thu, 7 May 2015 18:00:09 +0700 Message-Id: <49F121A1-16B4-424E-985B-AB6F885A8F76@gmail.com> References: <67C3DB36-6CF8-42F7-9F1D-D089333B1A8E@gmail.com> To: user@mesos.apache.org X-Mailer: Apple Mail (2.2098) --Apple-Mail=_531F2FF2-85A9-45E8-BA8E-BAB0610C3EC9 Content-Type: multipart/alternative; boundary="Apple-Mail=_F11B4F8F-F29D-4F93-9C2A-95947C271470" --Apple-Mail=_F11B4F8F-F29D-4F93-9C2A-95947C271470 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Thanks guys, this was helpful. I started the job tracker as a service, = but apparently I never started the task tracker (or it failed to start = and I didn't notice). I started it after Haosdent's message, but wasn't = able to see any difference and I kept poking around. After making some changes and the VM wouldn't boot, my OCD got the = better of me and I reinstalled everything from scratch. There are just = too many moving parts to hassle you guys with an imperfect install on my = end. This time through, I felt a lot more confident to use the Mesosphere = RPMs, but I couldn't find the best way to get things launched. = https://docs.mesosphere.com/reference/packages/ = has a Last-Modified of = Fri, 01 May 2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't = have any init.d service descriptions as the packages page would = indicate. For now, I just launched them manually, but would like to get = the machine to completely load on boot as services. At this point, I have tested Mesos with: mesos-execute --master=3D"localhost:5050" --name=3D"test-exec" = --command=3D"sleep 10" The only problem there is it seems that "localhost" isn't good enough = for my install, it needs to be the FQDN, but it works and the job flows = through the UI. Now, back to a hadoop job. When I try the job now, the logs show the = following stream of repeated messages: > 2015-05-07 17:52:53,124 INFO org.apache.hadoop.mapred.ResourcePolicy: = Satisfied map and reduce slots needed. > 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler: = Unknown/exited TaskTracker: http://10.211.55.16:50060. > [Repeated a few times a second for five seconds] > 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy: = JobTracker Status > Pending Map Tasks: 4 > Pending Reduce Tasks: 1 > Running Map Tasks: 0 > Running Reduce Tasks: 0 > Idle Map Slots: 0 > Idle Reduce Slots: 0 > Inactive Map Slots: 4 (launched but no hearbeat yet) > Inactive Reduce Slots: 1 (launched but no hearbeat yet) > Needed Map Slots: 0 > Needed Reduce Slots: 0 > Unhealthy Trackers: 0 This looks close. What's the best way to get a JDWP port set up to break in this code = (i.e. learning to fish...)? best, Brian > On May 7, 2015, at 12:11 PM, Adam Bordelon wrote: >=20 > =46rom the mesos-master log and the JT log, it doesn't look like the = MesosScheduler ever registered with Mesos, which should mean that it = wouldn't start any TTs or map/reduce tasks. However, your `ps` output = does seem to show a tasktracker running. Did you start that yourself (or = automatically as a system service)? >=20 > On Wed, May 6, 2015 at 9:32 AM, haosdent > wrote: > Do you start tasktracker successfully? >=20 > On Wed, May 6, 2015 at 11:32 PM, Brian Topping = > wrote: > Hi all, I'm happy to report that I'm very close to getting = 2.6.0-cdh5.4.0 integrated against Mesos 0.22.1 with the hadoop-mesos = 0.10 code on Github. Hoping someone might have a few minutes to parse = what I've got here and suggest something to try. >=20 > https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 = hopefully = has all the data necessary between the console output of the client run, = the mesos master and slave console, the XML configuration of the JT and = the output that was generated by it. Please let me know if I've left = something out. >=20 > I iterated a few times getting all the errors from missing paths or = libraries sorted out, but the example client ultimately just sits = waiting forever at "map 0% reduce 0%". >=20 > Any input kindly appreciated! >=20 > Brian >=20 >=20 >=20 > -- > Best Regards, > Haosdent Huang >=20 --Apple-Mail=_F11B4F8F-F29D-4F93-9C2A-95947C271470 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Thanks guys, this was helpful. I started the job tracker as a = service, but apparently I never started the task tracker (or it failed = to start and I didn't notice). I started it after Haosdent's message, = but wasn't able to see any difference and I kept poking around.

After making some = changes and the VM wouldn't boot, my OCD got the better of me and I = reinstalled everything from scratch. There are just too many moving = parts to hassle you guys with an imperfect install on my end.

This time through, I = felt a lot more confident to use the Mesosphere RPMs, but I couldn't = find the best way to get things launched. https://docs.mesosphere.com/reference/packages/ has = a Last-Modified of Fri, 01 May 2015 18:46:10 GMT (one week ago), = but the RHEL 6 RPMs don't have any init.d service descriptions as the = packages page would indicate. For now, I just launched them manually, = but would like to get the machine to completely load on boot as = services.

At = this point, I have tested Mesos with:

mesos-execute = --master=3D"localhost:5050" --name=3D"test-exec" --command=3D"sl= eep 10"

The = only problem there is it seems that "localhost" isn't good enough for my = install, it needs to be the FQDN, but it works and the job flows through = the UI. 

Now, back to a hadoop job. When I try the job now, the logs = show the following stream of repeated messages:

2015-05-07 17:52:53,124 INFO = org.apache.hadoop.mapred.ResourcePolicy: Satisfied map and reduce slots = needed.
2015-05-07 17:52:53,340 INFO = org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://10.211.55.16:50060
[Repeated = a few times a second for five seconds]
2015-05-07 17:49:08,914 INFO = org.apache.hadoop.mapred.ResourcePolicy: JobTracker Status
  =     Pending Map Tasks: 4
   Pending Reduce Tasks: 1
  =     Running Map Tasks: 0
  =  Running Reduce Tasks: 0
        =  Idle Map Slots: 0
      Idle = Reduce Slots: 0
     Inactive Map Slots: 4 = (launched but no hearbeat yet)
  Inactive Reduce = Slots: 1 (launched but no hearbeat yet)
    =    Needed Map Slots: 0
    Needed = Reduce Slots: 0
     Unhealthy Trackers: = 0

This = looks close. 

What's the best way to get a JDWP port set up to break in = this code (i.e. learning to fish...)?

best, Brian


On = May 7, 2015, at 12:11 PM, Adam Bordelon <adam@mesosphere.io> = wrote:

=46rom the mesos-master log and the JT log, it = doesn't look like the MesosScheduler ever registered with Mesos, which = should mean that it wouldn't start any TTs or map/reduce tasks. However, = your `ps` output does seem to show a tasktracker running. Did you start = that yourself (or automatically as a system service)?

On Wed, May 6, 2015 at 9:32 AM, haosdent <haosdent@gmail.com> wrote:
Do you start tasktracker successfully? 

On Wed, May 6, 2015 at 11:32 PM, = Brian Topping <brian.topping@gmail.com> wrote:
Hi all, I'm happy to report = that I'm very close to getting 2.6.0-cdh5.4.0 integrated against = Mesos 0.22.1 with the hadoop-mesos 0.10 code on Github. Hoping someone = might have a few minutes to parse what I've got here and suggest = something to try.

https://gist.github.com/briantopping/0dfd0777ff4ce5a81219&n= bsp;hopefully has all the data necessary between the console output of = the client run, the mesos master and slave console, the XML = configuration of the JT and the output that was generated by it. Please = let me know if I've left something out.

I iterated a few times getting all the = errors from missing paths or libraries sorted out, but the example = client ultimately just sits waiting forever at "map 0% reduce = 0%". 

Any = input kindly appreciated!

Brian



--
Best = Regards,
Haosdent Huang


= --Apple-Mail=_F11B4F8F-F29D-4F93-9C2A-95947C271470-- --Apple-Mail=_531F2FF2-85A9-45E8-BA8E-BAB0610C3EC9 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJVS0W5AAoJEAd3lYGMokdtXEsP/0pbsvQoNouMcydhZaOAx/KR tRGO88jr1TTzcoTebC7Tm9w3OSE4X0JKfPMkW9ksP+5oXnJH6XCo2A2uCJD0FkrU RmfTNR/qC7cbQUh1yLK4Zh3BnZElOnkyOyXi6265MoAUdkxi3wOh11uJ7j6NgwJr x7KarJCgGyIEe2BbeEJDmlTJqm53A8sQoCli5zHAh1VLJA+mxNi+zsouWeDWxOIJ B7/h+FQ6xfuGi8hTR2Qj3eB8IEgsAPjwSG3YTziRqohsoy2SXatfPes5XZVGsEZP 7AeQDLhlH0MuJ8ws/VnhlK9slgXKLdfT4OdDm6LNiL9D8eTQ9rD2yOt+aAIj5sbh LvMAVt0xrOL972ZWbLdYqr1ejs6POGk6UvDXgtHYykUolAxlzhvTp3ivLG1iRrtV LgEgChOeKviK0eWbVFfc+/cNyBIdBM8uMej3Uckh2LJkkt3Hwcrda+Xjy7dCI0ul jMXHZfIsQc0t7Gm+lhqle+DaXLKtjiEFyWkeolYRqjWSbw9nqLahk2dQ1LYKVAjm 8cc0aiRZe1Sg1OhJYCJLAPkeuISz89wauH+PeFLkJvUnoOLxLyEXwCNLq4t1jdYf VCzzDcUDJ7vPYafF2x8aeol9Sv1X8jkszIuShgw9Emdvb7siB2iwYzFqYXYspKX1 LJ48mJkSPiwf06jBoR2q =chjL -----END PGP SIGNATURE----- --Apple-Mail=_531F2FF2-85A9-45E8-BA8E-BAB0610C3EC9--