giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Harenberg <sdhar...@ncsu.edu>
Subject Re: [SOLVED] Re: Giraph job never ends
Date Sun, 15 Mar 2015 16:03:32 GMT
Figured out the issue via the container log file:
container_1426433168188_0001_01_000001/gam-stdout.log. Too much virtual
memory was trying to be used (I am using a micro instance on EC2 so there
is not much to work with) causing an "exitCode: 143". Apparently, there is
a limit on the virtual memory based on the physical memory, but you can
ignore this limit by adding the following to yarn-site.xml:

<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
  <description>Whether virtual memory limits will be enforced for
containers.</description>
</property>

source:
http://stackoverflow.com/questions/14110428/am-container-is-running-beyond-virtual-memory-limits

Everything seems to be working for me now.

On Fri, Mar 13, 2015 at 10:24 PM, Steven Harenberg <sdharenb@ncsu.edu>
wrote:

> Thanks Phil, I appreciate the help. Your posts over the past couple days
> have already been quite helpful.
>
> There were a few things I was going to play with as well, perhaps it is
> some configuration issue as you mentioned earlier. I had some issues with
> EC2 today and I will look at it again tomorrow.
>
> Thanks for letting me know about your talk, it sounds interesting. I will
> try and go as long as I can get there in time.
>
> --Steve
>
> On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes <motley.crue.fan@gmail.com
> > wrote:
>
>> Steve:
>>
>> I'm not 100% sure what to tell you, and I don't have access to my
>> cluster right this minute.  But later this evening I can log in and
>> see if I can find anything that might be
>> useful to you.
>>
>> Also, as an FYI, I'll be doing a presentation on Giraph at the
>> Triangle Java User's Group meeting this coming Monday... if you're in
>> the area (I see you have an @ncsu.edu address), and you can come by, I
>> might be able to help you then.   Part of my presentation will be
>> walking through how to setup a Giraph / YARN cluster, based on my
>> experiences over the past few days...
>>
>>
>> Phil
>>
>> This message optimized for indexing by NSA PRISM
>>
>>
>> On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg <sdharenb@ncsu.edu>
>> wrote:
>> > Hey Phil,
>> >
>> > I have been having the exact same problems as you (I am also setting up
>> > Giraph on EC2), but this solution did not work for me.
>> >
>> > Do you recall what error you saw in resourcemanager logs? I am also
>> looking
>> > at these logs, but nothing is standing out to me. In fact, it almost
>> seems
>> > like the application should have successfully finished. The log stops
>> > updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED"
>> at the
>> > end of the log. Though, it does look like one of the containers is not
>> > transitioning to these states.
>> >
>> > Thanks,
>> > Steve
>> >
>> >
>> > On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes <
>> motley.crue.fan@gmail.com>
>> > wrote:
>> >>
>> >> OK, this was easy enough to fix, once I understood what
>> >> was actually happening.  Since I'm running on EC2 nodes on
>> >> AWS, it is not the case that any give node can talk to any other
>> >> node on any port (at least not by default).  I had tried to
>> >> cherry-pick which ports to whitelist in the security group,
>> >> but I missed one or more that YARN needed for internal
>> >> communication.   I discovered this when examining the
>> >> resourcemanager logs.
>> >>
>> >>
>> >> For now, instead of trying to enumerate exactly which ports
>> >> to allow, I added a rule to allow "all traffic" for address
>> 10.0.0.0/24
>> >> and that solved this.
>> >>
>> >>
>> >> Cheers,
>> >>
>> >>
>> >> Phil
>> >>
>> >>
>> >> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
>> >> <motley.crue.fan@gmail.com> wrote:
>> >> > Interesting... It totally did not work for me when built using the
>> >> > hadoop_2 profile, but with the hadoop_yarn profile everything at
>> least
>> >> > starts up.  I'm pretty baffled right now... my cluster is essentially
>> >> > working, and I can run, for example, the WordCount example just fine.
>> >> > And the Giraph job starts and shows no apparent errors, but I get no
>> >> > output and it seems to run forever.
>> >> >
>> >> > It's probably some really small detail of my Hadoop configuration,
or
>> >> > some environmental issue.  The problem is, I don't even know where
to
>> >> > start looking right now.  :-(
>> >> >
>> >> >
>> >> > Phil
>> >> > This message optimized for indexing by NSA PRISM
>> >> >
>> >> >
>> >> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
>> >> > <martin.junghanns@gmx.net> wrote:
>> >> >> Hi Phillip,
>> >> >>
>> >> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
>> >> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
>> >> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
>> >> >>
>> >> >> Maybe you can also try the stable Giraph
>> >> >> version and report your problem as an issue?
>> >> >>
>> >> >> Cheers,
>> >> >> Martin
>> >> >>
>> >> >> On 11.03.2015 04:03, Phillip Rhodes wrote:
>> >> >>> Giraph crew:
>> >> >>>
>> >> >>> I'm trying to run the SimpleShortestPathsComputation example
using
>> >> >>> the latest Giraph code and Hadoop 2.5.2.  My command line looks
>> >> >>> like this:
>> >> >>>
>> >> >>> hadoop jar
>> >> >>>
>> >> >>>
>> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
>> >> >>>
>> >> >>>
>> >> >> org.apache.giraph.GiraphRunner
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>> >> >>>
>> >> >>>
>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>> >> >>>
>> >> >>>
>> >> >> -vip /user/prhodes/input/tiny_graph.txt -vof
>> >> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>> >> >>> /user/prhodes/giraph_output/shortestpaths -w 4
>> >> >>>
>> >> >>>
>> >> >>> and the job appears to start OK.  But then it starts outputing
>> >> >>> these kinds of messages, and this just continues (seemingly)
>> >> >>> forever until you ctrl+c it.
>> >> >>>
>> >> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
>> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation,
>> >> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
>> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers
>> >> >>> used: 1
>> >> >>>
>> >> >>> Any idea what is going on here?
>> >> >>>
>> >> >>>
>> >> >>> Thanks,
>> >> >>>
>> >> >>>
>> >> >>> Phil ---
>> >> >>>
>> >> >>>
>> >> >>> This message optimized for indexing by NSA PRISM
>> >> >>>
>> >
>> >
>>
>
>

Mime
View raw message