hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Xirog <k0stasx1...@gmail.com>
Subject Re: Memory issue or what?...
Date Sun, 14 Jul 2013 07:28:07 GMT
Thanks for your reply,

  I don't know what I can actually show you that will be of any help(except
from my code which is about 1000 lines), but I'll try to give you guys the
basic idea.
Of course I'm using the hama's graph (implementation of Pregel) for this.

 My program creates a graph with nodes and edges that both have big sets of
data (such as recordIds and edge values in each record) , as values. The
basic idea is that I'm running a query on this graph in the form of a path
(or subgraph), and the program returns the records that contain this path,
as well as the values of each of the records that contain this path.

The compute function executes and only the nodes that are part of the query
wake up at first, all others halt. As this happens, I collect the recordIds
from the node Values and the edge values from the edges, and when the end
nodes have been reached, the program terminates, I collect the result from
the end nodes and write it to the result file...

Is there some way I can access a memory mapping or something?... After
execution with 400.000 records, the log is:

>13/07/14 10:17:57 INFO bsp.BSPJobClient: The total number of supersteps: 48
>13/07/14 10:17:57 INFO bsp.BSPJobClient: Counters: 12
>13/07/14 10:17:57 INFO bsp.BSPJobClient:
org.apache.hama.graph.GraphJobRunner$GraphJobCounter
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     ITERATIONS=42
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     MULTISTEP_PARTITIONING=4
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     INPUT_VERTICES=1001
>13/07/14 10:17:57 INFO bsp.BSPJobClient:
org.apache.hama.bsp.JobInProgress$JobCounter
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     SUPERSTEPS=48
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=6
>13/07/14 10:17:57 INFO bsp.BSPJobClient:
org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=294
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     IO_BYTES_READ=344290795
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=411231
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_SENT=1592
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TASK_INPUT_RECORDS=1001
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_RECEIVED=1580
>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TASK_OUTPUT_RECORDS=1001
Job 1 Finished in 3559.706 seconds


Any ideas?
Thanks in advance,
Kostas X.


On Sun, Jul 14, 2013 at 10:08 AM, Chia-Hung Lin <clin4j@googlemail.com>wrote:

> Any chance to show how the code, logic, log, etc. is executed? Others
> might be able to help spot the issue in underlying infrastructure or
> somewhere else.
>
> On 14 July 2013 15:00, Kostas Xirog <k0stasx1rog@gmail.com> wrote:
> > Hello,
> >
> > I'm running my program with 400.000 records as data and the execution
> takes
> > 50 minutes whereas the execution of the same query on 200.000  records
> > takes 70 seconds. Any idea why that might be? I've been monitoring my
> > system with the 'top' command, and I see that for these 50 minutes the
> > memory usage is 75.5% and the CPU as at 100 almost constantly...
> >
> > I'm running hama in local mode on one machine with 8GB of RAM and 8 CPUs.
> > Any idea why that might be? Any ideas of how I can fix it?
> >
> > Thanks in advance,
> > Kostas X.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message