hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Memory issue or what?...
Date Sun, 14 Jul 2013 11:58:49 GMT
Disabling message compression by deleting
'hama.messenger.compression.class' property in hama-default.xml might
be helpful.

However, the fundamental problems won't fixed by client-side
configuration. You need to wait next releases.

On Sun, Jul 14, 2013 at 8:24 PM, Kostas Xirog <k0stasx1rog@gmail.com> wrote:
> I installed 0.6.2 and ran my program, and in 0.6.2 it actually uses more
> memory than it did before...plus the program needs more time to execute...
>
> What could I be doing wrong?...The input file that hama takes in is only
>  172.233.579 Bytes ...Any ideas anyone?
>
> Also I'm getting this warning in 0.6.2 : " WARN
> message.MessageTransferQueueFactory: Message queue is configured on
> deprecated parameter:hama.messenger.queue.class"
>
> Does anyone know how I can fix it and if it might be conflicting with my
> program's execution  in some way?
>
> Thank you in advance,
> Kostas X.
>
>
> On Sun, Jul 14, 2013 at 11:51 AM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> Please use the latest version.
>>
>> On Sun, Jul 14, 2013 at 4:28 PM, Kostas Xirog <k0stasx1rog@gmail.com>
>> wrote:
>> > Thanks for your reply,
>> >
>> >   I don't know what I can actually show you that will be of any
>> help(except
>> > from my code which is about 1000 lines), but I'll try to give you guys
>> the
>> > basic idea.
>> > Of course I'm using the hama's graph (implementation of Pregel) for this.
>> >
>> >  My program creates a graph with nodes and edges that both have big sets
>> of
>> > data (such as recordIds and edge values in each record) , as values. The
>> > basic idea is that I'm running a query on this graph in the form of a
>> path
>> > (or subgraph), and the program returns the records that contain this
>> path,
>> > as well as the values of each of the records that contain this path.
>> >
>> > The compute function executes and only the nodes that are part of the
>> query
>> > wake up at first, all others halt. As this happens, I collect the
>> recordIds
>> > from the node Values and the edge values from the edges, and when the end
>> > nodes have been reached, the program terminates, I collect the result
>> from
>> > the end nodes and write it to the result file...
>> >
>> > Is there some way I can access a memory mapping or something?... After
>> > execution with 400.000 records, the log is:
>> >
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient: The total number of supersteps:
>> 48
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient: Counters: 12
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:
>> > org.apache.hama.graph.GraphJobRunner$GraphJobCounter
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     ITERATIONS=42
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     MULTISTEP_PARTITIONING=4
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     INPUT_VERTICES=1001
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:
>> > org.apache.hama.bsp.JobInProgress$JobCounter
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     SUPERSTEPS=48
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=6
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:
>> > org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=294
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     IO_BYTES_READ=344290795
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=411231
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_SENT=1592
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TASK_INPUT_RECORDS=1001
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_RECEIVED=1580
>> >>13/07/14 10:17:57 INFO bsp.BSPJobClient:     TASK_OUTPUT_RECORDS=1001
>> > Job 1 Finished in 3559.706 seconds
>> >
>> >
>> > Any ideas?
>> > Thanks in advance,
>> > Kostas X.
>> >
>> >
>> > On Sun, Jul 14, 2013 at 10:08 AM, Chia-Hung Lin <clin4j@googlemail.com
>> >wrote:
>> >
>> >> Any chance to show how the code, logic, log, etc. is executed? Others
>> >> might be able to help spot the issue in underlying infrastructure or
>> >> somewhere else.
>> >>
>> >> On 14 July 2013 15:00, Kostas Xirog <k0stasx1rog@gmail.com> wrote:
>> >> > Hello,
>> >> >
>> >> > I'm running my program with 400.000 records as data and the execution
>> >> takes
>> >> > 50 minutes whereas the execution of the same query on 200.000  records
>> >> > takes 70 seconds. Any idea why that might be? I've been monitoring
my
>> >> > system with the 'top' command, and I see that for these 50 minutes
the
>> >> > memory usage is 75.5% and the CPU as at 100 almost constantly...
>> >> >
>> >> > I'm running hama in local mode on one machine with 8GB of RAM and 8
>> CPUs.
>> >> > Any idea why that might be? Any ideas of how I can fix it?
>> >> >
>> >> > Thanks in advance,
>> >> > Kostas X.
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message