hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Cutting a 0.7 release
Date Tue, 25 Feb 2014 01:14:06 GMT
1) Map and Reduce model is a file-based communication. So, each
mappers can run separately. For example, To run MR job on 1 GB input
data, 5 mappers will be scheduled. Even though there are only 2 task
slots (single machine), MR job slow but works - 2 running Map Tasks, 3
pending Map tasks.

However, unlike MapReduce, BSP uses network-based communication. It
means that the every BSP tasks must run at once. And the number of BSP
tasks is determined by the number of blocks of input. So, you CANNOT
run 1 GB input data on a single machine. It's not a Memory issue.

> throw OOM exceptions, instead it may eventually process items slower (with
> caches / queues) but never throw an exception for that but that's just my

I hope so too, but I think you are saying about Iterative MapReduce.

2) The normal block size of HDFS is 64 ~ 256 MB. If we can assume that
the split size = block size, I feel that current system is enough.

I don't think we have to spend a time for implementing disk-based something.

WDYT?

On Tue, Feb 25, 2014 at 12:19 AM, Anastasis Andronidis
<andronat_asf@hotmail.com> wrote:
> On 24 Φεβ 2014, at 3:32 μ.μ., Tommaso Teofili <tommaso.teofili@gmail.com>
wrote:
>
>>>
>>> According to my personal evaluations, current system is fairly
>>> respectable. As I mentioned before, I believe we should stick to
>>> in-memory style since the today's machines can be equipped with up to
>>> 128 GB. Disk (or disk hybrid) based queue is a optional, not a
>>> must-have.
>>>
>>
>> right, the only thing that I think we need to address before 0.7.0 is
>> related to the OutOfMemory errors (especially when dealing with large
>> graphs); for example IMHO even if the memory is not enough to store all the
>> graph vertices assigned to a certain peer, a scalable system should never
>> throw OOM exceptions, instead it may eventually process items slower (with
>> caches / queues) but never throw an exception for that but that's just my
>> opinion.
>>
>
> I like and agree with this.
>
> Cheers,
> Anastasis
>



-- 
Edward J. Yoon (@eddieyoon)
Chief Executive Officer
DataSayer, Inc.

Mime
View raw message