giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Resource Allocation Model Of Apache Giraph
Date Fri, 24 Oct 2014 12:42:22 GMT
giraph.userPartitionCount is the way to go, but not
giraph.maxPartitionsInMemory. That is for the out-of-core graph
functionality.

On Fri, Oct 24, 2014 at 1:23 PM, Matthew Saltz <saltzm@gmail.com> wrote:

> You may set giraph.userPartitionCount=<number of workers> and giraph.maxPartitionsInMemory=1.
> Like Avery said though, since parallelism occurs on a partition level (each
> thread processes a different partition) if you only have one partition per
> worker you cannot take advantage of multithreading.
>
> Best,
> Matthew
>
> On Fri, Oct 24, 2014 at 3:53 AM, Zhang, David (Paypal Risk) <
> pengzhang@ebay.com> wrote:
>
>>  I think no good solution. You can try to run a java application by
>> using FileInputFormat.getSplits to get the size of the array, which number
>> you can set to giraph workers.
>>
>> Or run a simple map-reduce job by using IdentityMapper to see how many
>> mappers there.
>>
>>
>>
>> Thanks,
>>
>> Zhang, David (Paypal Risk)
>>
>> *From:* Charith Wickramarachchi [mailto:charith.dhanushka@gmail.com]
>> *Sent:* 2014年10月24日 5:37
>> *To:* user
>> *Subject:* Re: Resource Allocation Model Of Apache Giraph
>>
>>
>>
>> Thanks  Claudio and Avery,
>>
>>
>>
>> I find a way way to configure hadoop to have desired number of mappers
>> per machine as Claudio mentioned.
>>
>>
>>
>> Avery,
>>
>> Could you please tell me how I can configure giraph to make each worker
>> handle only a single partition?
>>
>>
>>
>> Thanks,
>> Charith
>>
>>
>>
>> On Thu, Oct 23, 2014 at 2:26 PM, Avery Ching <aching@apache.org> wrote:
>>
>> Regarding your second point, partitions are decoupled from workers.  A
>> worker can handle zero or more partitions.  You can make each worker handle
>> one partition, but we typically like multiple partitions since we can use
>> multi-threading per machine.
>>
>>
>>
>> On 10/23/14, 9:04 AM, Claudio Martella wrote:
>>
>>  the way mappers (or containers) and hence workers are assigned to
>> machines is not under the control of giraph, but of the underlying hadoop
>> environment (with different responsibilities that depend on the hadoop
>> version, e.g. YARN). You'll have to tweak your hadoop configuration to
>> control the maximum number of workers assigned to one machine (optimally
>> one with multiple threads).
>>
>>
>>
>> On Thu, Oct 23, 2014 at 5:53 PM, Charith Wickramarachchi <
>> charith.dhanushka@gmail.com> wrote:
>>
>> Hi Folks,
>>
>>
>>
>> I'm wondering what is the resource allocation model for Apache Giraph.
>>
>>
>>
>> As I understand each worker is one to one Mapped with a Mapper and a
>> worker can process multiple partitions with a user defined number of
>> threads.
>>
>>
>>
>> Is it possible to make sure that one worker, only process a single
>> partition? Also is it possible to control the worker assignment in the
>> cluster nodes? (Ex: Make sure only N  workers runs on a single machine,
>> assuming we have enough resources)
>>
>>
>>
>> Thanks,
>>
>> Charith
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Charith Dhanushka Wickramaarachchi
>>
>>
>>
>> Tel  +1 213 447 4253
>>
>> Web  http://apache.org/~charith <http://www-scf.usc.edu/%7Ecwickram/>
>>
>> Blog  http://charith.wickramaarachchi.org/
>> <http://charithwiki.blogspot.com/>
>>
>> Twitter  @charithwiki <https://twitter.com/charithwiki>
>>
>>
>>
>> This communication may contain privileged or other
>> confidential information and is intended exclusively for the addressee/s.
>> If you are not the intended recipient/s, or believe that you may have
>> received this communication in error, please reply to the
>> sender indicating that fact and delete the copy you received and in
>> addition, you should not print, copy, retransmit, disseminate, or otherwise
>> use the information contained in this communication.
>> Internet communications cannot be guaranteed to be timely, secure, error
>> or virus-free. The sender does not accept liability for any errors
>> or omissions
>>
>>
>>
>>
>>
>> --
>>
>>    Claudio Martella
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Charith Dhanushka Wickramaarachchi
>>
>>
>>
>> Tel  +1 213 447 4253
>>
>> Web  http://apache.org/~charith <http://www-scf.usc.edu/~cwickram/>
>>
>> Blog  http://charith.wickramaarachchi.org/
>> <http://charithwiki.blogspot.com/>
>>
>> Twitter  @charithwiki <https://twitter.com/charithwiki>
>>
>>
>>
>> This communication may contain privileged or other
>> confidential information and is intended exclusively for the addressee/s.
>> If you are not the intended recipient/s, or believe that you may have
>> received this communication in error, please reply to the
>> sender indicating that fact and delete the copy you received and in
>> addition, you should not print, copy, retransmit, disseminate, or otherwise
>> use the information contained in this communication.
>> Internet communications cannot be guaranteed to be timely, secure, error
>> or virus-free. The sender does not accept liability for any errors
>> or omissions
>>
>
>


-- 
   Claudio Martella

Mime
View raw message