giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: On pre/post Application/Superstep contract
Date Sat, 01 Oct 2011 13:13:12 GMT
yep, it looks good from italy as well :)

On Sat, Oct 1, 2011 at 11:29 AM, Hyunsik Choi <hyunsik@apache.org> wrote:
> Now, that way looks good. Probably, later we could improve that like Context
> of MapReduce.
>
> --
> Hyunsik Choi
> Database Lab, Korea University
>
> On Sat, Oct 1, 2011 at 3:01 AM, Avery Ching <aching@apache.org> wrote:
>> It isn't visible (purposefully) since it is internal state.
>>
>> That being said, I believe this type of functionality would be useful.
>>  Right now there is a lot of ugly static variables stored in Vertex
>> implementations because of it.  Perhaps we should add another method in
>> GiraphJob
>>
>> final public void setWorkerObjectClass(Class<? extends Configurable>
>> workerObjectClass);
>>
>> Then in BasicVertex
>>
>> public void preApplication(Configurable workerObject);
>> public void postApplication(Configurable workerObject);
>> public void preSuperstep(Configurable workerObject);
>> public void postSuperstep(Configurable workerObject);
>> public Configurable getWorkerObject();
>>
>> Anyone else think of a cleaner way to do it?
>>
>> Avery
>>
>> On 9/30/11 8:42 AM, Claudio Martella wrote:
>>>
>>> afaik getGraphState() is not visible to my object. Or?
>>>
>>> On Fri, Sep 30, 2011 at 5:23 PM, Jake Mannix<jake.mannix@gmail.com>
>>>  wrote:
>>>>
>>>> Remember that there's already a "singleton"-like object available to all
>>>> vertices: the GraphState object, which has a handle on the GraphMapper.
>>>> Maybe this is the right place to get your handle on the
>>>> FSDataOutputStream?
>>>>   -jake
>>>> On Fri, Sep 30, 2011 at 7:25 AM, Claudio Martella
>>>> <claudio.martella@gmail.com>  wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> to my understanding pre/post Application/Superstep methods are called
>>>>> ONCE on a "fake" vertex on each worker (the so called
>>>>> representativeVertex). This means that these methods should not depend
>>>>> on any specific-vertex data.
>>>>>
>>>>> As I'm trying to sort out my Emitter, I thought I could create one
>>>>> FSDataOutputStream per worker which each Vertex belonging to that
>>>>> worker could share (which would be even thread-safe as each worker is
>>>>> not parallel).
>>>>>
>>>>> The questions are:
>>>>>
>>>>> 1) how to share the FSDataOutputFormat object created at
>>>>> preApplication() (and closed at postApplication()) created by this
>>>>> representativeVertex?
>>>>>
>>>>> 2) about the filename, I'd be happy to have access to the Worker Id so
>>>>> to create an outputfile filename as with happens with reducers and
>>>>> part files by FileOutputFormat (i.e.<userdefinedfilename>-workerid).
>>>>>
>>>>>
>>>>> The "best" idea i have in my mind right now is to use the calling
>>>>> vertex (the representativeVertex) hashCode as the id, and create an
>>>>> external Singleton where i can request register and request the
>>>>> outputfiles similarly to what happens with Aggregators now, and by
>>>>> passing the *this* reference as an index to this map. Any better idea?
>>>>> :)
>>>>>
>>>>>
>>>>> --
>>>>>     Claudio Martella
>>>>>     claudio.martella@gmail.com
>>>>
>>>
>>>
>>
>>
>



-- 
    Claudio Martella
    claudio.martella@gmail.com

Mime
View raw message