giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: On pre/post Application/Superstep contract
Date Sat, 01 Oct 2011 09:29:35 GMT
Now, that way looks good. Probably, later we could improve that like Context
of MapReduce.

--
Hyunsik Choi
Database Lab, Korea University

On Sat, Oct 1, 2011 at 3:01 AM, Avery Ching <aching@apache.org> wrote:
> It isn't visible (purposefully) since it is internal state.
>
> That being said, I believe this type of functionality would be useful.
>  Right now there is a lot of ugly static variables stored in Vertex
> implementations because of it.  Perhaps we should add another method in
> GiraphJob
>
> final public void setWorkerObjectClass(Class<? extends Configurable>
> workerObjectClass);
>
> Then in BasicVertex
>
> public void preApplication(Configurable workerObject);
> public void postApplication(Configurable workerObject);
> public void preSuperstep(Configurable workerObject);
> public void postSuperstep(Configurable workerObject);
> public Configurable getWorkerObject();
>
> Anyone else think of a cleaner way to do it?
>
> Avery
>
> On 9/30/11 8:42 AM, Claudio Martella wrote:
>>
>> afaik getGraphState() is not visible to my object. Or?
>>
>> On Fri, Sep 30, 2011 at 5:23 PM, Jake Mannix<jake.mannix@gmail.com>
>>  wrote:
>>>
>>> Remember that there's already a "singleton"-like object available to all
>>> vertices: the GraphState object, which has a handle on the GraphMapper.
>>> Maybe this is the right place to get your handle on the
>>> FSDataOutputStream?
>>>   -jake
>>> On Fri, Sep 30, 2011 at 7:25 AM, Claudio Martella
>>> <claudio.martella@gmail.com>  wrote:
>>>>
>>>> Hello,
>>>>
>>>> to my understanding pre/post Application/Superstep methods are called
>>>> ONCE on a "fake" vertex on each worker (the so called
>>>> representativeVertex). This means that these methods should not depend
>>>> on any specific-vertex data.
>>>>
>>>> As I'm trying to sort out my Emitter, I thought I could create one
>>>> FSDataOutputStream per worker which each Vertex belonging to that
>>>> worker could share (which would be even thread-safe as each worker is
>>>> not parallel).
>>>>
>>>> The questions are:
>>>>
>>>> 1) how to share the FSDataOutputFormat object created at
>>>> preApplication() (and closed at postApplication()) created by this
>>>> representativeVertex?
>>>>
>>>> 2) about the filename, I'd be happy to have access to the Worker Id so
>>>> to create an outputfile filename as with happens with reducers and
>>>> part files by FileOutputFormat (i.e.<userdefinedfilename>-workerid).
>>>>
>>>>
>>>> The "best" idea i have in my mind right now is to use the calling
>>>> vertex (the representativeVertex) hashCode as the id, and create an
>>>> external Singleton where i can request register and request the
>>>> outputfiles similarly to what happens with Aggregators now, and by
>>>> passing the *this* reference as an index to this map. Any better idea?
>>>> :)
>>>>
>>>>
>>>> --
>>>>     Claudio Martella
>>>>     claudio.martella@gmail.com
>>>
>>
>>
>
>

Mime
View raw message