giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Edunov <edu...@gmail.com>
Subject Re: InMemoryVertex Format(s)
Date Fri, 05 Jun 2015 01:06:29 GMT
Hi Khaled,

Unfortunately, there is no official way to read data between supersteps
right now. You can read all the data in the beginning and just keep it in
memory. And you can write output between supersteps. But reading between
supersteps is not supported yet. It has been in our TODO list for a while
but has never been implemented.

Regards,
Sergey Edunov


On Thu, Jun 4, 2015 at 9:44 AM, Khaled Ammar <khaled.ammar@gmail.com> wrote:

> Hi Sergey,
>
> Thank you for your clarification. I noticed Igor email the other day,
> sounds exciting. I am looking forward to it.
>
> *One more question please*, What is the best approach to load data
> between supersteps?
>
> I understand that it is possible to create a FileSystem object, open a
> file, read it as a stream, and then create vertices and edges using
> Factories. However, I think it is more efficient to do this (1) in parallel
> so that each part of the file is read by each worker, and (2) process the
> file contents using the reader classes identified in the job.
>
> Thanks,
> -Khaled
>
>
>
> On Wed, Jun 3, 2015 at 2:23 PM, Sergey Edunov <edunov@gmail.com> wrote:
>
>> You can switch computation class in MasterCompute by calling
>> setComputation(Class clazz), that means you'll have to provide your own
>> master compute class that extends MasterCompute. Implement convergence
>> criteria there and switch computation class upon convergence, you will no
>> longer rely on voteToHalt().
>>
>> Or you can wait for Blocks Framework to roll out and simply use it. See
>> here for announcement:
>> http://mail-archives.apache.org/mod_mbox/giraph-dev/201506.mbox/%3CCABJ-n3v-24YLzgNmrT3TZT6R8t4Vw1hrBcWWTghG_XgaC%3DYqrg%40mail.gmail.com%3E
>>
>> In my experience, Blocks framework is much easier to use and it naturally
>> suit your needs.
>>
>>
>>
>>
>> On Wed, Jun 3, 2015 at 11:02 AM, Khaled Ammar <khaled.ammar@gmail.com>
>> wrote:
>>
>>> Thank you Sergey,
>>>
>>> This is exactly what I am looking for. I would like to run multiple
>>> computation classes following each other, such that each computation class
>>> will execute until convergence.
>>>
>>> I think may be the GiraphJob class may help. Can I use
>>> setComputationClass to change the computation class in the configuration
>>> object then construct a new GiraphJob and run it? I am not certain when
>>> exactly all vertex and edge data are going to be deleted from memory.
>>>
>>> Thanks,
>>> -Khaled
>>>
>>>
>>> *setComputationClass
>>> <https://giraph.apache.org/apidocs/org/apache/giraph/conf/GiraphConfiguration.html#setComputationClass(java.lang.Class)>*
>>>
>>> , but I am not certain how to pass multiple computation classes there.
>>>
>>>
>>>
>>> On Wed, Jun 3, 2015 at 1:34 PM, Sergey Edunov <edunov@gmail.com> wrote:
>>>
>>>> Hi Khaled,
>>>>
>>>> As far as I know, InMemory input and output formats are only used in
>>>> test cases.
>>>> Can you elaborate more on why do you want to use I/O formats? You can
>>>> use different computation classes within one application and you don't need
>>>> to do I/O between them. All intermediate results can be kept in vertex and
>>>> edge data.
>>>>
>>>> Regards,
>>>> Sergey Edunov
>>>>
>>>> On Tue, Jun 2, 2015 at 12:46 PM, Khaled Ammar <khaled.ammar@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> There are InMemory input and output format for giraph. These could be
>>>>> useful when a specific computation should be executed until convergence
and
>>>>> then another computation is needed. Instead of writing intermediate results
>>>>> to HDFS and read it again, InMemoryVertex format sounds very convenient.
>>>>> However, I could not figure out how to use it in a proper Computation
or
>>>>> GiraphBenchmark class.
>>>>>
>>>>> I appreciate if any one can share his/her experience using this
>>>>> format.
>>>>>
>>>>> *Link to the InMemoryVertexOutputFormat class :*
>>>>>
>>>>> https://giraph.apache.org/apidocs/org/apache/giraph/io/formats/InMemoryVertexOutputFormat.html#InMemoryVertexOutputFormat()
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> -Khaled
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> -Khaled
>>>
>>
>>
>
>
> --
> Thanks,
> -Khaled
>

Mime
View raw message