giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: GIRAPH-684: Remove Writable Requirement?
Date Wed, 14 Aug 2013 14:28:08 GMT
Hi Nitay,

I'm +1 for (2). It would remove another explicit leakage of the Hadoop API
from Giraph. Although it makes clear that the Vertex signature does need to
support serialization, I agree that it is kind of all over the place (with
good and bad results, like checkpointing and ooc coming more easily).
At this point though, Giraph is more mature to take a different path in the
face of the initial advantage of having Writable API.


On Tue, Aug 13, 2013 at 11:25 PM, Nitay Joffe <nitayj@gmail.com> wrote:

> Hello Friends,
>
> I have a diff up that substantially changes our API,
> https://issues.apache.org/jira/browse/GIRAPH-684, which I would like to
> get people's vote on.
>
> Basically the question is whether we think that forcing the graph types
> (I,V,E,M1,M2) to be Writable/Comparable is the right thing to do.
> This requirement means we cannot easily externalize how a type gets
> serialized (for example if you wanted to test out different ways of
> serializing an integer).
> It also makes it more difficult to implement things like the Jython
> integration because every Jython object must be constantly
> wrapped/unwrapped in a Writable wrapper in order to conform to Giraph.
> Personally I have never liked the fact that we have serialization tied to
> the object in terms of code design patterns, but that is just me.
> The diff I have up removes the requirement of IVEMM being Writable, and
> allows you to specify, via a separate parameter, the serializer to use for
> each type. Note that it is completely backwards compatible. That is, if we
> detect that you are actually using a Writable then we stick in an internal
> WritableSerializer (which just calls readFields()/write()) and you do not
> need to specify anything.
>
> The major con of removing the Writable interface is it makes things less
> clear for our users. So, the potential solutions to vote on here are:
>
> 1) Leave things as they are. Let Jython be ugly. Serialization stays tied
> to the object. No external serializers.
> 2) Remove Writable everywhere (as diff currently does). Explain to users
> that they can use Writable or define their own Serializer.
> 3) Remove Writable _internally_ only. Keep the outermost-facing Java API
> (Computation, VertexInputFormat, etc) still allowing Writable only, but
> internally no Writable required. This allows Jython and any expert users to
> work with the more internal types that have no type restrictions yet leaves
> our Java API Writable. Don't mention external serializers to average users
> so our story stays the same. Essentially this option makes things a bit
> more confusing for developers instead of users.
>
> We have discussed this at Facebook (Avery, Alessandro, Maja and myself)
> for a while, and would like to get your opinions as well since this is a
> large change.
> What do folks think? Please weigh in.
>
> Thanks,
> - Nitay




-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message