ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: CacheEntry serialization failure
Date Mon, 21 Dec 2015 09:10:33 GMT
Vova,

We cannot return null in case of a conflict, as user won’t be able to
differentiate between a conflict and missing field. We should throw an
exception.

Also, I don’t like parsing field names looking for a dot for every field.
It will introduce a performance overhead for the cases that do not have
conflicts. Instead, we should add another API for this use case, something
like field(typeName, fieldName).

D.

On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <vozerov@gridgain.com>
wrote:

> Denis,
> Yes, as we do not know which field to pick, we return null.
>
> On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <dmagda@gridgain.com> wrote:
>
> > Sounds good for me. I would go for this approach.
> >
> > In addition if to consider your example below and the user decides to
> look
> > up a field by its simple name then he/she will get nothing or exception
> > (depending on the API), correct?
> > As an example for this case the method will return null
> >
> > BinaryObject obj = ...;
> > Object val = obj.field("x"); // null will be returned cause we don't know
> > what particular 'x' we have to return
> >
> >
> > --
> > Denis
> >
> > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote:
> >
> >> Folks,
> >>
> >> I thought about the solution a bit more and came to the following
> design.
> >>
> >> *Scenario:*
> >> class A {
> >>      int x;
> >> }
> >> class B : extends A {
> >>      int y;
> >> }
> >>
> >> *Solution:*
> >> 1) Field A.x is written as *"A.x"*, field B.x is written as *"B.x"*.
> I.e.
> >> *both
> >> conflicting fields are prefixed* with simple name of the owning class.
> >> 2) API is unchanged. User manipulates these fields on all public methods
> >> in
> >> exactly the same way: "A.x" and "B.x".
> >>
> >> *Rationale:*
> >> 1) We cannot prefix only some of conflicting fields. E.g. if decide to
> >> prefix only A.x, then it is not clear how to handle this case:
> >>
> >> class B extends A implements Binarylizable {
> >>      void write(BinaryWriter writer) {
> >>          writer.writeInt("B.x", x); // User intentionally written field
> as
> >> "B.x".
> >>      }
> >> }
> >>
> >> BinaryObject obj = ...;
> >> Object val = obj.field("B.x"); // Should we lookup for "B.x" as user
> asked
> >> us, or just for "x"?
> >>
> >> Prefixing all conflicting fields with class name resolves the problem.
> >>
> >> 2) If we add methods to manipulate fields not only by name, but by
> >> (typeName + fieldName) as well, then we will have to add *9 new methods*
> >> to
> >>
> >> API:
> >> BinaryType.fieldTypeName(String typeName, String fieldName);
> >> BinaryType.field(String typeName, String fieldName);
> >> BinaryObject.field(String typeName, String fieldName);
> >> BinaryObject.hasField(String  typeName, String fieldName);
> >> BinaryObjectBuilder.getField(String typeName, String fieldName);
> >> BinaryObjectBuilder.setField(String typeName, String fieldName, ...);
> // 3
> >> overloads
> >> BinaryObjectBuilder.removeField(String typeName, String fieldName);
> >>
> >> This is definitely an overkill for such a rare scenario.
> >>
> >> Thoughts?
> >>
> >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> >> wrote:
> >>
> >> Agree, prefixing parent class fields sound like a better option.
> >>> Regarding aliases - I need some time to understand internal mechanics.
> >>> Will answer this a bit later.
> >>>
> >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan <
> >>> dsetrakyan@apache.org
> >>>
> >>>> wrote:
> >>>> Vova,
> >>>>
> >>>> Shouldn’t it be the other way around? Class B writes the field as
“a”,
> >>>> and
> >>>> class A writes it with a prefix (possibly the hash code of the class
> >>>> name).
> >>>>
> >>>> Also, we should clearly document how the SQL queries are affected by
> >>>> this.
> >>>> AFAIK, we should be using field aliases here, no?
> >>>>
> >>>> D.
> >>>>
> >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> >>>> >
> >>>> wrote:
> >>>>
> >>>> May be we can use normal field names by default and add some
> >>>>>
> >>>> prefix/suffix
> >>>>
> >>>>> if conflict is found? E.g.:
> >>>>>
> >>>>> class A {
> >>>>>      int a; // Write as "a";
> >>>>> }
> >>>>>
> >>>>> class B extends A {
> >>>>>      int a; // Write as "B_a";
> >>>>> }
> >>>>>
> >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko <
> >>>>> valentin.kulichenko@gmail.com> wrote:
> >>>>>
> >>>>> Folks,
> >>>>>>
> >>>>>> It seems to me that issue here is not with 3rd-party libraries.
We
> >>>>>>
> >>>>> just
> >>>>
> >>>>> don't properly support class hierarchy in binary format. Any class
> >>>>>>
> >>>>> that
> >>>>
> >>>>> extends another class and has the field with the same name as parent
> >>>>>>
> >>>>> has
> >>>>
> >>>>> will fail unless user provides custom serialization logic that will
> >>>>>>
> >>>>> handle
> >>>>>
> >>>>>> it.
> >>>>>>
> >>>>>> What if we prepend the field name with the simple class name
in this
> >>>>>>
> >>>>> case?
> >>>>>
> >>>>>> Say, we have two classes:
> >>>>>>
> >>>>>> class A {
> >>>>>>    private int id;
> >>>>>> }
> >>>>>>
> >>>>>> class B extends A {
> >>>>>>    private int id;
> >>>>>> }
> >>>>>>
> >>>>>> In this case we will get two fields: "A.id" and "B.id". The
only
> >>>>>>
> >>>>> issue is
> >>>>
> >>>>> that if there are no name conflict, we should be able to resolve
by
> >>>>>>
> >>>>> both
> >>>>
> >>>>> names - with or without prepended type name. I.e., if A is
> serialized,
> >>>>>>
> >>>>> you
> >>>>>
> >>>>>> can get the field value by "id" or "A.id". This is similar to
how it
> >>>>>>
> >>>>> works
> >>>>>
> >>>>>> if you join two SQL tables with the same column names.
> >>>>>>
> >>>>>> Any thoughts on whether it's doable or not?
> >>>>>>
> >>>>>> -Val
> >>>>>>
> >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey Kornev <
> >>>>>>
> >>>>> andrewkornev@hotmail.com>
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>> In this particular case, the class that fails is a non-static
inner
> >>>>>>>
> >>>>>> class
> >>>>>
> >>>>>> that extends another non-static inner class, so they both end
up
> >>>>>>>
> >>>>>> having
> >>>>
> >>>>> the
> >>>>>>
> >>>>>>> compiler-generated "this$0" field.
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Andrey
> >>>>>>>
> >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300
> >>>>>>>> Subject: Re: CacheEntry serialization failure
> >>>>>>>> From: vozerov@gridgain.com
> >>>>>>>> To: dev@ignite.apache.org
> >>>>>>>>
> >>>>>>>> The most straightforward solution which comes to my
mind - *do not
> >>>>>>>>
> >>>>>>> ever
> >>>>>
> >>>>>> use
> >>>>>>>
> >>>>>>>> BinaryMarshaller by default*. Always fallback to
> >>>>>>>>
> >>>>>>> OptimizedMarshaller
> >>>>
> >>>>> unless
> >>>>>>>
> >>>>>>>> user explicitly asked us to use binary format (e.g.
through
> >>>>>>>>
> >>>>>>> package
> >>>>
> >>>>> wildcards).
> >>>>>>>>
> >>>>>>>> BTW, we already do this for Externalizable and
> >>>>>>>>
> >>>>>>> readObject/writeObject.
> >>>>>
> >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir Ozerov <
> >>>>>>>>
> >>>>>>> vozerov@gridgain.com
> >>>>>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Ah, I saw your problem with DirectedSpecifics. We need
to think
> >>>>>>>>>
> >>>>>>>> about
> >>>>>
> >>>>>> how
> >>>>>>>
> >>>>>>>> to solve it. Here is the case:
> >>>>>>>>> 1) Class is Serilzable and cannot be changed;
> >>>>>>>>> 2) There are several duplicated field names;
> >>>>>>>>> => BinaryMarshaller cannot handle it.
> >>>>>>>>>
> >>>>>>>>> Any thoughts?
> >>>>>>>>>
> >>>>>>>>> On Fri, Dec 18, 2015 at 8:34 PM, Vladimir Ozerov
<
> >>>>>>>>>
> >>>>>>>> vozerov@gridgain.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I fixed the problem, it was a bug actually.
> >>>>>>>>>>
> >>>>>>>>>> By default classes which has some custom Java
logic (e.g.
> >>>>>>>>>>
> >>>>>>>>> Externalizable,
> >>>>>>>
> >>>>>>>> or with writeObject/readObject methods) will be written
using
> >>>>>>>>>> OptimizedMarshaller, so similar field names
is not a problem.
> >>>>>>>>>> If you want to serialize such class in binary
format and have
> >>>>>>>>>>
> >>>>>>>>> duplicate
> >>>>>>>
> >>>>>>>> field names, you should provide your own BinarySerializer,
> >>>>>>>>>>
> >>>>>>>>> which
> >>>>
> >>>>> will
> >>>>>>
> >>>>>>> write
> >>>>>>>
> >>>>>>>> these fields with different names.
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Dec 18, 2015 at 8:07 PM, Andrey Kornev
<
> >>>>>>>>>>
> >>>>>>>>> andrewkornev@hotmail.com>
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> How am I supposed to handle this situation if
the class comes
> >>>>>>>>>>>
> >>>>>>>>>> from
> >>>>>
> >>>>>> a
> >>>>>>
> >>>>>>> 3d
> >>>>>>>
> >>>>>>>> party I can't modify?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>> Andrey
> >>>>>>>>>>>
> >>>>>>>>>>> Date: Fri, 18 Dec 2015 09:12:22 +0300
> >>>>>>>>>>>> Subject: Re: CacheEntry serialization
failure
> >>>>>>>>>>>> From: vozerov@gridgain.com
> >>>>>>>>>>>> To: dev@ignite.apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'll take a look.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 4:37 AM, Valentin
Kulichenko <
> >>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Folks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> It looks like CacheEntry implementation
(i.e., the entry
> >>>>>>>>>>>>>
> >>>>>>>>>>>> that
> >>>>
> >>>>> contains
> >>>>>>>>>>>
> >>>>>>>>>>>> version) can't be properly serialized
with the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> BinaryMarshaller.
> >>>>>>
> >>>>>>> I
> >>>>>>>
> >>>>>>>> created
> >>>>>>>>>>>
> >>>>>>>>>>>> the test and the ticket:
> >>>>>>>>>>>>>
> >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203
> >>>>>>>>>>>
> >>>>>>>>>>>> Can someone take a look?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Val
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message