ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: CacheEntry serialization failure
Date Mon, 21 Dec 2015 15:03:19 GMT
Well, if having only aliases as a way to resolve such conflicts is fine,
then there is not need for the things I described.

On Mon, Dec 21, 2015 at 5:49 PM, Dmitriy Setrakyan <dsetrakyan@gridgain.com>
wrote:

> Are you talking about SQL queries? I thought that we agreed to use field
> aliases in case of name conflicts, no?
>
> D.
>
> > On Dec 21, 2015, at 4:57 AM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
> >
> > Several additional problems appeared:
> > 1) We must use fully-qualified class names because simple class names
> might
> > also be the same. E.g: class package1.A extends package2.A
> > 2) Our query engine doesn't like dots and "$" from class names. Because
> of
> > this fields with these characters cannot be queried.
> >
> > To workaorund these problems I did the following:
> > 1) Fully qualified class names are used;
> > 2) "." is replaced with "_" and "$" is replaced with "__". E.g. field
> > org.my.package.MyClass$MyInnerClass.x is written as "
> > org_my_package_MyClass__MyInnerClass_x";
> > 3) If user would like to query the field, he can call special helper
> > method BinaryUtils.qualifiedFieldName(Class
> > cls, String fieldName) returning correct field name.
> >
> > As this problem is not likely to occur in real quering scenarios, I think
> > this solution is more or less fine.
> >
> > Thoughts?
> >
> >
> > On Mon, Dec 21, 2015 at 12:28 PM, Sergey Kozlov <skozlov@gridgain.com>
> > wrote:
> >
> >> From my standpoint Vova's appoach very close to SQL behavior if two
> joined
> >> tables have same column names.
> >>
> >> On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <
> dsetrakyan@apache.org
> >>>
> >> wrote:
> >>
> >>> Got it. I understand the reasoning now for not throwing an exception.
> >> Let’s
> >>> make sure we document this behavior.
> >>>
> >>> If there is no additional field-name-parsing overhead, then the
> proposed
> >>> API looks very nice.
> >>>
> >>> D.
> >>>
> >>> On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> >>> wrote:
> >>>
> >>>> Dima,
> >>>>
> >>>> Here is how proposed design works:
> >>>>
> >>>> class A {
> >>>>    int x = 1;
> >>>> }
> >>>> class B {
> >>>>    int x = 2;
> >>>> }
> >>>>
> >>>> BinaryObject obj = ...;
> >>>>
> >>>> Object val = obj.field("A.x"); // Returns "1";
> >>>> Object val = obj.field("B.x"); // Returns "2";
> >>>> Object val = obj.field("x"); // Returns null;
> >>>>
> >>>> boolean exists = obj.hasField("A.x"); // Returns "true";
> >>>> boolean exists = obj.hasField("B.x"); // Returns "true";
> >>>> boolean exists = obj.hasField("x"); // Returns "false";
> >>>>
> >>>> Looks clean and consistent for me. Remember that we are talking about
> >>> very
> >>>> specific use case. It is very unlikely that user will operate on
> >> objects
> >>>> conflicting fields in binary form.
> >>>> Also, there will be no parsing at all. We use field name passed by
> user
> >>>> directly.
> >>>>
> >>>> Vladimir.
> >>>>
> >>>> On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan <
> >>> dsetrakyan@apache.org
> >>>>>
> >>>> wrote:
> >>>>
> >>>>> Vova,
> >>>>>
> >>>>> We cannot return null in case of a conflict, as user won’t be
able to
> >>>>> differentiate between a conflict and missing field. We should throw
> >> an
> >>>>> exception.
> >>>>>
> >>>>> Also, I don’t like parsing field names looking for a dot for every
> >>> field.
> >>>>> It will introduce a performance overhead for the cases that do not
> >> have
> >>>>> conflicts. Instead, we should add another API for this use case,
> >>>> something
> >>>>> like field(typeName, fieldName).
> >>>>>
> >>>>> D.
> >>>>>
> >>>>> On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Denis,
> >>>>>> Yes, as we do not know which field to pick, we return null.
> >>>>>>
> >>>>>> On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <dmagda@gridgain.com
> >>>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Sounds good for me. I would go for this approach.
> >>>>>>>
> >>>>>>> In addition if to consider your example below and the user
> >> decides
> >>> to
> >>>>>> look
> >>>>>>> up a field by its simple name then he/she will get nothing
or
> >>>> exception
> >>>>>>> (depending on the API), correct?
> >>>>>>> As an example for this case the method will return null
> >>>>>>>
> >>>>>>> BinaryObject obj = ...;
> >>>>>>> Object val = obj.field("x"); // null will be returned cause
we
> >>> don't
> >>>>> know
> >>>>>>> what particular 'x' we have to return
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Denis
> >>>>>>>
> >>>>>>>> On 12/21/2015 11:48 AM, Vladimir Ozerov wrote:
> >>>>>>>>
> >>>>>>>> Folks,
> >>>>>>>>
> >>>>>>>> I thought about the solution a bit more and came to
the
> >> following
> >>>>>> design.
> >>>>>>>>
> >>>>>>>> *Scenario:*
> >>>>>>>> class A {
> >>>>>>>>     int x;
> >>>>>>>> }
> >>>>>>>> class B : extends A {
> >>>>>>>>     int y;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> *Solution:*
> >>>>>>>> 1) Field A.x is written as *"A.x"*, field B.x is written
as
> >>> *"B.x"*.
> >>>>>> I.e.
> >>>>>>>> *both
> >>>>>>>> conflicting fields are prefixed* with simple name of
the owning
> >>>> class.
> >>>>>>>> 2) API is unchanged. User manipulates these fields on
all public
> >>>>> methods
> >>>>>>>> in
> >>>>>>>> exactly the same way: "A.x" and "B.x".
> >>>>>>>>
> >>>>>>>> *Rationale:*
> >>>>>>>> 1) We cannot prefix only some of conflicting fields.
E.g. if
> >>> decide
> >>>> to
> >>>>>>>> prefix only A.x, then it is not clear how to handle
this case:
> >>>>>>>>
> >>>>>>>> class B extends A implements Binarylizable {
> >>>>>>>>     void write(BinaryWriter writer) {
> >>>>>>>>         writer.writeInt("B.x", x); // User intentionally
> >> written
> >>>>> field
> >>>>>> as
> >>>>>>>> "B.x".
> >>>>>>>>     }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> BinaryObject obj = ...;
> >>>>>>>> Object val = obj.field("B.x"); // Should we lookup for
"B.x" as
> >>> user
> >>>>>> asked
> >>>>>>>> us, or just for "x"?
> >>>>>>>>
> >>>>>>>> Prefixing all conflicting fields with class name resolves
the
> >>>> problem.
> >>>>>>>>
> >>>>>>>> 2) If we add methods to manipulate fields not only by
name, but
> >> by
> >>>>>>>> (typeName + fieldName) as well, then we will have to
add *9 new
> >>>>> methods*
> >>>>>>>> to
> >>>>>>>>
> >>>>>>>> API:
> >>>>>>>> BinaryType.fieldTypeName(String typeName, String fieldName);
> >>>>>>>> BinaryType.field(String typeName, String fieldName);
> >>>>>>>> BinaryObject.field(String typeName, String fieldName);
> >>>>>>>> BinaryObject.hasField(String  typeName, String fieldName);
> >>>>>>>> BinaryObjectBuilder.getField(String typeName, String
fieldName);
> >>>>>>>> BinaryObjectBuilder.setField(String typeName, String
fieldName,
> >>>> ...);
> >>>>>> // 3
> >>>>>>>> overloads
> >>>>>>>> BinaryObjectBuilder.removeField(String typeName, String
> >>> fieldName);
> >>>>>>>>
> >>>>>>>> This is definitely an overkill for such a rare scenario.
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>>
> >>>>>>>> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <
> >>>>> vozerov@gridgain.com
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Agree, prefixing parent class fields sound like a better
option.
> >>>>>>>>> Regarding aliases - I need some time to understand
internal
> >>>>> mechanics.
> >>>>>>>>> Will answer this a bit later.
> >>>>>>>>>
> >>>>>>>>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan
<
> >>>>>>>>> dsetrakyan@apache.org
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>> Vova,
> >>>>>>>>>>
> >>>>>>>>>> Shouldn’t it be the other way around? Class
B writes the field
> >>> as
> >>>>> “a”,
> >>>>>>>>>> and
> >>>>>>>>>> class A writes it with a prefix (possibly the
hash code of the
> >>>> class
> >>>>>>>>>> name).
> >>>>>>>>>>
> >>>>>>>>>> Also, we should clearly document how the SQL
queries are
> >>> affected
> >>>> by
> >>>>>>>>>> this.
> >>>>>>>>>> AFAIK, we should be using field aliases here,
no?
> >>>>>>>>>>
> >>>>>>>>>> D.
> >>>>>>>>>>
> >>>>>>>>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov
<
> >>>>>> vozerov@gridgain.com
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> May be we can use normal field names by default
and add some
> >>>>>>>>>>>
> >>>>>>>>>> prefix/suffix
> >>>>>>>>>>
> >>>>>>>>>>> if conflict is found? E.g.:
> >>>>>>>>>>>
> >>>>>>>>>>> class A {
> >>>>>>>>>>>     int a; // Write as "a";
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> class B extends A {
> >>>>>>>>>>>     int a; // Write as "B_a";
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin
Kulichenko <
> >>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Folks,
> >>>>>>>>>>>>
> >>>>>>>>>>>> It seems to me that issue here is not
with 3rd-party
> >>> libraries.
> >>>> We
> >>>>>>>>>>>>
> >>>>>>>>>>> just
> >>>>>>>>>>
> >>>>>>>>>>> don't properly support class hierarchy in
binary format. Any
> >>>> class
> >>>>>>>>>>>>
> >>>>>>>>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> extends another class and has the field
with the same name as
> >>>>> parent
> >>>>>>>>>>>>
> >>>>>>>>>>> has
> >>>>>>>>>>
> >>>>>>>>>>> will fail unless user provides custom serialization
logic
> >> that
> >>>> will
> >>>>>>>>>>>>
> >>>>>>>>>>> handle
> >>>>>>>>>>>
> >>>>>>>>>>>> it.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What if we prepend the field name with
the simple class name
> >>> in
> >>>>> this
> >>>>>>>>>>>>
> >>>>>>>>>>> case?
> >>>>>>>>>>>
> >>>>>>>>>>>> Say, we have two classes:
> >>>>>>>>>>>>
> >>>>>>>>>>>> class A {
> >>>>>>>>>>>>   private int id;
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> class B extends A {
> >>>>>>>>>>>>   private int id;
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> In this case we will get two fields:
"A.id" and "B.id". The
> >>> only
> >>>>>>>>>>>>
> >>>>>>>>>>> issue is
> >>>>>>>>>>
> >>>>>>>>>>> that if there are no name conflict, we should
be able to
> >>> resolve
> >>>> by
> >>>>>>>>>>>>
> >>>>>>>>>>> both
> >>>>>>>>>>
> >>>>>>>>>>> names - with or without prepended type name.
I.e., if A is
> >>>>>> serialized,
> >>>>>>>>>>>>
> >>>>>>>>>>> you
> >>>>>>>>>>>
> >>>>>>>>>>>> can get the field value by "id" or "A.id".
This is similar
> >> to
> >>>> how
> >>>>> it
> >>>>>>>>>>>>
> >>>>>>>>>>> works
> >>>>>>>>>>>
> >>>>>>>>>>>> if you join two SQL tables with the
same column names.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Any thoughts on whether it's doable
or not?
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Val
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey
Kornev <
> >>>>>>>>>>>>
> >>>>>>>>>>> andrewkornev@hotmail.com>
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> In this particular case, the class that
fails is a
> >> non-static
> >>>>> inner
> >>>>>>>>>>>>>
> >>>>>>>>>>>> class
> >>>>>>>>>>>
> >>>>>>>>>>>> that extends another non-static inner
class, so they both
> >> end
> >>> up
> >>>>>>>>>>>>>
> >>>>>>>>>>>> having
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>>> compiler-generated "this$0" field.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards
> >>>>>>>>>>>>> Andrey
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Date: Fri, 18 Dec 2015 20:44:12
+0300
> >>>>>>>>>>>>>> Subject: Re: CacheEntry serialization
failure
> >>>>>>>>>>>>>> From: vozerov@gridgain.com
> >>>>>>>>>>>>>> To: dev@ignite.apache.org
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The most straightforward solution
which comes to my mind -
> >>> *do
> >>>>> not
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> ever
> >>>>>>>>>>>
> >>>>>>>>>>>> use
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> BinaryMarshaller by default*.
Always fallback to
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> OptimizedMarshaller
> >>>>>>>>>>
> >>>>>>>>>>> unless
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> user explicitly asked us to
use binary format (e.g.
> >> through
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> package
> >>>>>>>>>>
> >>>>>>>>>>> wildcards).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> BTW, we already do this for
Externalizable and
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> readObject/writeObject.
> >>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir
Ozerov <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> vozerov@gridgain.com
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Ah, I saw your problem with
DirectedSpecifics. We need to
> >>>> think
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> about
> >>>>>>>>>>>
> >>>>>>>>>>>> how
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> to solve it. Here is the case:
> >>>>>>>>>>>>>>> 1) Class is Serilzable and
cannot be changed;
> >>>>>>>>>>>>>>> 2) There are several duplicated
field names;
> >>>>>>>>>>>>>>> => BinaryMarshaller cannot
handle it.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Any thoughts?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at
8:34 PM, Vladimir Ozerov <
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> vozerov@gridgain.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I fixed the problem, it
was a bug actually.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> By default classes which
has some custom Java logic
> >> (e.g.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Externalizable,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> or with writeObject/readObject
methods) will be written
> >>> using
> >>>>>>>>>>>>>>>> OptimizedMarshaller,
so similar field names is not a
> >>>> problem.
> >>>>>>>>>>>>>>>> If you want to serialize
such class in binary format and
> >>>> have
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> duplicate
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> field names, you should provide
your own BinarySerializer,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> which
> >>>>>>>>>>
> >>>>>>>>>>> will
> >>>>>>>>>>>>
> >>>>>>>>>>>>> write
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> these fields with different
names.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, Dec 18, 2015
at 8:07 PM, Andrey Kornev <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> andrewkornev@hotmail.com>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> How am I supposed to
handle this situation if the class
> >>>> comes
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> from
> >>>>>>>>>>>
> >>>>>>>>>>>> a
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 3d
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> party I can't modify?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>>> Andrey
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Date: Fri, 18 Dec
2015 09:12:22 +0300
> >>>>>>>>>>>>>>>>>> Subject: Re:
CacheEntry serialization failure
> >>>>>>>>>>>>>>>>>> From: vozerov@gridgain.com
> >>>>>>>>>>>>>>>>>> To: dev@ignite.apache.org
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I'll take a
look.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Fri, Dec
18, 2015 at 4:37 AM, Valentin Kulichenko <
> >>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>
wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Folks,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> It looks
like CacheEntry implementation (i.e., the
> >>> entry
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>
> >>>>>>>>>>> contains
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> version) can't
be properly serialized with the
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> BinaryMarshaller.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> created
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> the test and
the ticket:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Can someone
take a look?
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Val
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Sergey Kozlov
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message