ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@gridgain.com>
Subject Re: CacheEntry serialization failure
Date Mon, 21 Dec 2015 14:49:48 GMT
Are you talking about SQL queries? I thought that we agreed to use field aliases in case of
name conflicts, no?

D.

> On Dec 21, 2015, at 4:57 AM, Vladimir Ozerov <vozerov@gridgain.com> wrote:
> 
> Several additional problems appeared:
> 1) We must use fully-qualified class names because simple class names might
> also be the same. E.g: class package1.A extends package2.A
> 2) Our query engine doesn't like dots and "$" from class names. Because of
> this fields with these characters cannot be queried.
> 
> To workaorund these problems I did the following:
> 1) Fully qualified class names are used;
> 2) "." is replaced with "_" and "$" is replaced with "__". E.g. field
> org.my.package.MyClass$MyInnerClass.x is written as "
> org_my_package_MyClass__MyInnerClass_x";
> 3) If user would like to query the field, he can call special helper
> method BinaryUtils.qualifiedFieldName(Class
> cls, String fieldName) returning correct field name.
> 
> As this problem is not likely to occur in real quering scenarios, I think
> this solution is more or less fine.
> 
> Thoughts?
> 
> 
> On Mon, Dec 21, 2015 at 12:28 PM, Sergey Kozlov <skozlov@gridgain.com>
> wrote:
> 
>> From my standpoint Vova's appoach very close to SQL behavior if two joined
>> tables have same column names.
>> 
>> On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <dsetrakyan@apache.org
>>> 
>> wrote:
>> 
>>> Got it. I understand the reasoning now for not throwing an exception.
>> Let’s
>>> make sure we document this behavior.
>>> 
>>> If there is no additional field-name-parsing overhead, then the proposed
>>> API looks very nice.
>>> 
>>> D.
>>> 
>>> On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <vozerov@gridgain.com>
>>> wrote:
>>> 
>>>> Dima,
>>>> 
>>>> Here is how proposed design works:
>>>> 
>>>> class A {
>>>>    int x = 1;
>>>> }
>>>> class B {
>>>>    int x = 2;
>>>> }
>>>> 
>>>> BinaryObject obj = ...;
>>>> 
>>>> Object val = obj.field("A.x"); // Returns "1";
>>>> Object val = obj.field("B.x"); // Returns "2";
>>>> Object val = obj.field("x"); // Returns null;
>>>> 
>>>> boolean exists = obj.hasField("A.x"); // Returns "true";
>>>> boolean exists = obj.hasField("B.x"); // Returns "true";
>>>> boolean exists = obj.hasField("x"); // Returns "false";
>>>> 
>>>> Looks clean and consistent for me. Remember that we are talking about
>>> very
>>>> specific use case. It is very unlikely that user will operate on
>> objects
>>>> conflicting fields in binary form.
>>>> Also, there will be no parsing at all. We use field name passed by user
>>>> directly.
>>>> 
>>>> Vladimir.
>>>> 
>>>> On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan <
>>> dsetrakyan@apache.org
>>>>> 
>>>> wrote:
>>>> 
>>>>> Vova,
>>>>> 
>>>>> We cannot return null in case of a conflict, as user won’t be able
to
>>>>> differentiate between a conflict and missing field. We should throw
>> an
>>>>> exception.
>>>>> 
>>>>> Also, I don’t like parsing field names looking for a dot for every
>>> field.
>>>>> It will introduce a performance overhead for the cases that do not
>> have
>>>>> conflicts. Instead, we should add another API for this use case,
>>>> something
>>>>> like field(typeName, fieldName).
>>>>> 
>>>>> D.
>>>>> 
>>>>> On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <
>> vozerov@gridgain.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Denis,
>>>>>> Yes, as we do not know which field to pick, we return null.
>>>>>> 
>>>>>> On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <dmagda@gridgain.com
>>> 
>>>>> wrote:
>>>>>> 
>>>>>>> Sounds good for me. I would go for this approach.
>>>>>>> 
>>>>>>> In addition if to consider your example below and the user
>> decides
>>> to
>>>>>> look
>>>>>>> up a field by its simple name then he/she will get nothing or
>>>> exception
>>>>>>> (depending on the API), correct?
>>>>>>> As an example for this case the method will return null
>>>>>>> 
>>>>>>> BinaryObject obj = ...;
>>>>>>> Object val = obj.field("x"); // null will be returned cause we
>>> don't
>>>>> know
>>>>>>> what particular 'x' we have to return
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Denis
>>>>>>> 
>>>>>>>> On 12/21/2015 11:48 AM, Vladimir Ozerov wrote:
>>>>>>>> 
>>>>>>>> Folks,
>>>>>>>> 
>>>>>>>> I thought about the solution a bit more and came to the
>> following
>>>>>> design.
>>>>>>>> 
>>>>>>>> *Scenario:*
>>>>>>>> class A {
>>>>>>>>     int x;
>>>>>>>> }
>>>>>>>> class B : extends A {
>>>>>>>>     int y;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> *Solution:*
>>>>>>>> 1) Field A.x is written as *"A.x"*, field B.x is written
as
>>> *"B.x"*.
>>>>>> I.e.
>>>>>>>> *both
>>>>>>>> conflicting fields are prefixed* with simple name of the
owning
>>>> class.
>>>>>>>> 2) API is unchanged. User manipulates these fields on all
public
>>>>> methods
>>>>>>>> in
>>>>>>>> exactly the same way: "A.x" and "B.x".
>>>>>>>> 
>>>>>>>> *Rationale:*
>>>>>>>> 1) We cannot prefix only some of conflicting fields. E.g.
if
>>> decide
>>>> to
>>>>>>>> prefix only A.x, then it is not clear how to handle this
case:
>>>>>>>> 
>>>>>>>> class B extends A implements Binarylizable {
>>>>>>>>     void write(BinaryWriter writer) {
>>>>>>>>         writer.writeInt("B.x", x); // User intentionally
>> written
>>>>> field
>>>>>> as
>>>>>>>> "B.x".
>>>>>>>>     }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> BinaryObject obj = ...;
>>>>>>>> Object val = obj.field("B.x"); // Should we lookup for "B.x"
as
>>> user
>>>>>> asked
>>>>>>>> us, or just for "x"?
>>>>>>>> 
>>>>>>>> Prefixing all conflicting fields with class name resolves
the
>>>> problem.
>>>>>>>> 
>>>>>>>> 2) If we add methods to manipulate fields not only by name,
but
>> by
>>>>>>>> (typeName + fieldName) as well, then we will have to add
*9 new
>>>>> methods*
>>>>>>>> to
>>>>>>>> 
>>>>>>>> API:
>>>>>>>> BinaryType.fieldTypeName(String typeName, String fieldName);
>>>>>>>> BinaryType.field(String typeName, String fieldName);
>>>>>>>> BinaryObject.field(String typeName, String fieldName);
>>>>>>>> BinaryObject.hasField(String  typeName, String fieldName);
>>>>>>>> BinaryObjectBuilder.getField(String typeName, String fieldName);
>>>>>>>> BinaryObjectBuilder.setField(String typeName, String fieldName,
>>>> ...);
>>>>>> // 3
>>>>>>>> overloads
>>>>>>>> BinaryObjectBuilder.removeField(String typeName, String
>>> fieldName);
>>>>>>>> 
>>>>>>>> This is definitely an overkill for such a rare scenario.
>>>>>>>> 
>>>>>>>> Thoughts?
>>>>>>>> 
>>>>>>>> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <
>>>>> vozerov@gridgain.com
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Agree, prefixing parent class fields sound like a better
option.
>>>>>>>>> Regarding aliases - I need some time to understand internal
>>>>> mechanics.
>>>>>>>>> Will answer this a bit later.
>>>>>>>>> 
>>>>>>>>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan <
>>>>>>>>> dsetrakyan@apache.org
>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> Vova,
>>>>>>>>>> 
>>>>>>>>>> Shouldn’t it be the other way around? Class B writes
the field
>>> as
>>>>> “a”,
>>>>>>>>>> and
>>>>>>>>>> class A writes it with a prefix (possibly the hash
code of the
>>>> class
>>>>>>>>>> name).
>>>>>>>>>> 
>>>>>>>>>> Also, we should clearly document how the SQL queries
are
>>> affected
>>>> by
>>>>>>>>>> this.
>>>>>>>>>> AFAIK, we should be using field aliases here, no?
>>>>>>>>>> 
>>>>>>>>>> D.
>>>>>>>>>> 
>>>>>>>>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov
<
>>>>>> vozerov@gridgain.com
>>>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> May be we can use normal field names by default and
add some
>>>>>>>>>>> 
>>>>>>>>>> prefix/suffix
>>>>>>>>>> 
>>>>>>>>>>> if conflict is found? E.g.:
>>>>>>>>>>> 
>>>>>>>>>>> class A {
>>>>>>>>>>>     int a; // Write as "a";
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> class B extends A {
>>>>>>>>>>>     int a; // Write as "B_a";
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko
<
>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Folks,
>>>>>>>>>>>> 
>>>>>>>>>>>> It seems to me that issue here is not with
3rd-party
>>> libraries.
>>>> We
>>>>>>>>>>>> 
>>>>>>>>>>> just
>>>>>>>>>> 
>>>>>>>>>>> don't properly support class hierarchy in binary
format. Any
>>>> class
>>>>>>>>>>>> 
>>>>>>>>>>> that
>>>>>>>>>> 
>>>>>>>>>>> extends another class and has the field with
the same name as
>>>>> parent
>>>>>>>>>>>> 
>>>>>>>>>>> has
>>>>>>>>>> 
>>>>>>>>>>> will fail unless user provides custom serialization
logic
>> that
>>>> will
>>>>>>>>>>>> 
>>>>>>>>>>> handle
>>>>>>>>>>> 
>>>>>>>>>>>> it.
>>>>>>>>>>>> 
>>>>>>>>>>>> What if we prepend the field name with the
simple class name
>>> in
>>>>> this
>>>>>>>>>>>> 
>>>>>>>>>>> case?
>>>>>>>>>>> 
>>>>>>>>>>>> Say, we have two classes:
>>>>>>>>>>>> 
>>>>>>>>>>>> class A {
>>>>>>>>>>>>   private int id;
>>>>>>>>>>>> }
>>>>>>>>>>>> 
>>>>>>>>>>>> class B extends A {
>>>>>>>>>>>>   private int id;
>>>>>>>>>>>> }
>>>>>>>>>>>> 
>>>>>>>>>>>> In this case we will get two fields: "A.id"
and "B.id". The
>>> only
>>>>>>>>>>>> 
>>>>>>>>>>> issue is
>>>>>>>>>> 
>>>>>>>>>>> that if there are no name conflict, we should
be able to
>>> resolve
>>>> by
>>>>>>>>>>>> 
>>>>>>>>>>> both
>>>>>>>>>> 
>>>>>>>>>>> names - with or without prepended type name.
I.e., if A is
>>>>>> serialized,
>>>>>>>>>>>> 
>>>>>>>>>>> you
>>>>>>>>>>> 
>>>>>>>>>>>> can get the field value by "id" or "A.id".
This is similar
>> to
>>>> how
>>>>> it
>>>>>>>>>>>> 
>>>>>>>>>>> works
>>>>>>>>>>> 
>>>>>>>>>>>> if you join two SQL tables with the same
column names.
>>>>>>>>>>>> 
>>>>>>>>>>>> Any thoughts on whether it's doable or not?
>>>>>>>>>>>> 
>>>>>>>>>>>> -Val
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey
Kornev <
>>>>>>>>>>>> 
>>>>>>>>>>> andrewkornev@hotmail.com>
>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> In this particular case, the class that fails
is a
>> non-static
>>>>> inner
>>>>>>>>>>>>> 
>>>>>>>>>>>> class
>>>>>>>>>>> 
>>>>>>>>>>>> that extends another non-static inner class,
so they both
>> end
>>> up
>>>>>>>>>>>>> 
>>>>>>>>>>>> having
>>>>>>>>>> 
>>>>>>>>>>> the
>>>>>>>>>>>> 
>>>>>>>>>>>>> compiler-generated "this$0" field.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300
>>>>>>>>>>>>>> Subject: Re: CacheEntry serialization
failure
>>>>>>>>>>>>>> From: vozerov@gridgain.com
>>>>>>>>>>>>>> To: dev@ignite.apache.org
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The most straightforward solution
which comes to my mind -
>>> *do
>>>>> not
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> ever
>>>>>>>>>>> 
>>>>>>>>>>>> use
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> BinaryMarshaller by default*. Always
fallback to
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> OptimizedMarshaller
>>>>>>>>>> 
>>>>>>>>>>> unless
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> user explicitly asked us to use binary
format (e.g.
>> through
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> package
>>>>>>>>>> 
>>>>>>>>>>> wildcards).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> BTW, we already do this for Externalizable
and
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> readObject/writeObject.
>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir
Ozerov <
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> vozerov@gridgain.com
>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Ah, I saw your problem with DirectedSpecifics.
We need to
>>>> think
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> about
>>>>>>>>>>> 
>>>>>>>>>>>> how
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> to solve it. Here is the case:
>>>>>>>>>>>>>>> 1) Class is Serilzable and cannot
be changed;
>>>>>>>>>>>>>>> 2) There are several duplicated
field names;
>>>>>>>>>>>>>>> => BinaryMarshaller cannot
handle it.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Any thoughts?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:34
PM, Vladimir Ozerov <
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> vozerov@gridgain.com
>>>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I fixed the problem, it was a
bug actually.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> By default classes which
has some custom Java logic
>> (e.g.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Externalizable,
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> or with writeObject/readObject methods)
will be written
>>> using
>>>>>>>>>>>>>>>> OptimizedMarshaller, so similar
field names is not a
>>>> problem.
>>>>>>>>>>>>>>>> If you want to serialize
such class in binary format and
>>>> have
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> duplicate
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> field names, you should provide your
own BinarySerializer,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> which
>>>>>>>>>> 
>>>>>>>>>>> will
>>>>>>>>>>>> 
>>>>>>>>>>>>> write
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> these fields with different names.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Fri, Dec 18, 2015 at 8:07
PM, Andrey Kornev <
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> andrewkornev@hotmail.com>
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> How am I supposed to handle
this situation if the class
>>>> comes
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> from
>>>>>>>>>>> 
>>>>>>>>>>>> a
>>>>>>>>>>>> 
>>>>>>>>>>>>> 3d
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> party I can't modify?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Date: Fri, 18 Dec 2015
09:12:22 +0300
>>>>>>>>>>>>>>>>>> Subject: Re: CacheEntry
serialization failure
>>>>>>>>>>>>>>>>>> From: vozerov@gridgain.com
>>>>>>>>>>>>>>>>>> To: dev@ignite.apache.org
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I'll take a look.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Fri, Dec 18, 2015
at 4:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Folks,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> It looks like
CacheEntry implementation (i.e., the
>>> entry
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>> 
>>>>>>>>>>> contains
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> version) can't be
properly serialized with the
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> BinaryMarshaller.
>>>>>>>>>>>> 
>>>>>>>>>>>>> I
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> created
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> the test and the
ticket:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Can someone take
a look?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Sergey Kozlov
>> 

Mime
View raw message