Mailing-List: contact users-help@openjpa.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@openjpa.apache.org
Received-SPF: pass (athena.apache.org: domain of michael.d.dick@gmail.com
 designates 209.85.218.223 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type;
        b=L2P84mdvnRUZsgiRv9jyutHjZnFHTBpKIVY8Rr6ao+55gPUYU2GiWY+8Y1JApsT9LS
         ZVer4SiDULpoHvSq03IFjZpgMz5SVx1l+dJmDpe7drHgeW2+xYKdVtYRlN0OPUZysP6O
         6gc50r97B08XOo/O4PYmiZDR6xlT4xsv1zVn4=
MIME-Version: 1.0
In-Reply-To: <f62a888.7bb16b45.4b61d317.c1dff@o2.pl>
References: 
 <1CFFBC30691805488AD718FA2CBA01F86CA991@DOIT-EX502.exec.ds.state.ct.us>
	<d6e53b2e1001260947m1ebac92i17f3b8e808a75ff3@mail.gmail.com>
	<1CFFBC30691805488AD718FA2CBA01F86CAA1B@DOIT-EX502.exec.ds.state.ct.us>
	<f62a888.7bb16b45.4b61d317.c1dff@o2.pl>
From: Michael Dick <michael.d.dick@gmail.com>
Date: Mon, 8 Feb 2010 16:44:26 -0600
Message-ID: <72c1350f1002081444g40a88693w2be30772e91ed80e@mail.gmail.com>
Subject: Re: EntityManager fetch and query by key behave differently?
To: users@openjpa.apache.org
Content-Type: multipart/alternative; boundary=00032555415a4abdc1047f1e8cc6

--00032555415a4abdc1047f1e8cc6
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi all,

It seems to me that there are really two scenarios here.

1.) The results from executing a query, or find should contain _the_ values
as returned by JDBC. So for DB2, if a column is returned as 'test     ' we
should store that value in the entity. Ie :
EntityManager em =3D emf.createEntityManager();
MyEntity e =3D em.find(MyEntity.class, 'test     ') ;
e.getId().equals('test      ')   // the same would apply to a nonID field.

2.) When an entity is already in the persistence context we should use the
values from in memory (at least until em.refresh is called). Ie.
EntityManager em =3D emf.createEntityManager();
em.persist(new MyEntity('test');  // no whitespace

MyEntity e =3D em.find(MyEntity.class, 'test     ') ;  // we know the datab=
ase
will trim trailing whitespace.
e.getId().equals('test')   // 'test' is still in memory, and we must return
a single instance for each row.

2.b)
em.refresh(e);
e.getId().equals('test     ')  // since we've been told to update with the
latest values from the database this would be a time to add trailing
whitespace.

Of these, it's #2 which concerns me the most. Having two instances which
correspond to the same row can lead to interesting problems (that and I
think I know how to fix it).

I've opened JIRA issue OPENJPA-1501 for #2 in particular, but I'll take a
look at the other aspects too..

Like Milosz mentioned the behavior might be different across databases. In
some 'raw' JDBC tests I noticed that DB2 and Oracle would include the
trailing spaces, but MySQL truncated them. Some further investigation is
probably needed, but ultimately I think we'd want to rely on the JDBC drive=
r
returning the 'right' value..

-mike

On Thu, Jan 28, 2010 at 12:10 PM, Mi=C5=82osz Tylenda <mtylenda@o2.pl> wrot=
e:

> Hi All,
>
> That interesting mismatch reminds me of a few side-notes.
>
> 1. I remember Oracle treating "test2    " as equal to "test2" in WHERE
> clause when using constants. In other words, if a table has a row with va=
lue
> "test2    " the following query will return it:
>
> SELECT * FROM table WHERE column=3D'test2';
>
> On the other hand, if the query is written using ?-parameter
>
> SELECT * FROM table WHERE column=3D?;
>
> you have to right-pad the parameter ("test2    "). Otherwise, no match wi=
ll
> be found. It was at Oracle 8/9 times, I don't know if anything changed
> recently.
>
> 2. I am wondering whether the JDBC spec defines the behaviour of
> ?-parameters in our context.
>
> 3. OpenJPA uses ?-parameters in this case, even if the JPQL query uses
> constants.
>
> 4. I can imagine a similar problem can arise if one tries to use
> approximate types as primary keys, e.g. float. However, this
> approximate-problem is clearer since the JPA spec explicitly discourages
> such usages.
>
> Regards,
> Milosz
>
>
> > Heath,
> >
> > Thanks for confirming the behavior, I was beginning to think I was
> > imagining things :^)
> >
> > From my limited experience, here's my two cents: make find() and JPQL
> > work as much like SQL as possible (for this case).
> >
> > To me this means:
> >
> > 1. Always return a padded string when loading an attribute mapped to a
> > fixed-width column from the database [1].
> >
> > 2. A query result should never return the query parameter used, only th=
e
> > actual value of the column.
> >
> > In SQL, if you query:
> >
> >         SELECT e.ent1pk FROM Ent1 e WHERE e.ent1pk =3D 'test2'
> >
> > you expect to get 'test2    ' if that is what is stored in the ent1pk
> > column.  Yes, the COMPARISON of 'test2' and 'test2   ' is a little
> > complicated, as you describe.  However, fundamentally, 'test2' is NOT
> > STORED IN THE DATABASE.  So why would JPA return it to me in ANY query
> > result?
> >
> > If we were to follow this approach, then all entities Ent1 would have
> > padded primary keys ('test2    ').  The existing default equals()
> > implementation then would work as desired.  And, because of that, I
> > assume the JPA machinery would no longer load two entities for one
> > database key.
> >
> > I realize that this approach only treats the query problem.  You also
> > describe a problem upon entity creation -- where you create an Ent1 wit=
h
> > ent1pk=3D'test2' and later query it other ways and find it's not equal =
to
> > itself. You wrote:
> >
> >       // The type of the Primary Key column here is CHAR(10)
> >       Ent1 ent1 =3D new Ent1("test2");
> >       em.persist(ent1);
> >       em.flush();
> >       em.refresh(ent1);
> >
> > Extending the above "like SQL" rule, one COULD expect ent1's primary ke=
y
> > value to change from 'test2' to 'test2      ', perhaps after persist or
> > upon refresh.  In SQL inserting 'test2' into a fixed-width column will
> > pad the value [1].
> >
> > I see this as an unavoidable artifact of the SQL layer.  So, I'm willin=
g
> > to put up with this one.  JPA, by it's nature as a mapping layer, is
> > bound to be a slightly leaky abstraction.  The only alternative I can
> > see is to throw an exception if the ent1pk field is not the correct
> > (database) length.  I'm not sure that's better though.
> >
> >   --Rich
> >
> > [1] My knowledge of SQL is limited to practical experience, primarily
> > with DB2.  I'm not sure if these behaviors are actually in the SQL
> > specification or not.
> >
> >
> >
> > -----Original Message-----
> > From: Heath Thomann [mailto:jpaheath@gmail.com]
> > Sent: Tuesday, January 26, 2010 12:47 PM
> > To: users@openjpa.apache.org
> > Subject: Re: EntityManager fetch and query by key behave differently?
> >
> > Richard,
> >     Ironically, I'm working a customer problem where similar behavior i=
s
> > exhibited.  I have findings similar to yours, and then some.  I'd like
> > to detail my findings, but ultimately we have a couple bugs with the us=
e
> > of Strings as primary keys when mapped to CHARs or VARCHARs.  I will
> > focus only on CHARs but I have noticed some similar results with
> > VARCHARs.  To set the stage, lets take this entity:
> >
> > @Entity
> > public class Ent1 {
> >     @Id
> >     @Column(columnDefinition=3D"CHAR(10)")
> >     private String ent1pk;
> >    ..........
> >     public String toString() {
> >         return this.getClass().getName() + "[PK: " +
> > this.getPk().toString()
> > + "]";
> >     }
> >
> >
> > With this entity, lets perform the following tests:
> >
> >
> > EntityManager em;
> > em.getTransaction().begin();
> >
> > // The type of the Primary Key column here is CHAR(10)
> > Ent1 ent1 =3D new Ent1("test2");
> > em.persist(ent1);
> > em.flush();
> > em.refresh(ent1);
> >
> > // case a: perform a query using a key padded with spaces...note the
> > trailing spaces in //the Primary key in the second query:
> > Ent1 ent1FromQuery_noPads =3D (Ent1) em.createQuery("SELECT e FROM Ent1=
 e
> > WHERE e.ent1pk =3D 'test2'").getSingleResult();
> > Ent1 ent1FromQuery_wPads =3D (Ent1) em.createQuery("SELECT e FROM Ent1 =
e
> > WHERE
> > e.ent1pk =3D 'test2        '").getSingleResult();
> >
> > System.out.println("ent1 =3D " + ent1);
> > System.out.println("ent1FromQuery_noPads =3D " + ent1FromQuery_noPads);
> > System.out.println("ent1 =3D=3D ent1FromQuery_noPads : " + (ent1 =3D=3D
> > ent1FromQuery_noPads));
> > System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
> > (ent1.equals(ent1FromQuery_noPads)));
> > System.out.println("ent1FromQuery_wPads =3D " + ent1FromQuery_wPads);
> > System.out.println("ent1 =3D=3D ent1FromQuery_wPads : " + (ent1 =3D=3D
> > ent1FromQuery_wPads));
> > System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
> > (ent1.equals(ent1FromQuery_wPads)));
> >
> >
> > // case b: perform a find using a key padded with spaces...note the
> > trailing spaces in //the Primary key in the second find:
> > Ent1 ent1FromFind_noPads =3D em.find(Ent1.class, "test2");
> > Ent1 ent1FromFind_wPads =3D em.find(Ent1.class, "test2        ");
> >
> > System.out.println("ent1FromFind_noPads =3D " + ent1FromFind_noPads);
> > System.out.println("ent1 =3D=3D ent1FromFind_noPads : " + (ent1 =3D=3D
> > ent1FromFind_noPads));
> > System.out.println("ent1.equals(ent1FromFind_noPads) : " +
> > (ent1.equals(ent1FromFind_noPads)));
> > System.out.println("ent1FromFind_wPads =3D " + ent1FromFind_wPads);
> > System.out.println("ent1 =3D=3D ent1FromFind_wPads : " + (ent1 =3D=3D
> > ent1FromFind_wPads));
> > System.out.println("ent1.equals(ent1FromFind_wPads) : " +
> > (ent1.equals(ent1FromFind_wPads)));
> >
> >
> > With this test, let me show the output (println's) and then we can
> > discuss what is wrong with the output and why.....the output is as
> > follows:
> >
> >
> > 1)  ent1 =3D Ent1[PK: test2]
> > 2)  ent1FromQuery_noPads =3D Ent1[PK: test2     ]
> > 3)  ent1 =3D=3D ent1FromQuery_noPads : false
> > 4)  ent1.equals(ent1FromQuery_noPads) : false
> > 5)  ent1FromQuery_wPads =3D Ent1[PK: test2     ]
> > 6)  ent1 =3D=3D ent1FromQuery_wPads : false
> > 7)  ent1.equals(ent1FromQuery_wPads) : false
> >
> > 8)  ent1FromFind_noPads =3D Ent1[PK: test2]
> > 9)  ent1 =3D=3D ent1FromFind_noPads : true
> > 10) ent1.equals(ent1FromFind_noPads) : true
> > 11) ent1FromFind_wPads =3D Ent1[PK: test2        ]
> > 12) ent1 =3D=3D ent1FromFind_wPads : false
> > 13) ent1.equals(ent1FromFind_wPads) : false
> >
> >
> > A study of this output shows a couple problems:
> > 1) Entity Ent1 was created with a PK of "test2" which contains no
> > pads/trailing spaces.  However, as we can see in line 2 and 5, the PK o=
f
> > the entities returned from a 'find', regardless of whether we use a
> > 'padded' key or not, yields an entity with a padded PK.  That is, the P=
K
> > contains 5 pads, for a total of 10 chars (i.e. the column length).
> > Lines 3, 4, 6, and 7 yield incorrect results given the PK is padded.
> > Essentially, after the query we have two entities which correspond to
> > the same row (which is not correct per JPA spec).
> > 2) When we do a 'find', rather than a 'query', we get different results=
.
> > When we do a query using a PK with no pads, we get back an Entity which
> > contains a PK with no pads.  When we do a query with a PK which has bee=
n
> > padded with 8 pads/spaces, the entity we get back contains a PK with 8
> > pads as well.  In this case, line 9 and 10 shows correct results, but
> > line 12 and
> > 13 shows incorrect results.
> >
> > In speaking to my colleague Mike Dick, he found the following in the SQ=
L
> > spec which would indicate that we have a bug, and why (paragraph 3.b
> > being the pertinent info):
> >
> > 3) The comparison of two character strings is determined as follows:
> > a) Let CS be the collation as determined by Subclause 9.13, "Collation
> > determination", for the declared types of the two character strings.
> > b) If the length in characters of X is not equal to the length in
> > characters of Y, then the shorter string is effectively replaced, for
> > the purposes of comparison, with a copy of itself that has been extende=
d
> > to the length of the longer string by concatenation on the right of one
> > or more pad characters, where the pad character is chosen based on CS.
> > If CS has the NO PAD characteristic, then the pad character is an
> > implementation-dependent character different from any character in the
> > character set of X and Y that collates less than any string under CS.
> > Otherwise, the pad character is a <space>.
> > c) The result of the comparison of X and Y is given by the collation CS=
.
> > d) Depending on the collation, two strings may compare as equal even if
> > they are of different lengths or contain different sequences of
> > characters. When any of the operations MAX, MIN, and DISTINCT reference
> > a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer
> > to character strings, the specific value selected by these operations
> > from a set of such equal values is implementationdependent.
> >
> >
> > This indicates that the database should treat 'test' =3D=3D 'test ' =3D=
=3D
> > 'test             '.   By extension we'd assume that JPA should also
> > treat
> > them as equal.
> >
> > I'd love to hear from the communtiy on this matter to determine what th=
e
> > correct behaviour should be.
> >
> > Thanks,
> >
> > Heath
> >
> >
> > On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
> > <Richard.Landers@ct.gov>wrote:
> >
> > > Hello everyone,
> > >
> > > Can anyone clarify my understanding:
> > >
> > > I have an entity ("Organization") that I often fetch by it's key (a
> > > String).
> > >
> > > In DB2 (v9.5 community) the ID field is a CHAR(10).
> > >
> > > I've been using EntityManger.find(Class, String) to do the job for
> > > months.
> > >
> > > I was prettying-up URLs in my web layer when I noticed this behavior:
> > >
> > > (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> > > entity with "ORG-10    " (with trailing spaces) as it's ID.
> > >
> > > (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> > > (trimmed) as it's ID.
> > >
> > > Behavior #2 surprised me! I expected "ORG-10    ", with trailing
> > spaces,
> > > just as it is in the database.
> > >
> > > As an experiment, I changed my service implementation to use a query
> > > on the key instead of find() and the behavior goes away.  That is, I
> > get
> > > "ORG-10    " as the ID of the Organization regardless of whether the
> > > input was trimmed or not.
> > >
> > > Am I misunderstanding how it should work?
> > >
> > >  --Rich
> > >
> > >
> >
> >
> >
> >
> >
>

--00032555415a4abdc1047f1e8cc6--