openjpa-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Landers, Richard" <Richard.Land...@ct.gov>
Subject RE: EntityManager fetch and query by key behave differently?
Date Wed, 27 Jan 2010 16:53:10 GMT
Heath,

Thanks for confirming the behavior, I was beginning to think I was
imagining things :^)

>From my limited experience, here's my two cents: make find() and JPQL
work as much like SQL as possible (for this case).

To me this means:

1. Always return a padded string when loading an attribute mapped to a
fixed-width column from the database [1].

2. A query result should never return the query parameter used, only the
actual value of the column.

In SQL, if you query:

        SELECT e.ent1pk FROM Ent1 e WHERE e.ent1pk = 'test2'

you expect to get 'test2    ' if that is what is stored in the ent1pk
column.  Yes, the COMPARISON of 'test2' and 'test2   ' is a little
complicated, as you describe.  However, fundamentally, 'test2' is NOT
STORED IN THE DATABASE.  So why would JPA return it to me in ANY query
result?

If we were to follow this approach, then all entities Ent1 would have
padded primary keys ('test2    ').  The existing default equals()
implementation then would work as desired.  And, because of that, I
assume the JPA machinery would no longer load two entities for one
database key.

I realize that this approach only treats the query problem.  You also
describe a problem upon entity creation -- where you create an Ent1 with
ent1pk='test2' and later query it other ways and find it's not equal to
itself. You wrote:

	// The type of the Primary Key column here is CHAR(10)
	Ent1 ent1 = new Ent1("test2");
	em.persist(ent1);
	em.flush();
	em.refresh(ent1);

Extending the above "like SQL" rule, one COULD expect ent1's primary key
value to change from 'test2' to 'test2      ', perhaps after persist or
upon refresh.  In SQL inserting 'test2' into a fixed-width column will
pad the value [1].

I see this as an unavoidable artifact of the SQL layer.  So, I'm willing
to put up with this one.  JPA, by it's nature as a mapping layer, is
bound to be a slightly leaky abstraction.  The only alternative I can
see is to throw an exception if the ent1pk field is not the correct
(database) length.  I'm not sure that's better though.

  --Rich

[1] My knowledge of SQL is limited to practical experience, primarily
with DB2.  I'm not sure if these behaviors are actually in the SQL
specification or not.



-----Original Message-----
From: Heath Thomann [mailto:jpaheath@gmail.com]
Sent: Tuesday, January 26, 2010 12:47 PM
To: users@openjpa.apache.org
Subject: Re: EntityManager fetch and query by key behave differently?

Richard,
    Ironically, I'm working a customer problem where similar behavior is
exhibited.  I have findings similar to yours, and then some.  I'd like
to detail my findings, but ultimately we have a couple bugs with the use
of Strings as primary keys when mapped to CHARs or VARCHARs.  I will
focus only on CHARs but I have noticed some similar results with
VARCHARs.  To set the stage, lets take this entity:

@Entity
public class Ent1 {
    @Id
    @Column(columnDefinition="CHAR(10)")
    private String ent1pk;
   ..........
    public String toString() {
        return this.getClass().getName() + "[PK: " +
this.getPk().toString()
+ "]";
    }


With this entity, lets perform the following tests:


EntityManager em;
em.getTransaction().begin();

// The type of the Primary Key column here is CHAR(10)
Ent1 ent1 = new Ent1("test2");
em.persist(ent1);
em.flush();
em.refresh(ent1);

// case a: perform a query using a key padded with spaces...note the
trailing spaces in //the Primary key in the second query:
Ent1 ent1FromQuery_noPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
WHERE e.ent1pk = 'test2'").getSingleResult();
Ent1 ent1FromQuery_wPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
WHERE
e.ent1pk = 'test2        '").getSingleResult();

System.out.println("ent1 = " + ent1);
System.out.println("ent1FromQuery_noPads = " + ent1FromQuery_noPads);
System.out.println("ent1 == ent1FromQuery_noPads : " + (ent1 ==
ent1FromQuery_noPads));
System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
(ent1.equals(ent1FromQuery_noPads)));
System.out.println("ent1FromQuery_wPads = " + ent1FromQuery_wPads);
System.out.println("ent1 == ent1FromQuery_wPads : " + (ent1 ==
ent1FromQuery_wPads));
System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
(ent1.equals(ent1FromQuery_wPads)));


// case b: perform a find using a key padded with spaces...note the
trailing spaces in //the Primary key in the second find:
Ent1 ent1FromFind_noPads = em.find(Ent1.class, "test2");
Ent1 ent1FromFind_wPads = em.find(Ent1.class, "test2        ");

System.out.println("ent1FromFind_noPads = " + ent1FromFind_noPads);
System.out.println("ent1 == ent1FromFind_noPads : " + (ent1 ==
ent1FromFind_noPads));
System.out.println("ent1.equals(ent1FromFind_noPads) : " +
(ent1.equals(ent1FromFind_noPads)));
System.out.println("ent1FromFind_wPads = " + ent1FromFind_wPads);
System.out.println("ent1 == ent1FromFind_wPads : " + (ent1 ==
ent1FromFind_wPads));
System.out.println("ent1.equals(ent1FromFind_wPads) : " +
(ent1.equals(ent1FromFind_wPads)));


With this test, let me show the output (println's) and then we can
discuss what is wrong with the output and why.....the output is as
follows:


1)  ent1 = Ent1[PK: test2]
2)  ent1FromQuery_noPads = Ent1[PK: test2     ]
3)  ent1 == ent1FromQuery_noPads : false
4)  ent1.equals(ent1FromQuery_noPads) : false
5)  ent1FromQuery_wPads = Ent1[PK: test2     ]
6)  ent1 == ent1FromQuery_wPads : false
7)  ent1.equals(ent1FromQuery_wPads) : false

8)  ent1FromFind_noPads = Ent1[PK: test2]
9)  ent1 == ent1FromFind_noPads : true
10) ent1.equals(ent1FromFind_noPads) : true
11) ent1FromFind_wPads = Ent1[PK: test2        ]
12) ent1 == ent1FromFind_wPads : false
13) ent1.equals(ent1FromFind_wPads) : false


A study of this output shows a couple problems:
1) Entity Ent1 was created with a PK of "test2" which contains no
pads/trailing spaces.  However, as we can see in line 2 and 5, the PK of
the entities returned from a 'find', regardless of whether we use a
'padded' key or not, yields an entity with a padded PK.  That is, the PK
contains 5 pads, for a total of 10 chars (i.e. the column length).
Lines 3, 4, 6, and 7 yield incorrect results given the PK is padded.
Essentially, after the query we have two entities which correspond to
the same row (which is not correct per JPA spec).
2) When we do a 'find', rather than a 'query', we get different results.
When we do a query using a PK with no pads, we get back an Entity which
contains a PK with no pads.  When we do a query with a PK which has been
padded with 8 pads/spaces, the entity we get back contains a PK with 8
pads as well.  In this case, line 9 and 10 shows correct results, but
line 12 and
13 shows incorrect results.

In speaking to my colleague Mike Dick, he found the following in the SQL
spec which would indicate that we have a bug, and why (paragraph 3.b
being the pertinent info):

3) The comparison of two character strings is determined as follows:
a) Let CS be the collation as determined by Subclause 9.13, "Collation
determination", for the declared types of the two character strings.
b) If the length in characters of X is not equal to the length in
characters of Y, then the shorter string is effectively replaced, for
the purposes of comparison, with a copy of itself that has been extended
to the length of the longer string by concatenation on the right of one
or more pad characters, where the pad character is chosen based on CS.
If CS has the NO PAD characteristic, then the pad character is an
implementation-dependent character different from any character in the
character set of X and Y that collates less than any string under CS.
Otherwise, the pad character is a <space>.
c) The result of the comparison of X and Y is given by the collation CS.
d) Depending on the collation, two strings may compare as equal even if
they are of different lengths or contain different sequences of
characters. When any of the operations MAX, MIN, and DISTINCT reference
a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer
to character strings, the specific value selected by these operations
from a set of such equal values is implementationdependent.


This indicates that the database should treat 'test' == 'test ' ==
'test             '.   By extension we'd assume that JPA should also
treat
them as equal.

I'd love to hear from the communtiy on this matter to determine what the
correct behaviour should be.

Thanks,

Heath


On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
<Richard.Landers@ct.gov>wrote:

> Hello everyone,
>
> Can anyone clarify my understanding:
>
> I have an entity ("Organization") that I often fetch by it's key (a
> String).
>
> In DB2 (v9.5 community) the ID field is a CHAR(10).
>
> I've been using EntityManger.find(Class, String) to do the job for
> months.
>
> I was prettying-up URLs in my web layer when I noticed this behavior:
>
> (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> entity with "ORG-10    " (with trailing spaces) as it's ID.
>
> (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> (trimmed) as it's ID.
>
> Behavior #2 surprised me! I expected "ORG-10    ", with trailing
spaces,
> just as it is in the database.
>
> As an experiment, I changed my service implementation to use a query
> on the key instead of find() and the behavior goes away.  That is, I
get
> "ORG-10    " as the ID of the Organization regardless of whether the
> input was trimmed or not.
>
> Am I misunderstanding how it should work?
>
>  --Rich
>
>





Mime
View raw message