openjpa-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shelley <>
Subject SELECT DISTINCT returns duplicates
Date Mon, 10 Dec 2007 22:25:57 GMT
The JPA spec (4.8) indicates that duplicate values must be eliminated from
the query result when the DISTINCT keyword is used; however, I've run into
situations with OpenJPA where this does not seem to be working.  Using the
DISTINCT keyword on a SELECT JOIN query returns duplicate values.

Here is a simple example that illustrates the problem. There are two
entities, one which defines a OneToMany relationship with the other entity.

@Entity(name = "ENTITY_ONE")
public class EntityOne {

    @TableGenerator(name = "ONE_GEN", pkColumnName = "GEN_NAME",
pkColumnValue = "ONE_GEN")
    @GeneratedValue(strategy = GenerationType.TABLE, generator = "ONE_GEN")
    private Long id;

    @OneToMany(cascade = CascadeType.ALL)
    private Set<EntityTwo> entityTwos;

    public EntityOne() {
        this.entityTwos = new HashSet<EntityTwo>();

    public Set<EntityTwo> getEntityTwos() {
        return entityTwos;

    public Long getId() {
        return id;

@Entity(name = "ENTITY_TWO")
public class EntityTwo {

    @TableGenerator(name = "TWO_GEN", pkColumnName = "GEN_NAME",
pkColumnValue = "TWO_GEN")
    @GeneratedValue(strategy = GenerationType.TABLE, generator = "TWO_GEN")
    private Long id;

    private String text;

    public EntityTwo() {

    public EntityTwo(String text) {
        this.text = text;

    public String getText() {
        return text;

    public Long getId() {
        return id;


Two entity instances are persisted (eOne1 and eOne2):

    EntityOne eOne1 = new EntityOne();
    Set<EntityTwo> eTwos1 = eOne1.getEntityTwos();
    eTwos1.add(new EntityTwo("test1"));
    eTwos1.add(new EntityTwo("test2"));

    EntityOne eOne2 = new EntityOne();
    Set<EntityTwo> eTwos2 = eOne2.getEntityTwos();
    eTwos2.add(new EntityTwo("test1"));
    eTwos2.add(new EntityTwo("test3"));

When the following EJB Query is executed:
   "SELECT DISTINCT e FROM ENTITY_ONE e LEFT JOIN FETCH e.entityTwos WHERE IN (" + eOne1.getId() + "," + eOne2.getId() + ")"

Four results are returned in the ResultList, even though there are only two
distinct results. For example:
  [id=200] com.test.EntityOne@23d08b
  [id=200] com.test.EntityOne@23d08b
  [id=201] com.test.EntityOne@1a32ea4
  [id=201] com.test.EntityOne@1a32ea4

This is likely occurring because a result is being returned for each unique
EntityOne + EntityTwo; however, this seems to violate the spec. Is there
something incorrect in my code or is this a bug?

(Note: If I run this same code using a different persistence provider, such
as TopLink or Hibernate, only 2 results are returned.)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message