db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Tarico" <dav...@mits.com>
Subject Order of Rows Changing
Date Fri, 16 Apr 2010 23:02:40 GMT
Hi,

I have a situation where I run a query multiple times in succession, and
the order of the rows in the result set is changing, even though I am
the only user connected to the database and I am not changing the data
in any way.  

I realize that that except for columns that are explicitly sorted, the
order of rows in the ResultSet is arbitrary, but it's surprising to me
that the arbitrary ordering has an element of randomness that will
produce a different order from one moment to the next.  I'm wondering if
someone can explain why this randomness happens and if there is any way
to prevent it.

My full query is below, but it is simply: Select [columns...] from table
where Region = 'West' order by SalesRep

There is an index on both Region and SalesRep.  According to the query
plan, Derby is using the Region index to filter, and then doing an
external sort on disk involving merge runs.  (See below for full query
plan.)

I'm just guessing, but my theory is that when the results are written
out to files on disk, the order that the files are merged together is
random, producing differing order of rows within a SalesRep.

Here's an example of results I'm seeing:

SaleRep 1, customer FOO
SaleRep 1: customer BAR
SaleRep 1: customer BAZ
... 
SaleRep 2, customer FOO
SaleRep 2: customer BAR
SaleRep 2: customer BAZ
...

Then I run it again and I get:

SaleRep 1, customer FOO
SaleRep 1: customer BAR
SaleRep 1: customer BAZ
... 
SaleRep 2, customer XXX
SaleRep 2: customer YYY
SaleRep 2: customer ZZZ
...

Running the query a few more times sometimes produces the first results,
sometimes the second results, and sometimes one of several additional
orderings of rows within SalesRep 2.  Another very strange thing is that
for SaleRep 1, the order of rows stays the same.  The order is only
changing for rows within SaleRep 2 (and maybe other SalesReps).

My usecase is that I want users to be able to page through the data, and
for each page, I run the same SQL query and then pull out the rows for
that page, expecting the ResultSet to be the same.  For this query,
obviously I could force a consistent order of rows by order by SalesRep
and then by primary key.  

But what I'd really like to understand is what the conditions are that
trigger the random behavior in the ordering.  If my query was just
"Select [columns...] from table where Region = 'West'", and there was no
external merge sort, would the order of the rows fluctuate? If I don't
already need to sort the data for other reasons, I'd rather not pay the
performance penalty of adding a sort on the primary key.  

It boils down to this.  I want to run arbitrary queries, and I want to
ensure a consistent order of results, but I don't care what that order
is, and I don't want to pay the penalty of always doing a sort on
primary key.  Thoughts?

I'm running Derby embedded using JDBC driver 10.2.1.6, and the derby
database internal data structures are version 10.1.  My SQL and query
plan are below. Any tips or explanations are much appreciated!

Thanks
David.



SELECT 	RPT_PARENT_1195016960126.EXTRACT_ID,
	RPT_PARENT_1195016960126.MCUST_NUM_1387820363_648703,
	RPT_PARENT_1195016960126.M_AR_TERM_NUM,
	RPT_PARENT_1195016960126.MAR_TERM_DE__1467979434_85937,
	RPT_PARENT_1195016960126.M_CRED_LIMIT,
	RPT_PARENT_1195016960126.MREFERENCE__422035331_13607,
	RPT_PARENT_1195016960126.MCNAME_64264526_571575,
	RPT_PARENT_1195016960126.M_ADDRESS1, 
	RPT_PARENT_1195016960126.M_CITY, 
	RPT_PARENT_1195016960126.M_STATE, 
	RPT_PARENT_1195016960126.M_ZIP, 
	RPT_PARENT_1195016960126.MSTART_DATE__1104237094_24220,
	RPT_PARENT_1195016960126.MLAST_SALE_489198751_2515,
	RPT_PARENT_1195016960126.MLOYAL_REGI_64663127_29517,
	RPT_PARENT_1195016960126.PREV14629895,
	RPT_PARENT_1195016960126.M_SLSM_NUM,
	RPT_PARENT_1195016960126.MSELL_WHSE___2061051145_35424,
	RPT_PARENT_1195016960126.MSLSM_DESC__1782850260_16931,
	RPT_PARENT_1195016960126.MPY2_TOT___451324362_81337,
	RPT_PARENT_1195016960126.MPY_TOT___766274698_2014,
	RPT_PARENT_1195016960126.MCY_TOT____415423177_70471,
	RPT_PARENT_1195016960126.LIFE14629712,
	RPT_PARENT_1195016960126.M_CONTACT, 
	RPT_PARENT_1195016960126.M_PHONE, 
	RPT_PARENT_1195016960126.MFAX_69373_43504,
	RPT_PARENT_1195016960126.M_EMAIL_ADDR 
FROM RPT_PARENT_1195016960126 
WHERE  (  (  (  RPT_PARENT_1195016960126.MLOYAL_REGI_64663127_29517 =
'West'  )  )  )  
ORDER BY RPT_PARENT_1195016960126.MSLSM_DESC__1782850260_16931 ASC


Statement Name: 
	null
Statement Text: 
	SELECT RPT_PARENT_1195016960126.EXTRACT_ID,
RPT_PARENT_1195016960126.MCUST_NUM_1387820363_648703,
RPT_PARENT_1195016960126.M_AR_TERM_NUM,
RPT_PARENT_1195016960126.MAR_TERM_DE__1467979434_85937,
RPT_PARENT_1195016960126.M_CRED_LIMIT,
RPT_PARENT_1195016960126.MREFERENCE__422035331_13607,
RPT_PARENT_1195016960126.MCNAME_64264526_571575,
RPT_PARENT_1195016960126.M_ADDRESS1, 
RPT_PARENT_1195016960126.M_CITY, 
RPT_PARENT_1195016960126.M_STATE, 
RPT_PARENT_1195016960126.M_ZIP, 
RPT_PARENT_1195016960126.MSTART_DATE__1104237094_24220,
RPT_PARENT_1195016960126.MLAST_SALE_489198751_2515,
RPT_PARENT_1195016960126.MLOYAL_REGI_64663127_29517,
RPT_PARENT_1195016960126.PREV14629895,
RPT_PARENT_1195016960126.M_SLSM_NUM,
RPT_PARENT_1195016960126.MSELL_WHSE___2061051145_35424,
RPT_PARENT_1195016960126.MSLSM_DESC__1782850260_16931,
RPT_PARENT_1195016960126.MPY2_TOT___451324362_81337,
RPT_PARENT_1195016960126.MPY_TOT___766274698_2014,
RPT_PARENT_1195016960126.MCY_TOT____415423177_70471,
RPT_PARENT_1195016960126.LIFE14629712,
RPT_PARENT_1195016960126.M_CONTACT, 
RPT_PARENT_1195016960126.M_PHONE, 
RPT_PARENT_1195016960126.MFAX_69373_43504,
RPT_PARENT_1195016960126.M_EMAIL_ADDR 
FROM RPT_PARENT_1195016960126 WHERE  (  (  (
RPT_PARENT_1195016960126.MLOYAL_REGI_64663127_29517 = 'West'  )  )  )  
ORDER BY RPT_PARENT_1195016960126.MSLSM_DESC__1782850260_16931 ASC
Parse Time: 0
Bind Time: 15
Optimize Time: 110
Generate Time: 0
Compile Time: 125
Execute Time: 1078
Begin Compilation Timestamp : 2010-04-16 13:26:24.046
End Compilation Timestamp : 2010-04-16 13:26:24.171
Begin Execution Timestamp : 2010-04-16 13:32:25.289
End Execution Timestamp : 2010-04-16 13:32:26.382
Statement Execution Plan Text: 
Sort ResultSet:
Number of opens = 1
Rows input = 12189
Rows returned = 160
Eliminate duplicates = false
In sorted order = false
Sort information: 
	Number of merge runs=16
	Number of rows input=12189
	Number of rows output=12189
	Size of merge runs=[731, 731, 731, 731, 731, 731, 731, 731, 731,
731, 731, 731, 731, 731, 731, 731]
	Sort type=external
	constructor time (milliseconds) = 0
	open time (milliseconds) = 1062
	next time (milliseconds) = 16
	close time (milliseconds) = 0
	optimizer estimated row count:         9391.00
	optimizer estimated cost:        55759.64

Source result set:
	Project-Restrict ResultSet (3):
	Number of opens = 1
	Rows seen = 12189
	Rows filtered = 0
	restriction = false
	projection = true
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 720
		close time (milliseconds) = 0
		restriction time (milliseconds) = 0
		projection time (milliseconds) = 0
		optimizer estimated row count:         9391.00
		optimizer estimated cost:        55759.64

	Source result set:
		Index Row to Base Row ResultSet for
RPT_PARENT_1195016960126:
		Number of opens = 1
		Rows seen = 12189
		Columns accessed from heap = {2, 3, 4, 9, 12, 20, 21,
29, 31, 39, 45, 50, 52, 58, 59, 65, 68, 69, 74, 82, 85, 86, 87, 94, 96,
102}
			constructor time (milliseconds) = 0
			open time (milliseconds) = 0
			next time (milliseconds) = 720
			close time (milliseconds) = 0
			optimizer estimated row count:         9391.00
			optimizer estimated cost:        55759.64

			Index Scan ResultSet for
RPT_PARENT_1195016960126 using index
DIX_MLOYAL_REGI_64663127_29517_1271197022544 at read committed isolation
level using instantaneous share row locking chosen by the optimizer
			Number of opens = 1
			Rows seen = 12189
			Rows filtered = 0
			Fetch Size = 16
				constructor time (milliseconds) = 0
				open time (milliseconds) = 0
				next time (milliseconds) = 48
				close time (milliseconds) = 0
				next time in milliseconds/row = 0

			scan information: 
				Bit set of columns fetched=All
				Number of columns fetched=2
				Number of deleted rows visited=0
				Number of pages visited=73
				Number of rows qualified=12189
				Number of rows visited=12190
				Scan type=btree
				Tree height=3
				start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 

				stop position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 

				qualifiers:
None
				optimizer estimated row count:
9391.00
				optimizer estimated cost:
55759.64



Mime
View raw message