db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harshvardhan Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6940) Enhance derby statistics for more accurate selectivity estimates.
Date Fri, 23 Jun 2017 16:47:00 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16061206#comment-16061206

Harshvardhan Gupta commented on DERBY-6940:

Hi Bryan,

I thought of a workaround and was successful. In particular, I am comparing the maxVal and
minVal and if both are equal I first write an indicator boolean and then write only one DataValueDescriptor
object. In all other cases, I first write maxVal and then minVal, In this way the problematic
object will always be written last once.

public void writeExternal(ObjectOutput out)
		 throws IOException
		FormatableHashtable fh = new FormatableHashtable();
		fh.putLong("numRows", numRows);
		fh.putLong("numUnique", numUnique);
		fh.putLong("nullCount", nullCount);
			if (maxVal.equals(maxVal, minVal).getBoolean()) {
		catch(StandardException e){

		finally {

	public void readExternal(ObjectInput in)
		throws IOException, ClassNotFoundException
		FormatableHashtable fh = (FormatableHashtable)in.readObject();
		numRows = fh.getLong("numRows");
		numUnique = fh.getLong("numUnique");
		nullCount = fh.getLong("nullCount");
			maxVal = (DataValueDescriptor)in.readObject();
			minVal = maxVal.cloneValue(true);
			maxVal = (DataValueDescriptor) in.readObject();
			minVal = (DataValueDescriptor) in.readObject();

> Enhance derby statistics for more accurate selectivity estimates.
> -----------------------------------------------------------------
>                 Key: DERBY-6940
>                 URL: https://issues.apache.org/jira/browse/DERBY-6940
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>            Priority: Minor
>         Attachments: DERBY-6940_2.diff, DERBY-6940_3.diff, derby-6940.diff, EOFException_derby.log,
> Derby should collect extra statistics during index build time, statistics refresh time
which will help optimizer make more precise selectivity estimates and chose better execution
> We eventually want to utilize the new statistics to make better selectivity estimates
/ cost estimates that will help find the best query plan. Currently Derby keeps two type of
stats - the total row count and the number of unique values.
> We are initially extending the stats to include null count, the minimum value and maximum
value associated with each of the columns of an index. This would be useful in selectivity
estimates for operators such as [ IS NULL, <, <=, >, >= ] , all of which currently
rely on hardwired selectivity estimates.

This message was sent by Atlassian JIRA

View raw message