db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Filtering using the sort order on two columns
Date Tue, 04 Aug 2009 23:57:51 GMT
On Wed, Aug 5, 2009 at 00:16, Bryan Pendleton<bpendleton@amberpoint.com> wrote:
>
> What happens if you do two separate queries:
>
>    SELECT field, text, docfreq    FROM Terms
>    WHERE field > ?
>    ORDER BY field ASC, text ASC
>
>    SELECT field, text, docfreq    FROM Terms
>    WHERE (field = ? AND text > ?)
>    ORDER BY field ASC, text ASC
>
> Sometimes the "OR" will really throw off the optimizer.
>
> Does each query, separately, use the index as you expect?

Indeed, both of these queries (even the one with only >) do an index
scan instead of a table scan.  I think you're right; the optimiser
sees the OR and goes "oops, this condition is too hard", because it
doesn't know that the two ranges are adjacent (which would be hard to
detect, I'm guessing) and then proceeds to use the table scan instead.

I am considering using two separate queries as a workaround.  I can
look in the same field and then if that returns nothing, look for the
next field.  As far as caching goes, yes... I want to do that, but I
aimed for correctness first.  I can't cache *all* entries because
there might be an extremely large number, but I can at least cache
some 100 or 1,000 and then avoid the next 99 queries.

The generated column idea offered by Rick isn't bad either, but it
will double storage requirements versus doing the concatenation
manually, or versus this trick.

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

Mime
View raw message