db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Need Advice on Indexes
Date Wed, 24 Dec 2008 13:49:45 GMT
Hi Sai,

It's hard to answer your question without seeing your queries. However, 
for an index to be useful, some subset of the restricted columns in your 
query must form a leading prefix of the indexed columns. So for 
instance, your index would not be useful for the following query:

  select * from job where proj_name  = 'Plan 9'

because the leading column of the index, proj_id, is not restricted by 
the query. If the timestamp ranges are always necessary in your queries, 
then I would make them the leading columns in the index.

Hope this helps,
-Rick

Sai Pullabhotla wrote:
> Dear Developers,
>
> I currently have the following table and index in a Derby database.
>
> create table job(
> 	job_id bigint not null,
> 	proj_id integer not null,
> 	proj_name varchar(50) not null,
> 	proj_folder varchar(256) not null,
> 	submit_user varchar(20) not null,
> 	run_user varchar(20) not null,
> 	queue_priority integer not null,
> 	run_priority integer not null,
> 	submit_time timestamp not null,
> 	start_time timestamp not null,
> 	end_time timestamp,
> 	time_taken bigint,
> 	job_log varchar(256) not null,
> 	status char(1) default 'A' not null,
> 	remarks varchar(512),
> 	sched_job_id integer,
> 	
> 	primary key(job_id),
> 	check(status in('A', 'S', 'F', 'C'))
> );
>
> create index idx_job on job (proj_id desc, proj_name desc, proj_folder
> desc, start_time desc, end_time desc, run_user desc, time_taken desc,
> status desc, sched_job_id desc);
>
> This table would typically have a few hundred thousand records
> (~500,000). Records should be searchable by a combination of date
> range (start_time), project name, project folder, run user, status
> (one or more status codes using IN clause). To further clarify this
> search criteria, date range is always required and the rest of the
> parameters are optional.
>
> There is also a need to search records by job_id alone as well as
> sched_job_id alone. In the first case, there will be at most one
> result, but in the second case there could be several.
>
> The search results should be sortable pretty much by any column except
> remarks and job log.
>
> We also have a need to display the job count based on the status code
> for today, yesterday, last week and last month. In other words, the
> last week statistics would show how many jobs were successful, how
> many were failed etc. in the past one week.
>
> I'm wondering if the above index I've is good enough to perform all
> these functions fast enough. Please advise.
>
> Regards,
>
> Sai Pullabhotla
> Phone: (402) 408-5753
> Fax: (402) 408-6861
> www.jMethods.com
>   


Mime
View raw message