db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <de...@segel.com>
Subject RE: Need Advice on Indexes
Date Wed, 24 Dec 2008 22:10:45 GMT
Sai, please see my comments in yours...

> -----Original Message-----
> From: Sai Pullabhotla [mailto:sai.pullabhotla@jmethods.com]
> Sent: Wednesday, December 24, 2008 1:54 PM
> To: Derby Discussion; msegel@segel.com
> Subject: Re: Need Advice on Indexes
> Rick and Mike,
> Thanks for taking time to reply to my question.
> As far as normalizing the table that Mike suggested -
> this jobs table is nothing more than a history table. So, a project or
> user exists today but may not exist tomorrow. However, we want to
> maintain the history of every project that was executed (this is
> considered a job and hence the name). Since, a user may not exist in a
> future time, I can not maintain a foreign key to my user table even
> though I've a user table and an ID column in it. Same with projects. I
> hope it makes sense now why this table is a flat table. However, I
> think I can move the remarks column into its own table and use a
> longvarchar or clob field.
That's not a good design. Your user information should be stored separately
and have an id associated with it. You don't want to purge this information;
however you can associate a status field as part of this table. So you
retain and don't reuse the ids and you can still keep historical data

One of the problems you may have is how you populate the field with the
employee's name. What happens if you have two employees with the same name?
(Yes it happens. ;-)
The other problem is what happens when the person entering the name
misspells the name? (Segel != Segal,Siegel,Siegal,Seagull) :-)

In addition, your table isn't normalized when you clearly have a 1 to many
relationship of data. Conceptually there are multiple jobs to a project. So
when you store jobs and projects in the same data, its not normalized.

Note: There are times when you break from using 3rd normal form, however it
requires a lot of practical experience to justify it.

> Now, as far as the queries go, Most of the times, it is -
> select  job_id, proj_id, proj_name, proj_folder, submit_user,
> run_user, queue_priority, run_priority, submit_time, start_time,
> end_time, time_taken, job_log, status from dpa_job where start_time>=?
> and start_time<? and status in(?, ?, ?, ?) order by start_time desc
Not to nitpick, but I'm a bit old school and when writing a query I tend to
want to see it formatted to be easier to read. Taking your query...

SELECT  job_id, proj_id, proj_name, proj_folder, 
        submit_user, run_user, queue_priority, 
        run_priority, submit_time, start_time, end_time, 
        time_taken, job_log, status 
FROM    dpa_job 
WHERE   start_time>=?
AND     start_time<? 
AND     status IN(?, ?, ?, ?) 
ORDER BY start_time desc

It's a bit easier to read. The key words are capitalized and you can see
your queries.

Now looking at your common query example, the only thing you need to index
on is the start time column. You can add status to the query, however its
not going to really impact the performance of the query... unless you really
have a lot of rows. (Like more than 10,000 rows in the result set if you
ignore the status filter.)

Does the IN clause effect the performance? There was an issue earlier in
Derby, but I think it has been fixed for a couple of versions. In your
example query with only 4 values, it is not that big of an issue. 

Based on this query, your index would never be used.

You've indicated that this is a production database with customers using it.
I'm going to strongly suggest that your company hire a competent DBA. You've
said a couple of things that are major red flags.  When I say DBA, I mean
someone who could be considered a logical DBA and has both data modeling and
app development expertise. If not a FTE, get a consultant in to do a health

> Sai Pullabhotla
> Phone: (402) 408-5753
> Fax: (402) 408-6861
> www.jMethods.com

View raw message