db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rodrigo Madera" <rodrigo.mad...@gmail.com>
Subject Re: New "segmented" StorageFactory Development
Date Fri, 05 May 2006 18:07:14 GMT
Non taken =o)

I'm making sure everything is okay and then I'll proceed.

Thanks for you time,
Rodrigo

On 5/5/06, David Van Couvering <David.Vancouvering@sun.com> wrote:
> Hi, Rodrigo.  Hopefully you took no offense.  It was a tease at the
> debate going on the derby-user alias.  It's *great* to have you working
> on this, absolutely, I'm quite excited, this is something people
> regularly ask for.
>
> David
>
> Rodrigo Madera wrote:
> > Well, please, I'm not related to the Derby project at all.
> >
> > I work at IBM Brazil, on a client services project for Medco. Nothing
> > to do with Derby.
> >
> > Rodrigo
> >
> > On 5/5/06, David Van Couvering <David.Vancouvering@sun.com> wrote:
> >> Oh, wait, how could you be working for IBM and adding a feature?  I
> >> thought you guys were only doing bugfixes :)
> >>
> >> David
> >>
> >> Rodrigo Madera wrote:
> >> > Oh, just a technical detail... I work for IBM, but on a whole
> >> > different project...
> >> >
> >> > Is this a problem??
> >> >
> >> > Thanks
> >> >
> >> > On 5/5/06, Rodrigo Madera <rodrigo.madera@gmail.com> wrote:
> >> >> On 5/5/06, Mike Matrigali <mikem_app@sbcglobal.net> wrote:
> >> >> > Do you have any more details on your requirements, such as the
> >> >> following:
> >> >> > 1) do you need for a single table and/or index to be spread across
> >> >> >     multiple disk?
> >> >>
> >> >> It would be terrific and the absolute glory of the requirement,
> >> >> however, it depends.
> >> >>
> >> >> Is Derby based on a table/index-is-a-single-file architecture? If so,
> >> >> it's too much trouble to change this. Making the tables/indexes
> >> >> segmented would only be viable (in my opinion) if Derby already
> >> >> supports this.
> >> >>
> >> >> I vote to get the "divider" in place that routes the new tables/etc
to
> >> >> the different directories, and only then, when it's mature, begin a
> >> >> table segmentation engine.
> >> >>
> >> >> > 2) do you want control when you create each table/index where
it
> >> >> >     goes and how?
> >> >>
> >> >> Yes. I'm planing on doing this automagicaly based on the specified
> >> >> directory/capacity pairs.
> >> >>
> >> >> > 3) Are you looking to limit the absolute size of tables/indexes
> >> >> >     in each directory to a fixed size?
> >> >>
> >> >> Absolutely. This is very important for the approach I'm thinking of
in
> >> >> #1.
> >> >>
> >> >> > The existing storage system had some support for spreading data
> >> >> > across disks built into the interfaces, but was never used.  Data
> >> >> > is currently stored in the seg0 directory.  The idea was that
> >> >> > support could be added to store data also in a seg1 directory
> >> >> > located on another device.  If one were interested in this approach
> >> >> > they would first have to fix the code to pass around correctly
> >> >> > the seg argument (It has been observed that some code got lazy
and
> >> >> > just used 0 rather than proprogating the argument).
> >> >>
> >> >> I'm in. I'll co the latest version and check it out. Is it still
> >> there?
> >> >>
> >> >> > The next decision is how the tables are to spread across the disks.
> >> >> > If putting whole tables or indexes fits your plan then I would
use
> >> >> > the existing table metadata catalogs to track where a file is
(these
> >> >> > may have to be upgraded to hold the new info - not sure).
> >> >>
> >> >> IMO: This is the way to go for now.
> >> >>
> >> >> >  If one
> >> >> > wants to spread a single file across multiple segments then you
need
> >> >> > to decide if you want to do it by key or by some mathematical
block
> >> >> > number approach:
> >> >> >
> >> >> > partition by key
> >> >> >     o would pave the road for future interesting parallel query
> >> >> >       execution work.
> >> >> >     o would recommend again top down implementation, having the
> >> >> >       existing database metadata catalogs do the work.
> >> >> >
> >> >> > partition by block number
> >> >> >     o If there is any per table/index control again use the existing
> >> >> >       database metadata catalogs and pass the info down into
> >> >> >       store. partitioning by block number probably would best
be
> >> done
> >> >> >       with some new module as Dan suggested with alternate storage
> >> >> >       factory implementations.
> >> >>
> >> >> Too messy for now... Guess #1 is better for now...
> >> >>
> >> >> > If you want per table/index control I think the segX approach
is the
> >> >> > best, since the obvious input would be from the create table
> >> command.
> >> >>
> >> >> Ok. But I preffer to have the array of {path, capacity} tuples (or
> >> >> table, or meta info, or ...).
> >> >>
> >> >> > If you rather do the bottom up approach, I would first start at
> >> looking
> >> >> > at the in memory patch that was done.  If you don't need much
per
> >> >> > file control it may be possible to only override the
> >> StorageFactory as
> >> >> > Dan described.
> >> >>
> >> >> I'll take a look at it immediately.
> >> >>
> >> >> > Whatever approach you pick a couple of issues come to mind:
> >> >> > o how do you config the new segements into the db (currently just
> >> >> > automatically done a db creation time).
> >> >>
> >> >> Via the configuration tuples.
> >> >>
> >> >> > o how do back up a multiple segment database
> >> >>
> >> >> Transversing the repositories.
> >> >>
> >> >> > o how do handle allocation of disk space to files, current model
> >> >> >    is the db just uses all the disk space available on that disk
and
> >> >> >    fails if table allocation gets and out of disk space.
> >> >>
> >> >> DB uses all ${capacity} on ${path}.
> >> >>
> >> >>
> >> >>
> >> >> This is only my initial vision of the model, so please give your
> >> >> opinions here to make it better.
> >> >>
> >> >> Thanks,
> >> >> Rodrigo Madera
> >> >>
> >>
>

Mime
View raw message