Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Received-SPF: pass (asf.osuosl.org: domain of rodrigo.madera@gmail.com
 designates 64.233.162.207 as permitted sender)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
        s=beta; d=gmail.com;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=qbTR4Onf+xshJ0SKDHCzpxOw+aZedcUQdtqcUAZxX09jIWsqHxq8hG8N3/vwg6RDBkKIrViEBWeW92/+1kutfR/cP2IgVHbVnSE36LJDWZB69zWLRbBRtUOWqRDZpkdd1ZwJqOjpQn24AM1af4sHzPeJLZ5vCnkmP+A30FhVkGE=
Message-ID: <3cf983d0605051035x3ef7478ayd4e209f3c7f51c63@mail.gmail.com>
Date: Fri, 5 May 2006 14:35:14 -0300
From: "Rodrigo Madera" <rodrigo.madera@gmail.com>
To: derby-dev@db.apache.org, mikem_app@sbcglobal.net
Subject: Re: New "segmented" StorageFactory Development
In-Reply-To: <3cf983d0605051033k21e17a64gada12a69f8496447@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
References: <3cf983d0605050730udbe2c10u13f9038c5940af67@mail.gmail.com>
	 <445B874E.6010004@sbcglobal.net>
	 <3cf983d0605051033k21e17a64gada12a69f8496447@mail.gmail.com>

Oh, just a technical detail... I work for IBM, but on a whole
different project...

Is this a problem??

Thanks

On 5/5/06, Rodrigo Madera <rodrigo.madera@gmail.com> wrote:
> On 5/5/06, Mike Matrigali <mikem_app@sbcglobal.net> wrote:
> > Do you have any more details on your requirements, such as the followin=
g:
> > 1) do you need for a single table and/or index to be spread across
> >     multiple disk?
>
> It would be terrific and the absolute glory of the requirement,
> however, it depends.
>
> Is Derby based on a table/index-is-a-single-file architecture? If so,
> it's too much trouble to change this. Making the tables/indexes
> segmented would only be viable (in my opinion) if Derby already
> supports this.
>
> I vote to get the "divider" in place that routes the new tables/etc to
> the different directories, and only then, when it's mature, begin a
> table segmentation engine.
>
> > 2) do you want control when you create each table/index where it
> >     goes and how?
>
> Yes. I'm planing on doing this automagicaly based on the specified
> directory/capacity pairs.
>
> > 3) Are you looking to limit the absolute size of tables/indexes
> >     in each directory to a fixed size?
>
> Absolutely. This is very important for the approach I'm thinking of in #1=
.
>
> > The existing storage system had some support for spreading data
> > across disks built into the interfaces, but was never used.  Data
> > is currently stored in the seg0 directory.  The idea was that
> > support could be added to store data also in a seg1 directory
> > located on another device.  If one were interested in this approach
> > they would first have to fix the code to pass around correctly
> > the seg argument (It has been observed that some code got lazy and
> > just used 0 rather than proprogating the argument).
>
> I'm in. I'll co the latest version and check it out. Is it still there?
>
> > The next decision is how the tables are to spread across the disks.
> > If putting whole tables or indexes fits your plan then I would use
> > the existing table metadata catalogs to track where a file is (these
> > may have to be upgraded to hold the new info - not sure).
>
> IMO: This is the way to go for now.
>
> >  If one
> > wants to spread a single file across multiple segments then you need
> > to decide if you want to do it by key or by some mathematical block
> > number approach:
> >
> > partition by key
> >     o would pave the road for future interesting parallel query
> >       execution work.
> >     o would recommend again top down implementation, having the
> >       existing database metadata catalogs do the work.
> >
> > partition by block number
> >     o If there is any per table/index control again use the existing
> >       database metadata catalogs and pass the info down into
> >       store. partitioning by block number probably would best be done
> >       with some new module as Dan suggested with alternate storage
> >       factory implementations.
>
> Too messy for now... Guess #1 is better for now...
>
> > If you want per table/index control I think the segX approach is the
> > best, since the obvious input would be from the create table command.
>
> Ok. But I preffer to have the array of {path, capacity} tuples (or
> table, or meta info, or ...).
>
> > If you rather do the bottom up approach, I would first start at looking
> > at the in memory patch that was done.  If you don't need much per
> > file control it may be possible to only override the StorageFactory as
> > Dan described.
>
> I'll take a look at it immediately.
>
> > Whatever approach you pick a couple of issues come to mind:
> > o how do you config the new segements into the db (currently just
> > automatically done a db creation time).
>
> Via the configuration tuples.
>
> > o how do back up a multiple segment database
>
> Transversing the repositories.
>
> > o how do handle allocation of disk space to files, current model
> >    is the db just uses all the disk space available on that disk and
> >    fails if table allocation gets and out of disk space.
>
> DB uses all ${capacity} on ${path}.
>
>
>
> This is only my initial vision of the model, so please give your
> opinions here to make it better.
>
> Thanks,
> Rodrigo Madera
>