db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Segel" <mse...@mycingular.blackberry.net>
Subject Re: maximum file size
Date Fri, 10 Nov 2006 17:04:31 GMT
Let me clarify.
My suggestion was that
You are essentially creating a poor man's mpp and the local temp table is on a "control" node.
 The temp table holds the result set returned from each query. This assumes that while the
table may exceed the 4 gig limit, a query should not. True if this assumption is wrong, then
clearly derby is not the right tool in the first place.

Using. Multiple copies of the same table doesn' really solve the problem.

What you end up having is a set of tabls foo_n where n is the table number.  Then to do your
query, you would have to create a view foo_v that you query from to hide the detail from the
app and user.  This would be an implementation of a poor man's table partitioning.  

It could be done , however you now have to write a pre-insert trigger to determine which table
to store the data.  Then what happens when you overload the table? ( Suppose you partition
on a time period and the amount of data exceeds 4gb?)
( Talk about a set of complex stored procedures)
Yes it could be done using derby today.
So too could you do a poor man's mpp version as well. ( Although it would be a bit more work
on the controlling app's side.)

The point of my post is that while these solutions are possible, it would be better to use
a database more suited to the task, or actually enhance derby to improve their storage options.
However this would mean that you would have to rethink derby, and the design may change radically
enough you would end up with a new product aka "stormcloud".

Note: while this isn't rocket science it would require a concerted effort that goes beyond
what you could do under Apache. 
Hint: in theory, Sun could start with Derby and create their own database aka "stormcloud"
and then run with it.
(Look at Apache's license.)

Actually anyone could do this, but it would take some deep pockets to support a multiple man
year effort.

the original poster is trying to use a yugo to pull a loaded semi up a steep hill.

Sent via BlackBerry.

-Mike Segel
312 952 8175

-----Original Message-----
From: dmclean62@comcast.net
Date: Fri, 10 Nov 2006 16:16:55 
To:"Derby Discussion" <derby-user@db.apache.org>, msegel@segel.com
Subject: Re: maximum file size

If the OS won't allow a large enough DB file then a local temp file won't work either.

I am  curious about one thing though: Why would it not be a good idea to use multiple tables?
Other than the fact that it would be a practical solution that could actually be implemented
with the current version of Derby that is.

It doesn't seem like it would be very helpful to dismiss a possible solution out of hand without
at least some sort of vague and reasonable explanation.

 -------------- Original message ----------------------
From: "Michael Segel" <msegel@mycingular.blackberry.net>
> Not a good idea to use multiple files.
> Why not go all the way and make derby in to an mpp db? All you would have to do 
> is to preprocess the inbound query then send it off to all of the nodes, taking 
> the result set(s) in to a local temp file and then post process and return the 
> results...  :-)
> -----Original Message-----
> From: dmclean62@comcast.net
> Date: Fri, 10 Nov 2006 14:12:35 
> To:"Derby Discussion" <derby-user@db.apache.org>
> Subject: Re: maximum file size
> You could use multiple tables to get around the file size limit.
> Decide how many rows would go in each table and then use some mechanism for 
> assigning a unique ID to each row. You would then be able to determine which 
> table a specific row is in with an integer division.
> table # = <global row #> / <rows per table>
> local row # = <global row #> % <rows per table>
> The Telemetry Data Warehouse for the Hubble Space Telescope divides the data up 
> by time - each data table contains one calendar year's worth of telemetry data.
> Just a couple of ideas.
>  -------------- Original message ----------------------
> From: Suresh Thalamati <suresh.thalamati@gmail.com>
> > redcloud wrote:
> > > Hi! I need to build a SQL table containing 1000000000 (!!!) rows. But i 
> > > filled up a table with 20000000 rows (file size of table 4GB) and my 
> > > filesystem denied to go on filling up the table. My question is: can 
> > > derby db build "infinitive" size table by chunking in multiple files?
> > 
> > No. Currently a table maps to a single file in Derby. Table size is 
> > limited by the size of the file that can be created on a file system.

View raw message