db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luigi Lauro <luigi.la...@gmail.com>
Subject javax.jnlp.PersistenceService discoveries and questions about Derby StorageFactory/StorageFile - help me out pls :)
Date Fri, 23 Mar 2007 08:42:24 GMT
Hello back,

I've worked some more on the dreaded DERBY-2469 RFE, and I'm here to  
share my discoveries and find out some key questions which will make  
me understand if this is doable for real or not.

Discoveries and thoughts

1) It's very hard to test JNLP applications, and so it's very hard to  
test my JNLP StorageFactory. This is because the jnlp services such  
as PersistenceServiceImpl and BasicServiceImpl get only loaded at JVM  
startup time when java is launched from the javaws executable, which  
needs a jnlp file. This makes testing jnlp-based classes very hard,  
since automated testing or debugging with an IDE is out of discussion.

I've tried manually including my Mac OS X Java6 javaws.jar (which  
contains the system-dependent implementation classes for the JNLP  
interfaces) and manually initializing and loading the services, but  
it all fails horribly because of several missing things (mostly  
missing system properties, which impl classes expect, and are  
probably set up by javaws or some other yet-to-be-found class).

It can _PROBABLY_ be done by tinkering with it and slowly adding  
everything needed by going exception and after exception manually  
putting in the tidbits impl classes expect, but it's for sure a long  
and tedious work, and it would strongly platform-dependent (for  
example, I've noticed the impl classes in javaws varies consistently  
on the various platforms, and so do JNLP expected system properties).  
Or, maybe, if we are REALLY lucky, we can find a  
JNLPInitializerWhatever class, with a 'startmeup' method for setting  
everything. Dreaming is not a crime you know? :P

Any help here is SERIOUSLY appreciated. Yes I know I should post on  
the java ML, I'm still figuring out WHERE exactly is the right place  
to post this, again, any help here is appreciated.

2) PersistenceService has one big limitation which wasn't apparent  
reading the javadoc, but that I discovered after some heavy testing  
with it.

You can only create storage entities on the same codebase your  
application is, or parent codebases. If, for example, your  
application codebase is http://db.apache.org/derby, it is legal to  
create persistent entities as http://db.apache.org/derby/FILE or  
http://db.apache.org/FILE (this is done to allow sharing between  
application from the same vendor/domain), but IT IS NOT LEGAL to  
create http://db.apache.org/derby/MYSUBDIR/FILE, as of course it is  
not legal to create http://www.otherdomain.com/FILE.

This effectively means I won't be able to use the JNLP URL as the  
file path as I initially planned (because I can't create a hierarchy,  
I can also save entities at the root level), but I will have to use a  
flat hierarchy for saving entities and transparently translate this  
into a full-fledged hierarchy.

I plan to do this by using a 'name' hierarchy such as http:// 
db.apache.org/derby/directory,subdir1,subdir2,filename. I've checked  
valid separator chars for URLs and the best choice seems to be ','  
since it's more easily readable than for example '|', but it's still  
VERY VERY rare to find in common files (I seriously hope derby  
doesn't create my,file,name files).

At start I won't add any support data structure, but do everything  
with this flat structure, and parse the entire list each time to find  
for example the files available in a given directory (scan the entire  
list, find the ones that begins for "codebase/dir,"). Later, if  
performance become an issue, I could easily throw in a Tree for fast  
seek/list operations, generated at startup time, and kept up-to-date  
with each operation, but I prefer to start with a simpler approach  
and deal with performance problems later, if they come out.

Moreover, I don't really think derby creates THAT many files that  
even a full list scan for a simple list operation would make the  
StorageFactory so damn slow. But you know: first make it work, then  
make it fast.

3) PersistenceServiceImpl, which is Sun standard implementation of  
PersistenceService interface, only allows for 255 storage entities  
for a given codebase. After some decompiling (thanks god JAD exists,  
since downloading JVM sources from Sun site it's presently a  
nightmare) I found out it keeps an internal array, size 255. Very  
very dumb I know. If you try to register a 256th entity, it fails  
horribly with a ArrayIndexOutOfBoundsException.

This can't be overcome easily, since swapping that implementation  
with a more capable one would be a nightmare of finding out how to  
replicate the 'sandbox friendly' persistence of the impl class, and  
anyway, we don't really want to substitute Sun implementaion, but the  
main focus here is to make derby work with the tools sun gives us,  
instead of working around them.

Another approach to surpass this limit would be to save SEVERAL  
'files' into a single storage entity, to cut down the number of these  
(especially since there are no size limitations, but the  
implementation just warns/ask the user when more than the default  
maximum size is requested) but this would complicate the  
StorageFactory/StorageFile implementations by several order of  
magnitude, since I would have to keep 'pointers' to the file start  
and end inside each persistent entity, etc... this isn't easy and  
this isn't something I want to do unless there is no other viable way.

Also, maybe opening a bug regarding this limitation would make Sun  
improve the default implementation with a proper one (hey, use an  
ArrayList man!). I think this could be easily done and would have no  
downsides, especially since we already have size limits, so there's  
no use of a 'number of files' limit.

4) Luckily, there are no size limitations, both for single entity and  
the whole storage. There is just a default maximum size threshold,  
and the user gets a request popup when the storage gets past it.

Ok, now with the questions/doubts

A) StorageFactory.shutdown() - should a proper implementation delete  
the 'temporary' files if they are made persistent? I'm current  
planning to implement temp files as standard storage entities, just  
'tagged' with PersistenceService.TEMPORARY, but since this tag does  
nothing automatically, I will have to manually delete temporary  
entities at startup/shutdown I think. Another approach would be to  
have the temp files in memory, and do not keep them in the  
persistence storage: this would save size/number of files and would  
make this unnecessary

B) StorageFactory.init() - Let me see If i Got this right.

Home is directory of derby home (where all database are stored), but  
can be ignored.
DatabaseName is subdir in home for the given database.
TempDirName is home for temp files (can temporary files be created  
also outside of it, like in database directory? Is this possible?)
UniqueName is database specific subdirectory inside tempDirName.

Home shouldn't be created but provided (though in my case I have to  
create it since the user can't).
DatabaseName can be null when you are using the StorageFactory just  
to access the database directories (this will happen with my factory  
as well?), but if its not I have to create home/databaseName.
TempDirName can be null and a default should be used, and the  
directory created, but ONLY if uniqueName is not null. If uniqueName  
is null, then no temp dir is available and from what I can guess it  
means Derby won't use ANY TEMPORARY FILE AT ALL. Right?

Also, please use better names next time. IMHO home, databaseHome,  
temp, databaseTemp would have been a wiser choice, more easily  
understood by non-derbiers. Anyway.

C) StorageFactory.newStorageFile(String path) - should this method  
also create a temporary file, if the given path is under the temp  
dir? This isn't clear IMHO: if they hand me a tempDir path, do I have  
to just wrap it in a StorageFile, or Do I also have to create a  
temporary file with a unique name and return it? I Really haven't  
clear all the newStorageFile/createTemporaryFile methods, and even  
reading the javadocs and looking at BaseStorageFactory, I still feel  
puzzled.

Can someone help me understand this? I'd love that :)

D) This is hard to explain, but I will try my best. Since the  
PersistenceService API only gives me a 'name' metadata for a given  
storage entity, and I prefer not to save metadata inside the file  
contents itself (this would make things harder to implement), I was  
asking myself what metadata regarding the file do I have to keep.

Surely, first of all, I would need to tell if a storage entity it's  
temporary or not. If temp files gets created ONLY under the tempDir,  
this means I could use the name/path itself to tell this. But I'm  
still wondering if derby creates temporary files also outside tempDir  
or not. I think and hope not.

Also, I surely have to tell if a given storage entity it's a  
directory or a file. I could tell them by the name/path as well (if  
they end with SEPARATOR, they are dirs, if they don't, they are  
files), but this would work ONLY if derby doesn't craft itself alone  
paths for directories/files by parsing the path instead of using the  
StorageFactory/StorageFile methods. If derby does, then derby could  
produce a directory URL without the trailing separator, which will  
mess things up.

Another approach would be to use the use a zero maximum length as  
directory tag. A directory would have a zero length (and this is good  
also, since I'll create directories storage entities just as a  
placeholder, to tell if a directory was created or not), a file would  
have a default 1024/whatever maximum length at start (remember that  
this can be easily grow when needed, when I write to the entity).

Again, there is also a readonly bit on the files. Do I have to  
persist this as well? It would be a problem if I 'lose' this  
information and don't persist this? Also: are there any other file  
metadata bits which I'm forgetting and that I should save in the  
persistent storage?

Thanks again for your help, If I get some answers to shed some lights  
onto my doubts, I will fix a couple of things in the next few hours  
and post an 'alpha' JNLPStorageFactory/StorageFile patch in the jira  
issue.

Also, since working on it I've found out there are SEVERAL  
similarities between JNLP storage and an hypothetical memory storage  
using a simple Map<Path, File>, I'm implementing things around an  
abstract base class which delegates to the extending implementation  
classes only a couple of CRUD methods (create/delete/rename/etc...)  
and build all the StorageFactory/StorageFile logic on top of these.

This will mean I can probably get done a working MemoryStorageFactory  
along with the JNLP one, since doing the former would be only a 5%  
work more than doing my own JNLP storage, as I'm currently planning  
to do things.

Also, I think this could potentially lead to a massive StorageFactory/ 
StorageFile redesign for easier storage implementations, or to some  
higher-level abstract class wrappers around the present  
StorageFactory/StorageFile interfaces, such as the one I'm doing,  
that depends only on a FEW storage methods, instead of the 30+  
methods one presently needs to implement to get a Storage working. Of  
course, this is something I can't really tell if this is something  
needed and good for real or not, given my very basic derby internal  
knowledge.

Thanks again for any help/critic/hint/whatever you may provide, and  
forgive me for my messy english, and my very direct and yet very  
verbose way of writing :P

Luigi
Mime
View raw message