gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: HSQLDB woes...
Date Mon, 08 Nov 2010 21:48:09 GMT
On 2010-11-08 19:26, Enis Söztutar wrote:
>>>From my experience the SQL backend is a mJOR headache. Writing a SQL backend
> is actually much much harder than
> the HBase or Cassandra backend since we need very custom code for each SQL
> server. Plus, there is some code for
> dealing with HSQL embedded mode.
> I completely agree to switch to another zero-conf backend for tests and for
> nutch. However, I am not sure about BerkeleyDB.
> If we can implement a data store easily that would be great.

I started working on three implementations, but neither one works yet in
a satisfactory way... ;)

* Solr-based (GORA-9). Tests all pass, but when I tried to use this
store with Nutch it would lose all data at some point... probably I
misunderstood the way schemas are created / truncated / deleted, or this
could be related to GORA-12. More investigation needed. Another
complication is that Solr needs physical config files that specify
schema in order to create new cores (schema containers, so to speak - I
modeled one schema per Solr core), and there is no REST API in Solr to
create these files on the fly.

* Based on Ehcache. Again, all tests passed, and the store seemed
working ok in tests, but... it turned out that Ehcache is _very_
sensitive to a proper shutdown procedure. If the cache is not properly
shut down it leads to a complete data loss (!) Ouch. Pity I didn't
discover this before I started the implementation... On the off chance
that I missed something obvious I'm perfectly happy to share this work
in progress.

* Based on JDBM2.googlecode.com. This is partially done, simple put/get
works ok, and I think that would be our best bet for a simple
self-contained store ... but I'm probably missing something about the
key-space traversal because I can't make the queries work. Again, I'll
create an issue and I will attach the code.

So I guess we are probably close to having some of them working the way
they should (if possible at all). Unfortunately, for the next couple
week I won't be able to spend much time on this, so I welcome any help,
or code review (once I upload it).

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

View raw message