couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nils Breunese <N.Breun...@vpro.nl>
Subject RE: clucene and couchdb
Date Sat, 05 Jun 2010 21:09:26 GMT
We had some serious performance problems with couchdb-lucene on a busy site recently. It turned
out the problem wasn't couchdb-lucene itself (queries were fast!), but the fact that communication
between CouchDB and external processes use stdout/stdin, which AFAIK doesn't allow for concurrency.
This turned out to be a major bottleneck in our setup. We're currently setting up caching
for couchdb-lucene URL's hoping this will help. We even tried redirecting traffic for couchdb-lucene
URL's directly to couchdb-lucene, thus avoiding the stdin/stdout serialization, but apperently
the current stable release of couchdb-lucene doesn't handle concurrency well yet (I believe
rnewsom already fixed some bugs in that area), but it least it has the potential to do so.

Let us know when you have some numbers of couchdb-clucene versus couchdb-lucene. (We're mainly
a Java shop though, so we're not afraid of running JVM's.)

Nils.
________________________________________
Van: Norman Barker [norman.barker@gmail.com]
Verzonden: vrijdag 4 juni 2010 23:31
Aan: user@couchdb.apache.org
Onderwerp: clucene and couchdb

Hi,

I am writing a clucene indexer for CouchDB, I have
update_notifications and _fti as a db handler working. I am using
stdout/stdin for the communication and it is looking good.

Looking at http://wiki.apache.org/couchdb/Full_text_search I see that
the index property in the design document is a javascript function and
I am wondering why? For views I can understand why you would want to
do an evaluation but for Lucene could we just use a JSON Path
reference?

Thoughts appreciated, since I am in C++ and SpiderMonkey is available
I could do an eval of the javascript, but it might be easier just to
parse the JSON path.

We will be putting this CLucene implementation in the public domain
once I have cleared the necessary internal paperwork.

CLucene is dual license (Apache and LGPL) and I am using Cajun (BSD)
for the JSON parsing so should I host this separately or take out a
JIRA ticket to have it included in CouchDB?

thanks,

Norman

De informatie vervat in deze  e-mail en meegezonden bijlagen is uitsluitend bedoeld voor gebruik
door de geadresseerde en kan vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging,
verspreiding en/of verstrekking van deze informatie aan derden is voorbehouden aan geadresseerde.
De VPRO staat niet in voor de juiste en volledige overbrenging van de inhoud van een verzonden
e-mail, noch voor tijdige ontvangst daarvan.

Mime
View raw message