Return-Path: Delivered-To: apmail-incubator-couchdb-dev-archive@locus.apache.org Received: (qmail 59642 invoked from network); 1 Jun 2008 18:24:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Jun 2008 18:24:08 -0000 Received: (qmail 89315 invoked by uid 500); 1 Jun 2008 18:24:11 -0000 Delivered-To: apmail-incubator-couchdb-dev-archive@incubator.apache.org Received: (qmail 89287 invoked by uid 500); 1 Jun 2008 18:24:11 -0000 Mailing-List: contact couchdb-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-dev@incubator.apache.org Delivered-To: mailing list couchdb-dev@incubator.apache.org Received: (qmail 89273 invoked by uid 99); 1 Jun 2008 18:24:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jun 2008 11:24:11 -0700 X-ASF-Spam-Status: No, hits=-1998.5 required=10.0 tests=ALL_TRUSTED,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jun 2008 18:23:23 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 02AB6234C126 for ; Sun, 1 Jun 2008 11:23:45 -0700 (PDT) Message-ID: <154599939.1212344624996.JavaMail.jira@brutus> Date: Sun, 1 Jun 2008 11:23:44 -0700 (PDT) From: "Paul Joseph Davis (JIRA)" To: couchdb-dev@incubator.apache.org Subject: [jira] Updated: (COUCHDB-74) CouchDB full text searching In-Reply-To: <601011174.1212183285060.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/COUCHDB-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Joseph Davis updated COUCHDB-74: ------------------------------------- Attachment: (was: couchdb-xapian.tar.gz) > CouchDB full text searching > --------------------------- > > Key: COUCHDB-74 > URL: https://issues.apache.org/jira/browse/COUCHDB-74 > Project: CouchDB > Issue Type: New Feature > Components: Full-Text Search, HTTP Interface > Environment: branches/lucene-search > Reporter: Paul Joseph Davis > Attachments: json_communication.patch > > > I've managed to piece enough together to get full text search working with xapian and python. > You'll need to have a relatively recent version of xapian-core and xapian-bindings installed. I've got 1.0.6 installed. The version in Gutsy Gibon's repositories was too old. Specifically, there's a requirement for the xapian.WritableDatabase to have set_metadata. If your bindings have that, you should be golden. > Basic install insructions: > Download a fresh copy of the lucene-search branch: > davisp@nebula:/usr/local/src/tmp$ svn co http://svn.apache.org/repos/asf/incubator/couchdb/branches/lucene-search couchdb-lucene > Unpack the tarball attached to this ticket: > davisp@nebula:/usr/local/src/tmp/couchdb-lucene$ tar -cvf couchdb-xapian.tar.gz couchdb-xapian/ > Apply the patch to the lucene branch: > $ cd couchdb-lucene/ > $ svn mv src/couchdb/couch_ft_query.erl src/couchdb/couch_query.erl > $ patch -p0 -i ../couchdb-xapian/json_external_query_server.diff > Build the patched version of CouchDB: > $ ./bootstrap && ./configure && make && sudo make install > Edit the xapian.conf: > [couchdb] > base_url = url of your couchdb instance. Probably http://localhost:5984/ > index_directory = directory where the user running couchdb can write to. ie, /usr/local/var/lib/couchdb/xapian/ > [index:dbname] #Dbname is the name of the database you want to index > view=_all_docs_by_seq #The index sections aren't going to stay like this. This is the only view that will work correctly. > attributes=list of document attributes to index #At the moment, you probably just want to stick with string args. This will be improved in the future. > Copy the xapian.conf next to your couch.ini: > $ sudo cp xapian.conf /usr/local/etc/couchdb/ > Edit your couch.ini: > $ sudo vi /usr/local/etc/couchdb/couch.ini > In the [Couch] section, add the following: > DbUpdateNotificationProcess=/path/to/couch-xapian/xapian-indexer > ExternalQueryServer=/path/to/couch-xapian/xapian-query-server > If your xapian.ini doesn't live at /usr/local/etc/couchdb/xapian.conf, add " -c /path/to/xapian.conf" to the end of both those options in couch.ini > Lastly, fireup couchdb as usual. > Each database/view in xapian.conf will be indexed when couchdb starts up and when the db changes as expected. > To issue a query, try a url like: > http://localhost:5984/dbname/_search?query=hello+mom > _search parameters for xapian querying include: query, offset, limit, and view. (Remember we can't index views other than _all_docs_by_seq yet, so just ignore that for now) > ============= > Hopefully I didn't forget any steps in that. So the current state of things is that the indexing and searching work. I haven't tested things like languages etc yet. I have plans to make the xapian.conf file control all of the multitude of stuff that xapian supports. For now I was just going for bare bones, so don't fret if your favorite xapian feature isn't there yet. > Also, the view indexing still needs work. This will probably have to wait until Damien gets a method to support finding only changed documents in a view. I don't know if I have enough erlang fu to make this work on my own. (I'm doubting I do) > Also also, I'm planning on adding support for the neat xapian features like suggested search terms etc. I'll get to things of this nature as more people start using the package and I've ironed out the more major things. > Other than that, please send feedback. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.