couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Anderson (JIRA)" <>
Subject [jira] Commented: (COUCHDB-53) Incorporating JSearch to CouchDB
Date Thu, 22 Jul 2010 17:46:50 GMT


Chris Anderson commented on COUCHDB-53:

I'll say I think this is important.

It would allow "dynamic queries". That is, it makes it possible to run queries you didn't
think of, without running an entire map reduce job.

The absence of this is a big reason people shy away from CouchDB.

Just sayin'

> Incorporating JSearch to CouchDB
> --------------------------------
>                 Key: COUCHDB-53
>                 URL:
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: Full-Text Search
>         Environment: JSearch is developed in Java
>            Reporter: Jun Rao
>            Assignee: Paul Joseph Davis
>            Priority: Minor
>         Attachments: jsearch_full.tgz
> JSearch is a prototype that we developed for indexing and searching Json documents, and
we are enthusiastic about contributing it to CouchDB. JSearch converts a given Json document
to a Lucene document for indexing. The conversion is lossless and preserves all structural
information in the original Json document. We achieve that by storing the encoding of Json
structures in the payload of the posting list in a Lucene index. JSearch has a simple query
language that combines fulltext search and structural querying. To qualify as a match, a document
has to match both the JSON structures as well as the Boolean constraints specified in the
query. Suppose that we have indexed the following two JSON documents:
>    d1={ A: [ { B: "b1",  C: "c1" },
>              { B: "b2",  C: "c2" },
>            ]
>       }
>    d2={ A: [ { B: "b1",  C: "c2" },
>              { B: "b2",  C: "c1" },
>            ]
>       }
> One can issue the following two JSeach queries.
>    P={ A: [ { B: "b1" && C: "c1" } ] }
>    Q={ A: [ { B: "b1"} && {C: "c1" } ] }
> Query P ("&&" specifies conjunction) matches d1, but not d2. The reason is that
d2 doesn't have the proper B and C fields within the same JSON object. On the other hand,
query Q matches both d1 and d2, since it doesn't require the B field and the C field to be
in the same JSON object.
> Here is a summary of the querying features in JSearch
> 1. arbitrary conjunctive and disjunctive constraints
> 2. text search on atomic values of string type
> 3. range constraints on atomic values (only those of string and long types are currently
> 4. document level matching
> The easiest way to know more about JSeach is to give it a try. Download the attached
tgz file. Follow the readme file in it and try some of the examples. The attachment also includes
all Java source code (I can provide more technical details if needed). I am very interested
in your feedback. Does JSearch fit into CouchDB? What other features are needed? How should
JSearch be integrated (from Jan's mail, it seems that some infrastructure is already in-place)?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message