Try;
http://server/database/_fti/_design/lucene/all?q=attachment:%22It+is+generally+recognized+that+ablation+of+VT+associated+with+structural+heart+disease%22
The attachments are indexed into a field called 'attachments'
according to your index function, so you need to select that field
when querying.
B.
On Wed, Jun 23, 2010 at 3:04 AM, Christopher Utley
<chris.utley@citationpoint.com> wrote:
> Thank you for the response. I'll provide some additional information
> that might help to illuminate where I've gone wrong.
>
> Just to be clear... After 'mvn' successfully completes and I add the
> 3 lines to my CouchDB local.ini - then I...
>
> 1) cd target
> 2) tar xvf couchdb-lucene-0.6-SNAPSHOT-dist.tar.gz
> 3) cd couchdb-lucene-0.6-SNAPSHOT
> 4) cd bin
> 5) sudo ./run &
>
> Then I test ... My fulltext queries run great except for indexing
> attached PDFs. Is there a way to tell if Tika is being run at all? I
> don't see any errors, so I'm not sure where to begin to look.
>
> My query string looks like this:
>
> http://server/database/_fti/_design/lucene/all?q=%22It+is+generally+recognized+that+ablation+of+VT+associated+with+structural+heart+disease%22
>
>
> My attachment shows up like this:
>
> _attachments
> example.pdf
> 0.6 MB, application/pdf
>
>
> My design doc looks like this:
>
> {
> "_id": "_design/lucene",
> "_rev": "8-adbd1b56b459d9ec391ceb4cacc5f61f",
> "fulltext": {
> "all": {
> "defaults": {
> "store": "no"
> },
> "index": "function(doc) {var ret = new Document();function
> idx(obj) {for (var key in obj) {switch (typeof obj[key]) {case
> 'object':idx(obj[key]);break;case
> 'function':break;default:ret.add(obj[key]);break;}}};idx(doc);if
> (doc._attachments) {for (var i in doc._attachments)
> {ret.attachment(\"attachment\", i);}}return ret;}"
> }
> }
> }
>
>
>
>
> On Tue, Jun 22, 2010 at 6:43 PM, Robert Newson <robert.newson@gmail.com> wrote:
>>
>> Tika is fully integrated into couchdb-lucene. You've likely omitted
>> one or more steps in the README, but you should have built a zip file
>> with 'mvn', unzipped it, and run couchdb-lucene from there. the
>> startup scripts to put all of Tika on the classpath are included.
>>
>> B.
>>
>> On Tue, Jun 22, 2010 at 11:26 PM, Christopher Utley
>> <chris.utley@citationpoint.com> wrote:
>> > Greetings. I was wondering if someone on the list might have experience
>> > with CouchDB-Lucene, and more specifically Tika.
>> >
>> > My environment is as follows:
>> >
>> > Ubuntu 9.10
>> > CouchDB 0.11.0
>> > couchdb-lucene
>> > Tika 0.7
>> >
>> > I have CouchDB-Lucene working fine. Now I want to index (search) PDF
>> > attachment contents. Apparently the tool to do this (Tika) is not part of
>> > the CouchDB Lucene package, so I had to build that separately. Now I have
>> > this jar file in the target directory where I built Tika. I have no idea
>> > how to tell CouchDB Lucene where Tika is installed, or how to get it to use
>> > Tika now that's it's installed.
>> >
>> > Would setting the CLASSPATH in /etc/environment be part of the puzzle?
>> >
>> > Any ideas, suggestions, guesses, wild !#$ guesses, etc - would be greatly
>> > appreciated.
>> >
>> > Regards,
>> > Chris
>> >
>
|