lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Solr search – Tika extracted text from PDF not return highlighting snippet
Date Tue, 07 Aug 2012 20:58:05 GMT
The out-of-the-box example for SolrCell/Tika redirects the Tika "content" to 
the "text" field, which is not stored/highlighted, so the Tika content is 
indexed but not retrievable/highligtable.

What field are you highlighting for your database text?

You should direct your Tika "content" to a stored field, and then copy it to 
"text" for indexing and to whatever field you are highlighting.

-- Jack Krupansky

-----Original Message----- 
From: anarchos78
Sent: Tuesday, August 07, 2012 4:28 PM
To: solr-user@lucene.apache.org
Subject: Solr search – Tika extracted text from PDF not return highlighting 
snippet

Greetings friends,
I have successfully indexed Pdf –using Tika- and pure text –fetched from
database- in one single collection. Now I am trying to implement
highlighting. When I querying Solr i placing in the url the following:
http://localhost:8090/solr/ktimatologio/select/?q=BlahBlah&
&start=0&rows=120&indent=on&hl=true&wt=json . Everything is OK. The received
output has the original (not highlighted text) content under “docs” and the
highlighted snippets under “highlighting”. But I had noticed the documents
that have been extracted by Tika don’t have “highlighting” snippet. That
kind of response, cause me many troubles (zero length rows). Is there any
workaround in order to tackle it? I have already tried to copyField (at
index time) but the response come out blank *({“highlighting”:{}})*. I
really need help on this.

With honor,

Tom

Greece




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-search-Tika-extracted-text-from-PDF-not-return-highlighting-snippet-tp3999647.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Mime
View raw message