lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: PDF indexing
Date Mon, 07 May 2012 19:35:42 GMT
Try SolrCell (ExtractingRequestHandler).

See:
http://wiki.apache.org/solr/ExtractingRequestHandler

-- Jack Krupansky

-----Original Message----- 
From: Tolga 
Sent: Monday, May 07, 2012 3:24 PM 
To: solr-user@lucene.apache.org 
Subject: PDF indexing 

Hi,

>From what I have read, I think I have to use Tika (?) to index PDF, 
xls, doc, etc files. How do I start? Do I use mvn clean install in the 
source directory to get all the jar files to begin? Centos doesn't 
provide mvn, how do I build Tika after getting it from 
http://maven.apache.org ?

Sorry for the noob questions, I'm just beginning.

Mime
View raw message