lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rajan chandi <chandi.ra...@gmail.com>
Subject Re: Using SolrJ with Tika
Date Wed, 02 Sep 2009 14:13:22 GMT
Laurent,

Check-out Solr 1.4.

You can download the trunk and Build it on your box.

The Solr 1.4 does this out-of-the-box. No configuration required.

You can use HTTP POST to post the document using some Linux utility like
Curl and the PDF/Word/RTF/PPT/XLS etc. will be indexed. We tested this last
week.

Tika has already been included in Solr 1.4.

Cheers
Rajan

On Wed, Sep 2, 2009 at 5:26 PM, Angel Ice <lbil_fr@yahoo.fr> wrote:

> Hi everybody.
>
> I hope it's the right place for questions, if not sorry.
>
> I'm trying to index rich documents (PDF, MS docs etc) in SolR/Lucene.
> I have seen a few examples explaining how to use tika to solve this. But
> most of these examples are using curl to send documents to Solr or an HTML
> POST with an input file.
> But i'd like to do it in full java.
> Is there a way to use Solrj to index the documents with the
> ExtractingRequestHandler of SolR or at least to get the extracted xml back
> (with the extract.only option) ?
>
> Many thanks.
>
> Laurent.
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message