lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Martchenko <ale...@martchenko.com.br>
Subject Re: Apache Solr.
Date Mon, 03 Feb 2014 13:04:59 GMT
That's right, Solr doesn't import PDFs as it imports XMLs. You'll need to
use Tikka to import binary/specific file types.

http://tika.apache.org/1.4/formats.html


alexei martchenko
Facebook <http://www.facebook.com/alexeiramone> |
Linkedin<http://br.linkedin.com/in/alexeimartchenko>|
Steam <http://steamcommunity.com/id/alexeiramone/> |
4sq<https://pt.foursquare.com/alexeiramone>| Skype: alexeiramone |
Github <https://github.com/alexeiramone> | (11) 9 7613.0966 |


2014-02-03 Siegfried Goeschl <sgoeschl@gmx.at>:

> Hi Vignesh,
>
> a few keywords for further investigations
>
> * Solr Data Import Handler
> * Apache Tikka
> * Apache PDFBox
>
> Cheers,
>
> Siegfried Goeschl
>
>
> On 03.02.14 09:15, vignesh wrote:
>
>> Hi Team,
>>
>>
>>
>>                      I am Vignesh, am using Apache Solr 3.6 and able to
>> Index
>> XML file and now trying to Index PDF file and not able to index .Can you
>> give me the steps to carry out PDF indexing it will be very useful. Kindly
>> guide me through this process.
>>
>>
>>
>>
>>
>> Thanks & Regards.
>>
>> Vignesh.V
>>
>>
>>
>> cid:image001.jpg@01CA4872.39B33D40
>>
>> Ninestars Information Technologies Limited.,
>>
>> 72, Greams Road, Thousand Lights, Chennai - 600 006. India.
>>
>> Landline : +91 44 2829 4226 / 36 / 56   X: 144
>>
>>   <blocked::http://www.ninestars.in/> www.ninestars.in
>>
>>
>>
>>
>> --
>>
>> 30 Million Advertisements displayed. Is yours there?
>> http://www.safentrixads.com/adlink?cid=13
>> --
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message