lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamuela Lau <>
Subject Re: Indexing PDF file in Apache SOLR via Apache TIKA
Date Tue, 30 Oct 2018 07:21:13 GMT
Hi there,

Here are a couple of ways I'm aware of:

1. Extract-handler / post tool
You can use the curl command with the extract handler or bin/post to upload
a single document.

2. DataImportHandler
This could be used for, say, uploading multiple documents with Tika.

You should also be able to do it via the admin page, so long as you define
and modify the extract handler in solrconfig.xml.

Hope this helps!

On Tue, Oct 30, 2018 at 3:40 PM adiyaksa kevin <>

> Hello there, let me introduce my self. My name is Mohammad Kevin Putra (you
> can call me Kevin), from Indonesia, i am a beginner in backend developer, i
> use Linux Mint, i use Apache SOLR 7.5.0 and Apache TIKA 1.91.0.
> I have a little bit problem about how to put PDF File via Apache TIKA. I
> understand how SOLR or TIKA works, but i don't know how they both
> integrated.
> Last thing i know, TIKA can extract the PDF file i upload, and parse it
> into data/meta data automatically. And i just have to copy & paste it to
> the "Documents" tab in core solr.
> The question is :
> 1. can i upload PDF File to SOLR via TIKA with GUI mode ? or is it only
> with CLI mode ? if yes only with CLI mode, can you explain it to me please
> ?
> 2. Is it possible to add a text result in "Query" tab ?.
> The Background i asking about this is, i want to indexing PDF in my local
> system, then i just upload it like "drag & drop" in SOLR (is it possible ?)
> then when i type something in search box the result is like this :
> (Title of doc)
> blablablabla (yellow stabilo result) blablabla.
> the blablabla text is like a couple sentences. That's all i need.
> Sorry for my bad english.
> Thanks for reading and replying this for me, it will be very helpful to me.
> Thanks a lot

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message