lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Bypassing ExtractingRequestHandler
Date Mon, 13 Jun 2016 14:21:34 GMT



>Two things: Here's a sample bit of SolrJ code, pulling out the DB stuff should be straightforward:
http://searchhub.org/2012/02/14/indexing-with-solrj/

+1

> We tend to prefer running Tika externally as it's entirely possible 
> that Tika will crash or hang with certain files - and that will bring 
> down Solr if you're running Tika within it.

+1

>> I want to make a small modification 
>> to Tika to get and save additional data from my PDFs
What info do you need, and if it is common enough, could you ask over on Tika's JIRA and we'll
try to add it directly?



Mime
View raw message