lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Can I instruct the Tika Entity Processor to skip the first page using the DIH?
Date Wed, 08 Jul 2015 19:39:09 GMT
Unfortunately, no.  We can't even do that now with straight Tika.  I imagine this is for pdf
files?  If you'd like to add this as a feature, please submit a ticket over on Tika.

-----Original Message-----
From: Paden [mailto:rumsey.pr@gmail.com] 
Sent: Wednesday, July 08, 2015 12:14 PM
To: solr-user@lucene.apache.org
Subject: Can I instruct the Tika Entity Processor to skip the first page using the DIH?

Hello, I'm using the DIH to import some files from one of my local
directories. However, every single one of these files has the same first
page. So I want to skip that first page in order to optimize search. 

Can this be accomplished by an instruction within the dataimporthandler or,
if not, how could you do this? 



--
View this message in context: http://lucene.472066.n3.nabble.com/Can-I-instruct-the-Tika-Entity-Processor-to-skip-the-first-page-using-the-DIH-tp4216373.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message