lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen Agrawal <pkal...@gmail.com>
Subject Re: Problem with pdf, upgrading Cell
Date Fri, 30 Apr 2010 17:44:13 GMT
Grant,
You can try any of the sample pdfs that come in /docs folder of Solr 1.4
dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only
metadata i.e. stream_size, content_type apart from my own literals are
indexed, and content is missing..


On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll <gsingers@apache.org>wrote:

> Praveen and Marc,
>
> Can you share the PDF (feel free to email my private email) that fails in
> Solr?
>
> Thanks,
> Grant
>
>
> On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote:
>
> >
> > Hi
> > Nope i didn't get it to work... Just like you, command line version of
> tika extracts correctly the content, but once included in Solr, no content
> is extracted.
> > What i tried until now is:- Updating the tika libraries inside Solr 1.4
> public version, no luck there.- Downloading the latest SVN version, compiled
> it, and started from a simple schema, still no luck.- Getting other versions
> compiled on hudson (nightly builds), and testing them also, still no
> extraction.
> > I sent a mail on the developpers mailing list but they told me i should
> just mail here, hope some developper reads this because it's quite an
> important feature of Solr and somehow it got broke between the 1.4 release,
> and the last version on the svn.
> > Marc
> > _________________________________________________________________
> > Consultez gratuitement vos emails Orange, Gmail, Free, ... directement
> dans HOTMAIL !
> > http://www.windowslive.fr/hotmail/agregation/
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message