lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandhya Agarwal <sagar...@opentext.com>
Subject RE: Problem with pdf, upgrading Cell
Date Tue, 04 May 2010 07:28:09 GMT
Hello,



But I see that the libraries are being loaded :



INFO: Adding specified lib dirs to ClassLoader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/asm-3.1.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to
classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to
classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/commons-compress-1.0.jar'
to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/commons-logging-1.1.1.jar'
to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/dom4j-1.6.1.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/fontbox-1.1.0.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.1.jar'
to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/jempbox-1.1.0.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/log4j-1.2.14.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/metadata-extractor-2.4.0-beta-1.jar'
to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/pdfbox-1.1.0.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/poi-3.6.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-3.6.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-schemas-3.6.jar'
to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/poi-scratchpad-3.6.jar' to
classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/tagsoup-1.2.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/tika-core-0.7.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/tika-parsers-0.7.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/xercesImpl-2.8.1.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/xml-apis-1.0.b2.jar' to classloader

May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/extraction/lib/xmlbeans-2.3.0.jar' to classloader

May 4, 2010 12:50:16 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/dist/apache-solr-cell-1.4.0.jar' to classloader

May 4, 2010 12:50:20 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/dist/apache-solr-clustering-1.4.0.jar' to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/carrot2-mini-3.1.0.jar' to
classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/commons-lang-2.4.jar' to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/ehcache-1.6.2.jar' to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/google-collections-1.0-rc2.jar'
to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/jackson-core-asl-0.9.9-6.jar'
to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/jackson-mapper-asl-0.9.9-6.jar'
to classloader

May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader

INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/log4j-1.2.14.jar' to classloader



Thanks,

Sandhya



-----Original Message-----
From: Grant Ingersoll [mailto:gsiasf@gmail.com] On Behalf Of Grant Ingersoll
Sent: Tuesday, May 04, 2010 6:13 AM
Cc: solr-user@lucene.apache.org
Subject: Re: Problem with pdf, upgrading Cell



Little more info... Seems to be a classloading issue.  The tests pass, but they aren't loading
the Tika libraries via the Solr ResourceLoader, whereas the example is.  Marc, one thing to
try is to unjar the Solr WAR file and put the Tika libs in there, as I bet it will then work.
 Note, however, I haven't tried this.



On May 3, 2010, at 6:24 PM, Grant Ingersoll wrote:



> I've opened https://issues.apache.org/jira/browse/SOLR-1902 to track this.  It is indeed
a bug somewhere (still investigating).  It seems that Tika is now picking an EmptyParser implementation
when trying to determine which parser to use, despite the fact that it properly identifies
the MIME Type.

>

> -Grant

>

> On May 3, 2010, at 5:36 PM, Grant Ingersoll wrote:

>

>> I'm investigating.

>>

>> On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote:

>>

>>>

>>> Hi,

>>> Grant, i confirm what Praveen has said, any PDF i try does not work with the
new Tika and SVN versions. :(

>>> Marc

>>>

>>>> From: sagarwal@opentext.com

>>>> To: solr-user@lucene.apache.org

>>>> Date: Mon, 3 May 2010 13:05:24 +0530

>>>> Subject: RE: Problem with pdf, upgrading Cell

>>>>

>>>> Hello,

>>>>

>>>> Please let me know if anybody figured out a way out of this issue.

>>>>

>>>> Thanks,

>>>> Sandhya

>>>>

>>>> -----Original Message-----

>>>> From: Praveen Agrawal [mailto:pkalwar@gmail.com]

>>>> Sent: Friday, April 30, 2010 11:14 PM

>>>> To: solr-user@lucene.apache.org

>>>> Subject: Re: Problem with pdf, upgrading Cell

>>>>

>>>> Grant,

>>>> You can try any of the sample pdfs that come in /docs folder of Solr 1.4

>>>> dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only

>>>> metadata i.e. stream_size, content_type apart from my own literals are

>>>> indexed, and content is missing..

>>>>

>>>>

>>>> On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll <gsingers@apache.org>wrote:

>>>>

>>>>> Praveen and Marc,

>>>>>

>>>>> Can you share the PDF (feel free to email my private email) that fails
in

>>>>> Solr?

>>>>>

>>>>> Thanks,

>>>>> Grant

>>>>>

>>>>>

>>>>> On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote:

>>>>>

>>>>>>

>>>>>> Hi

>>>>>> Nope i didn't get it to work... Just like you, command line version
of

>>>>> tika extracts correctly the content, but once included in Solr, no content

>>>>> is extracted.

>>>>>> What i tried until now is:- Updating the tika libraries inside Solr
1.4

>>>>> public version, no luck there.- Downloading the latest SVN version, compiled

>>>>> it, and started from a simple schema, still no luck.- Getting other versions

>>>>> compiled on hudson (nightly builds), and testing them also, still no

>>>>> extraction.

>>>>>> I sent a mail on the developpers mailing list but they told me i
should

>>>>> just mail here, hope some developper reads this because it's quite an

>>>>> important feature of Solr and somehow it got broke between the 1.4 release,

>>>>> and the last version on the svn.

>>>>>> Marc

>>>>>> _________________________________________________________________

>>>>>> Consultez gratuitement vos emails Orange, Gmail, Free, ... directement

>>>>> dans HOTMAIL !

>>>>>> http://www.windowslive.fr/hotmail/agregation/

>>>>>

>>>>> --------------------------

>>>>> Grant Ingersoll

>>>>> http://www.lucidimagination.com/

>>>>>

>>>>> Search the Lucene ecosystem using Solr/Lucene:

>>>>> http://www.lucidimagination.com/search

>>>>>

>>>>>

>>>

>>> _________________________________________________________________

>>> Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur votre
téléphone!

>>> http://www.messengersurvotremobile.com/?d=Hotmail

>>

>> --------------------------

>> Grant Ingersoll

>> http://www.lucidimagination.com/

>>

>> Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search

>>

>

> --------------------------

> Grant Ingersoll

> http://www.lucidimagination.com/

>

> Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search

>



--------------------------

Grant Ingersoll

http://www.lucidimagination.com/



Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message