manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Indexing Solr with the web crawler
Date Fri, 21 Jan 2011 16:38:18 GMT
I will not be talking about ManifoldCF at this year's conference, most
likely, because the conference conflicts with my daughter's college
graduation.  Sorry about that!

I hadn't heard that they removed the extracting update request handler
from Solr.  That's unfortunate.  Please let me know how hard you find
it to install the jar, and I'll update the instructions accordingly.

Karl

On Fri, Jan 21, 2011 at 10:32 AM, Erlend Garåsen
<e.f.garasen@usit.uio.no> wrote:
>
> I knew that I had heard your name before, Karl. You held an LCF presentation
> in Prague. Unfortunately, I attended the other presentation at track 2, so I
> missed it.
>
> I hope there will be held similar presentations for this year's conference.
>
> Anyway, I figured out that it is the commit part which causes the problems.
> I entered the following url I saw from Resin's access_log:
> http://hoppalong.uio.no:8081/solr/update/extract?commit=true
>
> I'm not going to bother you with the complete stack trace, but here's the
> relevant line:
> Caused by: java.lang.ClassNotFoundException:
> org.apache.solr.handler.extraction.ExtractingRequestHandler
>
> Jack sent me a link about the ExtractingRequestHandler, and after I read
> this document I found the reason:
> "The ExtractingRequestHandler is not incorporated into the solr war file,
> you have to install it separately."
>
> So I will try to place the missing jar file into my lib folder next week.
>
> Erlend
>
>
> On 20.01.11 16.23, Erlend Garåsen wrote:
>>
>> On 20.01.11 16.15, Jack Krupansky wrote:
>>>
>>> Here's one email thread that details at least one cause of the lazy
>>> loading error:
>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200910.mbox/%3C4AD5EC8C.6000308@gmail.com%3E
>>>
>>
>> Thanks. Now I can see that I have the following lines in Resin's access
>> log:
>> 127.0.0.1 - - [20/Jan/2011:16:19:09 +0100] "GET
>> /solr/update/extract?commit=true HTTP/1.0" 500 5598 "-" "-"
>>
>> I run Solr on Resin, so maybe there is something more I need to
>> configure. I'll take a deeper look at this right now.
>>
>> Erlend
>>
>
>
> --
> Erlend Garåsen
> Center for Information Technology Services
> University of Oslo
> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
>

Mime
View raw message