manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Ingestion API socket timeout exception waiting for response code
Date Mon, 07 May 2012 12:18:21 GMT
Thanks for the update!
Karl

On Mon, May 7, 2012 at 7:15 AM, Erlend Garåsen <e.f.garasen@usit.uio.no> wrote:
>
> Document deletion works perfectly after I reinstalled the SSL certificate
> and reentered the username and password to our Solr server. So I think this
> issue has been solved.
>
> Erlend
>
> On 27.04.12 12.11, Erlend Garåsen wrote:
>>
>>
>> Many thanks for your suggestions and help, Karl. Using a filesystem
>> crawl was actually a good idea for debugging/testing. To install a new
>> version of Solr is not that easy on our test server for many reasons,
>> generally because it is under control of another division dealing with
>> servers at the uni, even though I can get root access. Anyway, according
>> to the logs on our Solr 3.2 server, it seems that MCF successfully
>> managed to delete one test document I removed:
>> [2012-04-27 11:18:33.092] {delete=[file:/tmp/mcf/docs/app_lasso.pdf]} 0 7
>> [2012-04-27 11:18:33.092] [] webapp=/solr path=/update params={}
>> status=0 QTime=7
>>
>> The result code is 200 according to Simple History in MCF.
>>
>> I entered the passwords once again for the Solr servers into the Solr
>> output configuration, deleted and uploaded our SSL certificate once
>> again before I did the filesystem test. I should have performed the
>> tests prior to the password updates.
>>
>> The crawl will start again later today at 6 pm on our production server,
>> so I will try to figure out whether we still have problems later. I'm
>> going to Scotland later this evening for some days without my laptop, so
>> I cannot check the status of my crawl before I'm back, but I'll let my
>> colleague watch the logs.
>>
>> Erlend
>>
>> On 26.04.12 21.14, Karl Wright wrote:
>>>
>>> Hi Erlend,
>>>
>>> I had some time today and was able to verify that everything worked
>>> fine against what I have currently on my laptop, which is Solr 3.2.
>>> The second job run looks like this:
>>>
>>> 04-26-2012 15:11:44.154 job end 1335467343879(test) 0 1
>>> 04-26-2012 15:11:34.159 document deletion (solr)
>>> file:/C:/testcrawl/there.txt 200 0 117
>>> 04-26-2012 15:11:24.690 read document C:\testcrawl OK 0 1
>>> 04-26-2012 15:11:24.494 job start 1335467343879(test) 0 1
>>>
>>> So it appears that either something changed in Solr, or SSL support is
>>> broken, or your network is not permitting a valid HTTP response for
>>> some reason.
>>>
>>> Karl
>>>
>>>
>>> On Thu, Apr 26, 2012 at 11:10 AM, Karl Wright<daddywri@gmail.com> wrote:
>>>>
>>>> Hi Erlend,
>>>>
>>>> Can you try the following:
>>>>
>>>> (1) Make a fresh Solr checkout of 3.6 or whatever Solr version you are
>>>> using, and build it
>>>> (2) Start it
>>>> (3) Run a simple filesystem crawl using a Solr connection that is
>>>> created with the default values
>>>> (4) Delete a file in your filesystem that was crawled
>>>> (5) Crawl again
>>>>
>>>> Does the deletion happen OK?
>>>>
>>>> AFAIK, nothing has changed in the Solr connector that should affect
>>>> the ability to delete. This test will confirm that it is still
>>>> working.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>> On Thu, Apr 26, 2012 at 10:19 AM, Erlend Garåsen
>>>> <e.f.garasen@usit.uio.no> wrote:
>>>>>
>>>>> It seems that MCF cannot delete documents from Solr. A timeout
>>>>> occurs, and
>>>>> the job stops after a while.
>>>>>
>>>>> This is what I can see from the log:
>>>>> WARN 2012-04-20 18:24:30,373 (Worker thread '16') - Service
>>>>> interruption
>>>>> reported for job 1327930125433 connection 'Web crawler': Ingestion API
>>>>> socket timeout exception waiting for response code: Read timed out;
>>>>> ingestion will be retried again later
>>>>>
>>>>> If I take a further look in Simple History, it seems that this error
is
>>>>> related to document deletion.
>>>>>
>>>>> I have tried to delete the document manually by using curl from the
>>>>> same
>>>>> server MCF is installed on in case we have some access restrictions,
>>>>> but
>>>>> Curr succeeded.
>>>>>
>>>>> We do not have any problems with adding, the timeout only occurs while
>>>>> deleting documents.
>>>>>
>>>>> I have checked our Solr configuration. MCF does use the correct path
>>>>> for
>>>>> document deletion, i.e. /update.
>>>>>
>>>>> The correct realm, username and password for our Solr server are
>>>>> entered
>>>>> correctly and the SSL certificate is valid as well.
>>>>>
>>>>> Erlend
>>>>>
>>>>> --
>>>>> Erlend Garåsen
>>>>> Center for Information Technology Services
>>>>> University of Oslo
>>>>> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
>>>>> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968,
>>>>> VIP: 31050
>>
>>
>>
>
>
> --
> Erlend Garåsen
> Center for Information Technology Services
> University of Oslo
> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

Mime
View raw message