manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: [VOTE] Release Apache ManifoldCF 1.5, RC7
Date Thu, 06 Feb 2014 14:53:36 GMT
Hi Erlend,

Please go into the Simple History, and change the start time of the query
to be one day earlier than the default.  By default, Simple History only
reports the last hour's worth of events.

Thanks,
Karl



On Thu, Feb 6, 2014 at 9:50 AM, Erlend GarĂ¥sen <e.f.garasen@usit.uio.no>wrote:

> On 06.02.14 15:25, Karl Wright wrote:
>
>  So I conclude that simple history is working fine, but since it is only
>> returning indexing results within the last hour by default it is confusing
>> you.  I also think it is likely that documents are getting skipped because
>> you've crawled this set before with the same job and many of the documents
>> have not changed.
>>
>
> Karl, we are indexing these documents:
>
> I have tail -F opened up from our Solr test server at the moment:
> [2014-02-06 15:21:00.321] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=B]} 0 38
> [2014-02-06 15:21:00.359] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=N]} 0 23
> [2014-02-06 15:21:29.732] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=G]} 0 29
> [2014-02-06 15:22:11.954] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=S]} 0 38
> [2014-02-06 15:22:15.752] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=D]} 0 28
> [2014-02-06 15:22:18.323] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/brevmottakere.xhtml?bokstav=H]} 0 34
> [2014-02-06 15:22:21.657] INFO [uio] OP crawl {add=[
> http://www.ibsen.uio.no/variakronologi.xhtml]} 0 73
>
> How could these log entries show up on our Solr server if the documents
> were skipped?
>
> And why did I get entries like this earlier today:
>
> DEBUG 2014-02-06 10:28:06,609 (Worker thread '29') - WEB: Decided to
> ingest 'http://www.ibsen.uio.no/varia.xhtml'
>
> (I have changed the log level back to INFO right now, so I cannot see
> these entries for the last crawl, but I will re-enable DEBUG again).
>
> I have re-ingested all documents several times today to be sure that all
> documents were crawled all over again.
>
> Of course, I can try to remove all jobs, delete all tables in PostgreSQL
> and try to create everything from scratch in case the old settings did not
> get upgraded successfully. Unfortunately MCF will delete all tables in my
> index as well.
>
> Erlend
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message