manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Rules of excluding specific files in Windows file server are not recognized
Date Tue, 11 Sep 2012 12:13:32 GMT
I am wondering if there might be another locale-specific toLowerCase()
issue like we saw in Turkey...

I've asked Shigeki to turn on connector debugging and send us the log.
 That should demonstrate if the rule is not matching due to case
reasons.

Karl

On Tue, Sep 11, 2012 at 7:44 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
> Hi Shigeki
>
> Can you try entering "*text.txt" in the text box?
>
> Ahmet
> --- On Tue, 9/11/12, Shigeki Kobayashi <shigeki.kobayashi3@g.softbank.co.jp> wrote:
>
> From: Shigeki Kobayashi <shigeki.kobayashi3@g.softbank.co.jp>
> Subject: Rules of excluding specific files in Windows file server are not recognized
> To: user@manifoldcf.apache.org
> Date: Tuesday, September 11, 2012, 1:46 PM
>
> Hi guys.
> I need some help in excluding specific files from crawling.
> I am trying to crawl Windows file server using Windows shares connector to index to Solr.
>
> There are some files I do not want to index so I set paths to exclude them from crawling,
but the job crawls them.
> For example, I do NOT want to index "text.txt" in a directory D which is a root path.
>
>
> In "Paths" tab: - Set D as the root path.  - To create crawling rules, from pulldown,
chose "exclude" and "file", and enter "text.txt" in a text box.
>
> - The list of crawling rules is created as following:
>   1. Exclude file(s) matching text.txt   2. Include indexable file(s) matching *  3.
Include directory(s) matching *
>
>
> - Save the job setting
> As the result, the job still tries to crawl the file.I wonder why "text.txt" does not
match in the crawling rule.
>
>
> Anyone knows what I did wrong?
> Version:  MCF 0.5  Solr 3.5  MySql 5.5
>
> Regards,
> Shigeki
>
>

Mime
View raw message