manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shigeki Kobayashi <shigeki.kobayas...@g.softbank.co.jp>
Subject Rules of excluding specific files in Windows file server are not recognized
Date Tue, 11 Sep 2012 10:46:52 GMT
Hi guys.

I need some help in excluding specific files from crawling.

I am trying to crawl Windows file server using Windows shares connector to
index to Solr.
There are some files I do not want to index so I set paths to exclude them
from crawling, but the job crawls them.

For example, I do NOT want to index "text.txt" in a directory D which is a
root path.

In "Paths" tab:
- Set D as the root path.
- To create crawling rules, from pulldown, chose "exclude" and "file", and
enter "text.txt" in a text box.
- The list of crawling rules is created as following:

  1. Exclude file(s) matching text.txt
  2. Include indexable file(s) matching *
  3. Include directory(s) matching *

- Save the job setting

As the result, the job still tries to crawl the file.
I wonder why "text.txt" does not match in the crawling rule.

Anyone knows what I did wrong?

Version:
  MCF 0.5
  Solr 3.5
  MySql 5.5


Regards,

Shigeki

Mime
View raw message