lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fergus McMenemie (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-1000) DIH FileListEntityProcessor fileName filters directory names and stops recursion
Date Tue, 03 Feb 2009 10:07:59 GMT

     [ https://issues.apache.org/jira/browse/SOLR-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Fergus McMenemie updated SOLR-1000:
-----------------------------------

    Attachment: SOLR-1000.patch

Sorted bugs in the Junit test and added a few other improvements to the test.

> DIH FileListEntityProcessor fileName filters directory names and stops recursion 
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-1000
>                 URL: https://issues.apache.org/jira/browse/SOLR-1000
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Fergus McMenemie
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-1000.patch, SOLR-1000.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I have been trying to find out why DIH in FileListEntityProcessor mode did not appear
to be recursing into subdirectories. Going through FileListEntityProcessor.java I eventually
tumbled to the fact that my filename filter setting from data-config.xml also applied to directory
names.
> Now, I feel that the fieldName filter should be applied to files fed into the parser,
it should not be applied to the directory names we are recursing through. I bodged the code
to adjust the behavior so that the "FileName" and "excludes" attributes of "entity" only apply
to filenames and not directory names. It now recurses though my directory tree only indexing
the appropriate files! I think the new behavior is more standard.
> I will submit the a patch once I have constructed one!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message