lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ZHANG Liang F <>
Subject how to store file path in Solr when using TikaEntityProcessor
Date Tue, 27 Mar 2012 07:55:34 GMT

I am using DIH to index local file system. But the file path, size and lastmodified field
were not stored. in the schema.xml I defined:

   <field name="title" type="string" indexed="true" stored="true"/>
   <field name="author" type="string" indexed="true" stored="true" />
   <!--<field name="text" type="text" indexed="true" stored="true" />
    liang added-->
   <field name="path" type="string" indexed="true" stored="true" />
   <field name="size" type="long" indexed="true" stored="true" />
   <field name="lastmodified" type="date" indexed="true" stored="true" />

And also defined tika-data-config.xml:

    <dataSource name="bin" type="BinFileDataSource" />
        <entity name="f" dataSource="null" rootEntity="false"
            fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)|(docx)|(ppt)" onError="skip"
            <entity name="tika-test" dataSource="bin" processor="TikaEntityProcessor"
            url="${f.fileAbsolutePath}" format="text" onError="skip">
                <field column="Author" name="author" meta="true"/>
                <field column="title" name="title" meta="true"/>
                <field column="text" name="text"/> -->
                <field column="fileAbsolutePath" name="path" />
                <field column="fileSize" name="size" />
                <field column="fileLastModified" name="lastmodified" />

The Solr version is 3.5. any idea?

Thanks in advance.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message