lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
Subject Re: PlainTextEntitiyProcessor not putting any text into a field in index
Date Thu, 18 Jun 2009 17:20:35 GMT
can you just log it and see what is contained in the plainText field.
(using LogTransformer)

On Thu, Jun 18, 2009 at 8:54 PM, Jay Hill<jayallenhill@gmail.com> wrote:
> I'm having some trouble getting the PlainTextEntityProcessor to populate a
> field in an index. I'm using the TemplateTransformer to fill 2 fields, and
> have a timestamp field in schema.xml, and these fields make it into the
> index. Only the plaintText data is missing. Here is my configuration:
>
> <dataConfig>
>    <dataSource type="FileDataSource" encoding="UTF-8" />
>    <document>
>        <entity
>       name="f"
>       processor="FileListEntityProcessor"
>       baseDir="/Users/jayhill/test/dir"
>       fileName=".*txt"
>       recursive="true"
>       rootEntity="true"
>       >
>
>        <entity
>           name="pt"
>           processor="PlainTextEntityProcessor"
>           url="${f.fileAbsolutePath}"
>           transformer="RegexTransformer,TemplateTransformer"
>           >
>          <field column="plainText" name="text"/>
>          <field column="datasource" template="textfiles" />
>        </entity>
>
>        </entity>
>    </document>
> </dataConfig>
>
> I've tried adding "plainText" as a field in schema.xml, but that didn't work
> either.
>
> When I look at what the PlainTextEntityProcessor class is doing I see that
> it has correctly parsed the file and has the text in a StringWriter:
>    row.put(PLAIN_TEXT, sw.toString());
> I just don't know how to get that text into a field in the index
>
> Any pointers appreciated.
>
> -Jay
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Mime
View raw message