mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Schelter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1521) lucene2seq - Error trying to load data from stored field (when non-indexed)
Date Sun, 27 Apr 2014 08:04:14 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982239#comment-13982239
] 

Sebastian Schelter commented on MAHOUT-1521:
--------------------------------------------

Can we have a lucene hacker look at this?

> lucene2seq - Error trying to  load data from stored field (when non-indexed)
> ----------------------------------------------------------------------------
>
>                 Key: MAHOUT-1521
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1521
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.9
>            Reporter: Terry Blankers
>            Assignee: Frank Scholten
>              Labels: lucene2seq
>             Fix For: 1.0
>
>
> When using lucene2seq to load data from a field that is stored but not indexed I receive
the following error:
> {noformat}IllegalArgumentException: Field 'body' does not exist in the index{noformat}
> Field is described in schema.xml as:
> {noformat}<fieldname="body"type="string" stored="true" indexed="false"/>{noformat}
> BTW,  field is copied to 'content' field for searching, schema.xml snippet:
> {noformat}<copyField source="body" dest="content" />{noformat}
> Copy field is described in schema.xml as:
> {noformat}<fieldname="content" type="text" stored="false" indexed="true" multiValued="true"/>{noformat}
> If I try to load data from the copy field, lucene2seq runs with no errors but I receive
empty data for each key/doc:
> {noformat}Key class: class org.apache.hadoop.io.Text Value Class: class org.apache.hadoop.io.Text
> Key: 96C4C76CF9D7449C724CA77CB8F650EAFD33E31C: Value:
> Key: D6842B81B8D09733B50BEDB4767C2A5C49E43B20: Value:{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message