lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (Commented) (JIRA)" <>
Subject [jira] [Commented] (SOLR-2549) DIH LineEntityProcessor support for delimited & fixed-width files
Date Tue, 13 Dec 2011 19:58:30 GMT


James Dyer commented on SOLR-2549:

The dependency here to SOLR-2943 is only for the "DIHCacheTypes" enum, which defines data
types for each flat file column of data.  This is particularly helpful when joining to SQL
data sources as DIH requires the join keys be the same type.  It might be beneficial to rename
the enum to "DIHType" or something more generic, should either issue become a candidate for
> DIH LineEntityProcessor support for delimited & fixed-width files
> -----------------------------------------------------------------
>                 Key: SOLR-2549
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Priority: Minor
>         Attachments: SOLR-2549.patch, SOLR-2549.patch, SOLR-2549.patch
> Provides support for Fixed Width and Delimited Files without needing to write a Transformer.

> The following xml properties are supported with this version of LineEntityProcessor:
> For fixed width files:
>  - colDef[#]
> For Delimited files:
>  - fieldDelimiterRegex
>  - firstLineHasFieldnames
>  - delimitedFieldNames
>  - delimitedFieldTypes
> These properties are described in the api documentation.  See patch.
> When combined with the cache improvements from SOLR-2382 this allows you to join a flat
file entity with other entities (sql, etc).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message