lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "david babits (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3434) CSVRequestHandler does not trim header when using header=true&trim=true
Date Fri, 04 May 2012 15:36:48 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268456#comment-13268456
] 

david babits commented on SOLR-3434:
------------------------------------

Yes, specifying fieldnames works, and worked yesterday too, I forgot to mention it.

To close this out:
My goal is to accept a random file, generated by extract from a database, and load it into
Solr.
Database extract comes with fields aligned, hence the white space in the header and values.
I do not know the fieldnames ahead of time, so I was hoping to specify header=true&trim=true
and have Solr take care of parsing.
This proved not to work.
Since I have to massage the data anyway to remove spaces, I might as well parse out the header
line at the same time using sed and construct fieldnames variable.

I also found that I need <dynamicField name="*" type="string" multiValued="true" />
since I do not know header up front, and can't rely on _s etc, and it wouldn't work otherwise.

So, trim=true&header=false&skipLines=2&fieldnames=$fieldnames
This is the workaround. 

My opinion is:  'trim' should be true by default, and certainly apply to both data and header,
although I understand it would break backward-compatibility.

Thanks again for your help.
                
> CSVRequestHandler does not trim header when using header=true&trim=true
> -----------------------------------------------------------------------
>
>                 Key: SOLR-3434
>                 URL: https://issues.apache.org/jira/browse/SOLR-3434
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.6
>         Environment: Linux
>            Reporter: david babits
>              Labels: CSV,, header, separator
>
> when using {{header=true&trim=true}} the field names in the header row are not trimmed.
> this is consistent with the documentation, but that doesn't mean it makes sense.
> would be good to change this so trim=true also applied to the header row (at least by
default)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message