lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henrique Oliveira <hensan...@gmail.com>
Subject Re: CSV entry as multiple documents
Date Wed, 18 Feb 2015 01:37:07 GMT
Yes, Alexandre is right about my question. To make it clear, a CSV that look like:
t1,v1,v2,v2
2015-01-01T01:59:00Z,0.3,0.5,0.7
2015-01-01T02:00:00Z,0.4,0.5,0.8

would be the same of indexing
t1,v
2015-01-01T01:59:00Z,0.3
2015-01-01T01:59:00Z,0.5
2015-01-01T01:59:00Z,0.7
2015-01-01T02:00:00Z,0.4
2015-01-01T02:00:00Z,0.5
2015-01-01T02:00:00Z,0.8

I don’t know if multiValued field would do the trick. Do you have more info on that split
command?

Henrique

> On Feb 17, 2015, at 7:57 PM, Alexandre Rafalovitch <arafalov@gmail.com> wrote:
> 
> I think the question asked was a bit different. It was about having
> one row/document split into multiple with some fields replicated and
> some mapped.
> 
> JSON (single-document format) has a split command which might be
> similar to what's being asked. CSV has a split command as well, but I
> think it is more about creating a multiValued field.
> 
> Or did I miss a different parameter?
> 
> Regards,
>   Alex.
> ----
> Sign up for my Solr resources newsletter at http://www.solr-start.com/
> 
> 
> On 17 February 2015 at 19:41, Anshum Gupta <anshum@anshumgupta.net> wrote:
>> Hi Henrique,
>> 
>> Solr supports posting a csv with multiple rows. Have a look at the
>> documentation in the ref. guide here:
>> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates
>> 
>> 
>> 
>> On Tue, Feb 17, 2015 at 2:44 PM, Henrique Oliveira <hensantos@gmail.com>
>> wrote:
>> 
>>> Hi all,
>>> 
>>> I was wondering if there is a way to tell Solr to treat a CSV entry as
>>> multiple documents instead of one document. For instance, suppose that a
>>> CSV file has 4 fields and a single entry:
>>> t1,v1,v2,v3
>>> 2015-01-01T01:00:59Z,0.3,0.5,0.7
>>> 
>>> I want Solr to update its index like it were 3 different documents:
>>> t1,v
>>> 2015-01-01T01:00:59Z,0.3
>>> 2015-01-01T01:00:59Z,0.5
>>> 2015-01-01T01:00:59Z,0.7
>>> 
>>> Is that possible, or do I have to create a different CSV for it?
>>> 
>>> Many thanks,
>>> Henrique.
>> 
>> 
>> 
>> 
>> --
>> Anshum Gupta
>> http://about.me/anshumgupta


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message