asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: loading CSV records with comma in the value
Date Sun, 26 Jul 2015 00:13:35 GMT
I would guess (not having access to the code right now) that we also 
have a quote character in addition to the delimiter character. Maybe 
that needs to be specified?

Also, I think that we should have regression tests for this that could 
serve as an example.

Cheers,
Till

On 25 Jul 2015, at 15:27, Taewoo Kim wrote:

> Can you try to load it into an internal dataset? I think I have 
> implemented
> the "comma between the comma (delimiter)" when modifying the delimited 
> data
> parser. And Chris also modified that part, too. If it doesn't work, I 
> can
> look at the issue.
>
> Best,
> Taewoo
>
> On Sat, Jul 25, 2015 at 1:51 PM, Chen Li <chenli@gmail.com> wrote:
>
>> Not sure if this topic was discussed before.  I was trying to load an
>> external CVS file using "," as the delimiter.  But the engine failed 
>> to
>> read a file with the following single record:
>>
>> 14, "John Smith, Mary Reeve"
>>
>>
>> use dataverse pubs;
>>
>>  create type PaperType as open {
>>     id: int32,
>>      authors: string
>>  }
>>
>> create external dataset Papers(PaperType)
>>  using localfs
>> (("path"="127.0.01:///Users/chenli/tmp/asterix-data/papers.csv"),
>>  ("format"="delimited-text"),
>>  ("delimiter"=","));
>>
>> for $paper in dataset('Papers')
>> return $paper;
>>
>> The following is the output, which shows that the comma in the 
>> authors
>> field was incorrectly used to break the field.  Any idea about how to 
>> fix
>> it?
>>
>> Output
>> Results:
>>
>> { "id": 14, "authors": " \"John Smith" }
>>
>> Duration of all jobs: 0.091 sec
>>
>> Success: Query Complete
>>

Mime
View raw message