asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Li <che...@gmail.com>
Subject Re: loading CSV records with comma in the value
Date Sun, 26 Jul 2015 06:30:36 GMT
@Taewoo: I tried it and it has the same problem.  Do you have a test
case for this feature?  Also do we have documentation for this syntax?

Chen

On Sat, Jul 25, 2015 at 10:52 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
> The URL is https://asterixdb.ics.uci.edu/documentation/aql/primer.html.
>
>
> It should look like this:
>
> ////
> use dataverse pubs;
>
> create type PaperType as open {
>    id: int32,
>    authors: string
> }
>
> create dataset Papers(PaperType) primary key id;
>
> load dataset Papers using localfs
>      using localfs
> (("path"="127.0.01:///Users/chenli/tmp/asterix-data/papers.csv"),
>    ("format"="delimited-text"),
>    ("delimiter"=","));
>
> for $paper in dataset('Papers')
> return $paper;
>
>
>
> Best,
> Taewoo
>
> On Sat, Jul 25, 2015 at 10:47 PM, Chen Li <chenli@gmail.com> wrote:
>
>> @Taewoo: can you send me the syntax or the documentation URL to show the
>> syntax?
>>
>> Chen
>>
>> On Sat, Jul 25, 2015 at 3:27 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
>> > Can you try to load it into an internal dataset? I think I have
>> implemented
>> > the "comma between the comma (delimiter)" when modifying the delimited
>> data
>> > parser. And Chris also modified that part, too. If it doesn't work, I can
>> > look at the issue.
>> >
>> > Best,
>> > Taewoo
>> >
>> > On Sat, Jul 25, 2015 at 1:51 PM, Chen Li <chenli@gmail.com> wrote:
>> >
>> >> Not sure if this topic was discussed before.  I was trying to load an
>> >> external CVS file using "," as the delimiter.  But the engine failed to
>> >> read a file with the following single record:
>> >>
>> >> 14, "John Smith, Mary Reeve"
>> >>
>> >>
>> >> use dataverse pubs;
>> >>
>> >>    create type PaperType as open {
>> >>       id: int32,
>> >>        authors: string
>> >>    }
>> >>
>> >> create external dataset Papers(PaperType)
>> >>    using localfs
>> >> (("path"="127.0.01:///Users/chenli/tmp/asterix-data/papers.csv"),
>> >>    ("format"="delimited-text"),
>> >>    ("delimiter"=","));
>> >>
>> >> for $paper in dataset('Papers')
>> >> return $paper;
>> >>
>> >> The following is the output, which shows that the comma in the authors
>> >> field was incorrectly used to break the field.  Any idea about how to
>> fix
>> >> it?
>> >>
>> >> Output
>> >> Results:
>> >>
>> >> { "id": 14, "authors": " \"John Smith" }
>> >>
>> >> Duration of all jobs: 0.091 sec
>> >>
>> >> Success: Query Complete
>> >>
>>

Mime
View raw message