asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taewoo Kim <wangs...@gmail.com>
Subject Re: loading CSV records with comma in the value
Date Sun, 26 Jul 2015 20:25:01 GMT
We have test cases for this case. There are located in
asterix-app/src/test/resources/runtimets/queries/load/.  The documentation
is in the /asterix-doc/src/site/markdown/csv.md. Addtional syntax for the
CSV is fairly simple. You just have two additional parameters - "quote" and
"header". Refer to the file for more details.



Best,
Taewoo

On Sat, Jul 25, 2015 at 11:30 PM, Chen Li <chenli@gmail.com> wrote:

> @Taewoo: I tried it and it has the same problem.  Do you have a test
> case for this feature?  Also do we have documentation for this syntax?
>
> Chen
>
> On Sat, Jul 25, 2015 at 10:52 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
> > The URL is https://asterixdb.ics.uci.edu/documentation/aql/primer.html.
> >
> >
> > It should look like this:
> >
> > ////
> > use dataverse pubs;
> >
> > create type PaperType as open {
> >    id: int32,
> >    authors: string
> > }
> >
> > create dataset Papers(PaperType) primary key id;
> >
> > load dataset Papers using localfs
> >      using localfs
> > (("path"="127.0.01:///Users/chenli/tmp/asterix-data/papers.csv"),
> >    ("format"="delimited-text"),
> >    ("delimiter"=","));
> >
> > for $paper in dataset('Papers')
> > return $paper;
> >
> >
> >
> > Best,
> > Taewoo
> >
> > On Sat, Jul 25, 2015 at 10:47 PM, Chen Li <chenli@gmail.com> wrote:
> >
> >> @Taewoo: can you send me the syntax or the documentation URL to show the
> >> syntax?
> >>
> >> Chen
> >>
> >> On Sat, Jul 25, 2015 at 3:27 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
> >> > Can you try to load it into an internal dataset? I think I have
> >> implemented
> >> > the "comma between the comma (delimiter)" when modifying the delimited
> >> data
> >> > parser. And Chris also modified that part, too. If it doesn't work, I
> can
> >> > look at the issue.
> >> >
> >> > Best,
> >> > Taewoo
> >> >
> >> > On Sat, Jul 25, 2015 at 1:51 PM, Chen Li <chenli@gmail.com> wrote:
> >> >
> >> >> Not sure if this topic was discussed before.  I was trying to load
an
> >> >> external CVS file using "," as the delimiter.  But the engine failed
> to
> >> >> read a file with the following single record:
> >> >>
> >> >> 14, "John Smith, Mary Reeve"
> >> >>
> >> >>
> >> >> use dataverse pubs;
> >> >>
> >> >>    create type PaperType as open {
> >> >>       id: int32,
> >> >>        authors: string
> >> >>    }
> >> >>
> >> >> create external dataset Papers(PaperType)
> >> >>    using localfs
> >> >> (("path"="127.0.01:///Users/chenli/tmp/asterix-data/papers.csv"),
> >> >>    ("format"="delimited-text"),
> >> >>    ("delimiter"=","));
> >> >>
> >> >> for $paper in dataset('Papers')
> >> >> return $paper;
> >> >>
> >> >> The following is the output, which shows that the comma in the
> authors
> >> >> field was incorrectly used to break the field.  Any idea about how
to
> >> fix
> >> >> it?
> >> >>
> >> >> Output
> >> >> Results:
> >> >>
> >> >> { "id": 14, "authors": " \"John Smith" }
> >> >>
> >> >> Duration of all jobs: 0.091 sec
> >> >>
> >> >> Success: Query Complete
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message