asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: ADM parser question
Date Thu, 02 Jun 2016 10:17:09 GMT
I think it's just because almost everything is a string. I actually got it
to parse now, so it's "solved". I just had to use a way more complex regex
for fixing the decimal suffix issue (we also output floats in scientific
notation :) )

On Wed, Jun 1, 2016 at 10:25 PM, Wail Alkowaileet <wael.y.k@gmail.com>
wrote:

> The file looks weird to me ... why everything is string-fied ?
>
> On Thu, Jun 2, 2016 at 4:30 AM, Ian Maxon <imaxon@uci.edu> wrote:
>
> > Also It seems like the line # was wrong somehow, or at least it was not
> > leading to the right part of the file. I was just stepping through the
> > lexer in the debugger, and I saw that it was failing on "id" :
> > "728286376236593152", which is line 310. Line 619 has no problems, nor do
> > any of the lines adjacent to it.
> >
> > On Wed, Jun 1, 2016 at 6:21 PM, Ian Maxon <imaxon@uci.edu> wrote:
> >
> > > Aha! Think I found it. The regular expression for the decimal
> replacement
> > > was just deficient in the case that the field was something besides 0.0
> > :)
> > > It should be [0-9]*.
> > >
> > > On Wed, Jun 1, 2016 at 12:30 PM, Ian Maxon <imaxon@uci.edu> wrote:
> > >
> > >> I just did something as minimal/open as possible, like:
> > >>
> > >> "create type Tweet as open {id: string}"
> > >>
> > >> I'm not actually sure what the original type was.
> > >>
> > >> On Wed, Jun 1, 2016 at 12:21 PM, abdullah alamoudi <
> bamousaa@gmail.com>
> > >> wrote:
> > >>
> > >>> Ian,
> > >>> Can you share the data type? I am trying to re produce this
> > >>>
> > >>> ~Abdullah.
> > >>>
> > >>> On Wed, Jun 1, 2016 at 8:49 AM, Ian Maxon <imaxon@uci.edu> wrote:
> > >>>
> > >>> > Oh, I forgot the list strips attachments. Here's the snippet of
the
> > >>> data
> > >>> > that's being troublesome:
> > >>> >
> > >>> >
> > >>>
> >
> https://drive.google.com/file/d/0B9fobkjZFASiRXAybS1BUXZvR1V6akE3VlhGTkVFU2ZkYzlB/view?usp=sharing
> > >>> >
> > >>> > On Tue, May 31, 2016 at 10:36 PM, Mike Carey <dtabass@gmail.com>
> > >>> wrote:
> > >>> >
> > >>> > > We desperately need to make roundtripping work!!
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > On 5/31/16 7:52 PM, Ian Maxon wrote:
> > >>> > >
> > >>> > >> Hi all,
> > >>> > >>
> > >>> > >> I have a question about something I am trying to coax
the ADM
> > parser
> > >>> > into
> > >>> > >> accepting. I have a file that I dumped from the SDSC
testbed
> that
> > >>> has a
> > >>> > >> bunch of tweets in it, just using curl and a dataset
scan. The
> > >>> issue is
> > >>> > >> that currently this doesn't work round-trip. However
in this
> case
> > >>> the
> > >>> > >> modifications don't seem like they should be terribly
severe,
> so I
> > >>> just
> > >>> > >> tried my hand at using sed to fix it. The two things
I think
> that
> > >>> should
> > >>> > >> make this hack work are: replacing the i32/i64 suffixes
(so just
> > >>> > s/i32//g)
> > >>> > >> and removing decimal suffixes (/s/\([0-9]\.[0-9]\)d/\1/g).
This
> > >>> gives
> > >>> > >> output to me, that seems like it is "correct". But the
parser is
> > >>> still
> > >>> > >> complaining and I don't understand why. It fails at line
619,
> > column
> > >>> > 228.
> > >>> > >> The tweet on that line, and the one above it, work fine
if I
> just
> > >>> use an
> > >>> > >> insert statement.
> > >>> > >>
> > >>> > >> Does anyone have any thoughts as to maybe what's causing
it to
> not
> > >>> take
> > >>> > >> this input? I'm hoping it's just something silly I am
too tired
> to
> > >>> > see...
> > >>> > >> Thanks in advance for any thoughts/suggestions.
> > >>> > >>
> > >>> > >> -Ian
> > >>> > >>
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message