fleece-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Manni-Bucau <rmannibu...@gmail.com>
Subject Re: JsonLocation.getStreamOffset() return value unclear
Date Thu, 24 Jul 2014 12:02:02 GMT
yes

but user and framework will look for the same I guess, at least short term.



Romain Manni-Bucau
Twitter: @rmannibucau
Blog: http://rmannibucau.wordpress.com/
LinkedIn: http://fr.linkedin.com/in/rmannibucau
Github: https://github.com/rmannibucau


2014-07-24 13:56 GMT+02:00 Hendrik Dev <hendrikdev22@gmail.com>:

> yes, lets make this the default
>
> but beside exception reporting there is another use case for the location:
>
> "Provides the location information of a JSON event in an input source.
> The JsonLocation information can be used to identify incorrect JSON or
> can be used
> by higher frameworks to know about the processing location."
>
> So maybe high level processing stuff will rely on this?
>
> But on the other side the API says that its perfectly valid to always
> return -1,-1,-1
> "All the information provided by a JsonLocation is optional. For
> example, a provider may only report line numbers. Also, there may not
> be any location information for an input source."
> Crazy, isn't it?
>
> Thanks
> Hendrik
>
> On Thu, Jul 24, 2014 at 1:46 PM, Romain Manni-Bucau
> <rmannibucau@gmail.com> wrote:
> > reviewing quickly JsonLocation is only useful when there is an exception
> > "you suck at line 3, column 6, offset 18". So we need to be able to open
> > it, go here in gedit/notepad++/other and check the syntax
> error...otherwise
> > whatever clever counting is done it is really useless.
> >
> > If for passing tcks we need to break it we'll do but I'm sure we'll keep
> > this as default, no?
> >
> >
> >
> > Romain Manni-Bucau
> > Twitter: @rmannibucau
> > Blog: http://rmannibucau.wordpress.com/
> > LinkedIn: http://fr.linkedin.com/in/rmannibucau
> > Github: https://github.com/rmannibucau
> >
> >
> > 2014-07-24 13:42 GMT+02:00 Hendrik Dev <hendrikdev22@gmail.com>:
> >
> >> On Thu, Jul 24, 2014 at 1:35 PM, Romain Manni-Bucau
> >> <rmannibucau@gmail.com> wrote:
> >> > Think we start with 1. But for column I don't really care, we can
> align
> >> on
> >> > RI.
> >> >
> >> > For offset not sure what is complicated but we should ensure offset
> >> > corresponds to the sum of previously parsed columns for all lines.
> While
> >> > this is consistent global system works.
> >>
> >> i agree but API says IMHO different things (column is always chars,
> >> offset can be bytes or chars according to jsr)
> >>
> >> my proposal is to keeps thing easy for now until we have tck. Will
> >> start column with 1 (its common and expected IMHO) and defer byte/char
> >> count stuff until tck arrives.
> >> RI is also not counting different for bytes and chars.
> >>
> >> Kind regards
> >> Hendrik
> >>
> >>
> >> >
> >> >
> >> >
> >> > Romain Manni-Bucau
> >> > Twitter: @rmannibucau
> >> > Blog: http://rmannibucau.wordpress.com/
> >> > LinkedIn: http://fr.linkedin.com/in/rmannibucau
> >> > Github: https://github.com/rmannibucau
> >> >
> >> >
> >> > 2014-07-24 12:39 GMT+02:00 Hendrik Dev <hendrikdev22@gmail.com>:
> >> >
> >> >> doing this efficiently is more complicated than i thought. Can we not
> >> >> simply just count 2 bytes for one char ;-)
> >> >>
> >> >> BTW, seem the JsonLocation column value leave also room for
> >> interpretation:
> >> >>
> >> >> Is the most left column 0 or 1? Texteditors for example start with
> >> >> column 1 (there is never a column 0) but RI starts with 0.
> >> >>
> >> >> Regards
> >> >> Hendrik
> >> >>
> >> >>
> >> >> On Wed, Jul 23, 2014 at 1:49 PM, Hendrik Dev <hendrikdev22@gmail.com
> >
> >> >> wrote:
> >> >> > agree, will make it so
> >> >> >
> >> >> > On Wed, Jul 23, 2014 at 1:28 PM, Romain Manni-Bucau
> >> >> > <rmannibucau@gmail.com> wrote:
> >> >> >> Hi
> >> >> >>
> >> >> >> I agree wording is wrong but IMO it is not ambiguous: we get
an
> >> >> inputstream
> >> >> >> or reader (and we *don't* want to check it is a file or not)
so we
> >> just
> >> >> >> count the chars or bytes we read. All other implementation
would
> >> lead to
> >> >> >> confusion IMO (make default text file reader compliant friendly).
> >> >> >>
> >> >> >> We can start this way and if we have issues go further but
I
> really
> >> >> doubt
> >> >> >> we need it.
> >> >> >>
> >> >> >> What's your opinion?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Romain Manni-Bucau
> >> >> >> Twitter: @rmannibucau
> >> >> >> Blog: http://rmannibucau.wordpress.com/
> >> >> >> LinkedIn: http://fr.linkedin.com/in/rmannibucau
> >> >> >> Github: https://github.com/rmannibucau
> >> >> >>
> >> >> >>
> >> >> >> 2014-07-23 13:21 GMT+02:00 Hendrik Dev <hendrikdev22@gmail.com>:
> >> >> >>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> the JSR 353 API says about JsonLocation.getStreamOffset()
> >> >> >>>
> >> >> >>> "long getStreamOffset()
> >> >> >>>
> >> >> >>> Return the stream offset into the input source this location
is
> >> >> >>> pointing to. If the input source is a file or a byte stream
then
> >> this
> >> >> >>> is the byte offset into that stream, but if the input
source is a
> >> >> >>> character media then the offset is the character offset.
Returns
> -1
> >> if
> >> >> >>> there is no offset available."
> >> >> >>>
> >> >> >>> There are IMHO two issues here:
> >> >> >>>
> >> >> >>> 1) How can we know that the input source is a file(stream)?
We
> can
> >> >> >>> only know if the parser  read from an Inputstream (=byte
stream)
> or
> >> >> >>> from an Reader (=character stream). Wording here is
> >> unclear/ambiguous.
> >> >> >>>
> >> >> >>> 2) Since a UTF8 or UTF16 character can map to one, two,
three or
> >> four
> >> >> >>> bytes the output can be very confusing (especially if
the user
> don't
> >> >> >>> know whether the parser was constructed form a byte or
character
> >> >> >>> stream and which charset is used).
> >> >> >>>
> >> >> >>> Seems that the RI is not implementing these distinctions,
if i
> >> looked
> >> >> >>> correctly they always return character offsets.
> >> >> >>>
> >> >> >>> So want we want do to?
> >> >> >>>
> >> >> >>> Thanks
> >> >> >>> Hendrik
> >> >> >>>
> >> >> >>>
> >> >> >>> --
> >> >> >>> Hendrik Saly (salyh, hendrikdev22)
> >> >> >>> @hendrikdev22
> >> >> >>> PGP: 0x22D7F6EC
> >> >> >>>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Hendrik Saly (salyh, hendrikdev22)
> >> >> > @hendrikdev22
> >> >> > PGP: 0x22D7F6EC
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Hendrik Saly (salyh, hendrikdev22)
> >> >> @hendrikdev22
> >> >> PGP: 0x22D7F6EC
> >> >>
> >>
> >>
> >>
> >> --
> >> Hendrik Saly (salyh, hendrikdev22)
> >> @hendrikdev22
> >> PGP: 0x22D7F6EC
> >>
>
>
>
> --
> Hendrik Saly (salyh, hendrikdev22)
> @hendrikdev22
> PGP: 0x22D7F6EC
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message