commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedikt Ritter <brit...@apache.org>
Subject Re: [CSV] Prohibit creation of more than one iterator over a CSVParser?
Date Fri, 20 Sep 2013 12:57:59 GMT
Hi,

sorry for the late reply. This is still an unresolved issue. What do we
want to do about it?

I like Emmanuels idea of having only one entry point into the API. Exposing
a parser that uses formats to parse and at the same time having a format
that can create parsers to parse input may be confusing.

However I'm not sure if the single entry point should be the format.
Following SRP I'd say the responsibility of formats is to describe the
structure of CSV data, while the responsibility of a parser is to parse
data with a given format. As I said a format that can parse something feels
strange to me.

Going one step back: We started with the problem that users could call
iterator() more than once and would have two iterator instances that eat up
each others data. This will always be the problem if we return
Iterable<CSVRecord> anywhere. I would expect that each call to iterator()
returns a new iterator instance that starts a the beginning of the set of
elements. I had a look at the JavaDoc [1] of Iterable and was surprised the
see that there is no such guarantee. So maybe creating just one iterator
and returning it on subsequent calls is the solution we should go with?
I think we agree that returning new instances that affect each other is the
worst option.

Benedikt

[1] http://docs.oracle.com/javase/6/docs/api/java/lang/Iterable.html


2013/8/14 Paul Benedict <pbenedict@apache.org>

> Emmanuel, yes. Both the CVSParser and Iterator are two distinct designs.
> Both are types of parsers here. So it's really a question of design. Do you
> want your users to use the parser directly or work through the iterator or
> both? But they shouldn't interfere with each other.
>
> Example:
> CVSParser p = record.parser();
> Iterator i = record.iterator();
>
> In the second example, Iterator is probably hiding the CVSParser
> underneath. This render's the original question moot. You can have as many
> concurrent parsers as you want with proper encapsulation.
>
> Paul
>
>
> On Wed, Aug 14, 2013 at 4:05 PM, Emmanuel Bourg <ebourg@apache.org> wrote:
>
> > Le 14/08/2013 20:22, Benedikt Ritter a écrit :
> >
> >> Thanks for the input. Now is the time to talk about this kind of stuff.
> >> I understand Matt's proposal and it should be relatively easy to
> >> implement.
> >>
> >> However I see Paul's point but don't know yet how to develop the API the
> >> way he suggests.
> >> Paul can your point be summed up as: Either expose an iterator or a
> parser
> >> but not both?
> >>
> >> What do others think? Sebb? Gary? Emmanuel?
> >> This seems like one of the last issues on the road to 1.0
> >>
> >
> > Throwing an IllegalArgumentException if iterator() is called more than
> > once would make sense.
> >
> > Following Paul's idea, we could probably hide the parser completely and
> > make it package private. The main entry point of the API would be
> > CSVFormat.parse(Reader) returning an Iterable<CSVRecord>.
> >
> > The public methods of CSVParser aren't that useful:
> >
> > * close()/isClosed() : resource management can be done outside [csv]
> >
> > * getCurrentLineNumber() : to be moved into CSVRecord?
> >
> > * getHeaderMap() : removed or moved into CSVRecord?
> >
> > * getRecordNumber() : already available in CSVRecord
> >
> > * getRecords() : to be removed. I guess [collections] or [lang] already
> > has a method turning an Iterable into a List?
> >
> >
> > Emmanuel Bourg
> >
> > ------------------------------**------------------------------**---------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<
> dev-unsubscribe@commons.apache.org>
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> --
> Cheers,
> Paul
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message