commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: [csv] CSVFormat API names
Date Tue, 16 Oct 2012 15:33:01 GMT
On 16 October 2012 16:29, Gary Gregory <garydgregory@gmail.com> wrote:
> On Tue, Oct 16, 2012 at 11:04 AM, Matt Benson <gudnabrsam@gmail.com> wrote:
>
>> Random thoughts--no real context here, so no way to inline:
>>
>> - "line separator" concept, while harmonizing with the line.separator
>> system property, might be better represented as "row separator" so as
>> not to imply that the parameter should be in any way limited to \r or
>> \n .  I would think the default for this would be the line.separator
>> property, however, and thus should take a String or CharSequence
>> (perhaps it already does, but there's been so much talk about char
>> parameters...).
>>
>
> Now that you mention it, this should have been obvious as soon as we wrote
> the test cases where a record is split over more than one line.
>
> There is a difference between line number and record number which the API
> tracks.
>
> I propose to change "line separator" to "record separator". The default can
> be line.separator.

OK.
I prefer record to row.

> Gary
>
>
>>
>> - with* methods:  just something to think about here, but while we're
>> creating a fluent API, would e.g. #delimitedBy('\t') read more
>> fluently than #withDelimiter('\t') ?  #escapingWith('\\') vs.
>> #withEscape('\\') ?
>>
>> $0.02,
>> Matt
>>
>> On Tue, Oct 16, 2012 at 8:53 AM, Jörg Schaible
>> <Joerg.Schaible@scalaris.com> wrote:
>> > Gary Gregory wrote:
>> >
>> >> On Tue, Oct 16, 2012 at 9:14 AM, Jörg Schaible
>> >> <Joerg.Schaible@scalaris.com>wrote:
>> >>
>> >>> Hi Gary,
>> >>>
>> >>> Gary Gregory wrote:
>> >>>
>> >>> > Hi All:
>> >>> >
>> >>> > The format object can configure various aspects of input and output
>> >>> > formatting.
>> >>> >
>> >>> > With my recent addition of the Quote enum for [CSV-53], there are
now
>> >>> > two aspects of quoting to configure: the quote character and the
>> quote
>> >>> > policy (minimal, all, non-numeric, and none.) FYI, 'none' is
>> currently
>> >>> > not implemented.
>> >>> >
>> >>> > First, I changed (without consulting this list, and please accept
my
>> >>> > apologies for this) the - IMO - cryptic and burdensome terminology
of
>> >>> > "encapsulator" to "quote char", and added "quote policy":
>> >>> >
>> >>> > - withQuoteChar(char)
>> >>> > - withQuotePolicy(Quote)
>> >>> >
>> >>> > My intention here is that all Quote APIs start with "withQuote"
>> >>> > followed by what aspect of quoting is being configured.
>> >>> >
>> >>> > Alternatively, we could have:
>> >>> >
>> >>> > - withQuote(char)
>> >>> > - withQuotePolicy(Quote)
>> >>>
>> >>> or
>> >>>
>> >>> - withQuote(char)
>> >>> - withQuote(Quote)
>> >>>
>> >>> ;-)
>> >>>
>> >>
>> >> Darn, I wish I knew you better to know if you were joking! :)
>> >>
>> >> This would not be good IMO because you are configuring two different
>> >> aspects of the behavior. When I see the same API name with different
>> >> parameters, I think that they are the same and that the API just does
>> >> conversions.
>> >>
>> >> We could consider making Quote a class instead of an enum and have it
>> >> carry a char and an enum, such that one object defines all quoting
>> >> aspects. This might be too normalized a design for something so simple
>> >> though.
>> >
>> > Actually I did not had a closer look to the API. You're definitely right
>> to
>> > use different names for different aspects. It does not make sense to
>> > overload just for fun.
>> >
>> >>
>> >>
>> >>>
>> >>> > Which makes the API more consistent with the other char/Character
>> based
>> >>> > properties:
>> >>> >
>> >>> > - withEscape
>> >>> > - withDelimiter
>> >>> > - withLineSeparator
>> >>> > - withCommentStart
>> >>> >
>> >>> > none of the above are post-fixed with a "Char" in the name.
>> >>> >
>> >>> > As far as reading, for me, the "-r" names are OK because the they
are
>> >>> > nouns (things): "a delimiter", "a line separator." But I do not
talk
>> >>> about
>> >>> > "an escape" because that would be an act (think Alcatraz) as opposed
>> to
>> >>> > what we have here: a character used to /perform/ escapes.
>> >>> >
>> >>> > So I propose to change "escape" to "escape char" because "escaper"
is
>> >>> > not a word.
>> >>> >
>> >>> > The name "comment start" is not great also because it implies (to
me)
>> >>> that
>> >>> > there is a "comment end" missing. So plain "comment" or "comment
>> char"
>> >>> > would be better.
>> >>>
>> >>> Who said it has to be a single char?
>> >>>
>> >>
>> >> The current implementation does. ;)
>> >>
>> >> Are comments even in any RFC?
>> >
>> > Not that I am aware of.
>> >
>> >>> .withEOLComment("//")
>> >>>
>> >>>
>> >>> Same applies to the line separator:
>> >>>
>> >>> .withLineSeparator("\n\r")
>> >>>
>> >>> > Circling back to "quote char" which I have the way it is now for
the
>> >>> > same reason as for the "escape" property.
>> >>> >
>> >>> > In summary, using *Char names is better IMO.
>> >>>
>> >>> Only if it can be a single char only. If it can either be a single char
>> >>> or a
>> >>> String, I normally tend to use overloaded methods:
>> >>>
>> >>> - withEOLComment(char)
>> >>> - withEOLComment(CharSequence)
>> >>>
>> >>
>> >> If you want to add // to the mix, please start a different thread. I'm
>> not
>> >> sure this is really needed. Do you have a real life use case?
>> >
>> > People come up with all kind of "solutions" they are used to. CSV is
>> brittle
>> > anyway, just because there is no "real" standard.
>> >
>> > Cheers,
>> > Jörg
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > For additional commands, e-mail: dev-help@commons.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> --
> E-Mail: garydgregory@gmail.com | ggregory@apache.org
> JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0
> Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message