commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: [CSV] Headers and the first record
Date Wed, 31 Jul 2013 15:41:21 GMT
On Wed, Jul 31, 2013 at 10:48 AM, Gary Gregory <garydgregory@gmail.com>wrote:

> On Wed, Jul 31, 2013 at 9:34 AM, Emmanuel Bourg <ebourg@apache.org> wrote:
>
>> Le 31/07/2013 15:08, Gary Gregory a écrit :
>>
>> > But that is exactly what _was_ happening! ;)
>> >
>> > If I called withHeader("A", "B", "C") the header was not skipped.
>>
>> Sounds good. The header is defined in the code, we don't expect to see
>> the header in the file so nothing is skipped.
>>
>
> NOT good! ;) This is where we disagree. The parser used to behave
> differently depending on the contents of the String[].
> - From an API design standpoint, it's smelly to me.
> - The feature is hard to understand. If we want that, we need two APIs for
> two behaviors.
>
> Using the withHeader API, I can tell the parser to:
> - Ignore the fact that there is a header record, I am overriding it with
> my own names
> - There is no header record, so I am telling you what the header names are.
>
> These two features clash because in one case the file has a header line
> and in the other the file does not. This is why we need settings with
> different names.
>
> That or a setting that says 'skip the first record, it's the header, I do
> not want to see it as a data record'
>
> I see three scenarios:
>
> 1) I set the headers (the file does not have one), do not skip the first
> record
> 2) I override the existing header record, skip the first record
> 3) The parser guesses the headers based on reading the first record, which
> skips the first record as a data record
>
> This can be accommodated with a skipHeaderRecord boolean setting.
>
> I do not care what the default behavior is as long as I can say "this file
> has headers, guess them please, and skip record 0" and "this file has a
> header record, but I'm telling you to call them A, B, and C, so skip record
> 0"
>
> 1) withHeader("A", "B", "C").skipHeaderRecord(false);
> 2) withHeader("A", "B", "C").skipHeaderRecord(true);
> 3) withHeader()
>
> Is there a better name for skipHeaderRecord? Maybe:
>
> 1b) withHeader("A", "B", "C").firstRecordIsHeader(false);
> 2b) withHeader("A", "B", "C").firstRecordIsHeader(true);
>
> Here the difference is that the API does not describe behavior, instead it
> describes the data, and behavior is implied.
>
> There is also:
>
> 1c) withHeader("A", "B", "C")
> 2c) withHeaderOverride("A", "B", "C")
>
> Thoughts?
>

I reverted back to NOT skipping a record when withHeader is called with a
non-empty array; and added a skipHeaderRecord setting to CSVFormat to use
when headers are initialized.

Gary


>
> Gary
>
>
>>
>> > If I called withHeader(new String[]{}) the header was skipped.
>>
>> Correct. The header is not defined in the code, the parser uses the
>> first record as header and doesn't return it when iterating.
>>
>> > If I called withHeader() the header was skipped (same as line above).
>>
>> Sounds good too.
>>
>>
>> What was the issue again ? ;)
>>
>>
>> > What I am asking is: should we have a saveHeader setting such that IF
>> you
>> > ask for headers, then we save that record in the parser, it is currently
>> > "lost", or, actually transformed into the header map.
>>
>> Keeping the header around might be useful, I wouldn't create a format
>> parameter for this though. It could be made available at the record
>> level, much like ResultSet.getMetaData().
>>
>> Emmanuel Bourg
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> --
> E-Mail: garydgregory@gmail.com | ggregory@apache.org
> Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> Spring Batch in Action <http://www.manning.com/templier/>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory
>



-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message