commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [CSV] CSVMutableRecord
Date Thu, 17 Aug 2017 00:28:40 GMT
On Wed, 16 Aug 2017 11:27:53 -0600, Gary Gregory wrote:
> Let's summarize the options:
>
> 0) do nothing
> 1) Add two put methods to CVSRecord making the class mutable
> 2) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat 
> such
> that a new boolean in CVSRecord allow method from 1) above to either 
> work
> or throw an exception.
> 3) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat 
> such
> that subclass of CVSRecord called CVSMutableRecord is created which
> contains two new put methods
>
> What else?

4) The factory method:
  /**
   * @param orig Original to be copied.
   * @param replace Fields to be replaced.
   * @return a copy of "orig", except for the fields in "replace".
   */
  public static CSVRecord createRecord(CSVRecord orig,
                                       Pair<Integer, String> ... 
replace)

Could also be:
  public static CSVRecord createRecord(CSVRecord orig,
                                       int[] replaceIndices,
                                       String[] replaceValues)

Gilles

>
> I like the simplest option: 1.
>
> Gary
>
> On Tue, Aug 15, 2017 at 6:01 PM, Gilles 
> <gilles@harfang.homelinux.org>
> wrote:
>
>> On Tue, 15 Aug 2017 17:43:26 -0600, Gary Gregory wrote:
>>
>>> On Tue, Aug 15, 2017 at 5:32 PM, Gilles 
>>> <gilles@harfang.homelinux.org>
>>> wrote:
>>>
>>> On Tue, 15 Aug 2017 22:52:32 +0000, nitin mahendru wrote:
>>>>
>>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>>
>>>>> On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>>> <nitin.mahendru88@gmail.com
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>> How about having a state in the class itself which says that 
>>>>>> it's
>>>>>>
>>>>>>> mutable
>>>>>>> or not.
>>>>>>> If we call a setter on an immutable then it throws an 
>>>>>>> exception.
>>>>>>> By default the records are immutable and you need to make them
>>>>>>> mutable
>>>>>>> using a new API.
>>>>>>>
>>>>>>>
>>>>>> A code example would be useful...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Below is the pull request I added.
>>>>> https://github.com/apache/commons-csv/pull/21
>>>>>
>>>>>
>>>> As I indicated in the previous message, this is functionally
>>>> breaking. [I'm diverting this discussion over to the "dev"
>>>> mailing list.]
>>>>
>>>>
>>> Saying that making record mutable is "breaking" is a bit unfair 
>>> when we do
>>> NOT document the mutability of the class in the first place.
>>>
>>
>> I'm stating a fact: class is currently immutable, change
>> would make it mutable; it is functionally breaking.
>> I didn't say that you are forbidden to do it; just that
>> it would be unwise, particularly if it would be to save
>> a few bytes.
>>
>> Gilles
>>
>>
>>
>>> Gary
>>>
>>>
>>>
>>>> The following should be an interesting read:
>>>>   http://markmail.org/message/6ytvmxvy2ndsfp7h
>>>>
>>>>
>>>> Regards,
>>>> Gilles
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 15, 2017 at 11:17 AM Gilles 
>>>> <gilles@harfang.homelinux.org>
>>>>> wrote:
>>>>>
>>>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>>
>>>>>> > On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>>> > <nitin.mahendru88@gmail.com
>>>>>> >> wrote:
>>>>>> >
>>>>>> >> How about having a state in the class itself which says
that 
>>>>>> it's
>>>>>> >> mutable
>>>>>> >> or not.
>>>>>> >> If we call a setter on an immutable then it throws an 
>>>>>> exception.
>>>>>> >> By default the records are immutable and you need to make

>>>>>> them
>>>>>> >> mutable
>>>>>> >> using a new API.
>>>>>>
>>>>>> A code example would be useful...
>>>>>>
>>>>>> >> pros: Saves memory, Keeps the immutability benefits
>>>>>>
>>>>>> What kind of usage are you considering that a single transient
>>>>>> record matters (as compared to the ~300 MB of the JVM 
>>>>>> itself...)?
>>>>>>
>>>>>> >> cons: people using "mutable" records need to be 
>>>>>> careful.(While
>>>>>> >> threading
>>>>>> >> maybe)
>>>>>> >>
>>>>>> >
>>>>>> > Interesting idea!
>>>>>> >
>>>>>> > But I think I like the idea of a subclass better if we are 
>>>>>> going to
>>>>>> > split
>>>>>> > the behavior b/w mutable and immutable.
>>>>>>
>>>>>> Once you have a subclass that is able to modify the state of
>>>>>> its parent, it's a mutable object. Period.
>>>>>> There is no such thing as a "split".
>>>>>>
>>>>>> >
>>>>>> > For my money and the KISS principle, I would just add the put

>>>>>> method
>>>>>> > in
>>>>>> > CSVRecord.
>>>>>>
>>>>>> Then, any use that assumes immutability will be broken.
>>>>>>
>>>>>>
>>>>>> Gilles
>>>>>>
>>>>>>
>>>>>> > Gary
>>>>>> >
>>>>>> >>
>>>>>> >> -Nitin
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Tue, Aug 15, 2017 at 9:01 AM Gilles
>>>>>> >> <gilles@harfang.homelinux.org>
>>>>>> >> wrote:
>>>>>> >>
>>>>>> >> > On Tue, 15 Aug 2017 09:49:04 -0600, Gary Gregory wrote:
>>>>>> >> > > That looks odd to me. What comes up for me is
the use 
>>>>>> case where
>>>>>> >> I
>>>>>> >> > > want to
>>>>>> >> > > ETL a file of 10,000,000 records and update, say,
one 
>>>>>> column. If
>>>>>> >> am
>>>>>> >> > > forced
>>>>>> >> > > to create a brand new record for every record
read, that 
>>>>>> would
>>>>>> >> be a
>>>>>> >> > > shame.
>>>>>> >> >
>>>>>> >> > Why?
>>>>>> >> >
>>>>>> >> > > If I had a mutable record, I could just keep on
updating 
>>>>>> it and
>>>>>> >> using
>>>>>> >> > > it to
>>>>>> >> > > write each row. Read record, update it, write
record. No 
>>>>>> extra
>>>>>> >> memory
>>>>>> >> > > needed.
>>>>>> >> >
>>>>>> >> > How is the size of 1 additional record going to matter

>>>>>> compared to
>>>>>> >> the
>>>>>> >> > size of the whole program?
>>>>>> >> >
>>>>>> >> > > Either we can make the current record mutable
(what's the 
>>>>>> harm?)
>>>>>> >> or
>>>>>> >> > > we can
>>>>>> >> > > make the parser serve out mutable records based
on a 
>>>>>> config
>>>>>> >> setting.
>>>>>> >> > > This
>>>>>> >> > > could be a subclass of CSVRecord with the extra
method I
>>>>>> >> proposed.
>>>>>> >> >
>>>>>> >> > The harm is that you loose all the promises of 
>>>>>> immutability.
>>>>>> >> >
>>>>>> >> > Regards,
>>>>>> >> > Gilles
>>>>>> >> >
>>>>>> >> > >
>>>>>> >> > > Thoughts?
>>>>>> >> > >
>>>>>> >> > > Gary
>>>>>> >> > >
>>>>>> >> > > On Tue, Aug 15, 2017 at 8:33 AM, Gilles
>>>>>> >> > > <gilles@harfang.homelinux.org>
>>>>>> >> > > wrote:
>>>>>> >> > >
>>>>>> >> > >> On Tue, 15 Aug 2017 08:01:53 -0600, Gary Gregory
wrote:
>>>>>> >> > >>
>>>>>> >> > >>> How does that work when you want to change
more than 
>>>>>> one
>>>>>> >> value?
>>>>>> >> > >>>
>>>>>> >> > >>
>>>>>> >> > >> How about a "vararg" argument:
>>>>>> >> > >>
>>>>>> >> > >> /**
>>>>>> >> > >>  * @param orig Original to be copied.
>>>>>> >> > >>  * @param replace Fields to be replaced.
>>>>>> >> > >>  */
>>>>>> >> > >> public static CSVRecord createRecord(CSVRecord
orig,
>>>>>> >> > >>                                      Pair<Integer,

>>>>>> String> ...
>>>>>> >> > >> replace) {
>>>>>> >> > >>     // ...
>>>>>> >> > >> }
>>>>>> >> > >>
>>>>>> >> > >>
>>>>>> >> > >> Gilles
>>>>>> >> > >>
>>>>>> >> > >>
>>>>>> >> > >>
>>>>>> >> > >>> Gary
>>>>>> >> > >>>
>>>>>> >> > >>> On Aug 15, 2017 00:17, "Benedikt Ritter"

>>>>>> <britter@apache.org>
>>>>>> >> > >>> wrote:
>>>>>> >> > >>>
>>>>>> >> > >>> Hi,
>>>>>> >> > >>>>
>>>>>> >> > >>>> I very much like that CSVRecord is
unmodifiable. So 
>>>>>> I’d
>>>>>> >> suggest an
>>>>>> >> > >>>> API,
>>>>>> >> > >>>> that creates a new record instead
of mutating the 
>>>>>> existing
>>>>>> >> one:
>>>>>> >> > >>>>
>>>>>> >> > >>>> CSVRecord newRecord = myRecord.put(1,
„value")
>>>>>> >> > >>>>
>>>>>> >> > >>>> I’m not sure about „put“ as
a method name since it 
>>>>>> clashes
>>>>>> >> with
>>>>>> >> > >>>> java.util.Map#put, which is mutation
based...
>>>>>> >> > >>>>
>>>>>> >> > >>>> Regards,
>>>>>> >> > >>>> Benedikt
>>>>>> >> > >>>>
>>>>>> >> > >>>> > Am 15.08.2017 um 02:54 schrieb
Gary Gregory
>>>>>> >> > >>>> <garydgregory@gmail.com>:
>>>>>> >> > >>>> >
>>>>>> >> > >>>> > Feel free to provide a PR on
GitHub :-)
>>>>>> >> > >>>> >
>>>>>> >> > >>>> > Gary
>>>>>> >> > >>>> >
>>>>>> >> > >>>> > On Aug 14, 2017 15:29, "Gary
Gregory"
>>>>>> >> <garydgregory@gmail.com>
>>>>>> >> > >>>> wrote:
>>>>>> >> > >>>> >
>>>>>> >> > >>>> >> I think we've kept the design
as YAGNI as 
>>>>>> possible... :-)
>>>>>> >> > >>>> >>
>>>>>> >> > >>>> >> Gary
>>>>>> >> > >>>> >>
>>>>>> >> > >>>> >> On Mon, Aug 14, 2017 at 3:25
PM, nitin mahendru <
>>>>>> >> > >>>> >> nitin.mahendru88@gmail.com>
wrote:
>>>>>> >> > >>>> >>
>>>>>> >> > >>>> >>> Yeah that also is OK.
I though there is a reason 
>>>>>> to keep
>>>>>> >> the
>>>>>> >> > >>>> CSVRecord
>>>>>> >> > >>>> >>> without setters. But
maybe not!
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>> Nitin
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>> On Mon, Aug 14, 2017
at 2:22 PM Gary Gregory
>>>>>> >> > >>>> <garydgregory@gmail.com
>>>>>> >> > >>>> >
>>>>>> >> > >>>> >>> wrote:
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>>> Hi All:
>>>>>> >> > >>>> >>>>
>>>>>> >> > >>>> >>>> Should we consider
adding put(int,Object) and
>>>>>> >> put(String,
>>>>>> >> > >>>> Object) to
>>>>>> >> > >>>> the
>>>>>> >> > >>>> >>>> current CSVRecord
class?
>>>>>> >> > >>>> >>>>
>>>>>> >> > >>>> >>>> Gary
>>>>>> >> > >>>> >>>>
>>>>>> >> > >>>> >>>> On Mon, Aug 14, 2017
at 2:54 PM, nitin mahendru <
>>>>>> >> > >>>> >>>> nitin.mahendru88@gmail.com>
>>>>>> >> > >>>> >>>> wrote:
>>>>>> >> > >>>> >>>>
>>>>>> >> > >>>> >>>>> Hi Everyone,
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> I recently pushed
a change(pull request 20) to 
>>>>>> get the
>>>>>> >> line
>>>>>> >> > >>>> ending
>>>>>> >> > >>>> >>> from
>>>>>> >> > >>>> >>>> the
>>>>>> >> > >>>> >>>>> parser.
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> Now I want to
push another change which I feel 
>>>>>> will
>>>>>> >> also be
>>>>>> >> > >>>> useful
>>>>>> >> > >>>> for
>>>>>> >> > >>>> >>>> the
>>>>>> >> > >>>> >>>>> community. I
want to add a CSVRecordMutable 
>>>>>> class which
>>>>>> >> had
>>>>>> >> > >>>> a
>>>>>> >> > >>>> >>> constructor
>>>>>> >> > >>>> >>>>> which accepts
a CSVRecord object. So when we 
>>>>>> have a
>>>>>> >> > >>>> CSVRecordMutable
>>>>>> >> > >>>> >>>> object
>>>>>> >> > >>>> >>>>> from it then
we can edit individual columns 
>>>>>> using it.
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> I would be using
this to write back my edited 
>>>>>> CSV file.
>>>>>> >> My
>>>>>> >> > >>>> use case
>>>>>> >> > >>>> >>> is to
>>>>>> >> > >>>> >>>>> read a csv, mangle
some columns, write back a 
>>>>>> new csv.
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> I could have
directly raised a pull request but 
>>>>>> I just
>>>>>> >> > >>>> wanted to
>>>>>> >> > >>>> float
>>>>>> >> > >>>> >>>> the
>>>>>> >> > >>>> >>>>> idea before and
see the reaction.
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> Thanks
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>> Nitin
>>>>>> >> > >>>> >>>>>
>>>>>> >> > >>>> >>>>
>>>>>> >> > >>>> >>>
>>>>>> >> > >>>> >>
>>>>>> >> > >>>> >>
>>>>>> >> > >>>>
>>>>>> >> > >>>>
>>>>>> >> > >>>>
>>>>>> >> > >>
>>>>>>
>>>>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message