commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: [CSV] CSVMutableRecord
Date Wed, 16 Aug 2017 17:27:53 GMT
Let's summarize the options:

0) do nothing
1) Add two put methods to CVSRecord making the class mutable
2) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat such
that a new boolean in CVSRecord allow method from 1) above to either work
or throw an exception.
3) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat such
that subclass of CVSRecord called CVSMutableRecord is created which
contains two new put methods

What else?

I like the simplest option: 1.

Gary

On Tue, Aug 15, 2017 at 6:01 PM, Gilles <gilles@harfang.homelinux.org>
wrote:

> On Tue, 15 Aug 2017 17:43:26 -0600, Gary Gregory wrote:
>
>> On Tue, Aug 15, 2017 at 5:32 PM, Gilles <gilles@harfang.homelinux.org>
>> wrote:
>>
>> On Tue, 15 Aug 2017 22:52:32 +0000, nitin mahendru wrote:
>>>
>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>
>>>> On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>> <nitin.mahendru88@gmail.com
>>>>>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>> How about having a state in the class itself which says that it's
>>>>>
>>>>>> mutable
>>>>>> or not.
>>>>>> If we call a setter on an immutable then it throws an exception.
>>>>>> By default the records are immutable and you need to make them
>>>>>> mutable
>>>>>> using a new API.
>>>>>>
>>>>>>
>>>>> A code example would be useful...
>>>>
>>>>
>>>>
>>>>
>>>> Below is the pull request I added.
>>>> https://github.com/apache/commons-csv/pull/21
>>>>
>>>>
>>> As I indicated in the previous message, this is functionally
>>> breaking. [I'm diverting this discussion over to the "dev"
>>> mailing list.]
>>>
>>>
>> Saying that making record mutable is "breaking" is a bit unfair when we do
>> NOT document the mutability of the class in the first place.
>>
>
> I'm stating a fact: class is currently immutable, change
> would make it mutable; it is functionally breaking.
> I didn't say that you are forbidden to do it; just that
> it would be unwise, particularly if it would be to save
> a few bytes.
>
> Gilles
>
>
>
>> Gary
>>
>>
>>
>>> The following should be an interesting read:
>>>   http://markmail.org/message/6ytvmxvy2ndsfp7h
>>>
>>>
>>> Regards,
>>> Gilles
>>>
>>>
>>>
>>>
>>> On Tue, Aug 15, 2017 at 11:17 AM Gilles <gilles@harfang.homelinux.org>
>>>> wrote:
>>>>
>>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>
>>>>> > On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>> > <nitin.mahendru88@gmail.com
>>>>> >> wrote:
>>>>> >
>>>>> >> How about having a state in the class itself which says that
it's
>>>>> >> mutable
>>>>> >> or not.
>>>>> >> If we call a setter on an immutable then it throws an exception.
>>>>> >> By default the records are immutable and you need to make them
>>>>> >> mutable
>>>>> >> using a new API.
>>>>>
>>>>> A code example would be useful...
>>>>>
>>>>> >> pros: Saves memory, Keeps the immutability benefits
>>>>>
>>>>> What kind of usage are you considering that a single transient
>>>>> record matters (as compared to the ~300 MB of the JVM itself...)?
>>>>>
>>>>> >> cons: people using "mutable" records need to be careful.(While
>>>>> >> threading
>>>>> >> maybe)
>>>>> >>
>>>>> >
>>>>> > Interesting idea!
>>>>> >
>>>>> > But I think I like the idea of a subclass better if we are going
to
>>>>> > split
>>>>> > the behavior b/w mutable and immutable.
>>>>>
>>>>> Once you have a subclass that is able to modify the state of
>>>>> its parent, it's a mutable object. Period.
>>>>> There is no such thing as a "split".
>>>>>
>>>>> >
>>>>> > For my money and the KISS principle, I would just add the put method
>>>>> > in
>>>>> > CSVRecord.
>>>>>
>>>>> Then, any use that assumes immutability will be broken.
>>>>>
>>>>>
>>>>> Gilles
>>>>>
>>>>>
>>>>> > Gary
>>>>> >
>>>>> >>
>>>>> >> -Nitin
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Aug 15, 2017 at 9:01 AM Gilles
>>>>> >> <gilles@harfang.homelinux.org>
>>>>> >> wrote:
>>>>> >>
>>>>> >> > On Tue, 15 Aug 2017 09:49:04 -0600, Gary Gregory wrote:
>>>>> >> > > That looks odd to me. What comes up for me is the
use case where
>>>>> >> I
>>>>> >> > > want to
>>>>> >> > > ETL a file of 10,000,000 records and update, say,
one column. If
>>>>> >> am
>>>>> >> > > forced
>>>>> >> > > to create a brand new record for every record read,
that would
>>>>> >> be a
>>>>> >> > > shame.
>>>>> >> >
>>>>> >> > Why?
>>>>> >> >
>>>>> >> > > If I had a mutable record, I could just keep on updating
it and
>>>>> >> using
>>>>> >> > > it to
>>>>> >> > > write each row. Read record, update it, write record.
No extra
>>>>> >> memory
>>>>> >> > > needed.
>>>>> >> >
>>>>> >> > How is the size of 1 additional record going to matter
compared to
>>>>> >> the
>>>>> >> > size of the whole program?
>>>>> >> >
>>>>> >> > > Either we can make the current record mutable (what's
the harm?)
>>>>> >> or
>>>>> >> > > we can
>>>>> >> > > make the parser serve out mutable records based on
a config
>>>>> >> setting.
>>>>> >> > > This
>>>>> >> > > could be a subclass of CSVRecord with the extra method
I
>>>>> >> proposed.
>>>>> >> >
>>>>> >> > The harm is that you loose all the promises of immutability.
>>>>> >> >
>>>>> >> > Regards,
>>>>> >> > Gilles
>>>>> >> >
>>>>> >> > >
>>>>> >> > > Thoughts?
>>>>> >> > >
>>>>> >> > > Gary
>>>>> >> > >
>>>>> >> > > On Tue, Aug 15, 2017 at 8:33 AM, Gilles
>>>>> >> > > <gilles@harfang.homelinux.org>
>>>>> >> > > wrote:
>>>>> >> > >
>>>>> >> > >> On Tue, 15 Aug 2017 08:01:53 -0600, Gary Gregory
wrote:
>>>>> >> > >>
>>>>> >> > >>> How does that work when you want to change
more than one
>>>>> >> value?
>>>>> >> > >>>
>>>>> >> > >>
>>>>> >> > >> How about a "vararg" argument:
>>>>> >> > >>
>>>>> >> > >> /**
>>>>> >> > >>  * @param orig Original to be copied.
>>>>> >> > >>  * @param replace Fields to be replaced.
>>>>> >> > >>  */
>>>>> >> > >> public static CSVRecord createRecord(CSVRecord
orig,
>>>>> >> > >>                                      Pair<Integer,
String> ...
>>>>> >> > >> replace) {
>>>>> >> > >>     // ...
>>>>> >> > >> }
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> > >> Gilles
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> > >>> Gary
>>>>> >> > >>>
>>>>> >> > >>> On Aug 15, 2017 00:17, "Benedikt Ritter" <britter@apache.org>
>>>>> >> > >>> wrote:
>>>>> >> > >>>
>>>>> >> > >>> Hi,
>>>>> >> > >>>>
>>>>> >> > >>>> I very much like that CSVRecord is unmodifiable.
So I’d
>>>>> >> suggest an
>>>>> >> > >>>> API,
>>>>> >> > >>>> that creates a new record instead of mutating
the existing
>>>>> >> one:
>>>>> >> > >>>>
>>>>> >> > >>>> CSVRecord newRecord = myRecord.put(1,
„value")
>>>>> >> > >>>>
>>>>> >> > >>>> I’m not sure about „put“ as a method
name since it clashes
>>>>> >> with
>>>>> >> > >>>> java.util.Map#put, which is mutation based...
>>>>> >> > >>>>
>>>>> >> > >>>> Regards,
>>>>> >> > >>>> Benedikt
>>>>> >> > >>>>
>>>>> >> > >>>> > Am 15.08.2017 um 02:54 schrieb Gary
Gregory
>>>>> >> > >>>> <garydgregory@gmail.com>:
>>>>> >> > >>>> >
>>>>> >> > >>>> > Feel free to provide a PR on GitHub
:-)
>>>>> >> > >>>> >
>>>>> >> > >>>> > Gary
>>>>> >> > >>>> >
>>>>> >> > >>>> > On Aug 14, 2017 15:29, "Gary Gregory"
>>>>> >> <garydgregory@gmail.com>
>>>>> >> > >>>> wrote:
>>>>> >> > >>>> >
>>>>> >> > >>>> >> I think we've kept the design
as YAGNI as possible... :-)
>>>>> >> > >>>> >>
>>>>> >> > >>>> >> Gary
>>>>> >> > >>>> >>
>>>>> >> > >>>> >> On Mon, Aug 14, 2017 at 3:25
PM, nitin mahendru <
>>>>> >> > >>>> >> nitin.mahendru88@gmail.com>
wrote:
>>>>> >> > >>>> >>
>>>>> >> > >>>> >>> Yeah that also is OK. I though
there is a reason to keep
>>>>> >> the
>>>>> >> > >>>> CSVRecord
>>>>> >> > >>>> >>> without setters. But maybe
not!
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>> Nitin
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>> On Mon, Aug 14, 2017 at 2:22
PM Gary Gregory
>>>>> >> > >>>> <garydgregory@gmail.com
>>>>> >> > >>>> >
>>>>> >> > >>>> >>> wrote:
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>>> Hi All:
>>>>> >> > >>>> >>>>
>>>>> >> > >>>> >>>> Should we consider adding
put(int,Object) and
>>>>> >> put(String,
>>>>> >> > >>>> Object) to
>>>>> >> > >>>> the
>>>>> >> > >>>> >>>> current CSVRecord class?
>>>>> >> > >>>> >>>>
>>>>> >> > >>>> >>>> Gary
>>>>> >> > >>>> >>>>
>>>>> >> > >>>> >>>> On Mon, Aug 14, 2017
at 2:54 PM, nitin mahendru <
>>>>> >> > >>>> >>>> nitin.mahendru88@gmail.com>
>>>>> >> > >>>> >>>> wrote:
>>>>> >> > >>>> >>>>
>>>>> >> > >>>> >>>>> Hi Everyone,
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> I recently pushed
a change(pull request 20) to get the
>>>>> >> line
>>>>> >> > >>>> ending
>>>>> >> > >>>> >>> from
>>>>> >> > >>>> >>>> the
>>>>> >> > >>>> >>>>> parser.
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> Now I want to push
another change which I feel will
>>>>> >> also be
>>>>> >> > >>>> useful
>>>>> >> > >>>> for
>>>>> >> > >>>> >>>> the
>>>>> >> > >>>> >>>>> community. I want
to add a CSVRecordMutable class which
>>>>> >> had
>>>>> >> > >>>> a
>>>>> >> > >>>> >>> constructor
>>>>> >> > >>>> >>>>> which accepts a CSVRecord
object. So when we have a
>>>>> >> > >>>> CSVRecordMutable
>>>>> >> > >>>> >>>> object
>>>>> >> > >>>> >>>>> from it then we can
edit individual columns using it.
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> I would be using
this to write back my edited CSV file.
>>>>> >> My
>>>>> >> > >>>> use case
>>>>> >> > >>>> >>> is to
>>>>> >> > >>>> >>>>> read a csv, mangle
some columns, write back a new csv.
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> I could have directly
raised a pull request but I just
>>>>> >> > >>>> wanted to
>>>>> >> > >>>> float
>>>>> >> > >>>> >>>> the
>>>>> >> > >>>> >>>>> idea before and see
the reaction.
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> Thanks
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>> Nitin
>>>>> >> > >>>> >>>>>
>>>>> >> > >>>> >>>>
>>>>> >> > >>>> >>>
>>>>> >> > >>>> >>
>>>>> >> > >>>> >>
>>>>> >> > >>>>
>>>>> >> > >>>>
>>>>> >> > >>>>
>>>>> >> > >>
>>>>>
>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message