commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <>
Subject Re: [CSV] CSVMutableRecord
Date Fri, 18 Aug 2017 17:34:54 GMT
To be clear, you are then OK with simply adding the two put() methods to
CSVRecord? Option 1 in my eariler message.


On Fri, Aug 18, 2017 at 10:05 AM, Gilles <>

> On Fri, 18 Aug 2017 09:36:06 -0600, Gary Gregory wrote:
>> Please see branch CSV-216 for a KISS implementation that uses a
>> CSVMutableRecord subclass.
> I don't think anyone gains anything through this subclassing.
> A client can no longer assume that an instance of "CSVRecord" is
> immutable and will have to make a defensive copy or an "instanceof"
> check (that will be obsolete/broken whenever the hierarchy is
> modified).
> Better assume a functionally breaking change and add the methods
> to class "CSVRecord"...
> Gilles
>> I do not believe this feature warrants creating interfaces or
>> framework-like code. I do not believe we need to start leaning the
>> JDBC-way.
>> Gary
>> On Thu, Aug 17, 2017 at 3:04 PM, Simon Spero <> wrote:
>> On Aug 15, 2017 8:01 PM, "Gilles" <> wrote:
>>> Saying that making record mutable is "breaking" is a bit unfair when we
>>> do
>>> > NOT document the mutability of the class in the first place.
>>> >
>>> I'm stating a fact: class is currently immutable, change would make it
>>> mutable; it is functionally breaking.
>>> I didn't say that you are forbidden to do it; just that it would be
>>> unwise,
>>> particularly if it would be to save a few bytes.
>>> Exactly.
>>> TL;DR. This is almost always a breaking semantic change; the safest  ways
>>> of implementing it are binary breaking; it's unlikely to have a major
>>> performance impact; it might be better to create a new API module for
>>> enhancements, with current package as legacy or implementation.
>>> If a class previously exposed no mutators, adding one is usually a major
>>> change. This is especially true for final classes, but it still affects
>>> use
>>> cases where an instance is owned by another class, which may rely on the
>>> lack of mutability to avoid making defensive copies.
>>> Of course, a final class that has a  package-private getter  to a shared
>>> copy of its backing array could be considered to be sending mixed
>>> messages...
>>> It is possible that a mutable class might have significant performance
>>> advantages over an immutable one beyond saving a few bytes. For example,
>>> if
>>> the updates are simple, and depend on the previous value of the cell,
>>> then
>>> a mutable version might have better cache behavior. If there's other
>>> sources of cache pressure this might have a higher than expected impact.
>>> The costs of copying the original values might also be relatively
>>> significant.
>>> For an ETL use case these issues are unlikely to be limiting factors;
>>> for a
>>> start, there's a non-zero chance that a  CSVRecord was extracted  by
>>> parsing a CSV file. Also a transform will require conversion to some sort
>>> of Number (or String allocation).
>>> The current API doesn't easily support adding alternate implementations
>>> of
>>> the relevant types. Implementation classes are final, and important
>>> return
>>> types are concrete.
>>> One solution might be to treat the current code as almost an
>>> implementation
>>> module, define a separate API module, and add extra interfaces and
>>> alternate  implementations to support  the target use case (mutable
>>> records, streams, reactivex, transform functions or what have you).
>>> Simon
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message