orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From majetideepak <...@git.apache.org>
Subject [GitHub] orc issue #120: ORC-184: Refactor ColumnStatistics classes for writer
Date Fri, 12 May 2017 21:35:19 GMT
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/120
  
    To answer your concerns,
    1) The functions like increase, merge, reset, etc. can be added to InternalStatisticsImpl
also. I don't see the need for a base class. The implementation for non-primitive types must
be specialized anyway.
    2) I don't see how the Types can be different between the ColumnWriter and the ColumnStatistics.
Can you give an example? If your [Type]ColumnWriter design is similar to the existing [Type]ColumnReader
design, then we can easily stick in an InternalStatisticsImpl<Type> instance.
    
    I opened ORC-185 as a pattern we could use to overcome the lack of a base class with [Type]ColumnStatisticsImpl.
Ultimately, we should definitely do what is easier to get the Writer code in with minimal
code changes possible. However, without the ColumnWriter design, it is hard to enumerate and
discuss other options for ColumnStatistics.
    However, putting methods that should be internal, in the public API is definitely not
the right approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message