orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wgtmac <...@git.apache.org>
Subject [GitHub] orc issue #120: ORC-184: Refactor ColumnStatistics classes for writer
Date Fri, 12 May 2017 22:16:44 GMT
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/120
  
    Yes, our current ColumnWriter design follows the approach of ColumnReader. One base class
ColumnWriter takes care of all common work, and specific inherited ColumnWriters like IntegerColumnWriter,
StructColumnWriter, BooleanColumnWriters, etc. manage their own logic. To integrate ColumnStatistics,
we have two options (correct me if I'm wrong):
    1. Put a based ColumnStatistics class in the base ColumnWriter like I said previously,
and in each TypeColumnWriter, use dynamic type of ColumnStatistics to do its type-specific
logic. In this way, we can save many duplicate code by polymorphism in the base ColumnWriter
class; (our current code is written in this fashion)
    2. There is no ColumnStatistic variable in the base class ColumnWriter. Instead, each
inherited TypeColumnWriter class uses one TypeColumnStatisticsImpl (not TypeColumnStatistics),
e.g. IntegerColumnWriter uses IntegerColumnStatisticsImpl. This is feasible except for a little
bit more duplicate stuff compared with the 1st approach.
    
    Your InternalColumnStatisticsImpl design is more inclined to the 2nd approach as it is
still not a base class to use in the base class ColumnWriter. We can discuss ColumnWriter
later as it is difficult without actual code. To save both of our time and effort and avoid
significant change of our current ColumnWriter implementation, I will update this PR accordingly
based on the approach 2 after your PR is merged. @majetideepak 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message