orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wgtmac <...@git.apache.org>
Subject [GitHub] orc issue #120: ORC-184: Refactor ColumnStatistics classes for writer
Date Fri, 12 May 2017 17:28:18 GMT
Github user wgtmac commented on the issue:

    We are progressively working on moving all the code here. As there is a big gap between
the current code and our code, it is not possible to create a PR for ColumnWriter right now.

    This PR is not a final decision, but a piece of code to show our ideas and introduce the
discussion as we actually need some sort of ColumnStatistics class to act as the base class
in the ColumnWriter implementation. To use your InternalStatisticsImpl<Type>, I have
following thoughts:
    1) InternalStatisticsImpl<Type> is an implementation class, so the interface functions
like increase, merge, reset, etc. should still be defined in a base class somewhere;
    2) We can implement ColumnWriter class using templates as well to adopt InternalStatisticsImpl<Type>;
but ColumnWriter and ColumnStatistics are slightly different since ColumnWriter has more types
and less common code, so I doubt template is not a good choice for ColumnWriter class.
    3) BTW, I really appreciate your refactoring in your PR @majetideepak. I would like to
add our code based on your changes. Before that, we have to achieve a consensus on which class
is the best choice of the base class in ColumnWriter implementation (like the ColumnStatisticsImpl
on the java side). In our design, it is ColumnStatistics itself.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message