hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lefty Leverenz <leftylever...@gmail.com>
Subject Re: Interesting claims that seem untrue
Date Tue, 17 Sep 2013 20:22:29 GMT
> Whatever you count, you get more of :)

Then let's count lines of documentation!  ;)

-- Lefty


On Tue, Sep 17, 2013 at 12:15 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> Whatever you count, you get more of :)
>
>
> On Tue, Sep 17, 2013 at 1:57 PM, Konstantin Boudnik <cos@apache.org>wrote:
>
>> Carter,
>>
>> what you are doing is essentially contradict ASF policy of "community over
>> code".
>>
>> Perhaps, your intentions are good. However, LOC calculations or other
>> silly
>> contests are essentially driving a wedge between developers who happen to
>> draw
>> their paycheck from different commercial entities. Hadoop community passed
>> through this already and it caused nothing but despair and bitterness
>> between
>> the people.
>>
>> Unlike some other popular contests, the number of lines contributed
>> doesn't
>> matter for most. Seriously.
>>
>> Regards,
>>   Cos
>>
>> On Mon, Sep 16, 2013 at 01:58PM, Carter Shanklin wrote:
>> > Ed,
>> >
>> > If nothing else I'm glad it was interesting enough to generate some
>> > discussion. These sorts of stats are always subjects of a lot of
>> > controversy. I have seen a lot of these sorts of charts float around in
>> > confidential slide decks and I think it's good to have them out in the
>> open
>> > where anyone can critique and correct them.
>> >
>> > In this case Ed, you've pointed out a legitimate flaw in my analysis.
>> Doing
>> > the analysis again I found that previously, due to a bug in my scripts,
>> > JIRAs that didn't have Hudson comments in them were not counted (this
>> was
>> > one way it was identifying SVN commit IDs which I have since removed
>> due to
>> > flakiness). Brock's patch was the single largest victim of this bug but
>> not
>> > the only one, there were some from Cloudera, NexR, Hortonworks, Facebook
>> > even 2 from you Ed. The interested can see a full list of exclusions
>> here:
>> >
>> https://docs.google.com/spreadsheet/ccc?key=0ArmXd5zzNQm5dDJTMkFtaUk2d0dyU3hnWGJCcUczbXc#gid=0
>> .
>> > I apologize to those under-represented, there wasn't any intent on my
>> part
>> > to minimize anyone's work. The impact in final totals is Cloudera +5.4%,
>> > NexR +0.8%, Facebook -2.7%, Hortonworks -3.3%. I will be updating the
>> blog
>> > later today with relevant corrections.
>> >
>> > There is going to be continued interest in seeing charts like these, for
>> > example when Hive 12 is officially done. Sanjay suggested that LoC
>> counts
>> > may not be the best way to represent true contribution. I agree that not
>> > all lines of code are created equal, for example a few monster patches
>> > recently went in re-arranging HCatalog namespaces and I think also
>> > indentation style. This (hopefully) mechanical work is not on the same
>> > footing as adding new query language features. Still it is work and
>> > wouldn't be fair to pretend it didn't happen. If anyone has ideas on
>> better
>> > ways to fairly capture contribution I'm open to suggestions.
>> >
>> >
>> >
>> > On Thu, Sep 12, 2013 at 7:19 AM, Edward Capriolo <edlinuxguru@gmail.com
>> >wrote:
>> >
>> > > I was reading the horton-works blog and found an interesting article.
>> > >
>> > >
>> http://hortonworks.com/blog/stinger-phase-2-the-journey-to-100x-faster-hive/#comment-160753
>> > >
>> > > There is a very interesting graphic which attempts to demonstrate
>> lines of
>> > > code in the 12 release.
>> > > http://hortonworks.com/wp-content/uploads/2013/09/hive4.png
>> > >
>> > > Although I do not know how they are calculated, they are probably
>> counting
>> > > code generated by tests output, but besides that they are wrong.
>> > >
>> > > One claim is that Cloudera contributed 4,244 lines of code.
>> > >
>> > > So to debunk that claim:
>> > >
>> > > In https://issues.apache.org/jira/browse/HIVE-4675 Brock Noland from
>> > > cloudera, created the ptest2 testing framework. He did all the work
>> for
>> > > ptest2 in hive 12, and it is clearly more then 4,244
>> > >
>> > > This consists of 84 java files
>> > > [edward@desksandra ptest2]$ find . -name "*.java" | wc -l
>> > > 84
>> > > and by itself is 8001 lines of code.
>> > > [edward@desksandra ptest2]$ find . -name "*.java" | xargs cat | wc -l
>> > > 8001
>> > >
>> > > [edward@desksandra hive-trunk]$ wc -l HIVE-4675.patch
>> > > 7902 HIVE-4675.patch
>> > >
>> > > This is not the only feature from cloudera in hive 12.
>> > >
>> > > There is also a section of the article that talks of a "ROAD MAP" for
>> hive
>> > > features. I did not know we (hive) had a road map. I have advocated
>> > > switching to feature based release and having a road map before, but
>> it was
>> > > suggested that might limit people from itch-scratching.
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> > --
>> > Carter Shanklin
>> > Director, Product Management
>> > Hortonworks
>> > (M): +1.650.644.8795 (T): @cshanklin <http://twitter.com/cshanklin>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or
>> entity to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message