hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: Can you help us Hbase Community
Date Tue, 15 Dec 2015 12:17:54 GMT
I had a look at the reports.  The prediction model looks good.

Few questions - what is the idea behind the tool that you plan to build for
the community.  Are you planning to give a tool that says for a given issue
description what are the files that it may impact?

For eg, if an interface is changed automatically all the impl of that
interface will get changed (am just taking a very simple example) - so your
tool does take this type of compile time rules also?

The reason for asking this is from the discussion, comment and social
activity what parameters do you get to ascertain the related changes.


On Tue, Dec 15, 2015 at 5:36 PM, Igor Wiese <igor.wiese@gmail.com> wrote:

> Hi, Hbase Community.
> My name is Igor Wiese, phd Student from Brazil. I sent an email a week
> ago about my research. We received some visit to inspect the results
> but any feedback was provided.
> I am investigating two important questions: What makes two files
> change together? Can we predict when they are going to co-change
> again?
> I've tried to investigate this question on the Hbase project. I've
> collected data from issue reports, discussions and commits and using
> some machine learning techniques to build a prediction model.
> I collected a total of 8492 commits in which a pair of files changed
> together and could correctly predict 71% commits. These were the most
> useful information for predicting co-changes of files:
> - sum of number of lines of code added, modified and removed,
> - number of words used to describe and discuss the issues,
> - median value of closeness, a social network measure  obtained from
> issue comments,
> - median value of effective size, a social network measure obtained
> from issue comments, and
> -  median value of hierarchy, a social network measure obtained from
> issue comments.
> To illustrate, consider the following example from our analysis. For
> release 1.1, the files "util/HBaseFsck.java" and
> "hbase/util/HBaseFsckRepair.java" changed together in 13 commits. In
> another 40 commits, only the first file changed, but not the second.
> Collecting contextual information for each commit made to first file
> in previous release, we were able to predict 9 commits in which both
> files changed together in release 1.1, and we only issued two false
> positives and two wrong predictions. For this pair of files, the most
> important contextual information was the number of developers that
> commented in an each issue and the social network metric (efficiency)
> obtained from issue comments.
> - Do these results surprise you? Can you think in any explanation for
> the results?
> - Do you think that our rate of prediction is good enough to be used
> for building tool support for the software community?
> - Do you have any suggestion on what can be done to improve the change
> recommendation?
> You can visit a webpage to inspect the results in details:
> http://flosscoach.com/index.php/17-cochanges/71-hbase
> All the best,
> Igor Wiese
> Phd Candidate
> --
> =================================
> Igor Scaliante Wiese
> PhD Candidate - Computer Science @ IME/USP
> Faculty in Dept. of Computing at Universidade Tecnológica Federal do Paraná

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message