accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
Date Tue, 04 Feb 2014 02:36:11 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890300#comment-13890300
] 

Keith Turner commented on ACCUMULO-118:
---------------------------------------

bq. But it was a pretty massive change, and maintaining it as a patch set, even with git's
help, would have been very hard.

There is a cost there.  There is also a cost to having incomplete features in master when
we declare feature freeze.   1.5.0 and 1.6.0 were both delayed because of incomplete features.
 One possible solution to this for 1.7.0 development is to only merge in complete features
to master.

There are lots of commits related to all of the 1.6.0 features, many of them reworking code
changed by previous commits related to the feature.  If someone was working on another feature
in a branch, this introduces a lot of unnecessary noise for them to deal with.  Ideally they
would only have to merge and resolve conflicts once per feature.  Of course we will never
achieve ideal ratio of 1, but I think we can easily make the commit per feature/bug ratio
much lower than it currently is.  I know [~ctubbsii] worked on namespaces in a branch for
a long period of time, merging in changes and resolving conflicts, I am not sure how painful
this was.

For a feature like this that touches a lot of existing code, there is the option of refactoring
w/o changing functionality and merging that into master.   Of course the refactoring would
be done to make the functionality changes in the branch easier.   So its a multi-step process
w/ the goal of always leaving master in state where its ready for release testing and minimizing
the number of merges for other feature branches.  This approach would also make code reviews
of commits to master much easier.   I am going to try doing this for the 1.7 features I work
on.

bq.  but was certainly not anything I was thinking about when I was changing thousands of
lines 

I was trying to determine how we can flush out more potential issues before changing thousands
of lines. If we can get as many people as possible to carefully review the design that would
probably do the trick.  I wonder if voting on design docs for new features would help.  Voting
would motivate me to carefully review a design because I would not want to vote until I had
done so.

> accumulo could work across HDFS instances, which would help it to scale past a single
namenode
> ----------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-118
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-118
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.6.0
>
>         Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to access the
files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to break up
the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message