drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cwestin <...@git.apache.org>
Subject [GitHub] drill pull request: DRILL-3987: Componentize Drill, extracting vec...
Date Wed, 11 Nov 2015 23:58:23 GMT
Github user cwestin commented on the pull request:

    https://github.com/apache/drill/pull/250#issuecomment-155948627
  
    On 2.2 above: This description of events is not entirely accurate. We have never held
up releases for the allocator work. Not once, never mind numerous occasions. For two releases
during which it was not ready, I constructed special extract patches that included bug fixes
only, because we believed it
    would help the project. I did ask for people to hold off commits exactly once, for 24
hours. And once I found a problem (the rebase didn't go well), I took the hold off after about
18 hours.
    
    As for merging patches in the order they are ready: the original allocator patch was ready
about 3 months ago now. All the tests passed. The problem was that you refused to review it
because it was too large. It was 213 files, less than half the size of this patch. Much like
this one, there were "virtually no functional changes in this patch and its purpose is extremely
narrow in scope." A large number of files had minor fixes that removed warnings, because these
were helpful in finding bugs (biggest example: making a number of things AutoCloseables because
then the compiler tells you when you leak resources -- this found a number of problems). The
core changes are in about a dozen files, mainly the rewrite of the allocator itself.
    
    You asked, and I pointed you at those core changes. I heard nothing back for a while,
until you eventually asked that I break it up into smaller patches -- but still no comments
on the allocator itself. Because of the layers of changes and dependencies, that was painful,
but I did break things up. The first four were ready for review in mid-August. We exchanged
messages and I told you where they were; When I came back from vacation in September, they
had still not been reviewed. I made a stink at MapR, and some people finally reviewed them.
I continued making patches (several more) which I continued to have a hard time getting people
to review. When you finally did look at one, you immediately rejected it just because it didn't
have its own JIRA. I submitted a JIRA for it within an hour of that, and it, like the others,
continued to be ignored. I finally badgered more folks at MapR to review them.
    
    Exactly once (not "numerous" times), at the time we ran the performance regression (a
magical thing that only folks at MapR can do), we found that there is a performance penalty.
I wouldn't call it a "concurrency design challenge." Yes, when you synchronize variables to
get transacted behavior (memory allocation is really a balance transfer), instead of just
depending on volatile variables that can change under your feet if you refer to them more
than once, things are going to be slower. But you can do more.
    
    In order to avoid this continual rebasing to track the project, I went back and made it
possible for the new and old allocators to co-exist, so that the code would at least be committed
while the performance penalty could be addressed (in order to avoid more rebasing in the future).
This would also allow the use of the new allocator when assertions are enabled (or another
debug flag), so that we could find any new problems; many of my rebases took time because
I kept finding new problems introduced by ongoing work because the new allocator is much stricter
than the old one.
    
    You insisted that if others could just look at it, you were sure that a faster solution
would be found; you asked for pointers to the code. I gave them to you. I never heard anything
back. In any case, it is set up to only be used in debug mode for the moment, and can stay
that way until someone can find a way to make it faster (and I explained the multiple implementations
I tried before I finally got it down to the ~10% performance penalty it currently carries)
    
    I continued breaking things down in an attempt to get them in. We are finally down to
the last two patches.
    
    Now I see a larger patch with the same characteristics, and it feels like it is being
bulldozed through, even though there aren't clear benefits to having it. Your diagrams doesn't
justify it, nor explain why it will be better. The new allocator, even if only used in debug
mode for now, will help engineers avoid creating new leaks that the old allocator doesn't
detect. Given some other suggestions I've seen to change the vector class interfaces, it seems
like they're not stable enough to be broken out into a separate project yet; that will complicate
changes.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message