couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: Bringing automatic compaction into trunk
Date Tue, 16 Aug 2011 21:10:20 GMT
Filipe is addressing Paul's concerns. As far as scanning vs. an evented architecture, I'd prefer
to see Filipe's working code in place, and later replaced with a better alternative. We need
to push the project forward, we value useful correct code first. It's easier to improve on
it once it's in place.

Also, I have no objections to a more modular architecture, I very much welcome it. But that
work can happen concurrently with pushing forward the code and adding features the user community
cares about.

-Damien


On Aug 16, 2011, at 7:00 AM, Robert Newson wrote:

> Ok, let's see Pauls' code concerns addressed first, it needs that
> cleanup before it can hit trunk.
> 
> I'd still prefer to see an event-driven rather than polling approach,
> e.g, hook into update_notifier and build a queue of databases that are
> actively being written to (and therefore growing). A much lazier
> background thing could compact databases that are inactive.
> 
> B.
> 
> On 16 August 2011 14:48, Jan Lehnardt <jan@apache.org> wrote:
>> 
>> On Aug 16, 2011, at 3:44 PM, Robert Newson wrote:
>> 
>>> All good points Jan, thanks.
>>> 
>>> Having large numbers of databases is one thing, but I'm focused on the
>>> impact on ongoing operations with this running in the background. What
>>> does it do to the users experience to have all dbs scanned
>>> periodically, etc?
>>> 
>>> The reason I suggest doing it after the move, and in its own app, is
>>> to reduce the work needed to not use this code in some circumstances
>>> (Cloudant hosting, for example). Yes, it's a separate module and
>>> disabled by default, but putting it in its own application will make
>>> the separation much more explicit and preclude unintended
>>> entanglements with core over time.
>> 
>> I think this is a valid concern, but I don't think it outweighs the
>> disadvantage. I'm happy to spend time to make sure this is properly
>> modular after srcmv.
>> 
>> Cheers
>> Jan
>> --
>> 
>> 
>>> 
>>> B.
>>> 
>>> On 16 August 2011 14:31, Jan Lehnardt <jan@apache.org> wrote:
>>>> 
>>>> On Aug 16, 2011, at 2:59 PM, Robert Newson wrote:
>>>> 
>>>>> I'm -1 on the approach (as I understand it) taken by the scheduler as
>>>>> it will be problematic in precisely the circumstance when you'd most
>>>>> want auto compaction (large numbers of databases and views).
>>>> 
>>>> As Filipe mentions in the ticket, this was tested with large numbers of
>>>> databases.
>>>> 
>>>> In addition, your "most want" assumption doesn't hold for the average
>>>> user, I'd wager (no numbers, alas). I'd say it's a basic user-experience
>>>> plus that a software doesn't start wasting a system resource without
>>>> cleaning up after itself. But this isn't even suggesting to enable this by
>>>> default. We have plenty of other features that need proper documentation
>>>> to be used correctly and that we are improving over time to make them
>>>> more obvious by removing common errors or odd behaviour.
>>>> 
>>>>> To this point "Just curious, would it make a big difference to commit
>>>>> the patch before srcmv and migrate it with the rest of the code base
>>>>> rather than letting it rot in JIRA and leave it all to Filipe to keep
>>>>> it updated." -- I'm -∞ on any suggestion that code should be put in
>>>>> trunk to stop it from rotting. Code should land when it's ready. I
>>>>> hope we're all agreed on that and that this paragraph was redundant.
>>>> 
>>>> I was suggesting that the the patch is ready enough for trunk and that
>>>> the level of readiness should not be "solves all possible cases". Especially
>>>> for something that is disabled by default. If we take this to the extreme,
>>>> we'd never add any new features.
>>>> 
>>>> I'm not suggesting "it compiles for me, lets throw it into trunk".
>>>> 
>>>>> After srcmv, and then some work to OTP-ify each of the resultant
>>>>> subdirs, we should add this as a separate application. We might also
>>>>> mark it as beta in the first release to gather feedback from the
>>>>> community.
>>>> 
>>>> I don't see how that is any different from adding it before srcmv and
>>>> avoiding leaving the front-porting effort to a single person.
>>>> 
>>>> Ideally we'd already have srcmv done, but we don't and I don't want
>>>> to hold off progress for an architecture change.
>>>> 
>>>>> I'll be accused of 'stop energy' within nanoseconds of this post so I
>>>>> should end by saying I'm +1 on couchdb gaining the ability to
>>>>> automatically compact its databases and views in principle.
>>>> 
>>>> :)
>>>> 
>>>> Cheers
>>>> Jan
>>>> --
>>>> 
>>>> 
>>>>> 
>>>>> B.
>>>>> 
>>>>> On 16 August 2011 13:19, Jan Lehnardt <jan@apache.org> wrote:
>>>>>> Good points Robert,
>>>>>> 
>>>>>> I replied inline and then hijacked the thread for a more general
discussion, sorry about that  :)
>>>>>> 
>>>>>> On Aug 16, 2011, at 2:08 PM, Robert Dionne wrote:
>>>>>> 
>>>>>>> Filipe,
>>>>>>> 
>>>>>>>  This is neat, I can definitely see the utility of the approach.
I do share the concerns expressed in other comments with respect to the use of the config
file for per db compaction specs and the use of a compact_loop that waits on config change
messages when the ets table is empty. I don't think it fully takes into account the use case
of large numbers of small dbs and/or some very large dbs interspersed with a lot of mid-size
dbs.
>>>>>> 
>>>>>> As I seid in the ticket, per-db config is desirable, but I think
outside of the scope of the ticket.
>>>>>> 
>>>>>>>  Anyway I like it a lot though I've only read the code for 1/2
and hour or so. I also agree with others that the code base is reaching a point of being a
bit crufty and it might be a good time with the git migration, etc.. to take a breath and
commit to making some of these OTP compliant changes and design changes we've talked about.
>>>>>> 
>>>>>> Just curious, would it make a big difference to commit the patch
before srcmv and migrate it with the rest of the code base rather than letting it rot in JIRA
and leave it all to Filipe to keep it updated.
>>>>>> 
>>>>>> I also fear that a srcmv'd release is still out a bit and I'd really
like to see this one (and a few others) go into 1.2 (as per my previous mail to this list
in another thread). While it isn't the absolute perfect solution in all cases, it is disabled
by default and manual compaction strategies work as they did before. In the meantime, we can
refine the rest of the system to make it more fully fledged and maybe even enable it by default
a few versions down when we are all comfortable with it. I'm not very comfortable keeping
good patches in JIRA and not trunk until they solve every little edge case. We haven't worked
like this in the past and I don't think it is worth doing.
>>>>>> 
>>>>>> Cheers
>>>>>> Jan
>>>>>> --
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Regards,
>>>>>>> 
>>>>>>> Bob
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Aug 15, 2011, at 9:29 PM, Filipe David Manana wrote:
>>>>>>> 
>>>>>>>> Developers, users,
>>>>>>>> 
>>>>>>>> It's been a while now since I opened a Jira ticket for it
(
>>>>>>>> https://issues.apache.org/jira/browse/COUCHDB-1153 ).
>>>>>>>> I won't describe it here with detail since it's already done
in the Jira ticket.
>>>>>>>> 
>>>>>>>> Unless there are objections, I would like to get this moving
soon.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Filipe David Manana,
>>>>>>>> fdmanana@gmail.com, fdmanana@apache.org
>>>>>>>> 
>>>>>>>> "Reasonable men adapt themselves to the world.
>>>>>>>> Unreasonable men adapt the world to themselves.
>>>>>>>> That's why all progress depends on unreasonable men."
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message