accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: [DISCUSS] Rethinking monitor log receiver
Date Mon, 06 Feb 2017 22:32:15 GMT
Avoiding trying to respond inline as your last reply didn't format well 
(not sure if my side or your side). So, sorry for a top-post.

I have no strong opinions on what technology is used to improve log 
aggregation. I would encourage whoever wants to take up this work to: do 
the research, present some facts about what exists, and present an plan 
for which software/approach is "best" for what we have to do. Many of 
the points you raised on "things bad in current implementation" can be 
re-stated as "could be better" and I would agreed with them.

No offense, but further discussion at this level just feels fruitless to 
me as I don't think anyone has ever argued *against* not owning 
log-aggregation (but I could be wrong). IMO to move this forward, it 
requires concrete suggestions.

Christopher wrote:
> On Mon, Feb 6, 2017 at 12:15 PM Josh Elser<josh.elser@gmail.com>  wrote:
>
> -1 for removal of the log-forwarding which does not include an
> out-of-the-box replacement. I'm going to avoid saying any more on this
> unless necessary.
>
>
> If we were to opt for removal (perhaps by agreeing that it were out of
> scope), I'd personally prefer it be phased out gradually, as users gained
> more experience integrating with superior tooling for distributed log
> aggregation. I think too many people rely on it right now to remove it
> abruptly.
>
>
> Most of the other things you bring up seem like you poking holes in how
> the log aggregation works presently. I don't think anyone would disagree
> that this could be made better. Instead of focusing on this, why not
> approach your issues by instead suggesting how you'd like to see it work
> and weigh the pros/cons?
>
>
> Yes, this discussion is about both whether, and how, we might be able to
> make it better. I haven't proposed a solution, because I don't have one at
> this time. Hence, my desire to [DISCUSS] it.
>
>
> Christopher wrote:
>> Currently, the monitor has an HA-standby behavior as a side-effect of the
>> way it handles logging.
>>
>> The way this works is that the other services read the monitor's lock in
> ZK
>> to get the current active monitor's host and port. They use this to set
>> some system properties, which are referenced in the default example
>> generic_logger.xml config file. They set these system properties and reset
>> the log4j system whenever they detect a change in the active monitor. This
>> has the effect of forcing all the logging to go to a particular socket
> open
>> on the monitor, wherever it is currently running.
>
> If you'd like to change the implementation, great.
>
>
> Right. So, this is a discuss thread about how we might want to change it. I
> don't have a good sense about what avenues are going to cause headaches.
>
>
> As a point of reference, the Master's ZK lock is also used for service
> discovery as well as distributed exclusion (a lock). I'm not sure why
> you find an issue with this.
>
>
> I don't find an issue with it. I recognize it does both. I stated that. I
> also recognize that it must do both for certain services. My point was that
> only one of those aspects (service discovery), rather than both, was needed
> for the monitor service. The other aspect (distributed exclusion) is an
> unnecessary side-effect which cripples redundant monitors because of this
> (optional) log receiving feature.
>
>
>> The ZK lock is currently being used to restrict monitor functionality to a
>> single monitor instance. But, this isn't really necessary. There isn't any
>> reason to restrict concurrent monitors. The real purpose of the ZK lock,
> as
>> I've described, is to hijack the ZK lock mechanism because it's also a
>> service-advertisement feature.
>
> The use of the ZK lock is to prevent users from accessing an instance of
> the Monitor which is not actually receiving the forwarded logs from the
> server. It would be terrible if half of an Ops team saw no errors on the
> monitor they went to while the other half saw the real errors.
>
>
> Yes. One solution would be to limit the distributed exclusion to just the
> log feature (along with any relevant messages), and leave the rest of the
> redundant monitor functional.
>
>
>> This is a bit convoluted and makes a lot of assumptions, and has a lot of
>> issues. It is also could be impeding some possible avenues of
>> simplification under ACCUMULO-3005.
>>
>> 1. It locks us in to using Log4J (probably a specific range of versions).
>> 2. It sends logs across the network insecurely.
>
> These seem like implementation details to me.
>
>
> Yes, but it's an inherent problem with the asynchronous log appender, which
> speaks directly to its suitability for running a custom log receiver.
>
>
>
>> 3. It assumes that you only want a single monitor service running.
>
> Again, an incorrect assumption.
>
>
> It prevents you from running more than one concurrently. The incorrect
> assumption that you want this is occurring on the part of the current
> implementation, based on its fundamental design.
>
>
>> 4. Code assumes particular configuration with particular variable's
>> embedded in them.
>
> Implementation detail
>
>
> Well, yes... this thread is a discuss thread about changing implementation
> in some way... so....
>
>
>> 5. Extra threads needed to track changes
>
>
>
> Implementation detail
>
>
> Yep.
>
>
>
>> 6. It adds code complexity.
>
> ??? I don't even know how to address this one.
>
>
> Responding to every point is optional. ;)
> But, seriously, my point here is that we often seem to throw the kitchen
> sink into our code, introducing unnecessary complexity, rather than
> encourage integration. This complexity makes integration harder, and the
> code less maintainable. While I don't yet know what my preferred solution
> is... I strongly suspect it is one that is both workable and less complex.
>
>
>> 7. It assumes the user wishes to use the monitor to watch logs, when other
>> tools are better suited for log aggregation, monitoring, and alerting.
>
> Your argument assumes that *all* users have such a capability set up.
> This feature does not preclude users from doing whatever they'd like.
>
>
> It's not an argument. I'm not championing a position. That point was simply
> an observation of the current default behavior, and the recognition that
> dedicated tools for this exist. I'm not assuming *any* users have any
> alternate capabilities set up, never mind "*all*". Whether or not users
> have such capabilities set up says nothing about whether we'd prefer to
> roll our own or help users integrate our product into alternatives. We need
> not know about their capabilities in order to adopt a preferred philosophy
> on this matter.
>
> And, it actually does preclude users from doing whatever they like. It
> imposes at least the following limitations on users: It's not possible
> (currently) for users to run the monitor without also listing on an extra,
> insecure, port, awaiting receipt of arbitrary log data. It's not possible
> for users to run two monitors concurrently and get functionality out of
> both. It's not possible for users to avoid log system resets, which could
> be expensive (performance) and could cause log messages to be dropped while
> it's reloading. It's not possible for users to be able to predict when the
> logging configuration will be reloaded (and changes take effect) on the
> basis of some new instance of the monitor being started somewhere.
>
>
>> This whole thing would be simpler if we just eliminated the monitor
>> log-watching feature entirely. However, we also have some options short of
>> that. For example, we could (1) use a different service-advertisement
>> mechanism that doesn't lock.
>
> Are you talking about using a persistent ZNode to advertise the Monitor
> location? What happens when there would be multiple monitors?
>
>
> Not necessarily a persistent ZNode. We can advertise as many monitors as we
> like, just as we currently do with tservers. This would make monitors
> available for log receipt, but the decision about who and how many to send
> to could be left to the user, perhaps with a custom log appender which is
> configurable.
>
>
> (2). We could stop doing this variable
>> injection thing, and still leave the socket appender running, so that
> users
>> will have to configure their destination in their own configuration files,
>> if they want to emit logs to that socket. (3) We could use an Accumulo RPC
>> to send logs, rather than the log4j API. Lots of options.
>
> Instead of #3, aren't there other "standard" log collection APIs that
> would make more sense than us rolling our own?
>
>
>
> Certainly. This is not an exhaustive list of options, nor is it a proposal
> to go with any one of them in particular. It's just a discussion. I'm not
> necessarily a fan of this option. It just occurred to me because I thought
> we could provide the feature at the same level of security as the rest of
> our RPCs (for instance, when using TLS or KRB).
>

Mime
View raw message