openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyson Norris <>
Subject Re: LogStore SPI
Date Mon, 25 Sep 2017 16:12:22 GMT
Just a follow up on this LogStore PR, since a couple of side conversations have occurred, and
it may hep to clarify a few things:

- There are 3 touchpoints of this SPI: 
	- log collection process (where ContainerProxy currently calls “docker logs” to acquire
log output from an activation
	- log retrieval process (where APIs return activation log lines in response to a request
for either the activation details or specifically request of the logs)
	- container run parameters (to cause containers to launch with docker log driver options)
- AFAIK with these you can support any scenario of log management for activations; the extreme
and middle possibilities are:
	- collect no logs, expose no logs to developers (not sure this is useful, but just pointing
it out)
	- collect logs, but DO NOT expose them via APIs (just using log drivers, and users must visit
an external system to browse logs)
	- collect logs, and DO expose them via APIs (offering same behavior as currently exists,
while storing logs in the external system)
- depending on the LogStore impl, there may be implications of log formatting required within
action containers, but this PR doesn’t address that. Specifically, for example, it is much
simpler to surface log events associated with an activation if the activation ID is embedded
in a structured log event, instead of random console.log or println messages emitted from
the action. This can either be supported by requiring action devs to use special logging hooks,
or overriding handling of stdout/stderr streams in the container (we’ve tested the latter
with good results in nodejs actions). But this all depends on how operators are planning to
expose logs back to action developers. 

I’m happy to go over these details more on the call, and as always feel free to follow up
in slack or dev list with questions. 


> On Sep 4, 2017, at 6:02 PM, Tyson Norris <> wrote:
> Hi All -
> I just created this LogStore SPI PR[1], and mentioned in the meeting last week I would
solicit feedback on the dev list.
> A couple of things to note:
> - our approach is using Splunk for log storage; we use fluentd in front of spunk forwarder,
so with minor additions, you can replace splunk with anything that fluentd can talk to
> - there is also a generic provision for specifying your docker log driver choices, in
case you don’t want to use fluentd at all (the example config is fluentd, but there is no
code that is related specifically to fluentd)
> - as mentioned in the PR, there is some assumption about the format of stdout/stderr
from the action containers - we are working on a separate PR for this, but the approach is
to allow a configuration to be passed to the container that indicates a preferred log output.
Of course people can also deploy their own action containers, but I think this flexibility
should be exposed in the OOTB containers.
> One issue that comes from decoupling log collection from the activation execution, is
the delay between when logs are generated and when the logs are available to developers. We
haven’t become attached to a specific approach for this, but some options (besides tuning
the log forwarders to lower latency or just polling till logs are available) are:
> - use the existing approach of adding a sentinel log to indicate the end of the activation
- this allows to distinguish between the state of “logs not collected yet”, and “logs
collected but none were generated”; then a developer can be given a message like “logs
not available yet” in case the collection has not made any progress yet.
> - don’t use controller APIs at all, just use the log store (splunk, ELK, etc); this
has some affect on the usefulness of the CLI for debugging.
> Someone mentioned using syslog on the call, but I didn’t quite follow the entire workflow,
so please chime in here if this SPI interface would meet your needs?
> Finally, the changes in this PR are dependent on the ContainerFactory PR [2] since in
our testing using an alternative Container provider (e.g. Mesos) is a real test case for delegating
container creation and (which implies log collection as well) to an external system.
> Thanks
> Tyson
> [1]
> [2]

View raw message