From dev-return-103192-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Fri Apr 12 22:24:39 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D87FF180621 for ; Sat, 13 Apr 2019 00:24:38 +0200 (CEST) Received: (qmail 69155 invoked by uid 500); 12 Apr 2019 22:24:37 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 69125 invoked by uid 99); 12 Apr 2019 22:24:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Apr 2019 22:24:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9CF80181044 for ; Fri, 12 Apr 2019 22:24:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.8 X-Spam-Level: * X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id xXvHMUvn8eSR for ; Fri, 12 Apr 2019 22:24:34 +0000 (UTC) Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id A524E5F48B for ; Fri, 12 Apr 2019 22:24:33 +0000 (UTC) Received: by mail-pg1-f175.google.com with SMTP id q1so5813901pgv.13 for ; Fri, 12 Apr 2019 15:24:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=5RoBuJ5C9wKCiWvdS+BCr66df8ZqeOGp359Duzi+TO8=; b=JI4lqkUS3SRnLRdbLjuAeiFW0g5Fc81MhjPooENcbYWsGO3LWGqR3jfWRQ3+RXgd4b yEMzgOr9ixNG4ePZvGUuqF4q3MupB34zfNC0LrSEGcT83nZc9MFgDyn1Ra9kShdrOswO CIzBImxfGrIGEclU2CiMLIL55UN/ytCVvgoHjwTEVpgw96c9v5+3WeOPzs5IoMiC8Bfz hNoEbkAoyEw/A+NNTGM4F5Lg+vJYgEjlsRFJCDfKxNyVmnuX7CE6BYs2/tqBoQeOsyc9 cdzd/H5Iy6EbF/+FS7kAUD6pQDapCiYBHTrmCG4u07NvNemhwutHdY1VgdmrgGeD0hsv u2kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=5RoBuJ5C9wKCiWvdS+BCr66df8ZqeOGp359Duzi+TO8=; b=gpb0ILagV6lruDR9UvfWVGdct6LplBmTjE4LLMCo1v/0x1lqRbsyXXtisnm5o2h3MC TkpHffj4H664OX45TlokkHhaarJdC3bVdVCR0I08ZeXRkFz5oEhuX5ywbvYCWOkjc2ZX bMw3KUNFfM495wnidpWS8Ezthh9BpVSOvrS+/Zv8JaBKrO3B40pxSGBT0n2eMC7BrTmY pMXFURs30ISl20ldoCQ7LG1+r+S7i7vM4gwGUhx8CoWNLKVKI8YkCl7ixn0F74MFBcW6 Y2AApxjpCqnlczeA8UjMfE7DjtMBWiOtJ/RdS+uwZRJdtFbHild7/H2T/87BHVCzKs/J 14yg== X-Gm-Message-State: APjAAAWxDeVWKD448BB69VPVGkELvXeAQua3g+396O3bApsFaEMAftn1 rtYw+nTXs6ZD8iRzKhL333ZGKYcAkR12Kb1cfuvLXTOpwb0= X-Google-Smtp-Source: APXvYqzajheI34jAWv6YBqew3eF+315ENjCc1V8FBBq/ZskIjL2aUvmTMDJO0LYxelWjJqKjc+wszI5jkZcPTqZ7Jv4= X-Received: by 2002:a63:1203:: with SMTP id h3mr56914904pgl.164.1555107871844; Fri, 12 Apr 2019 15:24:31 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Randall Hauch Date: Fri, 12 Apr 2019 17:24:20 -0500 Message-ID: Subject: Re: [DISCUSS] KIP-449: Add connector contexts to Connect worker logs To: dev Content-Type: multipart/alternative; boundary="00000000000049ecdf05865cc46d" --00000000000049ecdf05865cc46d Content-Type: text/plain; charset="UTF-8" Thanks for the review and feedback, Konstantine. 1. Great suggestion. I've updated the KIP to hopefully make it more clear that the uncommented line is unchanged from the existing Log4J configuration file. 2. Regarding including a `-` before the task number is acceptable if it makes it easier to, read and filter. I've updated the KIP and PR to incorporate your suggestion. 3. Task numbers do start at 0, as seen by the DistributedHerder code that creates the tasks ( https://github.com/apache/kafka/blob/02221bd907a23041c95ce6446986bff631652b3a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L608-L611). I've updated the KIP to highlight that task numbers are 0-based. As you can see in the sample log included in the KIP, "task-0" corresponds to the first task, not to the connector. In fact, the following are examples of the log messages with the "worker" scope where the Connector implementation is called: [2019-04-02 17:01:38,315] INFO [local-file-source|worker] Creating connector local-file-source of type FileStreamSource (org.apache.kafka.connect.runtime.Worker:227) [2019-04-02 17:01:38,317] INFO [local-file-source|worker] Instantiated connector local-file-source with version 2.3.0-SNAPSHOT of type class org.apache.kafka.connect.file.FileStreamSourceConnector (org.apache.kafka.connect.runtime.Worker:230) ... [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopping connector local-file-sink (org.apache.kafka.connect.runtime.Worker:334) [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopped connector local-file-sink (org.apache.kafka.connect.runtime.Worker:350) The FileStreamSourceConnector class does not actually log any messages itself, but if it did those would be with the "[local-file-source|worker]" context and would include the name of the class and line number in parentheses (instead of e.g., "org.apache.kafka.connect.runtime.Worker:230"). 4. I thought about doing a more complex example, but IMO the extra log lines made the *sample* in the KIP quite a bit longer harder to understand. I thought it important to keep the KIP a bit more readable while showing which scope appear for the different log messages. A second task would essentially have the nearly the same log messages, just with a task number in the scope. 5. The PR already changes the `runtime/src/test/resources/log4j.properties` to include the connector context MDC parameter. Because that's not a public API, I've not mentioned it in the KIP. 6. I guess I understand your goal - it's not always clear which worker is being referenced. However, I'm not sure whether it's that valuable to include the "client.id" (if set) just in the "worker" scope. That means that it's maybe more useful to introduce a second MDC parameter (e.g., "%{ worker.id}") and optionally include that on all log messages. We'd have to set the MDC context in the code in each thread, which isn't too much effort. The other advantage of this approach is that it doesn't have to be configurable: you can control it via your own logging configuration file (rather than optionally including it in the "worker" scope on some of the log messages). Thoughts? What would the "%{worker.id}" MDC value be set to if "client.id" is not set? Final notes: Removed the placeholders, and corrected the typos and grammar. Thanks again for the detailed review! On Fri, Apr 12, 2019 at 2:05 PM Konstantine Karantasis < konstantine@confluent.io> wrote: > Thanks for the KIP Randall. > > It might not be obvious right away, but this is a great improvement when > running Connect with multiple connectors or when debugging Connect, for > instance in integration tests or system tests. KIP looks good to me > overall, I have just a few comments below: > > 1. In the snippet of the config, by including an uncommented line, maybe > it's not immediately clear that this line is an existing line in > connect-log4j.properties and not an addition. Should this be mentioned in a > separate code block or in a different way? > > 2. Currently when adding the taskId to a connector or connector task, we > precede it with a dash (-). I feel this addition would make it easier to > parse the taskId visually as well as with parsing tools (for example I can > use `-` as a delimiter easier than 'k' or even a multi-character delimiter > such as 'task'). Do you think it's an overhead? > > 3. I think a specific mention to the convention used on the base-index for > taskIds would be useful. For example, taskId equal to zero in most cases > represents the (source/sink) connector, while taskId > 0 corresponds to > (source/sink) tasks. By reading the KIP, I assume that task0 (or task-0 if > you agree with my previous comment) will be the connector. Is this the > case? > > 4. Should the example include a snippet for taskId > 0? > > 5. Do you intend to enable connector MDC in tests? In this case how would > you distinguish between multiple workers in integrations tests? Do you > intend to address this here, in the PR corresponding to this KIP or follow > up later? > > 6. Related to my previous comment (5). The property 'client.id' is already > used to identify a Connect worker. Have you thought of offering the ability > to identify the worker MDC if this property is set, keeping at the same > time with MDC the ability to not include it as identifier? > > Finally, I'd suggest removing text placeholders. I think it will be easier > to follow the KIP. Currently I see placeholder text below Contents and in > Rejected alternatives. > > And my least favorite section, but I can't help it :) > Typos/grammar: > to understand what (is) happening within the worker, > Kafka Connect use(s) ... > quotes maybe not needed, text already verbatim in > `config/connect-log4j.properties` > > Best, > Konstantine > > On Tue, Apr 2, 2019 at 4:20 PM Randall Hauch wrote: > > > I've been working on https://github.com/apache/kafka/pull/5743 for a > > while, > > but there were a number of comment, suggestions, and mild concerns on the > > PR. One of those comments was that maybe changing the Connect log content > > in this way probably warrants a KIP. So here it is: > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs > > > > I've also updated my PR reflect the KIP. Please reply with comments > and/or > > feedback. > > > > Best regards, > > > > Randall > > > --00000000000049ecdf05865cc46d--