flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Peng <li.p...@doordash.com>
Subject Re: Flink on Kubernetes seems to ignore log4j.properties
Date Thu, 12 Dec 2019 22:30:58 GMT
Ok I think I identified the issue:

1. I accidentally bundled another version of slf4j in my job jar, which
results in some incompatibility with the slf4j jar bundled with flink/bin.
Apparently slf4j in this case defaults to something that ignores the conf?
Once I removed slf4j from my job jar, the logger properties were properly
consumed.
2. Looks like the line log4j.appender.file.file=${log.file} on the default
properties didn't work properly (resulting in log4j null errors), it
started working after I just set it manually to opt/flink/logs/output.log.

Thanks for your guidance!
Li

On Thu, Dec 12, 2019 at 12:09 PM Li Peng <li.peng@doordash.com> wrote:

> Hey ouywl, interesting, I figured something like that would happen. I
> actually replaced all the log4j-x files with the same config I originally
> posted, including log4j-console, but that didn't change the behavior either.
>
> Hey Yang, yes I verified the properties files are as I configured, and
> that the logs don't match up with it. Here are the JVM arguments, if that's
> what you were looking for:
>
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM:
> OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.232-b09
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
>  Maximum heap size: 989 MiBytes
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
>  JAVA_HOME: /usr/local/openjdk-8
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
>  Hadoop version: 2.8.3
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM
> Options:
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Xms1024m
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Xmx1024m
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
>  Program Arguments:
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> --configDir
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> /opt/flink/conf
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Djobmanager.rpc.address=myjob-manager
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dparallelism.default=2
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dblob.server.port=6124
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dqueryable-state.server.ports=6125
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Djobmanager.heap.size=3000m
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dtaskmanager.heap.size=3000m
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dmetrics.reporter.stsd.class=org.apache.flink.metrics.statsd.StatsDReporter
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dmetrics.reporter.stsd.host=myhost.com
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dmetrics.reporter.stsd.port=8125
> [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> -Dmetrics.system-resource=true
>
> Thanks,
> Li
>
> On Thu, Dec 12, 2019 at 4:40 AM ouywl <ouywl@139.com> wrote:
>
>>  @Li Peng
>>    I found your problems.  Your start cmd use args “start-foreground”,
>> It will run “exec "${FLINK_BIN_DIR}"/flink-console.sh ${ENTRY_POINT_NAME}
>> "${ARGS[@]}””, and In ' flink-console.sh’, the code is “log_setting=(
>> "-Dlog4j.configuration=file:${FLINK_CONF_DIR}/log4j-console.properties"
>> "-Dlogback.configurationFile=file:${FLINK_CONF_DIR}/logback-console.xml")”
>> . So the log4j.properties not work. It need log4j-console.properties and
>> logback-console.xml.
>>
>> ouywl
>> ouywl@139.com
>>
>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=ouywl&uid=ouywl%40139.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fsma8dc7719018ba2517da7111b3db5a170.jpg&items=%5B%22ouywl%40139.com%22%5D>
>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81>
定制
>>
>> On 12/12/2019 15:35,ouywl<ouywl@139.com> <ouywl@139.com> wrote:
>>
>> HI yang,
>>    Could you give more info detail? log4j.properties content, and The k8s
>> yaml. Is use the dockerfile in flink-container? When I test it use the
>> default per-job yaml in flick-container? It is only show logs in docker
>> infos. And not logs in /opt/flink/log.
>>
>> ouywl
>> ouywl@139.com
>>
>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=ouywl&uid=ouywl%40139.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fsma8dc7719018ba2517da7111b3db5a170.jpg&items=%5B%22ouywl%40139.com%22%5D>
>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81>
定制
>>
>> On 12/12/2019 13:47,Yang Wang<danrtsey.wy@gmail.com>
>> <danrtsey.wy@gmail.com> wrote:
>>
>> Hi Peng,
>>
>> What i mean is to use `docker exec` into the running pod and `ps` to get
>> the real
>> command that is running for jobmanager.
>> Do you have checked the /opt/flink/conf/log4j.properties is right?
>>
>> I have tested standalone per-job on my kubernetes cluster, the logs show
>> up as expected.
>>
>>
>> Best,
>> Yang
>>
>> Li Peng <li.peng@doordash.com> 于2019年12月12日周四 上午2:59写道:
>>
>>> Hey Yang, here are the commands:
>>>
>>> "/opt/flink/bin/taskmanager.sh",
>>> "start-foreground",
>>> "-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager",
>>> "-Dtaskmanager.numberOfTaskSlots=1"
>>>
>>> "/opt/flink/bin/standalone-job.sh",
>>> "start-foreground",
>>> "-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager",
>>> "-Dparallelism.default={{ .Values.task.replicaCount }}"
>>>
>>> Yes it's very curious that I don't see any logs actually written to
>>> /opt/flink/log.
>>>
>>> On Tue, Dec 10, 2019 at 11:17 PM Yang Wang <danrtsey.wy@gmail.com>
>>> wrote:
>>>
>>>> Could you find the logs under /opt/flink/log/jobmanager.log? If not,
>>>> please share the
>>>> commands the JobManager and TaskManager are using? If the command is
>>>> correct
>>>> and the log4j under /opt/flink/conf is expected, it is so curious why
>>>> we could not get the logs.
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>> Li Peng <li.peng@doordash.com> 于2019年12月11日周三 下午1:24写道:
>>>>
>>>>> Ah I see. I think the Flink app is reading files from
>>>>> /opt/flink/conf correctly as it is, since changes I make to flink-conf
are
>>>>> picked up as expected, it's just the log4j properties that are either
not
>>>>> being used, or don't apply to stdout or whatever source k8 uses for its
>>>>> logs? Given that the pods don't seem to have logs written to file
>>>>> anywhere, contrary to the properties, I'm inclined to say it's the former
>>>>> and that the log4j properties just aren't being picked up. Still have
no
>>>>> idea why though.
>>>>>
>>>>> On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myasuka@live.com> wrote:
>>>>>
>>>>>> Sure, /opt/flink/conf is mounted as a volume from the configmap.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best
>>>>>>
>>>>>> Yun Tang
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *Li Peng <li.peng@doordash.com>
>>>>>> *Date: *Wednesday, December 11, 2019 at 9:37 AM
>>>>>> *To: *Yang Wang <danrtsey.wy@gmail.com>
>>>>>> *Cc: *vino yang <yanghua1127@gmail.com>, user <user@flink.apache.org>
>>>>>> *Subject: *Re: Flink on Kubernetes seems to ignore log4j.properties
>>>>>>
>>>>>>
>>>>>>
>>>>>> 1. Hey Yun, I'm calling /opt/flink/bin/standalone-job.sh and
>>>>>> /opt/flink/bin/taskmanager.sh on my job and task managers respectively.
>>>>>> It's based on the setup described here:
>>>>>> http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/
.
>>>>>> I haven't tried the configmap approach yet, does it also replace
the conf
>>>>>> files in /opt/flink/conf?
>>>>>>
>>>>>> 2. Hey Vino, here's a sample of the kubernetes:
>>>>>> https://pastebin.com/fqJrgjZu  I didn't change any patterns from
the
>>>>>> default, so the string patterns should look the same, but as you
can see
>>>>>> it's full of info checkpoint logs that I originally was trying to
suppress.
>>>>>> Based on my log4j.properties, the level should be set to WARN. I
couldn't
>>>>>> actually find any .out files on the pod, this is from the kubectl
logs
>>>>>> command. I also didn't see any files in /opt/flink/log, which I thought
my
>>>>>> log4j was specified to do, hence me thinking that the properties
weren't
>>>>>> actually being consumed. I also have the same properties in my
>>>>>> src/main/resources folder.
>>>>>>
>>>>>> 3. Hey Yang, yes this is a standalone session cluster. I did specify
>>>>>> in the docker file to copy the log4j.properties to the /opt/flink/conf
>>>>>> folder on the image, and I confirmed that the properties are correct
when I
>>>>>> bash'd into the pod and viewed them manually.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Incidentally, I also tried passing the -Dlog4j.configuration argument
>>>>>> to the programs, and it doesn't work either. And based on what I'm
reading
>>>>>> on jira, that option is not really supported anymore?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks for your responses, folks!
>>>>>>
>>>>>> Li
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Dec 9, 2019 at 7:10 PM Yang Wang <danrtsey.wy@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi Li Peng,
>>>>>>
>>>>>>
>>>>>>
>>>>>> You are running standalone session cluster or per-job cluster on
>>>>>> kubernetes. Right?
>>>>>>
>>>>>> If so, i think you need to check your log4j.properties in the image,
>>>>>> not local. The log is
>>>>>>
>>>>>> stored to /opt/flink/log/jobmanager.log by default.
>>>>>>
>>>>>>
>>>>>>
>>>>>> If you are running active Kubernetes integration for a fresh taste.
>>>>>> The following cli option
>>>>>>
>>>>>> could be used to remove the redirect.
>>>>>>
>>>>>> -Dkubernetes.container-start-command-template="%java% %classpath%
>>>>>> %jvmmem% %jvmopts% %logging% %class% %args%"
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Yang
>>>>>>
>>>>>>
>>>>>>
>>>>>> vino yang <yanghua1127@gmail.com> 于2019年12月10日周二
上午10:55写道:
>>>>>>
>>>>>> Hi Li,
>>>>>>
>>>>>>
>>>>>>
>>>>>> A potential reason could be conflicting logging frameworks. Can you
>>>>>> share the log in your .out file and let us know if the print format
of the
>>>>>> log is the same as the configuration file you gave.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Vino
>>>>>>
>>>>>>
>>>>>>
>>>>>> Li Peng <li.peng@doordash.com> 于2019年12月10日周二 上午10:09写道:
>>>>>>
>>>>>> Hey folks, I noticed that my kubernetes flink logs (reached via *kubectl
>>>>>> logs <pod-name>*) completely ignore any of the configurations
I put
>>>>>> into /flink/conf/. I set the logger level to WARN, yet I still see
INFO
>>>>>> level logging from flink loggers
>>>>>> like org.apache.flink.runtime.checkpoint.CheckpointCoordinator. I
even made
>>>>>> copied the same properties to /flink/conf/log4j-console.properties
>>>>>> and log4j-cli.properties.
>>>>>>
>>>>>>
>>>>>>
>>>>>> From what I can tell, kubernetes just listens to stdout and stderr,
>>>>>> so shouldn't the log4j.properties control output to them? Anyone
seen this
>>>>>> issue before?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Here is my log4j.properties:
>>>>>>
>>>>>>
>>>>>> # This affects logging for both user code and Flink
>>>>>> log4j.rootLogger=WARN, file, console, stdout
>>>>>>
>>>>>> # Uncomment this if you want to _only_ change Flink's logging
>>>>>> log4j.logger.org.apache.flink=WARN
>>>>>>
>>>>>> # The following lines keep the log level of common libraries/connectors
on
>>>>>> # log level INFO. The root logger does not override this. You have
to manually
>>>>>> # change the log levels here.
>>>>>> log4j.logger.akka=INFO
>>>>>> log4j.logger.org.apache.kafka=INFO
>>>>>> log4j.logger.org.apache.hadoop=INFO
>>>>>> log4j.logger.org.apache.zookeeper=INFO
>>>>>>
>>>>>> # Log all infos in the given file
>>>>>> log4j.appender.file=org.apache.log4j.FileAppender
>>>>>> log4j.appender.file.file=${log.file}
>>>>>> log4j.appender.file.append=false
>>>>>> log4j.appender.file.layout=org.apache.log4j.PatternLayout
>>>>>> log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS}
%-5p %-60c %x - %m%n
>>>>>>
>>>>>> # Log all infos to the console
>>>>>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>>>>>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>>>>>> log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS}
%-5p %-60c %x - %m%n
>>>>>>
>>>>>> # Suppress the irrelevant (wrong) warnings from the Netty channel
handler
>>>>>> log4j.logger.org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,
file, console
>>>>>> log4j.logger.org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction=WARN
>>>>>> log4j.logger.org.apache.flink.runtime.checkpoint=WARN
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Li
>>>>>>
>>>>>>

Mime
View raw message