Hey ouywl, interesting, I figured something like that would happen. I actually replaced all the log4j-x files with the same config I originally posted, including log4j-console, but that didn't change the behavior either.

Hey Yang, yes I verified the properties files are as I configured, and that the logs don't match up with it. Here are the JVM arguments, if that's what you were looking for: 

[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.232-b09
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Maximum heap size: 989 MiBytes
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JAVA_HOME: /usr/local/openjdk-8
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Hadoop version: 2.8.3
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM Options:
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Xms1024m
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Xmx1024m
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Program Arguments:
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     --configDir
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     /opt/flink/conf
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Djobmanager.rpc.address=myjob-manager
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dparallelism.default=2
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dblob.server.port=6124
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dqueryable-state.server.ports=6125
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Djobmanager.heap.size=3000m
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dtaskmanager.heap.size=3000m
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dmetrics.reporter.stsd.class=org.apache.flink.metrics.statsd.StatsDReporter
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dmetrics.reporter.stsd.host=myhost.com
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dmetrics.reporter.stsd.port=8125
[main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint -     -Dmetrics.system-resource=true

Thanks,
Li

On Thu, Dec 12, 2019 at 4:40 AM ouywl <ouywl@139.com> wrote:
 @Li Peng 
   I found your problems.  Your start cmd use args “start-foreground”, It will run “exec "${FLINK_BIN_DIR}"/flink-console.sh ${ENTRY_POINT_NAME} "${ARGS[@]}”, and In ' flink-console.sh’, the code is “log_setting=("-Dlog4j.configuration=file:${FLINK_CONF_DIR}/log4j-console.properties" "-Dlogback.configurationFile=file:${FLINK_CONF_DIR}/logback-console.xml")” . So the log4j.properties not work. It need log4j-console.properties and logback-console.xml.


On 12/12/2019 15:35ouywl<ouywl@139.com> wrote:
HI yang,
   Could you give more info detail? log4j.properties content, and The k8s yaml. Is use the dockerfile in flink-container? When I test it use the default per-job yaml in flick-container? It is only show logs in docker infos. And not logs in /opt/flink/log.


On 12/12/2019 13:47Yang Wang<danrtsey.wy@gmail.com> wrote:
Hi Peng,

What i mean is to use `docker exec` into the running pod and `ps` to get the real
command that is running for jobmanager. 
Do you have checked the /opt/flink/conf/log4j.properties is right?

I have tested standalone per-job on my kubernetes cluster, the logs show up as expected.


Best,
Yang

Li Peng <li.peng@doordash.com> 于2019年12月12日周四 上午2:59写道:
Hey Yang, here are the commands:
"/opt/flink/bin/taskmanager.sh",
"start-foreground",
"-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager",
"-Dtaskmanager.numberOfTaskSlots=1"
"/opt/flink/bin/standalone-job.sh",
"start-foreground",
"-Djobmanager.rpc.address={{ .Chart.Name }}-job-manager",
"-Dparallelism.default={{ .Values.task.replicaCount }}"
Yes it's very curious that I don't see any logs actually written to /opt/flink/log. 

On Tue, Dec 10, 2019 at 11:17 PM Yang Wang <danrtsey.wy@gmail.com> wrote:
Could you find the logs under /opt/flink/log/jobmanager.log? If not, please share the
commands the JobManager and TaskManager are using? If the command is correct
and the log4j under /opt/flink/conf is expected, it is so curious why we could not get the logs.


Best,
Yang

Li Peng <li.peng@doordash.com> 于2019年12月11日周三 下午1:24写道:
Ah I see. I think the Flink app is reading files from /opt/flink/conf correctly as it is, since changes I make to flink-conf are picked up as expected, it's just the log4j properties that are either not being used, or don't apply to stdout or whatever source k8 uses for its logs? Given that the pods don't seem to have logs written to file anywhere, contrary to the properties, I'm inclined to say it's the former and that the log4j properties just aren't being picked up. Still have no idea why though.

On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myasuka@live.com> wrote:

Sure, /opt/flink/conf is mounted as a volume from the configmap.

 

Best

Yun Tang

 

From: Li Peng <li.peng@doordash.com>
Date: Wednesday, December 11, 2019 at 9:37 AM
To: Yang Wang <danrtsey.wy@gmail.com>
Cc: vino yang <yanghua1127@gmail.com>, user <user@flink.apache.org>
Subject: Re: Flink on Kubernetes seems to ignore log4j.properties

 

1. Hey Yun, I'm calling /opt/flink/bin/standalone-job.sh and /opt/flink/bin/taskmanager.sh on my job and task managers respectively. It's based on the setup described here: http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ . I haven't tried the configmap approach yet, does it also replace the conf files in /opt/flink/conf?

2. Hey Vino, here's a sample of the kubernetes: https://pastebin.com/fqJrgjZu  I didn't change any patterns from the default, so the string patterns should look the same, but as you can see it's full of info checkpoint logs that I originally was trying to suppress. Based on my log4j.properties, the level should be set to WARN. I couldn't actually find any .out files on the pod, this is from the kubectl logs command. I also didn't see any files in /opt/flink/log, which I thought my log4j was specified to do, hence me thinking that the properties weren't actually being consumed. I also have the same properties in my src/main/resources folder.

3. Hey Yang, yes this is a standalone session cluster. I did specify in the docker file to copy the log4j.properties to the /opt/flink/conf folder on the image, and I confirmed that the properties are correct when I bash'd into the pod and viewed them manually.

 

Incidentally, I also tried passing the -Dlog4j.configuration argument to the programs, and it doesn't work either. And based on what I'm reading on jira, that option is not really supported anymore?

 

Thanks for your responses, folks!

Li

 

On Mon, Dec 9, 2019 at 7:10 PM Yang Wang <danrtsey.wy@gmail.com> wrote:

Hi Li Peng,

 

You are running standalone session cluster or per-job cluster on kubernetes. Right?

If so, i think you need to check your log4j.properties in the image, not local. The log is

stored to /opt/flink/log/jobmanager.log by default. 

 

If you are running active Kubernetes integration for a fresh taste. The following cli option

could be used to remove the redirect.

-Dkubernetes.container-start-command-template="%java% %classpath% %jvmmem% %jvmopts% %logging% %class% %args%"

 

Best,

Yang

 

vino yang <yanghua1127@gmail.com> 20191210日周二 上午10:55写道:

Hi Li,

 

A potential reason could be conflicting logging frameworks. Can you share the log in your .out file and let us know if the print format of the log is the same as the configuration file you gave.

 

Best,

Vino

 

Li Peng <li.peng@doordash.com> 20191210日周二 上午10:09写道:

Hey folks, I noticed that my kubernetes flink logs (reached via kubectl logs <pod-name>) completely ignore any of the configurations I put into /flink/conf/. I set the logger level to WARN, yet I still see INFO level logging from flink loggers like org.apache.flink.runtime.checkpoint.CheckpointCoordinator. I even made copied the same properties to /flink/conf/log4j-console.properties and log4j-cli.properties. 

 

From what I can tell, kubernetes just listens to stdout and stderr, so shouldn't the log4j.properties control output to them? Anyone seen this issue before?

 

Here is my log4j.properties:


# This affects logging for both user code and Flink
log4j.rootLogger=WARN, file, console, stdout

# Uncomment this if you want to _only_ change Flink's logging
log4j.logger.org.apache.flink=WARN

# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
log4j.logger.akka=INFO
log4j.logger.org.apache.kafka=INFO
log4j.logger.org.apache.hadoop=INFO
log4j.logger.org.apache.zookeeper=INFO

# Log all infos in the given file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.file=${log.file}
log4j.appender.file.append=false
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n

# Log all infos to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n

# Suppress the irrelevant (wrong) warnings from the Netty channel handler
log4j.logger.org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline=ERROR, file, console
log4j.logger.org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction=WARN
log4j.logger.org.apache.flink.runtime.checkpoint=WARN

Thanks,

Li