flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacky D <jacky.du0...@gmail.com>
Subject Re: Flink Memory analyze on AWS EMR
Date Tue, 12 May 2020 18:44:56 GMT
hi, Arvid

thanks for the advice  ,  I removed the quotes and it do created a yarn
session on EMR , but I didn't find any jit log file generated .

The config with quotes is working on standalone cluster . I also tried to
dynamic pass the property within the yarn session command :

flink-yarn-session -n 1 -d -nm testSession -yD
env.java.opts="-XX:+UnlockDiagnosticVMOptions
-XX:+TraceClassLoading -XX:+LogCompilation
-XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"


but get same result , session created , but can not find any jit log file
under container log .


Thanks

Jacky

Arvid Heise <arvid@ververica.com> 于2020年5月12日周二 下午12:57写道:

> Hi Jacky,
>
> I suspect that the quotes are the actual issue. Could you try to remove
> them? See also [1].
>
> [1]
> http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html
>
> On Tue, May 12, 2020 at 4:03 PM Jacky D <jacky.du0314@gmail.com> wrote:
>
>> hi, Xintong
>>
>> Thanks for reply , I attached those lines below for application master
>> start command :
>>
>>
>> 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory
>>                   - Crypto codec
>> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
>> 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory
>>                   - Using crypto codec
>> org.apache.hadoop.crypto.JceAesCtrCryptoCodec.
>> 2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>                    - DataStreamer block
>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
>> packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false
>> lastByteOffsetInBlock: 1697
>> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>                    - DFSClient seqno: 0 reply: SUCCESS
>> downstreamAckTimeNanos: 0 flag: 0
>> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>                    - DataStreamer block
>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
>> packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true
>> lastByteOffsetInBlock: 1697
>> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>                    - DFSClient seqno: 1 reply: SUCCESS
>> downstreamAckTimeNanos: 0 flag: 0
>> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>                    - Closing old block
>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315
>> 2020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70
>> org.apache.hadoop.hdfs.protocol.ClientProtocol.complete
>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #70
>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>                    - Call: complete took 2ms
>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71
>> org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes
>> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #71
>> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>                    - Call: setTimes took 2ms
>> 2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72
>> org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission
>> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client
>>                   - IPC Client (1954985045) connection to
>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #72
>> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>                    - Call: setPermission took 2ms
>> 2020-05-11 21:16:16,654 DEBUG
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Application
>> Master start command: $JAVA_HOME/bin/java -Xmx424m
>> "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation
>> -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"
>> -Dlog.file="<LOG_DIR>/jobmanager.log"
>> -Dlog4j.configuration=file:log4j.properties
>> org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint  1>
>> <LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err
>> 2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client
>>                   - stopping client from cache:
>> org.apache.hadoop.ipc.Client@28194a50
>> 2020-05-11 21:16:16,656 DEBUG
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>> method setApplicationTags.
>> 2020-05-11 21:16:16,656 DEBUG
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>> method setAttemptFailuresValidityInterval.
>> 2020-05-11 21:16:16,656 DEBUG
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>> method setKeepContainersAcrossApplicationAttempts.
>> 2020-05-11 21:16:16,656 DEBUG
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>> method setNodeLabelExpression.
>>
>> Xintong Song <tonysong820@gmail.com> 于2020年5月11日周一 下午10:11写道:
>>
>>> Hi Jacky,
>>>
>>> Could you search for "Application Master start command:" in the debug
>>> log and post the result and a few lines before & after that? This is not
>>> included in the clip of attached log file.
>>>
>>> Thank you~
>>>
>>> Xintong Song
>>>
>>>
>>>
>>> On Tue, May 12, 2020 at 5:33 AM Jacky D <jacky.du0314@gmail.com> wrote:
>>>
>>>> hi, Robert
>>>>
>>>> Thanks so much for quick reply  , I changed the log level to debug  and
>>>> attach the log file .
>>>>
>>>> Thanks
>>>> Jacky
>>>>
>>>> Robert Metzger <rmetzger@apache.org> 于2020年5月11日周一 下午4:14写道:
>>>>
>>>>> Thanks a lot for posting the full output.
>>>>>
>>>>> It seems that Flink is passing an invalid list of arguments to the
>>>>> JVM.
>>>>> Can you
>>>>> - set the root log level in conf/log4j-yarn-session.properties to DEBUG
>>>>> - then launch the YARN session
>>>>> - share the log file of the yarn session on the mailing list?
>>>>>
>>>>> I'm particularly interested in the line printed here, as it shows the
>>>>> JVM invocation.
>>>>>
>>>>> https://github.com/apache/flink/blob/release-1.6/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L1630
>>>>>
>>>>>
>>>>> On Mon, May 11, 2020 at 9:56 PM Jacky D <jacky.du0314@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,Robert
>>>>>>
>>>>>> Yes , I tried to retrieve more log info from yarn UI , the full logs
>>>>>> showing below , this happens when I try to create a flink yarn session
on
>>>>>> emr when set up jitwatch configuration .
>>>>>>
>>>>>> 2020-05-11 19:06:09,552 ERROR
>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error
while
>>>>>> running the Flink Yarn session.
>>>>>> java.lang.reflect.UndeclaredThrowableException
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)
>>>>>> at
>>>>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>>>> at
>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813)
>>>>>> Caused by:
>>>>>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't
>>>>>> deploy Yarn session cluster
>>>>>> at
>>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429)
>>>>>> at
>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610)
>>>>>> at
>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>>>>>> ... 2 more
>>>>>> Caused by:
>>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException:
>>>>>> The YARN application unexpectedly switched to state FAILED during
>>>>>> deployment.
>>>>>> Diagnostics from YARN: Application application_1584459865196_0165
>>>>>> failed 1 times (global limit =2; local limit is =1) due to AM Container
for
>>>>>> appattempt_1584459865196_0165_000001 exited with  exitCode: 1
>>>>>> Failing this attempt.Diagnostics: Exception from container-launch.
>>>>>> Container id: container_1584459865196_0165_01_000001
>>>>>> Exit code: 1
>>>>>> Exception message: Usage: java [-options] class [args...]
>>>>>>            (to execute a class)
>>>>>>    or  java [-options] -jar jarfile [args...]
>>>>>>            (to execute a jar file)
>>>>>> where options include:
>>>>>>     -d32   use a 32-bit data model if available
>>>>>>     -d64   use a 64-bit data model if available
>>>>>>     -server   to select the "server" VM
>>>>>>                   The default VM is server,
>>>>>>                   because you are running on a server-class machine.
>>>>>>
>>>>>>
>>>>>>     -cp <class search path of directories and zip/jar files>
>>>>>>     -classpath <class search path of directories and zip/jar files>
>>>>>>                   A : separated list of directories, JAR archives,
>>>>>>                   and ZIP archives to search for class files.
>>>>>>     -D<name>=<value>
>>>>>>                   set a system property
>>>>>>     -verbose:[class|gc|jni]
>>>>>>                   enable verbose output
>>>>>>     -version      print product version and exit
>>>>>>     -version:<value>
>>>>>>                   Warning: this feature is deprecated and will be
>>>>>> removed
>>>>>>                   in a future release.
>>>>>>                   require the specified version to run
>>>>>>     -showversion  print product version and continue
>>>>>>     -jre-restrict-search | -no-jre-restrict-search
>>>>>>                   Warning: this feature is deprecated and will be
>>>>>> removed
>>>>>>                   in a future release.
>>>>>>                   include/exclude user private JREs in the version
>>>>>> search
>>>>>>     -? -help      print this help message
>>>>>>     -X            print help on non-standard options
>>>>>>     -ea[:<packagename>...|:<classname>]
>>>>>>     -enableassertions[:<packagename>...|:<classname>]
>>>>>>                   enable assertions with specified granularity
>>>>>>     -da[:<packagename>...|:<classname>]
>>>>>>     -disableassertions[:<packagename>...|:<classname>]
>>>>>>                   disable assertions with specified granularity
>>>>>>     -esa | -enablesystemassertions
>>>>>>                   enable system assertions
>>>>>>     -dsa | -disablesystemassertions
>>>>>>                   disable system assertions
>>>>>>     -agentlib:<libname>[=<options>]
>>>>>>                   load native agent library <libname>, e.g.
>>>>>> -agentlib:hprof
>>>>>>                   see also, -agentlib:jdwp=help and
>>>>>> -agentlib:hprof=help
>>>>>>     -agentpath:<pathname>[=<options>]
>>>>>>                   load native agent library by full pathname
>>>>>>     -javaagent:<jarpath>[=<options>]
>>>>>>                   load Java programming language agent, see
>>>>>> java.lang.instrument
>>>>>>     -splash:<imagepath>
>>>>>>                   show splash screen with specified image
>>>>>> See
>>>>>> http://www.oracle.com/technetwork/java/javase/documentation/index.html
>>>>>> for more details.
>>>>>>
>>>>>> Thanks
>>>>>> Jacky
>>>>>>
>>>>>> Robert Metzger <rmetzger@apache.org> 于2020年5月11日周一
下午3:42写道:
>>>>>>
>>>>>>> Hey Jacky,
>>>>>>>
>>>>>>> The error says "The YARN application unexpectedly switched to
state
>>>>>>> FAILED during deployment.".
>>>>>>> Have you tried retrieving the YARN application logs?
>>>>>>> Does the YARN UI / resource manager logs reveal anything on the
>>>>>>> reason for the deployment to fail?
>>>>>>>
>>>>>>> Best,
>>>>>>> Robert
>>>>>>>
>>>>>>>
>>>>>>> On Mon, May 11, 2020 at 9:34 PM Jacky D <jacky.du0314@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------- Forwarded message ---------
>>>>>>>> 发件人: Jacky D <jacky.du0314@gmail.com>
>>>>>>>> Date: 2020年5月11日周一 下午3:12
>>>>>>>> Subject: Re: Flink Memory analyze on AWS EMR
>>>>>>>> To: Khachatryan Roman <khachatryan.roman@gmail.com>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi, Roman
>>>>>>>>
>>>>>>>> Thanks for quick response , I tried without logFIle option
but
>>>>>>>> failed with same error , I'm currently using flink 1.6
>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html,
>>>>>>>> so I can only use Jitwatch or JMC .  I guess those tools
only available on
>>>>>>>> Standalone cluster ? as document mentioned "Each standalone
>>>>>>>> JobManager, TaskManager, HistoryServer, and ZooKeeper daemon
redirects
>>>>>>>> stdout and stderr to a file with a .out filename suffix and
writes
>>>>>>>> internal logging to a file with a .log suffix. Java options
>>>>>>>> configured by the user in env.java.opts" ?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Jacky
>>>>>>>>
>>>>>>>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>

Mime
View raw message