nemo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-nemo] alapha23 commented on issue #187: [NEMO-324] Distinguish Beam's run and waitUntilFinish methods
Date Fri, 15 Mar 2019 06:45:57 GMT
alapha23 commented on issue #187: [NEMO-324] Distinguish Beam's run and waitUntilFinish methods
URL: https://github.com/apache/incubator-nemo/pull/187#issuecomment-473177054
 
 
   @wonook 
   I am afraid that timeout is not working. Driver shutdown did not initiate. 
   
   You could  refer to my `driver.stderr` and `driver.stdout` from my [driver_log.zip](https://github.com/apache/incubator-nemo/files/2969718/driver_log.zip)
   
   Interesting, I happened to implement timeout in a similar way as you did. This is [my branch](https://github.com/alapha23/incubator-nemo/tree/apache-master-gao).
I was hit an identical error. 
   
   This error is also verified by @taegeonum on his machine
   
   ### I started running nextmark using these parameters
   ```
   #!/bin/bash
   # 
   # Licensed to the Apache Software Foundation (ASF) under one
   # or more contributor license agreements.  See the NOTICE file
   # distributed with this work for additional information
   # regarding copyright ownership.  The ASF licenses this file
   # to you under the Apache License, Version 2.0 (the
   # "License"); you may not use this file except in compliance
   # with the License.  You may obtain a copy of the License at
   #
   #   http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing,
   # software distributed under the License is distributed on an
   # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   # KIND, either express or implied.  See the License for the
   # specific language governing permissions and limitations
   # under the License.
   #
   # run this by ./bin/generate_javadocs.sh
   
   TIMEOUT=30
   WINDOW=30
   INTERVAL=30
   EVENTS=0
   PARALLELISM=1
   PERIOD=50
   NORMAL=10
   BURSTY=10
   CPU_DELAY=0
   SAMPLING=0.9
   
   ENABLE_OFFLOADING=false
   ENABLE_OFFLOADING_DEBUG=false
   POOL_SIZE=0
   FLUSH_BYTES=$((10 * 1024 * 1024))
   FLUSH_COUNT=10
   
   ./bin/run_nexmark.sh \
    	-job_id nexmark-Q0 \
   	-executor_json `pwd`/examples/resources/executors/beam_test_executor_resources.json \
    	-user_main org.apache.beam.sdk.nexmark.Main \
    	-optimization_policy org.apache.nemo.compiler.optimizer.policy.StreamingPolicy \
   	-scheduler_impl_class_name org.apache.nemo.runtime.master.scheduler.StreamingScheduler
\
           -user_args "--runner=org.apache.nemo.client.beam.NemoRunner --streaming=true --query=$1
--manageResources=false --monitorJobs=true --streamTimeout=$TIMEOUT"
   ```
   
   ###  my commandline shows
   
   ```
   Powered by
       _   __                   
      / | / /__  ____ ___  ____ 
     /  |/ / _ \/ __ `__ \/ __ \
    / /|  /  __/ / / / / / /_/ /
   /_/ |_/\___/_/ /_/ /_/\____/ 
   
   SLF4J: Class path contains multiple SLF4J bindings.
   SLF4J: Found binding in [jar:file:/usr/local/share/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: Found binding in [jar:file:/Users/zhiyuangao/Documents/incubator-nemo/examples/nexmark/target/nexmark-0.2-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
   SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    INFO 03-15 15:23:52,485 JobLauncher:127 [main] - Launching RPC Server
    INFO 03-15 15:23:52,753 DriverRPCServer:93 [main] - DriverRPCServer running at 15621
    INFO 03-15 15:23:53,058 JobLauncher:163 [main] - Launching driver
   
   Powered by
        ___________  ______  ______  _______
       /  ______  / /  ___/ /  ___/ /  ____/
      /     _____/ /  /__  /  /__  /  /___
     /  /\  \     /  ___/ /  ___/ /  ____/
    /  /  \  \   /  /__  /  /__  /  /
   /__/    \__\ /_____/ /_____/ /__/
   
   Mar 15, 2019 3:23:53 PM org.apache.reef.util.REEFVersion logVersion
   INFO: REEF Version: 0.16.0
   Mar 15, 2019 3:23:53 PM org.apache.reef.client.DriverLauncher$SubmittedJobHandler onNext
   INFO: REEF job submitted: nexmark-Q0.
    INFO 03-15 15:23:53,439 JobLauncher:297 [main] - User program started
   2019-03-15T06:23:54.247Z Running query:0; streamTimeout:30
   2019-03-15T06:23:54.581Z Generating 100000 events in streaming mode
    INFO 03-15 15:23:55,012 JobLauncher:242 [ForkJoinPool.commonPool-worker-1] - Waiting for
the driver to be ready
   Mar 15, 2019 3:23:55 PM org.apache.reef.client.DriverLauncher$RunningJobHandler onNext
   INFO: The Job nexmark-Q0 is running.
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus
   INFO: Allocated: 1, Outstanding requests: Optional:{0}
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus
   INFO: Allocated: 1, Outstanding requests: Optional:{0}
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus
   INFO: Allocated: 2, Outstanding requests: Optional:{0}
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus
   INFO: Allocated: 2, Outstanding requests: Optional:{0}
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.common.driver.evaluator.AllocatedEvaluatorImpl
makeRootServiceConfiguration
   INFO: No service configuration given and no ConfigurationProviders set.
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.common.driver.evaluator.AllocatedEvaluatorImpl
makeRootServiceConfiguration
   INFO: No service configuration given and no ConfigurationProviders set.
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.process.ReefRunnableProcessObserver
onResourceStatus
   INFO: Sending resource status: ResourceStatusEventImpl:{id:Node-59-1552631035581, runtime:Node-59-1552631035581,
state:RUNNING, diag:Optional.empty, exit:Optional.empty} 
   Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.process.ReefRunnableProcessObserver
onResourceStatus
   INFO: Sending resource status: ResourceStatusEventImpl:{id:Node-58-1552631035640, runtime:Node-58-1552631035640,
state:RUNNING, diag:Optional.empty, exit:Optional.empty} 
    INFO 03-15 15:23:57,651 JobLauncher:250 [ForkJoinPool.commonPool-worker-1] - Launching
DAG...
    INFO 03-15 15:23:57,744 JobLauncher:263 [ForkJoinPool.commonPool-worker-1] - Waiting for
the DAG to finish execution
    INFO 03-15 15:24:24,995 NemoPipelineResult:75 [main] - Job timed out before PT30Sms, while
waiting until finish.
    INFO 03-15 15:24:24,996 JobLauncher:181 [main] - Wait for the driver to finish
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message