From issues-return-7677-archive-asf-public=cust-asf.ponee.io@mxnet.apache.org Fri Dec 11 01:24:22 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id C1B9F18062B for ; Fri, 11 Dec 2020 02:24:22 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id 08A5A41DDF for ; Fri, 11 Dec 2020 01:24:22 +0000 (UTC) Received: (qmail 92181 invoked by uid 500); 11 Dec 2020 01:24:21 -0000 Mailing-List: contact issues-help@mxnet.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mxnet.apache.org Delivered-To: mailing list issues@mxnet.apache.org Received: (qmail 92172 invoked by uid 99); 11 Dec 2020 01:24:21 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Dec 2020 01:24:21 +0000 From: =?utf-8?q?GitBox?= To: issues@mxnet.apache.org Subject: =?utf-8?q?=5BGitHub=5D_=5Bincubator-mxnet=5D_samskalicky_opened_a_new_issue_?= =?utf-8?q?=2319655=3A_Race_condition_between_loading_a_model_and_accessing_?= =?utf-8?q?tensor_data?= Message-ID: Date: Fri, 11 Dec 2020 01:24:21 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit samskalicky opened a new issue #19655: URL: https://github.com/apache/incubator-mxnet/issues/19655 ## Description Theres a race condition between loading a model and the model's params being available by a backend run with `optimize_for` like: ``` block = SymbolBlock.imports(symbol_file, input_names, param_file, mxnet.cpu()) block.optimize_for(inputs, backend='myBackend') ``` Where the data in the ndarrays have garbage data in the backend. ### Fix We need to loop over the NDArrays for args/aux and call `wait_to_read` on each to ensure any previous operations pushed to the engine are complete: https://github.com/apache/incubator-mxnet/blob/master/src/c_api/c_api_symbolic.cc#L1309-L1310 ### Workaround In the meantime we can simply call `mx.nd.waitall()` between loading the model and `optimize_for` ``` block = SymbolBlock.imports(symbol_file, input_names, param_file, mxnet.cpu()) mx.nd.waitall() block.optimize_for(inputs, backend='myBackend') ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org For additional commands, e-mail: issues-help@mxnet.apache.org