From user-return-30388-archive-asf-public=cust-asf.ponee.io@flink.apache.org Fri Oct 18 02:59:58 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 2B71E180657 for ; Fri, 18 Oct 2019 04:59:58 +0200 (CEST) Received: (qmail 19860 invoked by uid 500); 18 Oct 2019 02:59:54 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 19850 invoked by uid 99); 18 Oct 2019 02:59:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Oct 2019 02:59:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 140B21A34B8 for ; Fri, 18 Oct 2019 02:59:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.001 X-Spam-Level: X-Spam-Status: No, score=-0.001 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 4fE7FfquLJB4 for ; Fri, 18 Oct 2019 02:59:52 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.217.68; helo=mail-vs1-f68.google.com; envelope-from=sadhukhan.pritam@gmail.com; receiver= Received: from mail-vs1-f68.google.com (mail-vs1-f68.google.com [209.85.217.68]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 011DFBC8BB for ; Fri, 18 Oct 2019 02:59:51 +0000 (UTC) Received: by mail-vs1-f68.google.com with SMTP id b1so3032639vsr.10 for ; Thu, 17 Oct 2019 19:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=INBWK1YdBNcxjnKjj1qieFfgfaptZiKJ3sNCr98NAr0=; b=KrgW7Gv2H+YFXeko0n08AOEocWHFu+/LrVk5hTPSMhhe0o1mGE2TPJOtPSbRLmMG25 6C7i9WtZVkqKpiqqwindzpM/IVpojfDN6TLybRx+SMWF4U2DsPk4XVZdY+b9OnRbqHCF X+Fb2twDG2MgAgJtXgS/qS1LPcnQBjdL02ElVZ94xczTD8rCzcwyhDyURC13ysfMPivF l71wKKq8oDZmd5EEqTaMCVEeIhCa8Tz/KKFSov4Z5PCMWS2N7HSvocAUesyNEMn4ILPI lzpkov9+GGKsgrgPbndqRLtzWbmXEWKsTI5LIpZRMWo/MTo3dO1MYoSmixsNW/k6Dd67 8cIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=INBWK1YdBNcxjnKjj1qieFfgfaptZiKJ3sNCr98NAr0=; b=H1tbjg7Ir3Bqb4hwMLDAaGjMP9tWweCIN6iN1ubrYso4B1MJRm/E1veBY91HEuvmj2 a4LD9plCrKU4xZrNY9GxCjc0kyIWNYU79SsmHg8JJ6JT+//g8C7SmbE06mkBWvIJDB3N uhiex8AnSmHTTibyZkRfslJmFvBzc1ng0cV1X9n0Jqz4mbWkqKmFcjwM4rvhie9Q03ey 1zS00xGsK0FsGZduf0VdtzfzgwCUe0pJEfDMLKJio3Sl88JwamZrAjFmi+ekSC1Wka5j +dD1ZIOBD/mnU05Xmw3i18O0IsimiZug5Bwum8s1xdVUEVmZqOLqZgjK5326AmZ6ujoT KNRw== X-Gm-Message-State: APjAAAVUpH0nYVbhTuejwvYQJbDDpPKI8OGRUr7J7GZk5rHnU55MMkon cIINdgzXyDIWyAbBBmr3yv2IAUFujo/vniEPe5F78ctv X-Google-Smtp-Source: APXvYqyrxe0TMnBveCGd47QGG/TfU7e/hWbvWV/4TuuAdHZ8a3YJTihdoiikanKC6gLvvrMRxSkmreyb7T8tUgnSk3g= X-Received: by 2002:a67:ea85:: with SMTP id f5mr4126900vso.71.1571367591125; Thu, 17 Oct 2019 19:59:51 -0700 (PDT) MIME-Version: 1.0 From: Pritam Sadhukhan Date: Fri, 18 Oct 2019 08:29:40 +0530 Message-ID: Subject: Data processing with HDFS local or remote To: user@flink.apache.org Content-Type: multipart/alternative; boundary="00000000000014a75d0595268757" --00000000000014a75d0595268757 Content-Type: text/plain; charset="UTF-8" Hi, I am trying to process data stored on HDFS using flink batch jobs. Our data is splitted into 16 data nodes. I am curious to know how data will be pulled from the data nodes with the same number of parallelism set as the data split on HDFS i.e. 16. Is the flink task being executed locally on the data node server or it will happen in the flink nodes where data will be pulled remotely? Any help will be appreciated. Regards, Pritam. --00000000000014a75d0595268757 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

I am trying to process data stored = on HDFS using flink batch jobs.
Our data is splitted into 16 data= nodes.

I am curious to know how data will be pull= ed from the data nodes with the same number of parallelism set as the data = split on HDFS i.e. 16.

Is the flink task being exe= cuted locally on the data node server or it will happen in the flink nodes = where data will be pulled remotely?

Any help will = be appreciated.

Regards,
Pritam.
--00000000000014a75d0595268757--