Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4826E200D3C for ; Tue, 14 Nov 2017 14:26:08 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4693B160BF4; Tue, 14 Nov 2017 13:26:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8C8041609EF for ; Tue, 14 Nov 2017 14:26:07 +0100 (CET) Received: (qmail 71584 invoked by uid 500); 14 Nov 2017 13:26:06 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 71574 invoked by uid 99); 14 Nov 2017 13:26:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Nov 2017 13:26:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9BCBC1807BC for ; Tue, 14 Nov 2017 13:26:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.293 X-Spam-Level: * X-Spam-Status: No, score=1.293 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=data-artisans-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id KE0DnZDk5GUN for ; Tue, 14 Nov 2017 13:26:03 +0000 (UTC) Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 9F8AD5FB2D for ; Tue, 14 Nov 2017 13:26:03 +0000 (UTC) Received: by mail-wm0-f44.google.com with SMTP id b189so15085316wmd.0 for ; Tue, 14 Nov 2017 05:26:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=data-artisans-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:organization:in-reply-to :references:mime-version; bh=Da9vrsTwAyw9TAqAp/ww2eyq9cM6xihRR+OI6f0WLfc=; b=0E2lVTT5gWCDcc80Hhn6JKrKHyRfLTryHjaAd6s+6zdQfNkhNCvVNmKJ6e/DponIfX NOTtiha3GRMujiJX+IuctJD8KVTs2KFS5S5ThkFa8U85d0im0H95fEAB/5x9ISs30id7 uHODDz984YUQJPYmvjnAbx7L8gmtgSWe4JDvvXLzUQ/QQ5JhhzzV6/NqHZCUqCUSqDSh /CasCYgd1fOeDR+wvVP9zHpNa62qBS5v9IoXz15FsaLAdR+21oGcyBm9NbcC6AaS3gwO w9StHcKnwi24zU2BbWx0F1JF+blm/RMraR2z6AmAWr8YOSt9FAL4YlaoxzFzAPioHcpz Kkng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:organization :in-reply-to:references:mime-version; bh=Da9vrsTwAyw9TAqAp/ww2eyq9cM6xihRR+OI6f0WLfc=; b=Jb4uBbFEjFwYsqsEOQqlgdfdQLwFgjcVM/2Rd/anKh3uDzhM0m4u2/mSzRsTmy1a/k SmONtEuSsfqjN9bPsc00k2itvUXNc8GPdOPvkrZkREwg+uWQDV+UnEPulnxprVRBXb4/ T5r1HG2PD3tUVenBIWNiBVnWPk9JuAPXcguLwaTHHW2xbfxqbPv+D5O3qNYrWM/oCI77 VbLbPCuToAlH6C72ytWeeKnvQpNBBfZDS0HKw4Gmz4CCB2+gR+PfbXpDd8mTdSjItQ1S 8QykAxkWh48mic+GziPZ68lEWKXGLnv29sAb94lzhHYwSkznsF+Jdax66sh4CYtuBYVV SR7w== X-Gm-Message-State: AJaThX63wZQO0xufAribCNj2uGC0cMpuwfTZcsEjPTYA9HibbwbCMTzh BFp7ztKr4eRjvs6Aqq+xuvFZ9w== X-Google-Smtp-Source: AGs4zMaZpeHKL1do8MuSxnglsH6qS7pcxGRBfntm+Pi3GtCRGiptVgIe5vK52tmrJAcJQGBrgL3wRA== X-Received: by 10.80.221.11 with SMTP id t11mr4688177edk.84.1510665963146; Tue, 14 Nov 2017 05:26:03 -0800 (PST) Received: from nico-work.localnet (ip-2-205-80-95.web.vodafone.de. [2.205.80.95]) by smtp.gmail.com with ESMTPSA id c7sm16679761edc.26.2017.11.14.05.26.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 05:26:02 -0800 (PST) From: Nico Kruber To: yunfan123 Cc: user@flink.apache.org, Chesnay Schepler Subject: Re: Flink takes too much memory in record serializer. Date: Tue, 14 Nov 2017 14:25:57 +0100 Message-ID: <3266813.ngzCfZ3qxZ@nico-work> Organization: data Artisans In-Reply-To: <448691c1-39ab-a406-b1bb-a39cf954ac2a@apache.org> References: <1510654284121-0.post@n4.nabble.com> <448691c1-39ab-a406-b1bb-a39cf954ac2a@apache.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2357098.4g5vQYq2T5"; micalg="pgp-sha1"; protocol="application/pgp-signature" archived-at: Tue, 14 Nov 2017 13:26:08 -0000 --nextPart2357098.4g5vQYq2T5 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" We're actually also trying to have the serializer stateless in future and may be able to remove the intermediate serialization buffer which is currently growing on heap before we copy the data into the actual target buffer. This intermediate buffer grows and is pruned after serialization if it is bigger than 5MB (see DataOutputSerializer), and re-used for anything below that threshold. So you may actually have up to 5MB per output channel which sits waiting for data. Please refer to https://issues.apache.org/jira/browse/FLINK-4893 for updates on this. These improvements will certainly reduce some of our memory footprint and help you. Throughput will then, of course, be limited by your network's speed and the number of network buffers to hold this amount of data and to saturate your network connections. The availability of these buffers will then limit your throughput accordingly. Nico On Tuesday, 14 November 2017 11:29:33 CET Chesnay Schepler wrote: > I don't there's anything you can do except reducing the parallelism or > the size of your messages. > > A separate serializer is used for each channel as the serializers are > stateful; they are capable of writing records partially > to a given MemorySegment to better utilize the allocated memory. > > How many messages is each operator instance processing per second? I > would imagine that at this scale > your memory consumption goes through the roof anyway due to the message > size. > Even if every operator instance is only processing 10 records/s you're > already looking at 10TB memory usage > for in-flight data. > > On 14.11.2017 11:11, yunfan123 wrote: > > In the class org.apache.flink.runtime.io.network.api.writer.RecordWriter, > > it has same number of serializers with the numChannels. > > If I first operator has 500 parallels and the next operator has 1000 > > parallels. > > And every message in flink is 2MB. > > The job takes 500 * 1000 * 2MB as 1TB memory in totally!!! > > Can I do anything to reduce the memory usage. > > > > > > > > -- > > Sent from: > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ --nextPart2357098.4g5vQYq2T5 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQTIh4KsbsNd3l7wd+cg8nJL2uqeWQUCWgru5QAKCRAg8nJL2uqe WY8tAKCNXCNiZBS0Iaof4rxlFWc4fsbabwCgu8FALZt6XPwqu7Lzcn+2XH+vrwE= =iFrv -----END PGP SIGNATURE----- --nextPart2357098.4g5vQYq2T5--