Return-Path: X-Original-To: apmail-ignite-user-archive@minotaur.apache.org Delivered-To: apmail-ignite-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CDDA1812D for ; Wed, 18 Nov 2015 06:48:41 +0000 (UTC) Received: (qmail 29924 invoked by uid 500); 18 Nov 2015 06:48:41 -0000 Delivered-To: apmail-ignite-user-archive@ignite.apache.org Received: (qmail 29875 invoked by uid 500); 18 Nov 2015 06:48:41 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 29865 invoked by uid 99); 18 Nov 2015 06:48:41 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Nov 2015 06:48:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7ECFAC7006 for ; Wed, 18 Nov 2015 06:48:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.174 X-Spam-Level: ** X-Spam-Status: No, score=2.174 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, NML_ADSP_CUSTOM_MED=1.2, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id mMoG5LseBzXf for ; Wed, 18 Nov 2015 06:48:33 +0000 (UTC) Received: from mbob.nabble.com (mbob.nabble.com [162.253.133.15]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTP id 271CA439B6 for ; Wed, 18 Nov 2015 06:48:33 +0000 (UTC) Received: from malf.nabble.com (unknown [162.253.133.59]) by mbob.nabble.com (Postfix) with ESMTP id 51C1E1A4EE40 for ; Tue, 17 Nov 2015 22:36:13 -0800 (PST) Date: Tue, 17 Nov 2015 22:39:41 -0800 (PST) From: diopek To: user@ignite.apache.org Message-ID: <1447828781193-1999.post@n6.nabble.com> In-Reply-To: <1447792418441-1993.post@n6.nabble.com> References: <1447461338394-1958.post@n6.nabble.com> <58C7309D-F3D1-4542-87C3-DF0092D08E2D@gridgain.com> <1447741126675-1980.post@n6.nabble.com> <1447742741925-1982.post@n6.nabble.com> <1447778182592-1988.post@n6.nabble.com> <1447792418441-1993.post@n6.nabble.com> Subject: Re: Data Loading Performance Issue MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi Val, Please see my comments below; *Val:*Can you please clarify what you mean by "batch run time"? Is it somehow connected to data loading via the store or it's a different issue? *diopek:* Batch run time : first step is cache initialization (most times taking step) and second step doing some computations and generating outputs using this cache populated in first step. Cache loading via store. Currently most important issue is, overall batch is running faster in my local PC (Windows 7, 8CPU/32GB RAM machine) than more powerful Linux server (64 CPU and 1TB RAM, which also has a faster network to Oracle DB). As a side note deployment package for all servers was built on my Windows server with 64 Windows version of JDK 1.8.0_65, Linux boxes Linux version of that JDK 1.8.0_65 ). I am literally puzzled here, what is causing this delay.. *Val:* I noticed that you put some lists instead of individual entries into the cache. What is the size of these lists? My suspicion is that the most time is spent for the serialization of the values (JCache spec has pass-by-value semantics, so we have to do this even in LOCAL cache). *diopek:* Yes my cache structure IgniteCache>, as I needed to process records that has certain common attributes together, these are like trade positions that has certain common attributes like date, acct, currency etc. After reading into a cache during the proecessing stage each group split into more granular records like 10 records become 1000 records and then I aggregate (group by) them so number of records shrinks to 500. then I directly write these records into some feed file. During the interim processing/computation the don't get stored back to Ignite cache just stays in regular Java memory till got flush into file. Also all the grouped records in the cache are being read by multiple partition threads, lets say if cache has 5,000,000 records and there are 5 partition threads each thread reads only its partition of records. if I read all the records into cache row by row, how can I partition and process records as groups. Of course I can partition rows but I loose the grouping. This is also the reason why I grouped and inserted into cache at the first place (as DB doesn't have natural partition key). In my uses cases I store sometimes customized objects as well as ArrayList, LindkedHashMap as values. I am aware of that serialization costs but was not sure about its extent. Currently my first priority just to resolve this Linux deployment slowness first, though this serialization has some cost but I am able to group and partition the records (I am open to suggestions as how can group/partition the records. *Val:* I would suggest to store a value per DB row to avoid duplicate data and therefore duplicate serializations. It also looks like loading the data in multithreaded fashion may be helpful - execute the query first and then do DB row parsing and saving into cache in several parallel threads. You can utilize CacheLoadOnlyStoreAdapter for this. *diopek:* Is there any working example you can point me to ? Thanks, Deha -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-Loading-Performance-Issue-tp1958p1999.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.