Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 512B6200BB8 for ; Sat, 12 Nov 2016 09:52:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4FBF1160B00; Sat, 12 Nov 2016 08:52:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 99F40160AF4 for ; Sat, 12 Nov 2016 09:52:46 +0100 (CET) Received: (qmail 36159 invoked by uid 500); 12 Nov 2016 08:52:45 -0000 Mailing-List: contact users-help@camel.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@camel.apache.org Delivered-To: mailing list users@camel.apache.org Received: (qmail 36147 invoked by uid 99); 12 Nov 2016 08:52:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Nov 2016 08:52:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id AE9B61A8C11 for ; Sat, 12 Nov 2016 08:52:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.512 X-Spam-Level: X-Spam-Status: No, score=0.512 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id yoxqHho5xFsh for ; Sat, 12 Nov 2016 08:52:42 +0000 (UTC) Received: from mail-qk0-f172.google.com (mail-qk0-f172.google.com [209.85.220.172]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id F1F285FCA7 for ; Sat, 12 Nov 2016 08:52:41 +0000 (UTC) Received: by mail-qk0-f172.google.com with SMTP id n21so46019930qka.3 for ; Sat, 12 Nov 2016 00:52:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=t7nJuyHbn4bhb/WjzEiBUlo0pZWPOoRT2GoVEaB29lk=; b=GsGKMn6eiP510xZLl/FzGfL9kCpqsnh4C13evQkXYVCgbobya7LZInGzt055gE1oxz +mxB62hB+1QV6q4a1SM1KBA1l5Cr8kP45mn2H2wvhobVBTRwaiXGAEQNohA2ZSyPCMON wqmnvs97talvicVkv66OCnLRpMkFLAhs4ij6tJ5x+7Pb2UOOFDgl644/GanqQoKRO5+R eynpeby1m1ex4mLXJKOxR1L5A6QSYXmpkP/8HqXtrXFbcGj5pqXXMtz5aqI56U3CHu8Y O1pajGY5kL9TFtOzPXykoyxGqW34FvBdX+ATWz8iZh2X0x/3GtDffMKZjeSO2rFLThgO GsFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=t7nJuyHbn4bhb/WjzEiBUlo0pZWPOoRT2GoVEaB29lk=; b=buL6KujkWHvC9dzv2ItqaoQ65lAtCToIu9vSIBSdNHlMsU5fswju+wYuC54sPap1/f YAOajf1QSEXcF2L29eeVq5GkFmL6W5Rby2Hk00QFhrP45pK4ZCxuZVUEKwGyN20L0O0a 00QurKqgrfeqXNzX9vn16x9E/jHDNvyn+fwiTfXRUr+lCsGxCoS/96cTcT/uxVDCrjWM RXNfTj06EVg6LxUHNDqlNI8mv2jkn0agL4UMLzjbLRyGy94iB3MqzTMLIx5oCLhvQ2Gc tpupxy+XgqWQq3ZxD2RiCVkWap6KmGhvbF7dxqXPTDvMmkpOTSSzhDGVquMdE2cu9HBK whBA== X-Gm-Message-State: ABUngvdopGuuPJYa8UROnCRACMT6bQwlW5my/LQCgzM7NV3jsAjdeS+VHNNpPdlcbBwJgTjOoXANHY8iyIPI9A== X-Received: by 10.55.17.68 with SMTP id b65mr7232340qkh.60.1478940761518; Sat, 12 Nov 2016 00:52:41 -0800 (PST) MIME-Version: 1.0 Received: by 10.237.62.141 with HTTP; Sat, 12 Nov 2016 00:52:21 -0800 (PST) In-Reply-To: References: <1478811669653-5790018.post@n5.nabble.com> From: Claus Ibsen Date: Sat, 12 Nov 2016 09:52:21 +0100 Message-ID: Subject: Re: Processing VERY large result sets To: "users@camel.apache.org" Content-Type: text/plain; charset=UTF-8 archived-at: Sat, 12 Nov 2016 08:52:47 -0000 Hi I think in the past there has been some threads talking about this, how to speedup such use-case. Not sure how easy it is to search for those. For example use nabble or markmail to search in the archives. On Fri, Nov 11, 2016 at 8:20 AM, Zoran Regvart wrote: > Hi Christian, > I was solving the exact same problem few years back, here is what I > did: I've created a custom @Handler that performs the JDBC query, the > purpose of which was to return Iterator over the records. The > implementation of the handler used springjdbc-iterable[1] to stream > the rows as they were consumed by another @Handler that took the > Iterator from the body and wrote line item by item using BeanIO. > > On a more recent project I had PostgreSQL as the database and could > use the CopyManager[2] that proved to be very performant, perhaps your > database the same functionality you can use. > > So basically custom coded the solution. > > zoran > > [1] https://github.com/apache/cxf > [2] https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/CopyManager.html > > On Thu, Nov 10, 2016 at 10:01 PM, Christian Jacob wrote: >> Hi there,my task is to execute a JDBC query against a Hive database and >> produce rows in csv files. The clue is, that depending on the query >> criteria, the number of range from some dozens to some millions. My first >> solution was something like this: >> from ("...").to ("sql:...") // produces a List> Object>>.split(body()).process(myProcessor) // produces a single row for the >> csv file.to("file:destination?fileExists=Append"); >> This was awful slow because the file producer opens the file, appends one >> single row, and closes it again.I found some posts how to use an Aggregator >> before sending the content to the file producer. This really was the desired >> solution, and the performance was satisfying. In this solution, the >> aggregator holds the total content of the csv file to be produced. >> Unfortunately, the files can be so large that I get stuck in "java gc >> overhead limit exceeded" exceptions. No matter how high I set the heap >> space, I have no chance to avoid this.Now I'm looking for a way how to get >> out of this, and I don't know how. My ideas are: >> Use a splitter that produces a sublist - I don't know how I could do it >> Use an aggregator that does not produce the total content for the files to >> be created, but only for example 1000 lines and then collects the next block >> - I don't know it here either >> Or maybe someone has a better idea...Kind regards,Christian >> >> >> >> -- >> View this message in context: http://camel.465427.n5.nabble.com/Processing-VERY-large-result-sets-tp5790018.html >> Sent from the Camel - Users mailing list archive at Nabble.com. > > > > -- > Zoran Regvart -- Claus Ibsen ----------------- http://davsclaus.com @davsclaus Camel in Action 2: https://www.manning.com/ibsen2