From odf-dev-return-379-apmail-incubator-odf-dev-archive=incubator.apache.org@incubator.apache.org Tue Nov 8 09:13:37 2011 Return-Path: X-Original-To: apmail-incubator-odf-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-odf-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7D6699023 for ; Tue, 8 Nov 2011 09:13:37 +0000 (UTC) Received: (qmail 64526 invoked by uid 500); 8 Nov 2011 09:13:37 -0000 Delivered-To: apmail-incubator-odf-dev-archive@incubator.apache.org Received: (qmail 64503 invoked by uid 500); 8 Nov 2011 09:13:37 -0000 Mailing-List: contact odf-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: odf-dev@incubator.apache.org Delivered-To: mailing list odf-dev@incubator.apache.org Received: (qmail 64495 invoked by uid 99); 8 Nov 2011 09:13:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2011 09:13:37 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daisy.ycguo@gmail.com designates 74.125.82.175 as permitted sender) Received: from [74.125.82.175] (HELO mail-wy0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2011 09:13:30 +0000 Received: by wyh5 with SMTP id 5so244130wyh.6 for ; Tue, 08 Nov 2011 01:13:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=7zM+TH05ZVaYOMs9a9eY15Q4fGQc2YFBYFtlLvbuYtY=; b=LzVwPQfTDO+AakVSYaaFKzzpqxy4eRD3eynyXqvTLEh+5c6OggICRLvXhs0giNJ5/x TDYx+wkptsIjVwgc+tL4detl4iKNtX/GCZP0mqaWEWOFtDCPYEk9jys6AndSmy+bGkTJ aZEM4A089WftLlfJBCM7LjQUVKKYK/NGBjlTg= MIME-Version: 1.0 Received: by 10.180.92.169 with SMTP id cn9mr11563289wib.62.1320743589274; Tue, 08 Nov 2011 01:13:09 -0800 (PST) Received: by 10.180.86.104 with HTTP; Tue, 8 Nov 2011 01:13:08 -0800 (PST) In-Reply-To: References: Date: Tue, 8 Nov 2011 17:13:08 +0800 Message-ID: Subject: Re: HugeData in a Spreadsheet From: Daisy Guo To: odf-dev@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable 2011/10/29 Rob Weir : > On Fri, Oct 28, 2011 at 12:57 AM, Wilson Barbosa > wrote: >> Hi Devin and Rob, >> >> this is an adaptation of my original code: >> > > If I'm reading this right, you have 14 sheets, each one with 10,000 > rows and 13 columns of data. =A0That comes out to nearly 2 million > cells. > > Aside from increasing your heap size, I'm not sure there is anything > you can do in the current version of the ODF Toolkit to bring this > down, if you're dealing with that amount of data. > > One trick might be to use temp files. =A0So create 14 temporary ODF > documents, each with one sheet. =A0Then write some custom code to > combine the temp files. =A0But don't instantiate an XML DOM for the > final combine step. =A0Just create a new content.xml that concatenates > the markup from the temp files. > > In the next release we might look at doing String.intern() on cell > values, and see if that helps. =A0When a spreadsheet is used as a > database, there are often many duplicate strings. I believe we used String.intern() in the current code base. Maybe we need to figure out a possible way to load a document partially. > > -Rob > >> 8<---- >> public void teste_table() throws Exception { >> =A0 =A0 =A0 =A0int next_sheet =3D 0; >> =A0 =A0 =A0 =A0long next_line =3D 1; >> =A0 =A0 =A0 =A0// Cria objetos da planilha >> =A0 =A0 =A0 =A0SpreadsheetDocument planilha =3D SpreadsheetDocument >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0.newSpreadsheetDocument(); >> >> =A0 =A0 =A0 =A0Table t =3D planilha.getSheetByIndex(0); >> =A0 =A0 =A0 =A0t.setTableName("wks_" + next_sheet); >> =A0 =A0 =A0 =A0for (long i =3D 1; i < 140000; i++) { >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("A" + >> String.valueOf(next_line)).setStringValue( >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"*A******"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("B" + >> String.valueOf(next_line)).setStringValue( >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"*B******"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("C" + >> String.valueOf(next_line)).setStringValue("*C*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("D" + >> String.valueOf(next_line)).setStringValue("*D*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("E" + >> String.valueOf(next_line)).setStringValue("*E*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("F" + >> String.valueOf(next_line)).setStringValue("*F*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("G" + >> String.valueOf(next_line)).setStringValue("*G*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("H" + >> String.valueOf(next_line)).setStringValue("*H*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("I" + >> String.valueOf(next_line)).setStringValue("*I*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("J" + >> String.valueOf(next_line)).setStringValue("*J*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("K" + >> String.valueOf(next_line)).setStringValue("*K*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("L" + >> String.valueOf(next_line)).setStringValue("*L*"); >> =A0 =A0 =A0 =A0 =A0 =A0t.getCellByPosition("M" + >> String.valueOf(next_line)).setStringValue("*M*"); >> =A0 =A0 =A0 =A0 =A0 =A0// >> =A0 =A0 =A0 =A0 =A0 =A0next_line++; >> =A0 =A0 =A0 =A0 =A0 =A0// >> =A0 =A0 =A0 =A0 =A0 =A0System.out.println(i); >> =A0 =A0 =A0 =A0 =A0 =A0// >> =A0 =A0 =A0 =A0 =A0 =A0if(i % 10000 =3D=3D 0){ >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0next_sheet++; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0next_line =3D 1; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0t =3D planilha.appendSheet("wks" + next_s= heet); >> =A0 =A0 =A0 =A0 =A0 =A0} >> >> =A0 =A0 =A0 =A0} >> =A0 =A0 =A0 =A0planilha.save("/tmp/big_worksheet.ods"); >> =A0 =A0 =A0 =A0planilha.close(); >> =A0 =A0} >> 8<------- >> >> I had some variations, for example using a Cell object to better work wi= th >> the formatation and data values... >> >> I get a problem similar when using a TreeMap and i discovered a jdbm2 >> implementation of TreeMap in disk, but i dont know how to use a view of = a >> SpreadsheetDocument in disk. >> >> Thanks in advance, >> Wilson. >> >> >> >> >> 2011/10/25 Devin Han >> >>> Hi Wilson, >>> >>> I test two cases. >>> >>> If the <> is >>> =A0 =A0 =A0 =A0 =A0 =A0c =3D t.getCellByPosition(0, i); >>> =A0 =A0 =A0 =A0 =A0 =A0c.setStringValue("i"); >>> There is no exception, everything works well. >>> >>> If change to: >>> =A0 =A0 =A0 =A0 =A0 =A0c =3D t.getCellByPosition(i, i); >>> =A0 =A0 =A0 =A0 =A0 =A0c.setStringValue("i"); >>> =A0 =A0 =A0 =A0 =A0 =A0System.out.println(i); >>> "java.lang.OutOfMemoryError" will be thrown when i =3D 941. >>> >>> So, Logic of Application, please give more info about it. >>> >>> 2011/10/24 Wilson Barbosa >>> >>> > Dear All, >>> > >>> > I'm newbie using the SimpleODS, im getting in trouble when creating a= big >>> > spreadsheet (about 140k rows). When the sheet if growing the performa= nce >>> is >>> > degrated and in a point i got "java.lang.OutOfMemory" exception. Ther= e is >>> a >>> > way of work in the sheet object like a file? >>> > >>> > The example of my code (reduced): >>> > ... >>> > // Cria objetos da planilha >>> > SpreadsheetDocument planilha =3D >>> > SpreadsheetDocument.newSpreadsheetDocument(); >>> > Table t =3D planilha.getSheetByIndex(proxima_sheet); >>> > t.setTableName("WKS_" + proxima_sheet); >>> > // >>> > Cell c =3D null; >>> > for(int i =3D 0; i<65000;i++){ >>> > =A0<> >>> > } >>> > ... >>> > >>> > >>> Many thanks., >>> > Wilson. >>> > >>> >>> >>> >>> -- >>> -Devin >>> >> >