incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <robw...@apache.org>
Subject Re: HugeData in a Spreadsheet
Date Sat, 29 Oct 2011 00:59:43 GMT
On Fri, Oct 28, 2011 at 12:57 AM, Wilson Barbosa
<wilson.barbosa@gmail.com> wrote:
> Hi Devin and Rob,
>
> this is an adaptation of my original code:
>

If I'm reading this right, you have 14 sheets, each one with 10,000
rows and 13 columns of data.  That comes out to nearly 2 million
cells.

Aside from increasing your heap size, I'm not sure there is anything
you can do in the current version of the ODF Toolkit to bring this
down, if you're dealing with that amount of data.

One trick might be to use temp files.  So create 14 temporary ODF
documents, each with one sheet.  Then write some custom code to
combine the temp files.  But don't instantiate an XML DOM for the
final combine step.  Just create a new content.xml that concatenates
the markup from the temp files.

In the next release we might look at doing String.intern() on cell
values, and see if that helps.  When a spreadsheet is used as a
database, there are often many duplicate strings.

-Rob

> 8<----
> public void teste_table() throws Exception {
>        int next_sheet = 0;
>        long next_line = 1;
>        // Cria objetos da planilha
>        SpreadsheetDocument planilha = SpreadsheetDocument
>                .newSpreadsheetDocument();
>
>        Table t = planilha.getSheetByIndex(0);
>        t.setTableName("wks_" + next_sheet);
>        for (long i = 1; i < 140000; i++) {
>            t.getCellByPosition("A" +
> String.valueOf(next_line)).setStringValue(
>                    "*A******");
>            t.getCellByPosition("B" +
> String.valueOf(next_line)).setStringValue(
>                    "*B******");
>            t.getCellByPosition("C" +
> String.valueOf(next_line)).setStringValue("*C*");
>            t.getCellByPosition("D" +
> String.valueOf(next_line)).setStringValue("*D*");
>            t.getCellByPosition("E" +
> String.valueOf(next_line)).setStringValue("*E*");
>            t.getCellByPosition("F" +
> String.valueOf(next_line)).setStringValue("*F*");
>            t.getCellByPosition("G" +
> String.valueOf(next_line)).setStringValue("*G*");
>            t.getCellByPosition("H" +
> String.valueOf(next_line)).setStringValue("*H*");
>            t.getCellByPosition("I" +
> String.valueOf(next_line)).setStringValue("*I*");
>            t.getCellByPosition("J" +
> String.valueOf(next_line)).setStringValue("*J*");
>            t.getCellByPosition("K" +
> String.valueOf(next_line)).setStringValue("*K*");
>            t.getCellByPosition("L" +
> String.valueOf(next_line)).setStringValue("*L*");
>            t.getCellByPosition("M" +
> String.valueOf(next_line)).setStringValue("*M*");
>            //
>            next_line++;
>            //
>            System.out.println(i);
>            //
>            if(i % 10000 == 0){
>                next_sheet++;
>                next_line = 1;
>                t = planilha.appendSheet("wks" + next_sheet);
>            }
>
>        }
>        planilha.save("/tmp/big_worksheet.ods");
>        planilha.close();
>    }
> 8<-------
>
> I had some variations, for example using a Cell object to better work with
> the formatation and data values...
>
> I get a problem similar when using a TreeMap and i discovered a jdbm2
> implementation of TreeMap in disk, but i dont know how to use a view of a
> SpreadsheetDocument in disk.
>
> Thanks in advance,
> Wilson.
>
>
>
>
> 2011/10/25 Devin Han <devinhan@apache.org>
>
>> Hi Wilson,
>>
>> I test two cases.
>>
>> If the <<Logic of Application>> is
>>            c = t.getCellByPosition(0, i);
>>            c.setStringValue("i");
>> There is no exception, everything works well.
>>
>> If change to:
>>            c = t.getCellByPosition(i, i);
>>            c.setStringValue("i");
>>            System.out.println(i);
>> "java.lang.OutOfMemoryError" will be thrown when i = 941.
>>
>> So, Logic of Application, please give more info about it.
>>
>> 2011/10/24 Wilson Barbosa <wilson.barbosa@gmail.com>
>>
>> > Dear All,
>> >
>> > I'm newbie using the SimpleODS, im getting in trouble when creating a big
>> > spreadsheet (about 140k rows). When the sheet if growing the performance
>> is
>> > degrated and in a point i got "java.lang.OutOfMemory" exception. There is
>> a
>> > way of work in the sheet object like a file?
>> >
>> > The example of my code (reduced):
>> > ...
>> > // Cria objetos da planilha
>> > SpreadsheetDocument planilha =
>> > SpreadsheetDocument.newSpreadsheetDocument();
>> > Table t = planilha.getSheetByIndex(proxima_sheet);
>> > t.setTableName("WKS_" + proxima_sheet);
>> > //
>> > Cell c = null;
>> > for(int i = 0; i<65000;i++){
>> >  <<Logic of Application>>
>> > }
>> > ...
>> >
>> >
>> Many thanks.,
>> > Wilson.
>> >
>>
>>
>>
>> --
>> -Devin
>>
>

Mime
View raw message