Return-Path: X-Original-To: apmail-poi-dev-archive@www.apache.org Delivered-To: apmail-poi-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 084EECF2E for ; Tue, 16 Dec 2014 09:31:51 +0000 (UTC) Received: (qmail 79864 invoked by uid 500); 16 Dec 2014 09:31:50 -0000 Delivered-To: apmail-poi-dev-archive@poi.apache.org Received: (qmail 79829 invoked by uid 500); 16 Dec 2014 09:31:50 -0000 Mailing-List: contact dev-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Developers List" Delivered-To: mailing list dev@poi.apache.org Received: (qmail 79817 invoked by uid 99); 16 Dec 2014 09:31:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2014 09:31:50 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [212.13.204.73] (HELO urchin.earth.li) (212.13.204.73) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2014 09:31:45 +0000 Received: from nick (helo=localhost) by urchin.earth.li with local-esmtp (Exim 4.80) (envelope-from ) id 1Y0oTV-00020T-07 for dev@poi.apache.org; Tue, 16 Dec 2014 09:31:25 +0000 Date: Tue, 16 Dec 2014 09:31:24 +0000 (GMT) From: Nick Burch X-X-Sender: nick@urchin.earth.li To: POI Developers List Subject: Re: Using MapDB to reduce memory footprint of shared strings table in SXSSF In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Checked: Checked by ClamAV on apache.org On Tue, 16 Dec 2014, Sumedh wrote: > 1. For a quick win, is it possible to provide a hook so that we can plug > in an overridden implementation of SharedStringTable class? As far as I > saw, there is no clean pluggability available right now (but I have very > little understanding of POI codebase). We'd need to tweak things to allow that. However, is working at the CTRst level going to be good for you with MapDB or similar? Will serialising then deserialising those cause you lots of problems / overhead? Would there be a better "thing" to pass back and forth between XSSF / SXSSF / SAX code for a shared string? (There has been discussion lately about trying to avoid the amount of xmlbeans objects on public interfaces, so that a switch to something like jaxp could be done later if we want to, so this is one case when we can consider it) > 2. If that works well, we can explore using MapDB as one of the options > to be used natively after considering all the other factors (like > licensing and size)...or may be some other smaller library focused only > on this aspect, or Alex's homegrown code. :) > > BTW, MapDB is free as speech and free as beer under Apache License 2.0 > . :) > - https://github.com/jankotek/MapDB/blob/master/license.txt And small too, so I don't see any major issues with making it an option for people wanting lower memory but higher IO reading, assuming we can't find a better one (eg from Alex or Lucene!) Nick --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org For additional commands, e-mail: dev-help@poi.apache.org