poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Stadler <dominik.stad...@gmx.at>
Subject Re: [Suggestion] Enhancement for reading big excel files
Date Sun, 22 Jan 2017 20:49:43 GMT
Hi,

sorry for the long delay, I have been busy with other things lately and
none of the other committers stepped in.

Let's see some first code then, I am not sure if adding more complexity to
SXSSFWorkbook is a good thing, maybe we can at least separate out most of
the functionality cleanly to not clobber the already large class with more
code.

Please show it whenever you have at least some code in place, it does not
need to be completed or anything, just a first proof-of-concept, so we can
iterate over it and ensure it matches into the overall code-structure from
the start.

Thanks... Dominik.

On Wed, Jan 4, 2017 at 4:35 AM, Renjith R <renjith.r.panikar@gmail.com>
wrote:

> Thanks a lot for the comments, Dominik.
>
> To answer your questions..
>
> * How would you ensure feature-parity compared to HSSF/XSSF implementation?
> There are a large number of things that are possible in a workbook, do you
> plan to support all those or only a subset?
>
> Well.. I would like to start it with the XSSF implimentation , as I am not much familiar
with the HSSF one.
>
> I am not looking to support a subset, coz no one is going to use it unless it supports
some basic functionalities.
>
> * The text seems to indicate that there is already some code already
> available. Can we take a look? You can start a fork of Apache POI fromhttps://github.com/apache/poi/
easily and do the changes there so others
> can take a look and suggest improvements/changes. Or is it a standalone
> piece of code?
>
> No plans for a stand alone code as long as we can incorporate it with exising functionality.
Since we already have a class (org.apache.poi.xssf.streaming.SXSSFWorkbook) that is dedicated
to reduce memory consumption, I would like to start with it and see if this can be added as
a feature to it. I will also take a look at the code to see if we can leverage any exisitng
functionality.
>
> * How would you ensure that the code is maintained over time? As this
> sounds like quite a large chunk of code, are you planning to continue to
> invest some time in the long run? We had some cases where code was
> "donated", but never looked at afterwards, which is bad as it increases the
> code-base, but also increases number of bug-reports and areas that are not
> well covered by tests.
>
> :). I am not looking for a 'code donation' here. I'll be around for a long time.
>
>
> On Sun, Dec 25, 2016 at 4:19 PM, Renjith R <renjith.r.panikar@gmail.com>
> wrote:
>
>> I don't know if you are able to see the screenshot in my previous mail.
>> Following was your comment.
>> I would start working on it if you think it worths adding.
>>
>> *From: *Dominik Stadler <d...@gmx.at>
>> *Subject: *Re: Suggestion on how to read huge excel files.
>> *Date: *2015-06-20 15:24 (+0530)
>> *List: *user@poi.apache.org
>> <https://lists.apache.org/list.html?user@poi.apache.org>
>>
>> It seems not that many people need similar functionality currently,
>> however it looks useful for handling very large documents.
>>
>> I looked at it and it looks good, some comments:
>>
>> * The finalize() in the Beans looks strange and should not be needed,
>> these members are freed anyway and having to implement finalize()
>> always looks fishy!
>>
>> Thanks... Dominik.
>>
>>
>>
>> On Sun, Dec 25, 2016 at 4:12 PM, Renjith R <renjith.r.panikar@gmail.com>
>> wrote:
>>
>>> Ok. I recall that. It was you who did the code review that time.
>>>
>>>
>>> ‚Äč
>>>
>>> On Sun, Dec 25, 2016 at 4:04 PM, Renjith R <renjith.r.panikar@gmail.com>
>>> wrote:
>>>
>>>> Thanks, Dominik. I'll try to resend it.
>>>> Let me know if you can see the attachments.
>>>>
>>>> On Fri, Dec 23, 2016 at 6:57 AM, Renjith R <renjith.r.panikar@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Developers,
>>>>>
>>>>>     Couple of years back I suggested an enhancement to read very large
>>>>> excel files using StAX api. Attached the document. Unfortunately, I did
not
>>>>> get a chance to work on it. Do you think it will make sense if I start
>>>>> working on it?. Kindly let me know your suggestions.
>>>>>
>>>>> regards,
>>>>> Renjith
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message