incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <robw...@apache.org>
Subject Re: Fwd: Google Summer of Code 2013
Date Fri, 22 Mar 2013 14:51:27 GMT
On Fri, Mar 22, 2013 at 10:34 AM, Svante Schubert
<svante.schubert@gmail.com> wrote:
> This time I will pass as mentor, but would like to comment to the SAX
> approach.
>
> Currently we are already using SAX (AFAIK DOM in general) to build up
> our own typed DOM tree, see
> http://svn.apache.org/viewvc/incubator/odf/trunk/odfdom/src/main/java/org/odftoolkit/odfdom/pkg/OdfFileSaxHandler.java?view=markup
>
> The DOM of the XML files will only be created when elements of the
> desired files are accessed - not when you load the overall document.
> If there is a very huge document (let's say presentation) we still have
> to parse the complete document, even if we only desire a certain
> slide,as it is all in one content.xml.
> If there is a very huge document (let's say spreadsheet) we still have
> to parse the complete document, even if we only desire a certain
> spreadsheet or range from it, as it is all in one content.xml.
> If there is a very huge document (let's say text) we still have to parse
> the complete document, even if we only desire a certain chapter or
> content table, as it is all in one content.xml.
> Do you had a special scenario in mind, Rob?
>

There is a difference between parsing the entire document and building
a DOM for everything in content.xml.

For example, I might just be looking to extract all hyperlinks from a
document.  Or I might want to replace all instances of "Sun
Microsystems" with "Oracle Corp.'.   Instantiating the entire
content.xml DOM for tasks like this is overkill.

-Rob

> PS: For instance, operations for real-time collaboration could be
> created by the above SAX Interface.
>
> - Svante
>
> On 22.03.2013 14:23, Rob Weir wrote:
>> If anyone wants to mentor a GSoC student you need to get your idea
>> entered into JIRA now.  It looks like the deadline is this weekend.
>>
>> I wonder whether a streaming/scanning parser could be done in this
>> time frame?  We've seen some cases where our DOM-based solution takes
>> up too much memory and a more SAX-like approach would be better.
>>
>> Any other ideas?
>>
>> -Rob
>>
>>
>> ---------- Forwarded message ----------
>> From: Ulrich Stärk <uli@apache.org>
>> Date: Fri, Mar 22, 2013 at 5:01 AM
>> Subject: Re: Google Summer of Code 2013
>> To: pmcs@apache.org
>>
>>
>> Dear PMCs,
>>
>> I'm going to submit our application to Google this weekend but our
>> ideas list only shows 34 ideas
>> until now. That's a shame considering that we have over a hundred
>> projects and were able to offer
>> potential students 142 project ideas to choose from last year and it
>> might also hinder our chances
>> of being accepted.
>>
>> Remember, GSoC is a great way to attract fresh blood to your projects
>> and to get work done that
>> might otherwise go undone. It is in your own interest to participate.
>>
>> Incubator mentors, please also talk to your respective podlings.
>>
>> If there is anything keeping you from participating, or anything that
>> needs clarification, don't
>> hesitate to contact the community development project at
>> dev@community.apache.org or, if you want to
>> keep the discussion private, code-awards@apache.org.
>>
>> Cheers,
>>
>> Uli
>>
>> On 05.03.2013 16:26, Ulrich Stärk wrote:
>>> Hello PMCs,
>>>
>>> Google Summer of Code [1] is the ideal opportunity for you to attract new contributors
to your projects.
>>>
>>> The ASF will apply as a participating organization meaning individual projects
don't have to apply
>>> separately.
>>>
>>> If you want to participate with your project you NOW need to
>>>
>>> - understand what it means to be a mentor [2].
>>>
>>> - record your project ideas. Just create issues in JIRA, label them with gsoc2013,
and they will
>>> show up at [3]. Please be as specific as possible when describing your idea.
Include the programming
>>> language, the tools and skills required, but try not to scare potential students
away. They are
>>> supposed to learn what's required before the program starts. Use labels, e.g.
for the programming
>>> language (java, c, c++, erlang, python, brainfuck, ...) or technology area (cloud,
xml, web, foo,
>>> bar, ...) and record them at [5]. Please use the COMDEV JIRA project for recording
your ideas if
>>> your project doesn't use JIRA (e.g. httpd, ooo). Contact dev@community.apache.org
if you need
>>> assistance.
>>>
>>> - subscribe to code-awards@apache.org (restricted to potential mentors, meant
to be used as a
>>> private list - general discussions on the public dev@community.apache.org list
as much as possible
>>> please). Use a recognized address when subscribing (@apache.org or one of your
alias addresses on
>>> record).
>>>
>>> Note that the ASF isn't accepted yet, nevertheless you *really* should start
recording your ideas now.
>>>
>>> Over the years we were able to complete hundreds of projects successfully. Some
of our prior
>>> students are active contributors now! Let's make this a success again this year!
>>>
>>>
>>> Uli
>>>
>>> P.S.: Except for the private parts (label spreadsheet mostly), this email is
free to be shared
>>> publicly if you want to.
>>>
>>> [1] http://www.google-melange.com/gsoc/homepage/google/gsoc2013
>>> [2] http://community.apache.org/guide-to-being-a-mentor.html
>>> [3] http://s.apache.org/gsoc2013ideas
>>> [4] http://community.apache.org/gsoc.html
>>> [5] http://s.apache.org/gsoclabels
>>>
>

Mime
View raw message