pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremias Maerki <...@jeremias-maerki.ch>
Subject Re: PDFBox Project for GSoC 2012
Date Tue, 20 Mar 2012 08:32:28 GMT
Hey guys

On 18.03.2012 03:16:14 Tharaka Nayanajith Wijebandara wrote:
> Hi,
> 
> 
> Thanks mehdi.
> 
> 
> I have two ideas for a GSoC task, but need all of your help to select
> suitable one.
> 
> 
>    - One project is HTML to PDF and vise versa converter. This feature can
>    be found in JIRA also (https://issues.apache.org/jira/browse/PDFBOX-6,
>    https://issues.apache.org/jira/browse/PDFBOX-9)

HTML to PDF sounds like it requires a full layout engine which is a big
undertaking. Using (parts of) Apache FOP as a base would seem to me to
be the better base than PDFBox because of its infrastructure to generate
many output formats, not just PDF. Please note that there are already
tools (like Flying Saucer) that do that although they have the "wrong"
license. :-) Anyway, having a good HTML/CSS engine @Apache would be a
killer. But it's something that clearly goes beyong a GSoC project.

As an alternative to that, some of you may remember recent discussions
about the desire for an API to create simple layouts with PDFBox. I
think that was coming from the XDocReport and Apache ODF Toolkit corners
(search for "Angelo Zerr").

PDF to HTML is surely a great use case for PDFBox. One thing that could
be very interesting in this context would be to use the structure tree 
(tagged PDF) if it is available to improve the HTML output. Pure text
extraction might also profit from that.

> 
>    - Other one is enhancing features of PDF reader and zooming features,
>    page display features, bookmark navigator, page thumbnail viewer can be
>    very much useful. Since I have previous experience in awt, swing and
>    java2d, it will be easy for me.

Improving the PDF Viewer would be soooo cool! I'm still dreaming of that
Adobe Acrobat Professional analog using PDFBox, i.e. an well-designed
GUI-application base that can easily be extended with plug-ins for more
than just PDF viewing: integrated PDF Debugger (which I use all the time),
type writer feature, object inspection by point and click, page rotation,
insert/remove/move pages, extract images, image to PDF etc. etc. After
all, PDFBox already has so many of the features required but they are
mostly accessible only to developers or from the command-line. If only I
had time to do it, I would already have started with it.

> 
> There might be several other tasks which are important than this. So all of
> you are welcome, to reply with good ideas.
> 
> On Sat, Mar 17, 2012 at 5:01 PM, mehdi houshmand <med1985@gmail.com> wrote:
> 
> > Hi Tharaka,
> >
> > Have you had any more thoughts on a project you'd like to undertake?
> > Have you applied and been through all the admin needed to be accepted
> > into GSoC 2012? Let me know if you need any help.
> >
> > Mehdi
> >
> > On 9 March 2012 06:25, Andreas Lehmkuehler <andreas@lehmi.de> wrote:
> > > Hi,
> > >
> > > Am 07.03.2012 07:40, schrieb mehdi houshmand:
> > >>
> > >> Hi Andreas,
> > >>
> > >> Sorry, maybe I wasn't clear, I am an ASF committer... Just not to
> > >> PDFBox.. . I do have domain expertise being a full-time FOP developer
> > >> and having dealt with PDFs and fonts quite a bit. Should I pop an
> > >> email to dev-community to see if it's ok? It seems like such a waste
> > >> to have an interested applicant but no mentor...
> > >
> > > I'm not an GSoC expert but that sounds good to me. You may double check
> > with
> > > the dev-community, but IMHO it's not necessary.
> > > I'm glad that you volunteer to help us, thanks in advance. I'll try to
> > help
> > > as much as I can.
> > >
> > >
> > > BR
> > > Andreas Lehmkühler
> > >
> > >
> > >> Mehdi
> > >>
> > >> On 6 March 2012 21:32, Andreas Lehmkuehler<andreas@lehmi.de>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>>
> > >>> Am 06.03.2012 21:24, schrieb mehdi houshmand:
> > >>>
> > >>>> Hi Andreas,
> > >>>>
> > >>>> Does the mentor need to be a PDFBox committer? If not, I wouldn't
mind
> > >>>> putting myself forward as a candidate... Of course, that is if
no one
> > >>>> else does.
> > >>>
> > >>>
> > >>> Thanks for the offer, but AFAIKT it's not possible. According to [1]
> > the
> > >>> mentor has to be an ASF member or committer.
> > >>>
> > >>>
> > >>>> Mehdi
> > >>>>
> > >>>> On 6 March 2012 18:43, Andreas Lehmkuehler<andreas@lehmi.de>
> >  wrote:
> > >>>>>
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> Am 29.02.2012 03:50, schrieb Tharaka Nayanajith Wijebandara:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>>
> > >>>>>> I'm university student in Sri Lanka and a newbie to Open
Source
> > >>>>>> Development. I would like to participate for Google Summer
of Code
> > >>>>>> 2012
> > >>>>>> with an Apache Project. Since I'm familiar with Java and
I have used
> > >>>>>> PDFBox
> > >>>>>> Library for my academic project, I like to develop new
feature for
> > >>>>>> PDFBox
> > >>>>>> as my GSoC project. First of all I want to know that is
it possible
> > to
> > >>>>>> participate for GSoC 2012 with PDFBox project?
> > >>>>>>
> > >>>>>>
> > >>>>>> If it is yes, I want help from PDFBox development community
to
> > select
> > >>>>>> appropriate PDFBox task for GSoC.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> There is a lot to do and I'm sure that some of those jobs should
> > >>>>> qualify
> > >>>>> as
> > >>>>> GSoC task.
> > >>>>>
> > >>>>>
> > >>>>>> If you have any idea about good project or advice for me,
please
> > reply
> > >>>>>> to
> > >>>>>> this.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> You will need a mentor and I'm not sure if you will find one
among
> > our
> > >>>>> ranks. I'd like to support you, but I can't do it due to personal
> > >>>>> reasons.
> > >>>>>
> > >>>>> Anybody else?
> > >>>>>
> > >>>>>
> > >>>>> BR
> > >>>>> Andreas Lehmkühler
> > >>>
> > >>>
> > >>>
> > >>> BR
> > >>> Andreas Lehmkühler
> > >>>
> > >>> [1] http://community.apache.org/guide-to-being-a-mentor.html
> > >
> > >
> >
> 
> 
> 
> -- 
> Thanks & Regards,
> Tharaka Wijebandara,
> Faculty of Information Technology,
> University of Moratuwa.




Jeremias Maerki


Mime
View raw message