pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Duane Nickull <du...@technoracle-systems.com>
Subject Re: ANN: AMI2-PDF2SVG conversion of PDF to semantic characters and graphics
Date Wed, 21 Nov 2012 19:03:48 GMT
Peter:

Sorry for the delay.  There are subtle differences between all the FOSS
licenses and there is an overview here:

http://opensource.org/licenses/category


I could not recommend one without knowing the intent but many of these are
likely good.

Duane Nickull
***********************************
Technoracle Advanced Systems Inc.
Consulting and Contracting; Proven Results!
i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile
b. http://technoracle.blogspot.com
t.  @duanechaos
"Don't fear the Graph!  Embrace Neo4J"






On 2012-11-17 3:20 PM, "Peter Murray-Rust" <pm286@cam.ac.uk> wrote:

>On Sat, Nov 17, 2012 at 7:10 PM, Duane Nickull <
>duane@technoracle-systems.com> wrote:
>
>> Very cool project!  I did not see any EULA on this declaring a GPL or
>> similar style license.
>
>
>Apache 2. I intended to include a LICENSE  but I've probably missed it by
>mistake. Will add LICENSE today. We don't use GPL as we want AMI2 to be
>usable in any sort of application.
>
>What license are you using?  I would like to
>> introduce this work to some people.
>>
>
>We work in a completely Open manner. Some of us are connected with the
>Open
>Knowledge Foundation and its work (http://okfn.org) and we shall be using
>it for open Content Mining (though of course it can be used for any
>purpose). The AMI2 project as a whole will probably be coordinated (in a
>loose sense) there as we are interested in the SciTechMed applications
>
>We'd like to know of people's experiences and obviously if anyone can
>contribute (say) information on fonts and glyphs that's probably the most
>obviously generic thing at present.
>
>Known issues:
>* bitmap images are bypassed just to save time and space in testing.
>Should
>be an hour or two to add them.
>* some publishers use fonts in unusual colour maps and these have a
>serious
>impact on performance (ten times slower). That probably need a small
>filter
>in PDFBox.
>* output is verbose (each character has a clip path). We can normalize
>this. There's also quite a lot of debug (e.g. <svg:title> which allows
>mouseover of characters for debugging.
>
>One issue is where we normalize characters. Authors and readers of STM
>documents are not well versed in typesetting and so INCREMENT (U+2206)
>would be replaced immediately by GREEK CAPITAL LETTER DELTA  (U+0394).
>Similarly we will expand ligatures ("ffl") which most people don't even
>know exist!
>
>P.
>
>>
>> Thank you for sharing!
>>
>> So people can share in return.
>
>
>> Duane Nickull
>> ***********************************
>> Technoracle Advanced Systems Inc.
>> Consulting and Contracting; Proven Results!
>> i.  Neo4J, PDF, Java, LiveCycle ES, Flex, AIR, CQ5 & Mobile
>> b. http://technoracle.blogspot.com
>> t.  @duanechaos
>> "Don't fear the Graph!  Embrace Neo4J"
>>
>>
>>
>>
>>
>>
>-- 
>Peter Murray-Rust
>Reader in Molecular Informatics
>Unilever Centre, Dep. Of Chemistry
>University of Cambridge
>CB2 1EW, UK
>+44-1223-763069



Mime
View raw message