pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: DomXmpParser: namespace not found
Date Thu, 09 Jul 2015 13:35:19 GMT
>From my perspective, it would be great to have a general xmp parser that also allows for
some variance from spec (PDFBOX-2855).  We've been using jempbox for pdfs as well as images
over on Tika, and it has worked well for us. 

I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're
willing to take on.

I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about
moving jempbox into Tika.

Thank you.



-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Thursday, July 09, 2015 4:56 AM
To: users@pdfbox.apache.org
Subject: Re: DomXmpParser: namespace not found


> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <THausherr@t-online.de>:
> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>> All,
>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to
anyone, really), but is there an obvious solution for this kind of error:
>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for
the namespace http://ns.adobe.com/lightroom/1.0/
>>                 at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>                 at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>> On a handful of image files in our test docs on Tika, I'm getting this with:
>> http://ns.adobe.com/lightroom/1.0/
>> http://ns.adobe.com/exif/1.0/aux/
> These namespaces are not supported by xmpbox. We've had this problem with another namespace
(I can't remember which one), and it wasn't possible to support it because we couldn't find
a schema definition.
> But you say these are image files. So this isn't about pdf xmp.

xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support
other PDF standard meta data requirements as well as generic XMP use cases to again have a
generic XMP library. OTOH there is org.apache.xmlgraphics.xmp



> Tilman
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message