Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7DBB818A95 for ; Thu, 9 Jul 2015 13:35:52 +0000 (UTC) Received: (qmail 71568 invoked by uid 500); 9 Jul 2015 13:35:52 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 71545 invoked by uid 500); 9 Jul 2015 13:35:52 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 71534 invoked by uid 99); 9 Jul 2015 13:35:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2015 13:35:52 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [198.49.146.234] (HELO smtpvbsrv1.mitre.org) (198.49.146.234) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2015 13:33:35 +0000 Received: from smtpvbsrv1.mitre.org (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id 107AF72E027; Thu, 9 Jul 2015 09:35:23 -0400 (EDT) Received: from IMCCAS03.MITRE.ORG (imccas03.mitre.org [129.83.29.80]) by smtpvbsrv1.mitre.org (Postfix) with ESMTP id 05B1372E00F; Thu, 9 Jul 2015 09:35:23 -0400 (EDT) Received: from imshyb01.MITRE.ORG (129.83.29.2) by IMCCAS03.MITRE.ORG (129.83.29.80) with Microsoft SMTP Server (TLS) id 14.3.224.2; Thu, 9 Jul 2015 09:35:21 -0400 Received: from imshyb01.MITRE.ORG (129.83.29.2) by imshyb01.MITRE.ORG (129.83.29.2) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Thu, 9 Jul 2015 09:35:21 -0400 Received: from na01-by2-obe.outbound.protection.outlook.com (10.140.19.249) by imshyb01.MITRE.ORG (129.83.29.2) with Microsoft SMTP Server (TLS) id 15.0.1044.25 via Frontend Transport; Thu, 9 Jul 2015 09:35:21 -0400 Received: from DM2PR09MB0713.namprd09.prod.outlook.com (10.161.144.28) by DM2PR09MB0714.namprd09.prod.outlook.com (10.161.145.11) with Microsoft SMTP Server (TLS) id 15.1.207.19; Thu, 9 Jul 2015 13:35:19 +0000 Received: from DM2PR09MB0713.namprd09.prod.outlook.com ([10.161.144.28]) by DM2PR09MB0713.namprd09.prod.outlook.com ([10.161.144.28]) with mapi id 15.01.0207.004; Thu, 9 Jul 2015 13:35:19 +0000 From: "Allison, Timothy B." To: "users@pdfbox.apache.org" CC: "dev@tika.apache.org" Subject: RE: DomXmpParser: namespace not found Thread-Topic: DomXmpParser: namespace not found Thread-Index: AdC5kTG5iCBlYYO/RZWLapMxqJM3aQALWP0AABmj0IAACYkUIA== Date: Thu, 9 Jul 2015 13:35:19 +0000 Message-ID: References: <559D8B2C.5070205@t-online.de> <33A676B6-1AED-4E3A-8BD3-000C12A6AFFD@fileaffairs.de> In-Reply-To: <33A676B6-1AED-4E3A-8BD3-000C12A6AFFD@fileaffairs.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: pdfbox.apache.org; dkim=none (message not signed) header.d=none; x-originating-ip: [192.80.55.87] x-microsoft-exchange-diagnostics: 1;DM2PR09MB0714;5:lwpxXSSGK8n5TlrypBClKrr/Bq7qzHTfAD6m7gLLcdi23XcWulBo/wtaPQt4pQSgnQib9F7gkGOgHPqutu8BKqNhrgF7SRWnPBwlHubUTtAZNxAhH/pBk0etpfCcE5xv3KE15Y10M3//LHbtPKKDbw==;24:yZmuDkiJ9PoTH843HEKOZry+S4Ph7r1Uc75/vjoULVK5hq6xlEceVNoXzpmTKmBgR+foazN+YvJDMTUcZ5KVDWjMDkwTlV1sTfo7xAVdej0=;20:yg0f3JWMrK3wTxbjR+7WIhRs6D2ypH7hSCdSs7P4YvrNiyZIzwfsi0bU1M848CnLc+LANunIjJ+WVFXkGAcHwQ== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(42134001)(42139001);SRVR:DM2PR09MB0714; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:DM2PR09MB0714;BCL:0;PCL:0;RULEID:;SRVR:DM2PR09MB0714; x-forefront-prvs: 0632519F33 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(51704005)(377454003)(13464003)(252514010)(66066001)(189998001)(5002640100001)(74316001)(87936001)(86362001)(19580395003)(19580405001)(2656002)(110136002)(2900100001)(33656002)(2950100001)(46102003)(40100003)(77096005)(62966003)(102836002)(122556002)(15975445007)(77156002)(50986999)(5001960100002)(5001920100001)(76176999)(2351001)(92566002)(54356999)(76576001)(5003600100002)(2501003);DIR:OUT;SFP:1101;SCL:1;SRVR:DM2PR09MB0714;H:DM2PR09MB0713.namprd09.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jul 2015 13:35:19.6181 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: c620dc48-1d50-4952-8b39-df4d54d74d82 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR09MB0714 X-OriginatorOrg: mitre.org X-Virus-Checked: Checked by ClamAV on apache.org >From my perspective, it would be great to have a general xmp parser that al= so allows for some variance from spec (PDFBOX-2855). We've been using jemp= box for pdfs as well as images over on Tika, and it has worked well for us.= =20 I'd prefer to continue using your xmp parser, but I understand if you need = to limit what you're willing to take on. I'll take a look at xmlgraphics, and I'll discuss the fallback option with = Tika devs about moving jempbox into Tika. Thank you. Cheers, Tim -----Original Message----- From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]=20 Sent: Thursday, July 09, 2015 4:56 AM To: users@pdfbox.apache.org Subject: Re: DomXmpParser: namespace not found Hi, > Am 08.07.2015 um 22:42 schrieb Tilman Hausherr : >=20 > Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.: >> All, >> Apologies for the idiocy I'm about to reveal (well, that won't be a reve= lation to anyone, really), but is there an obvious solution for this kind o= f error: >>=20 >> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a defi= nition for the namespace http://ns.adobe.com/lightroom/1.0/ >> at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefin= ition(DomXmpParser.java:848) >> at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsPro= perties(DomXmpParser.java:290) >> at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRo= ot(DomXmpParser.java:234) >> at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser= .java:198) >> at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser= .java:105) >> at org.apache.tika.parser.image.xmp.JempboxExtractor.par= se(JempboxExtractor.java:59) >>=20 >> On a handful of image files in our test docs on Tika, I'm getting this w= ith: >>=20 >> http://ns.adobe.com/lightroom/1.0/ >> http://ns.adobe.com/exif/1.0/aux/ >>=20 >=20 > These namespaces are not supported by xmpbox. We've had this problem with= another namespace (I can't remember which one), and it wasn't possible to = support it because we couldn't find a schema definition. >=20 > But you say these are image files. So this isn't about pdf xmp. xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend= it to support other PDF standard meta data requirements as well as generic= XMP use cases to again have a generic XMP library. OTOH there is org.apach= e.xmlgraphics.xmp WDYT? BR Maruan >=20 > Tilman >=20 >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org > For additional commands, e-mail: users-help@pdfbox.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org