incubator-sanselan-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charles Matthew Chen" <charlesmc...@gmail.com>
Subject Re: Metadata use by Apache Java projects
Date Tue, 20 Nov 2007 03:39:08 GMT
Hi Jeremias & Antoine,

   Antoine, it looks like you found it pretty easy to convert
Sanselan's metadata into XMP format.

   Jeremias, it sounds like you considering a new project which can
translate data from many formats (read by a variety of projects) into
XMP.  That sounds great!

   Sanselan could not use XMP internally to represent metadata,
though.  Sanselan's goal is to read & write metadata (such as EXIF
metadata) preserving not just tag values but directory structure,
field order, field location, etc.  I'm in the process of refactoring
the metadata data structures at the moment, actually, in order to
approach binary compatibility as closely as possible.

Charles.




On Nov 19, 2007 8:57 AM, Antoine Moreau de Bellaing <amdb@enst.fr> wrote:
> Thank you for your advice....
> This class might (perhaps) help in converting EXIF metadata into XMP
> by using Adobe'Toolkit
>
>
> Regards,
> Antoine Moreau de Bellaing.
>
> import java.io.File;
> import java.io.IOException;
> import java.util.Vector;
>
> import org.cmc.sanselan.ImageReadException;
> import org.cmc.sanselan.Sanselan;
> import org.cmc.sanselan.common.IImageMetadata;
> import org.cmc.sanselan.formats.jpeg.JpegImageMetadata;
> import org.cmc.sanselan.formats.tiff.TiffDirectory;
> import org.cmc.sanselan.formats.tiff.TiffField;
> import org.cmc.sanselan.formats.tiff.TiffImageMetadata;
>
> import com.adobe.xmp.XMPConst;
> import com.adobe.xmp.XMPException;
> import com.adobe.xmp.XMPMeta;
> import com.adobe.xmp.XMPMetaFactory;
>
> public class XMPMetadataExample
> {
>         public static void metadataExample(File file) throws
> ImageReadException,
>                         IOException, XMPException
>         {
>                 IImageMetadata metadata = Sanselan.getMetadata(file);
>
>
>                 if (metadata instanceof JpegImageMetadata)
>                 {
>                         JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
>                         XMPMeta meta = xmpMeta(jpegMetadata);
>                         System.out.println(XMPMetaFactory.serializeToString(meta, null));
>
>                 }
>         }
>
>         private static XMPMeta xmpMeta(JpegImageMetadata jpegMetadata) throws
> ImageReadException, IOException, XMPException
>         {
>                 XMPMeta meta = XMPMetaFactory.create();
>                 Vector dirs = jpegMetadata.getExif().getDirectories();
>                 for (int i = 0; i < dirs.size(); i++)
>                 {
>                         TiffImageMetadata.Directory dir = (TiffImageMetadata.Directory)
dirs
>                                         .get(i);
>
>                         Vector items = dir.getItems();
>                         for (int j = 0; j < items.size(); j++)
>                         {
>                                 Object item = items.get(j);
>                                 TiffImageMetadata.Item tiffItem = (TiffImageMetadata.Item)
item;
>                                 TiffField field = tiffItem.getTiffField();
>                                 if (namespace(dir.type) != null) meta.setProperty
> (namespace(dir.type), field.getTagName(), field.getValueDescription());
>
>                         }
>                 }
>                 return meta;
>         }
>
>
>         public static final String namespace(int type)
>         {
>                 switch (type)
>                 {
>                         case TiffDirectory.DIRECTORY_TYPE_UNKNOWN :
>                                 return null;
>                         case TiffDirectory.DIRECTORY_TYPE_ROOT :
>                                 return XMPConst.NS_TIFF;
>                         case TiffDirectory.DIRECTORY_TYPE_SUB :
>                                 return null;
>                         case TiffDirectory.DIRECTORY_TYPE_THUMBNAIL :
>                                 return null;
>                         case TiffDirectory.DIRECTORY_TYPE_EXIF :
>                                 return XMPConst.NS_EXIF;
>                         case TiffDirectory.DIRECTORY_TYPE_GPS :
>                                 return null;
>                         case TiffDirectory.DIRECTORY_TYPE_INTEROPERABILITY :
>                                 return null;
>                         default :
>                                 return null;
>                 }
>         }
> }
>
> Le 19 nov. 07 à 12:00, Jeremias Maerki a écrit :
>
>
> > Cool, this proves my point that XMP is useful. ;-)
> >
> > AFAIK, JPEG metadata is usually not embedded as XMP but as EXIF/IPTC
> > data. In this case, the EXIF and IPTC chunks would have to be
> > converted
> > into the XMP representation. I guess that's what Adobe's Bridge does.
> > That's exactly what would need to be done if my proposal would be
> > implemented.
> >
> > So, if you want to do it now (i.e. before we've reached a conclusion)
> > you'll have to extract every single value from the metadata directory
> > and put it into the structure exposed by Adobe's XMP Toolkit. To get
> > the
> > individual values, see:
> > https://svn.apache.org/repos/asf/incubator/sanselan/trunk/src/main/java/org/cmc/sanselan/sampleUsage/MetadataExample.java
> >
> > The right mappings are easily found in the XMP specification.
> >
> > Jeremias Maerki
> >
> >
> >
> > On 19.11.2007 11:43:48 Antoine Moreau de Bellaing wrote:
> >> Hello.
> >> I'm looking for a way to connect the  Adobe XMP Toolkit to Sanselan.
> >> Especially with JPEG.
> >>
> >> I'm really newbie, so I apology if my response doesn't make sense to
> >> you all...
> >>
> >> Here's an output of Sanselan
> >> TiffImageMetadata.toString()
> >>              Root:
> >>                      Make: 'Canon'
> >>                      Model: 'Canon EOS 350D DIGITAL'
> >>                      Orientation: 1
> >>                      XResolution: 72
> >>                      YResolution: 72
> >>                      ResolutionUnit: 2
> >>                      DateTime: 2007-10-06T16:47:56.000+0200
> >>                      WhitePoint: 313/1000, 329/1000
> >>                      PrimaryChromaticities: 64/100, 33/100, 21/100, 71/100,
15/100,
> >> 6/100
> >>                      YCbCrCoefficients: 299/1000, 587/1000, 114/1000
> >>                      YCbCrPositioning: 2
> >>                      Exif_IFD_Pointer: 320
> >>
> >>              Exif:
> >>                      ExposureTime: 1/60
> >>                      FNumber: 5
> >>                      ExposureProgram: 0
> >>                      ISOSpeedRatings: 400
> >>                      ExifVersion: 48, 50, 50, 49
> >>                      DateTimeOriginal: 2007-10-06T16:47:56.000+0200
> >>                      DateTimeDigitized: 2007-10-06T16:47:56.000+0200
> >>                      ComponentsConfiguration: 1, 2, 3, 0
> >>                      ShutterSpeedValue: 387114/65536
> >>                      ApertureValue: 304340/65536
> >>                      ExposureBiasValue: 0
> >>                      MeteringMode: 1
> >>                      Flash: 16
> >>                      FocalLength: 41
> >>                      MakerNote: 24, 0, 1, 0, 3, 0, 46, 0, 0, 0, 34, 4, 0, 0,
2, 0, 3,
> >> 0,
> >> 4, 0, 0, 0, 126, 4, 0, 0, 3, 0, 3, 0, 4, 0, 0, 0, -122, 4, 0, 0, 4,
> >> 0,
> >> 3, 0, 34, 0, 0, 0, -114, 4, 0, 0, 6... (8340)
> >>                      UserComment: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,
> >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... (264)
> >>                      FlashpixVersion: 48, 49, 48, 48
> >>                      ColorSpace: 65535
> >>                      PixelXDimension: 3456
> >>                      PixelYDimension: 2304
> >>                      Interoperability_IFD_Pointer: 9366
> >>                      FocalPlaneXResolution: 3456000/874
> >>                      FocalPlaneYResolution: 2304000/582
> >>                      FocalPlaneResolutionUnit: 2
> >>                      CustomRendered: 0
> >>                      ExposureMode: 0
> >>                      WhiteBalance: 0
> >>                      SceneCaptureType: 0
> >>                      Unknown: 22/10
> >>
> >>              Interoperability:
> >>                      GPSLatitudeRef: 'R03'
> >>                      GPSLatitude: 48, 49, 48, 48
> >>
> >>              Sub:
> >>                      Compression: 6
> >>                      XResolution: 72
> >>                      YResolution: 72
> >>                      ResolutionUnit: 2
> >>                      JPEGInterchangeFormat: 9716
> >>                      JPEGInterchangeFormatLength: 9176
> >>
> >> The same file parsed with Adobe Bridge produces this XMP file :
> >>
> >> <?xpacket begin="Ôªø" id="W5M0MpCehiHzreSzNTczkc9d"?>
> >> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.1-c037
> >> 46.282696, Mon Apr 02 2007 18:36:56        ">
> >>    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
> >>       <rdf:Description rdf:about=""
> >>             xmlns:tiff="http://ns.adobe.com/tiff/1.0/">
> >>          <tiff:Make>Canon</tiff:Make>
> >>          <tiff:Model>Canon EOS 350D DIGITAL</tiff:Model>
> >>          <tiff:Orientation>1</tiff:Orientation>
> >>          <tiff:ImageWidth>3456</tiff:ImageWidth>
> >>          <tiff:ImageLength>2304</tiff:ImageLength>
> >>          <tiff:PhotometricInterpretation>2</
> >> tiff:PhotometricInterpretation>
> >>          <tiff:SamplesPerPixel>3</tiff:SamplesPerPixel>
> >>          <tiff:BitsPerSample>
> >>             <rdf:Seq>
> >>                <rdf:li>8</rdf:li>
> >>                <rdf:li>8</rdf:li>
> >>                <rdf:li>8</rdf:li>
> >>             </rdf:Seq>
> >>          </tiff:BitsPerSample>
> >>          <tiff:XResolution>72/1</tiff:XResolution>
> >>          <tiff:YResolution>72/1</tiff:YResolution>
> >>          <tiff:ResolutionUnit>2</tiff:ResolutionUnit>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:exif="http://ns.adobe.com/exif/1.0/">
> >>          <exif:ExifVersion>0221</exif:ExifVersion>
> >>          <exif:ExposureTime>1/60</exif:ExposureTime>
> >>          <exif:ShutterSpeedValue>5906891/1000000</
> >> exif:ShutterSpeedValue>
> >>          <exif:FNumber>5/1</exif:FNumber>
> >>          <exif:ApertureValue>4643856/1000000</exif:ApertureValue>
> >>          <exif:ExposureProgram>0</exif:ExposureProgram>
> >>          <exif:ISOSpeedRatings>
> >>             <rdf:Seq>
> >>                <rdf:li>400</rdf:li>
> >>             </rdf:Seq>
> >>          </exif:ISOSpeedRatings>
> >>          <exif:DateTimeOriginal>2007-10-06T16:47:56+02:00</
> >> exif:DateTimeOriginal>
> >>          <exif:DateTimeDigitized>2007-10-06T16:47:56+02:00</
> >> exif:DateTimeDigitized>
> >>          <exif:ExposureBiasValue>0/2</exif:ExposureBiasValue>
> >>          <exif:MeteringMode>1</exif:MeteringMode>
> >>          <exif:Flash rdf:parseType="Resource">
> >>             <exif:Fired>False</exif:Fired>
> >>             <exif:Return>0</exif:Return>
> >>             <exif:Mode>2</exif:Mode>
> >>             <exif:Function>False</exif:Function>
> >>             <exif:RedEyeMode>False</exif:RedEyeMode>
> >>          </exif:Flash>
> >>          <exif:FocalLength>41/1</exif:FocalLength>
> >>          <exif:CustomRendered>0</exif:CustomRendered>
> >>          <exif:ExposureMode>0</exif:ExposureMode>
> >>          <exif:WhiteBalance>0</exif:WhiteBalance>
> >>          <exif:SceneCaptureType>0</exif:SceneCaptureType>
> >>          <exif:FocalPlaneXResolution>3456000/874</
> >> exif:FocalPlaneXResolution>
> >>          <exif:FocalPlaneYResolution>2304000/582</
> >> exif:FocalPlaneYResolution>
> >>          <exif:FocalPlaneResolutionUnit>2</
> >> exif:FocalPlaneResolutionUnit>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:xap="http://ns.adobe.com/xap/1.0/">
> >>          <xap:ModifyDate>2007-10-06T16:47:56+02:00</xap:ModifyDate>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:dc="http://purl.org/dc/elements/1.1/">
> >>          <dc:creator>
> >>             <rdf:Seq>
> >>                <rdf:li>antoine</rdf:li>
> >>             </rdf:Seq>
> >>          </dc:creator>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:aux="http://ns.adobe.com/exif/1.0/aux/">
> >>          <aux:SerialNumber>1330734959</aux:SerialNumber>
> >>          <aux:LensInfo>18/1 55/1 0/0 0/0</aux:LensInfo>
> >>          <aux:Lens>18.0-55.0 mm</aux:Lens>
> >>          <aux:ImageNumber>160</aux:ImageNumber>
> >>          <aux:FlashCompensation>0/1</aux:FlashCompensation>
> >>          <aux:OwnerName>antoine</aux:OwnerName>
> >>          <aux:Firmware>1.0.3</aux:Firmware>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:crs="http://ns.adobe.com/camera-raw-settings/1.0/">
> >>          <crs:AlreadyApplied>True</crs:AlreadyApplied>
> >>       </rdf:Description>
> >>       <rdf:Description rdf:about=""
> >>             xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/">
> >>          <photoshop:ColorMode>3</photoshop:ColorMode>
> >>          <photoshop:ICCProfile>Canon EOS 350D DIGITAL</
> >> photoshop:ICCProfile>
> >>       </rdf:Description>
> >>    </rdf:RDF>
> >> </x:xmpmeta>
> >> <?xpacket end="w"?>
> >>
> >>
> >> Root corresponds to the tiff namespace
> >> Exif corresponds to the exif namespace
> >>
> >> In Sanselan those variables are private :
> >> TiffImageMetadata.directory
> >>
> >> Would an accesor to directory be usefull to parse XMP with Adobe's
> >> Toolkit?
> >>
> >>
> >> Regards,
> >> Anoine Moreau de Bellaing
> >>
> >>
> >> Le 19 nov. 07 à 10:26, Jeremias Maerki a écrit :
> >>
> >>> (I realize this is heavy cross-posting but it's probably the best
> >>> way to
> >>> reach all the players I want to address.)
> >>>
> >>> As you may know, I've started developing an XMP metadata package
> >>> inside
> >>> XML Graphics Commons in order to support XMP metadata (and
> >>> ultimately
> >>> PDF/A) in Apache FOP. Therefore, I have quite an interest in
> >>> metadata.
> >>>
> >>> What is XMP? XMP, for those who don't know about it, is based on a
> >>> subset of RDF to provide a flexible and extensible way of
> >>> storing/representing document metadata.
> >>>
> >>> Yesterday, I was surprised to discover that Adobe has published an
> >>> XMP
> >>> Toolkit with Java support under the BSD license. In contrast to my
> >>> effort, Adobe's toolkit is quite complete if maybe a bit more
> >>> complicated to use. That got me thinking:
> >>>
> >>> Every project I'm sending this message to is using document metadata
> >>> in
> >>> some form:
> >>> - Apache XML Graphics: embeds document metadata in the generated
> >>> files
> >>> (just FOP at the moment, but Batik is a similar candidate)
> >>> - Tika (in incubation): has as one of its main purposes the
> >>> extraction
> >>> of metadata
> >>> - Sanselan (in incubation): extracts and embeds metadata from/in
> >>> bitmap
> >>> images
> >>> - PDFBox (incubation in discussion): extracts and embeds XMP
> >>> metadata
> >>> from/in PDF files (see also JempBox)
> >>>
> >>> Every one of these projects has its own means to represent
> >>> metadata in
> >>> memory. Wouldn't it make sense to have a common approach? I've
> >>> worked
> >>> with XMP for some time now and I can say it's ideal to work with. It
> >>> also defines guidelines to embed XMP metadata in various file
> >>> formats.
> >>> It's also relatively easy to map metadata between different file
> >>> formats
> >>> (Dublin Core, EXIF, PDF Info etc.).
> >>>
> >>> Sanselan and Tika have both chosen a very simple approach but is it
> >>> versatile enough for the future? While the simple Map<String,
> >>> String[]> in
> >>> Tika allows for multiple authors, for example, it doesn't support
> >>> language alternatives for things such as dc:title or dc:description.
> >>>
> >>> I'm seriously thinking about abandoning most of my XMP package
> >>> work in
> >>> XML Graphics Commons in favor of Adobe's XMP Toolkit. What it
> >>> doesn't
> >>> support, tough:
> >>> - Metadata merging functionality (which I need for synchronizing the
> >>> PDF
> >>> Info object and the XMP packet for PDF/A)
> >>> - Schema-specific adapters (for Dublin Core and many other XMP
> >>> Schemas) for
> >>> easier programming (which both Ben and I have written for JempBox
> >>> and
> >>> XML Graphics Commons). Adobe's toolkit only allows generic access.
> >>>
> >>> Some links:
> >>> Adobe XMP website: http://www.adobe.com/products/xmp/
> >>> Adobe XMP Toolkit: http://www.adobe.com/devnet/xmp/
> >>> JempBox: http://sourceforge.net/projects/jempbox
> >>> Apache XML Graphics Commons:
> >>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/xmp/
> >>>
> >>> My questions:
> >>> - Any interest in converging on a unified model/approach?
> >>> - If yes, where shall we develop this? As part of Tika (although
> >>> it's
> >>> still in incubation)? As a seperate project (maybe as Apache Commons
> >>> subproject)? If more than XML Graphics uses this, XML Graphics is
> >>> probably not the right home.
> >>> - Is Adobe's XMP toolkit interesting for adoption (!=incubation)? Is
> >>> the JempBox or XML Graphics Commons approach more interesting?
> >>> - Where's the best place to discuss this? We can't keep posting to
> >>> several mailing lists.
> >>>
> >>> At any rate, I would volunteer to spearhead this effort, especially
> >>> since I have immediate need to have complete XMP functionality. I've
> >>> almost finished mapping all XMP structures in XG Commons but I
> >>> haven't
> >>> committed my latest changes (for structured properties) and I may
> >>> still
> >>> not cover all details of XMP.
> >>>
> >>> Thanks for reading this far,
> >>> Jeremias Maerki
> >>>
> >>>
> >>
> >
> >
>
>

Mime
View raw message