commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedikt Ritter <brit...@apache.org>
Subject Re: Creating EXIF tags (TiffOutputField) the right way
Date Thu, 02 Jun 2016 20:55:32 GMT
Hello Joakim,

Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Mi., 1. Juni 2016 um
15:10 Uhr:

> Sure! That would also give even more scrutiny to the code. I'm not 100%
> sure this is totally correct, but I got wonderful help from Phil Harvey
> (ExifTool) to get the charset/encoding correct.
> So I'm pretty confident. How do I contribute?
>

Looking at the Commons Imaging website [1] I realised, that we currently do
not have a user guide :o) To the best idea would probably be to add it to
the Sample Usage page [2]. The website is build from source in SVN [3]. You
would have to check that out, modify the documentation and then create an
SVN patch file, using

svn diff >> mypatch.diff

the mypatch.diff would then have to be attached to a Jira issue. More
information can be found in [5].



> Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It
> seems the EXIF tags I'm writing (UserComment and ImageDescription) are not
> enough for the comment to appear as a caption in image viewer software
> (like Picasa etc). I was wondering (hoping) Sanselan could write the
> following tags:
>
> IPTC:Caption-Abstract
> and
> XMP:Description
>
>
To be honest, I don't know much about how Sanselan/Imaging works. I have
worked on the code for a while, but I don't use it in my current projects.
So the only thing I can do, is look through the code for you and try to
find an answer to your questions :-)

Benedikt

[1] http://commons.apache.org/proper/commons-imaging/index.html
[2] http://commons.apache.org/proper/commons-imaging/sampleusage.html
[3] http://svn.apache.org/repos/asf/commons/proper/imaging/trunk
[4] http://issues.apache.org/jira/browse/IMAGING
[5] http://commons.apache.org/patches.html


>
> Joakim
>
> On 1 June 2016 at 14:55, Benedikt Ritter <britter@apache.org> wrote:
>
> > Hello Joakim,
> >
> > glad you found out what to do. This would make for a good addition to the
> > user guide. Would you like to contribute your findings?
> >
> > Benedikt
> >
> > Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Di., 31. Mai 2016 um
> > 19:21 Uhr:
> >
> > > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
> > >
> > > On 31 May 2016 at 19:20, Joakim Knudsen <joakim.grahl@gmail.com>
> wrote:
> > >
> > > > Following a post on the User-Commons-Apache log (from 2012), I ended
> up
> > > > with the following code which seems to work.
> > > > It writes proper Unicode, which I can read back successfully using
> > > > ExifTool. I also see the comment nicely in Windows Explorer, and
> under
> > > File
> > > > > Properties.
> > > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > > > otherwise (with ASCII) it did not work. At least Windows couldn't
> make
> > > > sense of the EXIF data.
> > > >
> > > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F,
> 0x44,
> > > >         0x45, 0x00 };
> > > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE
> if
> > > the file is big-endian!
> > > > byte[] bytesComment = new byte[unicodeMarker.length +
> comment.length];
> > > > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> > > unicodeMarker.length);
> > > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> > > comment.length);
> > > >
> > > > TiffOutputField exif_comment = new
> > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >         TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > bytesComment.length, bytesComment);
> > > >
> > > >
> > > > I can now write UserComment: "æøå" without problems :)
> > > >
> > > >
> > > >
> > > > - Joakim
> > > >
> > > >
> > > > On 31 May 2016 at 17:39, Benedikt Ritter <britter@apache.org> wrote:
> > > >
> > > >> Hello Joachim,
> > > >>
> > > >> Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Sa., 28.
Mai
> 2016
> > um
> > > >> 21:10 Uhr:
> > > >>
> > > >> > Hi Benedikt, and thanks for replying!
> > > >> >
> > > >> > So, if FieldType is unused, maybe the alternative, simpler
> > constructor
> > > >> is
> > > >> > more appropriate/correct to use?
> > > >> >
> > > >> > // try using the approach given in the example (modified from
the
> > GPS
> > > >> tag):
> > > >> > TiffOutputField exif_comment = TiffOutputField.create(
> > > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> >         outputSet.byteOrder, textToSet);
> > > >> >
> > > >> > However, now Sanselan throws an ImageWriteException:
> > > >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data
> > type.
> > > >> >
> > > >> > So are you 100% sure field type should not be set (to ASCII)?
> > > >> >
> > > >>
> > > >> No, I'm just saying that it uses a hard coded encoding anyway :-)
> > > >>
> > > >>
> > > >> >
> > > >> > Next, you're saying the string to set (textToSet) is converted
> > > >> internally
> > > >> > to byte array, using US-ASCII encoding.
> > > >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥"
when I copy
> the
> > > JPEG
> > > >> > out and check Properties in Windows Explorer.
> > > >> > If I write only ASCII characters, e.g. "Test", then that comes
> > through
> > > >> just
> > > >> > fine.
> > > >> >
> > > >> > In summary, here is the code that works for me (except non-ASCII
> > > >> > characters):
> > > >> >
> > > >> >
> > > >> > *//
> > > >> >
> > > >> >
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > > >> > <
> > > >> >
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > > >> > >*byte
> > > >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > >> >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > >> >         textToSet, outputSet.
> > > >> > *byteOrder*);
> > > >> >
> > > >> > // constructor arguments: taginfo tag fieldtype count bytes
> > > >> > TiffOutputField exif_comment = new
> > > >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > >> >         b.length, b);
> > > >> >
> > > >>
> > > >> The provided links indicate to me, that it is possible to write non
> > > ASCII
> > > >> characters. Are you sure your code looks like what Damjan suggested?
> > > >>
> > > >> Benedikt
> > > >>
> > > >>
> > > >> >
> > > >> >
> > > >> >
> > > >> > Joakim
> > > >> >
> > > >> >
> > > >> >
> > > >> > On 22 May 2016 at 15:29, Benedikt Ritter <britter@apache.org>
> > wrote:
> > > >> >
> > > >> > > Hello Joakim
> > > >> > >
> > > >> > > Joakim Knudsen <joakim.grahl@gmail.com> schrieb am
Sa., 21. Mai
> > > 2016
> > > >> um
> > > >> > > 19:29 Uhr:
> > > >> > >
> > > >> > > > Hi List!
> > > >> > > >
> > > >> > > > I'm working on an Android app, where I want to read
and write
> > > "EXIF
> > > >> > tags"
> > > >> > > > to JPEG files on the device. Sanselan 0.97 seems to
work
> > > perfectly,
> > > >> > > > although it's a bit complicated to work with EXIF
> > > tags/directories.
> > > >> > > >
> > > >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT
> > and
> > > >> > > > EXIF_TAG_IMAGE_DESCRIPTION.
> > > >> > > > According to the documentation I could find, UserComment
is of
> > > field
> > > >> > type
> > > >> > > > "undefined", whereas ImageDescription is of field type
ASCII.
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > >> > > >
> > > >>
> > http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> > > >> > > >
> > > >> > > > What's the proper way of creating those tags, wrt.
charset
> etc?
> > I
> > > >> want
> > > >> > as
> > > >> > > > wide as possible character support (æøå etc).
> > > >> > > >
> > > >> > > > I find different discussions online, with different
advice.
> > Seems
> > > >> two
> > > >> > > > constructors are going around, where the simpler one
does not
> > deal
> > > >> with
> > > >> > > > charset/encoding at all. This one uses the .create
method:
> > > >> > > >
> > > >> > > > String textToSet = "Some Text æøå";
> > > >> > > >
> > > >> > > > TiffOutputField exif_comment = TiffOutputField.create(
> > > >> > > >                 TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > > >                 outputSet.byteOrder, textToSet);
> > > >> > > >
> > > >> > > >
> > > >> > > > while this one uses the standard constructor:
> > > >> > > >
> > > >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > >> > > >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > >> > > >         textToSet, outputSet.byteOrder
> > > >> > > > );
> > > >> > > >
> > > >> > > > // constructor arguments: taginfo tag fieldtype count
bytes
> > > >> > > > TiffOutputField exif_comment2 = new
> > > >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > >> > > >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > >> > > >         b.length, b);
> > > >> > > >
> > > >> > > > In this last one, the string to set has been converted
to a
> byte
> > > >> array
> > > >> > > > first. But can/should I set the encoding anywhere?
> > > >> > > >
> > > >> > > > Is the field type even ASCII? This information seems
to
> indicate
> > > >> it's
> > > >> > > > not ASCII...
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > >> > > >
> > > >> > > >
> > > >> > > > Need some help here, as you can see, to get this right.
The
> > second
> > > >> > > > approach above does seem to work in my app, but I'd
like to be
> > > sure
> > > >> > > > I'm not somehow messing up the JPEGs on the deviced.
> > > >> > > >
> > > >> > >
> > > >> > > I've looked at the code of
> > > >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> > > >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> > > >> > TagInfoGpsText).
> > > >> > > Here are my observations:
> > > >> > >
> > > >> > > - The FieldType parameter, which you have set to
> > > >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in
the
> > > >> > implemenation
> > > >> > > of encodeValue(FieldType, Object, ByteOrder)
> > > >> > > - When converting the input String to byte array,
> > > >> String.getBytes(String
> > > >> > > charsetName) is used
> > > >> > > - For charsetName "US-ASCII" is always used (it can not
be
> > > configured
> > > >> by
> > > >> > > the user)
> > > >> > >
> > > >> > > So my guess is, that the code will not handle characters
not in
> > the
> > > >> > > US-ASCII charset correctly.
> > > >> > >
> > > >> > > Benedikt
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Joakim
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message