incubator-sanselan-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Giles ...@jogiles.co.nz>
Subject Re: Sanselan issue, with patch
Date Fri, 15 May 2009 04:48:00 GMT
Hello,

Sorry for the delayed response - I have been away for some time and 
unable to afford the time to respond.

In addition to the patch proposed below, I have another few suggestions 
I would recommend for Sanselan. At present I am running my own patched 
version of Sanselan until (hopefully) these can be integrated. The 
suggestions / patches are:

 * Add equals() / hashcode() methods to IPTCType and IPTCRecord.

 * In JpefImageParser, change getPhotoshopMetadata() so that instead of 
doing the following:
        photoshopApp13Data = data;
in the loop, I suggest doing the following:
        records.addAll(data.getRecords());
        rawBlocks.addAll(data.getRawBlocks());
This suggestion requires a little bit more rejigging of the code, but 
once done, it has the effect of merging the IPTC segments, which in my 
use cases works. Prior to this change, I was in some circumstances not 
getting the results I was wanting. Since this change, I have noticed no 
regressions in my testing.

Cheers,
Jonathan

Charles Matthew Chen wrote:
> Hi Jonathan,
>
>    Sorry for the delay reply.  Thanks for looking into this issue and
> contributing a patch.  This is a known issue - there seems to be a
> large amount of variation in how some IPTC data is encoded.  The
> possible problem with continuing after find an unknown marker in IPTC
> data is that subsequent data might be invalid.  I'll look into this
> issue more when I get a chance.  In the meantime, can you please open
> an issue in JIRA and contribute (by attaching it to the issue) a
> sample image that demonstrates this issue?
>
> https://issues.apache.org/jira/browse/SANSELAN
>
> Thanks,
>    Matthew
>
>
> On Thu, Apr 30, 2009 at 5:10 PM, Jonathan Giles <jo@jogiles.co.nz> wrote:
>   
>> Hi all,
>>
>> I'm using Sanselan to extract IPTC metadata. Recently I was given an image
>> that was causing Sanselan to return an empty set when retrieving the
>> metadata, despite the fact that there is clearly iptc metadata in the image.
>>
>> I trawled through the Sanselan code and found the problem: it's in
>> IPTCParser.parseIPTCBlock(...), and is the second 'if' statement within the
>> while loop. In particular, instead of the following:
>> if (tagMarker != IPTC_RECORD_TAG_MARKER) {
>>   if (verbose) {
>>       System.out.println("Unexpected record tag marker in IPTC data.");
>>   }
>>   return elements;
>> }
>>
>> You should do the following:
>> if (tagMarker != IPTC_RECORD_TAG_MARKER) {
>>   if (verbose) {
>>       System.out.println("Unexpected record tag marker in IPTC data.");
>>   }
>>   continue;
>> }
>>
>> In my case, this was tripping up the continued parsing of the image, as it
>> was occuring early in the parsing process. If you're interested, the verbose
>> output of an attempt to parse this file once the little patch above is
>> applied is shown below. In particular, look at the 'Unexpected record tag
>> marker in IPTC data.' midway down. I don't know what these values are, and I
>> am slightly concerned that I may be throwing away good data. Do you have any
>> thoughts?
>>
>> Cheers,
>> Jonathan Giles
>>
>> Output of verbose run after patch was applied:
>> ==============================
>> blockType 1002 (0x3ea)
>> blockSize 6143 (0x17ff)
>> blockType 1005 (0x3ed)
>> blockSize 16 (0x10)
>> blockType 1062 (0x426)
>> blockSize 14 (0xe)
>> blockType 1037 (0x40d)
>> blockSize 4 (0x4)
>> blockType 1049 (0x419)
>> blockSize 4 (0x4)
>> blockType 1011 (0x3f3)
>> blockSize 9 (0x9)
>> blockType 1034 (0x40a)
>> blockSize 1 (0x1)
>> blockType 10000 (0x2710)
>> blockSize 10 (0xa)
>> blockType 1013 (0x3f5)
>> blockSize 72 (0x48)
>> blockType 1016 (0x3f8)
>> blockSize 112 (0x70)
>> blockType 1032 (0x408)
>> blockSize 16 (0x10)
>> blockType 1054 (0x41e)
>> blockSize 4 (0x4)
>> blockType 1050 (0x41a)
>> blockSize 837 (0x345)
>> blockType 1064 (0x428)
>> blockSize 12 (0xc)
>> blockType 1044 (0x414)
>> blockSize 4 (0x4)
>> blockType 1036 (0x40c)
>> blockSize 6151 (0x1807)
>> blockType 1057 (0x421)
>> blockSize 85 (0x55)
>> blockType 1030 (0x406)
>> blockSize 7 (0x7)
>> blockType 1028 (0x404)
>> blockSize 1136 (0x470)
>> tagMarker 28 (0x1c)
>> recordNumber 1 (0x1)
>> tagMarker 0 (0x0)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 0 (0x0)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 2 (0x2)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 0 (0x0)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 4 (0x4)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 28 (0x1c)
>> recordNumber 1 (0x1)
>> tagMarker 90 (0x5a)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 0 (0x0)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 3 (0x3)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 27 (0x1b)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 37 (0x25)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 71 (0x47)
>> Unexpected record tag marker in IPTC data.
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 0 (0x0)
>> ignore record version record! 0
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 5 (0x5)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 25 (0x19)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 40 (0x28)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 55 (0x37)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 80 (0x50)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 85 (0x55)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 103 (0x67)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 105 (0x69)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 110 (0x6e)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 120 (0x78)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 122 (0x7a)
>> tagMarker 28 (0x1c)
>> recordNumber 2 (0x2)
>> recordType 22 (0x16)
>>
>>     

Mime
View raw message