harmony-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Jing Qin (JIRA)" <j...@apache.org>
Subject [jira] Created: (HARMONY-6604) [classlib][archive] Deflater behavior different between ri and harmony
Date Tue, 27 Jul 2010 06:59:16 GMT
[classlib][archive] Deflater behavior different between ri and harmony

                 Key: HARMONY-6604
                 URL: https://issues.apache.org/jira/browse/HARMONY-6604
             Project: Harmony
          Issue Type: Bug
            Reporter: Li Jing Qin
            Priority: Minor
         Attachments: arrayfile

Given the test case below:

                File arrayfile = new File("arrayfile"); [0]
		BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(arrayfile)));
		String input = null;
		byte[] holder = new byte[262144];
		int curser = 0;
		byte[] output = new byte[262144];
		while ((input = reader.readLine()) != null) {
			String[] inputs = input.split(" ");
			for (String in : inputs) {
				String code = in;
				holder[curser] = Byte.decode(code).byteValue();
		Deflater deflater = new Deflater(Deflater.DEFAULT_COMPRESSION);
		int start = 0;
		int end = curser;
		int offset = 0;
		while (start < end) {
			if (deflater.needsInput()) {
				int size = 4092;
				if (start + size >= end) {
					size = end - start;
				deflater.setInput(holder, start, size);
				start += size;
			} else {
				int def = deflater.deflate(output, offset, output.length - offset);
				if (def != 0) {
					System.out.println("def: " + def);
					offset += def;
		int def = deflater.deflate(output, offset, output.length - offset);
		System.out.println("end deflate: " + def);

The RI output is:
def: 2
def: 20670
end deflate: 93

Harmony is:
def: 2
end deflate: 20757

>From the output, it seems that RI thought input bytes enough early than Harmony does.
I guess ri maybe use different flush policy when the there are too many read bytes hold in
the deflater, but fail to test out which policy ri uses.

The difference seems not so big. We are using Z_NO_FLUSH as the flush parameter, *which allows
deflate to decide how much data to accumulate before producing output, in order to maximize
the compression*. And the result shows that harmony need less bytes than RI. But this difference
cause the 6 junit test case fail in hadoop.

Moreover, in zlib, there still another flush parameters (Z_PARTIAL_FLUSH, for example) maybe
we need add these.

Does anyone has any idea about this?

[0] arrayfile will be attached.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message