Hi Alexey!
I had updated the JIRA with new patch [1], please take a look.
There seem to be no way to fix the callers of shiftLeftOneBit, so I
leaved fastpath for zeroes intact.
Thanks,
Aleksey.
[1] https://issues.apache.org/jira/browse/HARMONY5825
On Thu, May 22, 2008 at 7:02 PM, Alexei Fedotov
<alexei.fedotov@gmail.com> wrote:
> Aleksey,
> Thank you for thorough reply. I like and support your suggestion about
> commenting the changes. Let me address the most mysterious thing I
> still don't understand: adding a fast path for zeroes in the shift
> function.
>
> If this optimization helps increasing performance, some code sends
> zeroes to the shift function more frequently than other numbers. Do
> you know where your shift is called with zeroes as a typical argument?
> What if we place the check in the specific caller instead of the
> shift?
>
> Thanks!
>
>
>
> On Thu, May 22, 2008 at 4:07 PM, Aleksey Shipilev
> <aleksey.shipilev@gmail.com> wrote:
>> Hi, Alexei!
>>
>> Thanks for the review! Here is some light on your questions:
>>
>> On Thu, May 22, 2008 at 11:27 AM, Alexei Fedotov
>> <alexei.fedotov@gmail.com> wrote:
>>> 1. I have noticed that you have replaced shiftLeft(1) with
>>> shiftLeftOneBit() in the following checks which appear . Is it
>>> performance critical part or just cleaning?
>> This is performancewise change. "sign == 0" guarantees that shifting
>> is not needed.
>>
>>> // Checking if: remainder * 2 >= scaledDivisor
>>> remainder.abs().shiftLeftOneBit().compareTo(scaledDivisor.abs())
>>>
>>> If this is critical, then can one imagine a way of checking that a
>>> number is twice bigger than another number without copying and
>>> shifting the number? For cases when one number has more binary digits
>>> than another this should be easy.
>> IMO, this way is also fast, though we can think up about something
>> with peak efficiency. I see we will need structural changes to
>> accommodate another design which would allow us to use other checking
>> schemes.
>>
>>> 2. What is the meaning of 2 in the new function name unsignedMultAdd2?
>>> Why not unsignedMultAdd4?
>> That mean "unsigned multiplication and addition of 2 arguments", hence
>> the "2" in signature. We can change it to another if this is
>> confusing.
>>
>>> 2a. What is the purpose of the following fix? How a function call
>>> (even for a magic function), could be quicker than multiplication
>>> inlined by JIT? Does it mean our JIT poorly inlines multiplication?
>>>  long res = (aDigits[0] & 0xFFFFFFFFL) * (factor);
>>> + long res = unsignedMultAdd2(aDigits[0], factor, 0, 0);
>> Well, the original purpose of this was code cleanup (see how original
>> version look like, bloated with 0xF...F constants), so we came up with
>> idea of extracting unsigned operations of (a*b), (a*b + c), (a*b + c +
>> d) as general method. Then you can imagine two situations:
>> 1. JIT compiler inlines the method and eliminated unneeded additions.
>> 2. JIT compiler generates the magic for this method and substituting
>> it with optimal native code (HARMONY5826).
>>
>> *sigh* I guess I need to write the javadoc for this method.
>>
>>> 3. You added a check against shifting zero values. Does adding this
>>> check on a common path add to performance? If zeroes come to shift
>>> function often enough to give a benefit, which caller spams so much
>>> zeroes? Does it worth to patch the caller instead?
>> Nope, the benefit of 0path there is much more than penalty on non0path.
>>
>>> 4. I have an abstract concern with no idea how to address it. An
>>> integer array in memory is very similar to a long array (if endings
>>> are correct). Is there any way to use 64bit operations for speed up?
>> We thought about such the thing. This would require severe changes to
>> fit into existing code. More details would be later :)
>>
>> Thanks,
>> Aleksey.
>>
>
>
>
> 
> With best regards,
> Alexei
>
