harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ray Chen <clrayc...@gmail.com>
Subject Re: [classlib] BufferedReader and FileInputStream.available()
Date Tue, 10 Nov 2009 06:09:54 GMT
Shall we store the return value of out.position()-1 to avoid invoking it
twice?
If it is too tiny for performace improvement, current patch seems good for
me.

On Tue, Nov 10, 2009 at 2:04 PM, Regis <xu.regis@gmail.com> wrote:

> Ray Chen wrote:
>
>> Hi Regis,
>>
>> Maybe add the z/OS newline, on z/OS 0x15 means the newline. You can refer
>> to
>> PrintStream.write(int) for this.
>>
>
> how about this:
>
> if ((out.position() > offset)
>        && (out.get(out.position() - 1) == '\n' || out
>                .get(out.position() - 1) == 0x15)) {
>
>    // we could return the result without blocking read
>    break;
> }
>
>
>> On Tue, Nov 10, 2009 at 12:35 PM, Regis <xu.regis@gmail.com> wrote:
>>
>>  enh wrote:
>>>
>>>  On Fri, Nov 6, 2009 at 02:28, Mark Hindess <mark.hindess@googlemail.com
>>>> >
>>>> wrote:
>>>>
>>>>  In message <4AF3F7DE.3020400@gmail.com>, Tim Ellison writes:
>>>>>
>>>>>  On 06/Nov/2009 03:06, Regis wrote:
>>>>>>
>>>>>>  Mark Hindess wrote:
>>>>>>>
>>>>>>>  In message <
>>>>>>>> 96933a4d0911051045p61431af5ie2cecb850a62267a@mail.gmail.com>,
>>>>>>>> enh writes:
>>>>>>>>
>>>>>>>>  On Wed, Nov 4, 2009 at 23:07, Regis <xu.regis@gmail.com>
wrote:
>>>>>>>>>
>>>>>>>>>  Mark Hindess wrote:
>>>>>>>>>>
>>>>>>>>>>  In message <4AF0DC18.1000504@gmail.com>, Regis
writes:
>>>>>>>>>>>
>>>>>>>>>>> Regis, these are really two different issues
so lets treat them
>>>>>>>>>>> as such.  I actually think fixing the /proc reading
is better
>>>>>>>>>>> done in the java code.  I've attached a patch
that just catches
>>>>>>>>>>> and ignores the IOException from the available
call.  I think
>>>>>>>>>>> this is reasonable since if the exception was
"real" then the
>>>>>>>>>>> read will fail if a similar exception.  In fact,
since the user
>>>>>>>>>>> is doing a read call, having the exception thrown
by the read is
>>>>>>>>>>> more in keeping with the expectations of the
user than having an
>>>>>>>>>>> exception thrown by the available call.
>>>>>>>>>>>
>>>>>>>>>>>  Thanks Mark, I ran luni tests with your patch,
no regression
>>>>>>>>>> found, I'll leave patch to you to apply :)
>>>>>>>>>>
>>>>>>>>>> invoking in.available before read is a little overhead,
in the
>>>>>>>>>> worst case, one read need four system calls, three
seek and one
>>>>>>>>>> read. Actually the condition
>>>>>>>>>>
>>>>>>>>>> if ((in.available() = 0) && (out.position()
> offset))
>>>>>>>>>>
>>>>>>>>>> it's only for tty file, for normal file it's not
necessary. It's
>>>>>>>>>> better if we can remove the check for normal file,
but I have no
>>>>>>>>>> idea how to do this.
>>>>>>>>>>
>>>>>>>>>>  isatty(3). for the price of one JNI call at construction
and a
>>>>>>>>> boolean field to record the result, you could avoid any
later
>>>>>>>>> calls.
>>>>>>>>>
>>>>>>>>>  This might be a good idea, but I'm getting confused...
>>>>>>>>
>>>>>>>> At the moment there seems to be a special case in
>>>>>>>> FileInputStream.available() like this:
>>>>>>>>
>>>>>>>>  // stdin requires special handling
>>>>>>>>  if (fd == FileDescriptor.in) {
>>>>>>>>   return (int) fileSystem.ttyAvailable();
>>>>>>>>  }
>>>>>>>>  ... perform seek based available check ...
>>>>>>>>
>>>>>>>> but I find this confusing because the comment and check implies
>>>>>>>> that stdin is special but the fileSystem method name implies
>>>>>>>> that it is being a tty that is the distinction.  This is
>>>>>>>> confusing because *any* file descriptor can be a tty - e.g.
new
>>>>>>>> FileReader("/dev/tty") - and stdin doesn't have to be a tty
-
>>>>>>>> "java </dev/zero".
>>>>>>>>
>>>>>>>>  Good question, I suppose here should be like this:
>>>>>>> if (istty(fd)) {
>>>>>>>  return (int) fileSystem.ttyAvailable(fd);
>>>>>>> }
>>>>>>>
>>>>>>>  With apologies for being pedantic, but IMHO it should be fd.istty()
>>>>>> since there are likely a few places where this could be useful, and
we
>>>>>> should cache it in the right place.
>>>>>>
>>>>>>  Agreed.  I think that is what elliott was suggesting earlier in
this
>>>>> thread.  Definitely seems like we should fix the semantics of these tty
>>>>> checks so they aren't just about stdin.
>>>>>
>>>>>  yes and no. i was restricting myself to the question "how should we
>>>> query the number of available bytes?". on Unix, the answer is with the
>>>> FIONREAD ioctl. so my claim was really:
>>>>
>>>> * there should be a "static native int available(FileDescriptor fd)"
>>>> method generally available
>>>> * the Unix implementation should use the FIONREAD ioctl.
>>>> * any platform-specific hacks (such as, say, if Windows really needs
>>>> the three-seek hack) should be in the native code for the broken
>>>> platforms.
>>>> * any Java should use the intention-revealing
>>>> "available(FileDescriptor)" method and not concern itself with how
>>>> that works on any given platform.
>>>>
>>>> we've done that for Android. (if Windows forces you to treat files and
>>>> ttys completely differently, your life will be more difficult, but i
>>>> still think that should be in the platform-specific code. as you
>>>> pointed out with the "/dev/tty" example, on Unix it's not just
>>>> inefficient, it's wrong.)
>>>>
>>>> for the bigger question of BufferedReader.readLine, i don't think we
>>>> should be calling "available" at all here. i think somewhere there
>>>> must be the equivalent of a "readFully" rather than a "read", and
>>>> that's the mistake that needs fixing. fix that, and the "available"
>>>> hack can go completely.
>>>>
>>>>  I just found a way can remove the in.available() check here. Instead of
>>> checking  in.available, I check whether the last char read from stream is
>>> '\n'.
>>>
>>> Index: modules/luni/src/main/java/java/io/InputStreamReader.java
>>> =====================================================================
>>> --- modules/luni/src/main/java/java/io/InputStreamReader.java
>>> +++ modules/luni/src/main/java/java/io/InputStreamReader.java
>>> @@ -246,14 +246,9 @@ public class InputStreamReader extends Reader {
>>>            while (out.hasRemaining()) {
>>>                // fill the buffer if needed
>>>                if (needInput) {
>>> -                    try {
>>> -                        if ((in.available() == 0)
>>> -                            && (out.position() > offset)) {
>>> -                            // we could return the result without
>>> blocking
>>> read
>>> -                            break;
>>> -                        }
>>> -                    } catch (IOException e) {
>>> -                        // available didn't work so just try the read
>>> +                    if ((out.position() > offset) &&
>>> out.get(out.position() - 1) == '\n') {
>>> +                        // we could return the result without blocking
>>> read
>>> +                        break;
>>>                    }
>>>
>>>                    int to_read = bytes.capacity() - bytes.limit();
>>>
>>>
>>> --
>>> Best Regards,
>>> Regis.
>>>
>>>
>>>
>>>  try this:
>>>>
>>>> #include <unistd.h>
>>>> int main() {
>>>>  ssize_t n;
>>>>  char buf[1024];
>>>>  while ((n = read(0, buf, sizeof(buf))) > 0) {
>>>>   write(2, ">", 1);
>>>>   write(2, buf, n);
>>>>  }
>>>>  return 0;
>>>> }
>>>>
>>>> if you run that and type "hello" then return, then "world" then
>>>> return, you'll see (mixing your input and its output):
>>>>
>>>> hello
>>>>
>>>>  hello
>>>>>
>>>>>  world
>>>>
>>>>  world
>>>>>
>>>>>  because it's the terminal's job to make sure you get data at the end
>>>> of a line. (or not, if so configured, but if someone has configured
>>>> their terminal that way, that's their decision, they should expect
>>>> programs' behavior to change, and we certainly shouldn't go around
>>>> reconfiguring the terminal.)
>>>>
>>>> by contrast, if you run this (where my program above has been built as
>>>> "read") like this:
>>>>
>>>> echo -e "hello\nworld\n" | ./read
>>>>
>>>> you see:
>>>>
>>>>  hello
>>>> world
>>>>
>>>> i.e. both lines arrive as a single chunk of data.
>>>>
>>>> (you didn't say exactly _how_ the test fails, so i'm just guessing
>>>> it's related to this.)
>>>>
>>>>  --elliott
>>>>
>>>>  Still need to look at the available natives too since elliott suggested
>>>>
>>>>> that we should replace the 3*seek with a single ioctl for all cases
>>>>> (though this might not be as portable).
>>>>>
>>>>> -Mark.
>>>>>
>>>>>  Regis, can you explain a little more about why the check is needed
>>>>>
>>>>>> in the available call?  And why the available check is needed in
the
>>>>>>>> FileInputStream read method?
>>>>>>>>
>>>>>>>>  After I removed the check, everything is fine, except following
>>>>>>> test is
>>>>>>> failed:
>>>>>>>
>>>>>>>  public static void main(String args[])
>>>>>>>  {
>>>>>>>   BufferedReader reader = new BufferedReader(new
>>>>>>> InputStreamReader(System.in));
>>>>>>>   String line = null;
>>>>>>>   try {
>>>>>>>     line = reader.readLine();
>>>>>>>   } catch (IOException e) {
>>>>>>>     e.printStackTrace();
>>>>>>>   }
>>>>>>>   System.out.println(line);
>>>>>>>  }
>>>>>>>
>>>>>>>  In each case, is it really stdin that is special or any file
>>>>>>>
>>>>>>>> descriptor
>>>>>>>> representing a tty?
>>>>>>>>
>>>>>>>> I'd like to understand this as there are a whole bunch of
ttyRead
>>>>>>>> with
>>>>>>>> a similar check that only affects stdin not any tty descriptor.
>>>>>>>>
>>>>>>>>  in the meantime, i wondered also whether it's worth swapping
the
>>>>>>>> two
>>>>>>>>
>>>>>>>>> conjuncts in the if, since the latter is essentially
free while the
>>>>>>>>> former is expensive. (i've no idea how often the cheap
conjunct is
>>>>>>>>> false, though.)
>>>>>>>>>
>>>>>>>>>  I thought about this but held of doing it for the same
reason.
>>>>>>>>  However,
>>>>>>>> I guess it is almost certainly a win.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>  Mark.
>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>>  --
>>> Best Regards,
>>> Regis.
>>>
>>>
>>
>>
>>
>
> --
> Best Regards,
> Regis.
>



-- 
Regards,

Ray Chen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message