Return-Path: Delivered-To: apmail-harmony-dev-archive@www.apache.org Received: (qmail 84199 invoked from network); 5 Nov 2007 08:14:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Nov 2007 08:14:01 -0000 Received: (qmail 81499 invoked by uid 500); 5 Nov 2007 08:13:49 -0000 Delivered-To: apmail-harmony-dev-archive@harmony.apache.org Received: (qmail 80970 invoked by uid 500); 5 Nov 2007 08:13:48 -0000 Mailing-List: contact dev-help@harmony.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@harmony.apache.org Delivered-To: mailing list dev@harmony.apache.org Received: (qmail 80961 invoked by uid 99); 5 Nov 2007 08:13:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Nov 2007 00:13:47 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of liyilei1979@gmail.com designates 209.85.128.191 as permitted sender) Received: from [209.85.128.191] (HELO fk-out-0910.google.com) (209.85.128.191) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Nov 2007 08:13:51 +0000 Received: by fk-out-0910.google.com with SMTP id 18so1521058fks for ; Mon, 05 Nov 2007 00:13:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=myL4Ko7GxzH53S6unevx6CWNuyHO1hzVtr1JQJFKQfo=; b=YeuyNfrsjoA5EtkBXuoo5MBF1MPwAeHkiij/bFAaybfIJ64tkRX9p13uiJWpKDfg1fkfRYpvx6Z0d9RUFkKMEzcGmR3/19zBE4vaQ5R+frKvOiyp93WS7h6PdkSK57PqmawscV05iqZOkfeVw542numo9+WF1clIgH6W96Ywmww= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=kVxFhYlS1etd+6zNODW7frbr8FJDb8gqn7Ocyo4UGT4UMHtACIE0Ngw5lOOpD1W6PWE0Vf9BDy5FCHV3ZdpEySnbk0R00odnpVi3/U8K2s3447BWMwmENI3kruMQUGMdiIHUfphgPTJooOaQXr5rVQDSOxgmXSJIBmPcDOREycw= Received: by 10.82.124.10 with SMTP id w10mr8757109buc.1194250408236; Mon, 05 Nov 2007 00:13:28 -0800 (PST) Received: by 10.82.159.3 with HTTP; Mon, 5 Nov 2007 00:13:28 -0800 (PST) Message-ID: Date: Mon, 5 Nov 2007 16:13:28 +0800 From: "Leo Li" To: dev@harmony.apache.org Subject: Re: [classlib][nio]FileChannel and direct buffer reallocation(was Re: [jira] Commented: (HARMONY-4081) [classlib][nio] FileChannel.write(ByteBuffer[]) sometimes works incorrectly) In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_27676_28220933.1194250408226" References: <51abf0750708200937s54f07eafve82fe2a32fbfad6b@mail.gmail.com> <51abf0750709100748y6d3a45b6na6ee6e1607143f20@mail.gmail.com> <51abf0750709110336q1de3c9ddxe16b8d5fb4259a07@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_27676_28220933.1194250408226 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On 11/5/07, Yang Paulex wrote: > > Hi, all > > Any progress in this issue? Is it OK to commit the patch for > FileChannelImpl > at first? At least, this part is a obvious issue to be fixed. I myself > tried > it locally on laptop Win/Desktop Linux based on IBM VME, and it works > fine. I myself think it is OK. Had the modification at r567561 had problem(although I do not think so), it would not hinder us from applying the patch for FileChannelImpl: It releases the bytebuffer too early to complete the operation. 2007/9/11, Leo Li : > > > > On 9/11/07, Mikhail Markov wrote: > > > Leo, > > > > > > I've tried it on my laptop without Hyperthreading - still it crashes. > > > And, btw, it did not crashed before r567561 commit - could that be a > > problem > > > in that patch? > > > > In that patch, I just explicitly invoke System.gc when memory is > > tight. It will aggravate the problem, but I am not sure whether it is > > the root cause. > > > > It will take some time to investigate it. > > > > Good luck! > > > > > > Thanks, > > > Mikhail > > > > > > On 9/11/07, Leo Li wrote: > > > > > > > > On 9/10/07, Mikhail Markov wrote: > > > > > Hi, Leo, > > > > > > > > > > I've just checked only FileChannelImpl changes with the latest svn > > > > snapshot > > > > > and got VM crash (in IBM VME) in threadstart WIN API function. > Could > > you > > > > > repeat this? > > > > > > > > Hi, Mikhail > > > > I have met this problem before. It seems the native block > > > > allocated for the direct byte buffer is released before we expected > so > > > > the WIN API will reference an invalid address although the direct > byte > > > > buffer should have been pinned in the patch. > > > > After some studying, I cannot find obvious problem in the native > > > > block reallocation mechanism in the class library. Actually IBM VME > > > > and DRLVM both encounter the failure. > > > > But what makes me puzzle most is the problem can occur on RI > > > > albeit with a much lower frequency. (Not sure whether it can be > > > > reproduce on every machine.)Seems it is a cross classlib and cross > vm > > > > problem.:) > > > > I have got some hint but I have no proof so I was hesitating to > > > > tell it in public: I have once shutdown the hyper-threading option > and > > > > then everything is ok. I will try to find a multi-processor machine > > > > with hyper-threading shutdown for test to determine whether it is > > > > related to hyper-threading or parallel multi-threading(not the style > > > > of time slice sharing). > > > > Could you please also try this on your server with hyper-threading > > > > closed since the result is always different on each machine? > > > > > > > > Good luck! > > > > > > > > > > > > > > Thanks, > > > > > Mikhail > > > > > > > > > > On 8/22/07, Leo Li wrote: > > > > > > > > > > > > Hi, Mikhail: > > > > > > I have just focused on the problem about how to ensure > direct > > byte > > > > > > buffer to be release in time. And after I applied the patch at > > > > r567561, > > > > > > although the problem of releasing direct byte buffer seems > > resolved, I > > > > > > found > > > > > > one testcase in FileChannel failed. > > > > > > After some studying I found the problem is coincident with > > > > > > HARMONY-4081, in that there is a bug in FileChannel.write() that > > there > > > > is > > > > > > no > > > > > > holder for temporarily allocated direct buffers then the they > > might be > > > > > > gc-ed > > > > > > and the related memory resource reallocated. My patch, > > which might > > > > > > intrigue GC, aggravates this problem and leads to failure in > > normal > > > > > > testsuite. > > > > > > So could you please first commit the part of > FileChannel.write > > () > > > > and > > > > > > let current tests pass? > > > > > > > > > > > > Thanks. > > > > > > > > > > > > On 8/21/07, Mikhail Markov wrote: > > > > > > > > > > > > > > Thanks for detailed comments! > > > > > > > You are right about the memory auto-freeing, so my > modifications > > of > > > > > > > AbstractMemorySpy are not correct. > > > > > > > See my comments for MappedByteBuffer below inlined. > > > > > > > > > > > > > > Still the changes in FileChannelImpl alone do not work: I've > > just > > > > > > > re-tried: > > > > > > > the test still fails and the following messages starts > printing: > > > > > > > ... > > > > > > > Memory Spy! Fixed attempt to free memory that was not > allocated > > > > > > > PlatformAddress[29968352] > > > > > > > ... > > > > > > > I've added debug stack-trace printing and found that these > > messages > > > > are > > > > > > > printed when tried to free DirectBuffers at the end of > > > > > > > FileChannelImpl.write() > > > > > > > method. It's strange at least, that we could not explicitly > free > > > > > > > DirectBuffer which we allocated. > > > > > > > Seems like these buffers were freed in > > > > AbstractMemorySpy.orphanedMemory > > > > > > () > > > > > > > method. > > > > > > > The comment for DirectByteBuffer.free() method says: > > > > > > > "Explicitly free the memory used by this direct byte buffer. > If > > the > > > > > > memory > > > > > > > has already been freed then this is a no-op. > > > > > > > ... > > > > > > > Note this is is possible that the memory is freed by code that > > > > reaches > > > > > > > into > > > > > > > the address and explicitly frees it 'beneith' us -- this is > bad > > > > form." > > > > > > > Does it mean that freeing in AbstractMemorySpy.orphanedMemory > () > > is > > > > "bad > > > > > > > form"? :-) > > > > > > > We should somehow "synchronize" the explicit memory freeing > and > > > > > > > auto-freeing > > > > > > > in AbstractMemorySpy. > > > > > > > Looking into the code, i could propose to add additional > boolean > > > > > > parameter > > > > > > > to AbstractMemorySpy.free() method to indicate if warning > > message > > > > should > > > > > > > be > > > > > > > printed or not but the main problem here is that reproducer > > still > > > > fails > > > > > > if > > > > > > > modify just FileChannelImpl, which means that auto-freeing > does > > not > > > > work > > > > > > > as > > > > > > > expected. And I'm not quite understand why it's so. > > > > > > > Do you have any ideas? > > > > > > > > > > > > > > Thanks, > > > > > > > Mikhail > > > > > > > > > > > > > > > > > > > > > On 8/17/07, Yang Paulex wrote: > > > > > > > > > > > > > > > > I'm forwarding this discussion to dev-list to make the > > discussion > > > > > > > > easier:). > > > > > > > > Please see my comments inline. > > > > > > > > > > > > > > > > 2007/8/16, Mikhail Markov (JIRA) : > > > > > > > > > > > > > > > > > > > > > > > > > > > [ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/browse/HARMONY-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520216 > > > > > > > > ] > > > > > > > > > > > > > > > > > > Mikhail Markov commented on HARMONY-4081: > > > > > > > > > ----------------------------------------- > > > > > > > > > > > > > > > > > > Paulex, > > > > > > > > > > > > > > > > > > Thanks for the patch review! > > > > > > > > > > > > > > > > > > In the beginning, i've created the patch withouth > > > > > > > MappedPlatformAddress > > > > > > > > > and AbstractMemorySpy modifications, but this lead to > > > > exceptions. > > > > > > > Seems > > > > > > > > like > > > > > > > > > after explicit temporary buffers freeing at the end of > > > > > > > > > FileChannelImpl.write() method, another attempt to free > the > > same > > > > > > > > resources > > > > > > > > > is made in RuntimeMemorySpy.alloc() method. > > > > > > > > > > > > > > > > > > About MappedPlatformAddress modification: yes - on Linux > > it's as > > > > you > > > > > > > > > described, but unfortunately on Windows UnmapViewOfFile > > function > > > > is > > > > > > > > used, > > > > > > > > > which does not physically free the memory. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I missed to mention windows implementation last time, but I > > don't > > > > > > catch > > > > > > > > you > > > > > > > > up here on the UnmapViewOfFile, because I cannot find the > > relevant > > > > > > > > explanation in MSDN that this method needs further free() to > > > > release > > > > > > > > physical memory [1], and the sample code of MSDN doesn't add > > any > > > > > > further > > > > > > > > memory free for this[2]. Even if UnmapViewOfFile doesn't > free > > the > > > > > > > memory, > > > > > > > > I > > > > > > > > prefer we modify the unmap() implementation on Windows to > add > > > > memory > > > > > > > free, > > > > > > > > so that the the platform neutral behavior can be kept for > > portlib, > > > > > > > > otherwise > > > > > > > > on Linux we may put the situation in risk that free same > > memory > > > > twice. > > > > > > > how > > > > > > > > do you think? > > > > > > > > > > > > > > > > > > > > > I've just checked the info again and agree with you - no > > explicit > > > > memory > > > > > > > freeing is needed. > > > > > > > > > > > > > > But I did see some potential problems, although not sure > because > > > > MSDN is > > > > > > > not > > > > > > > > very clear here:- > > > > > > > > > > > > > > > > The CloseHandle() needs to be invoked after all file view is > > > > unmapped, > > > > > > > in > > > > > > > > our case, we don't support multi-view for same file mapping > > > > object, so > > > > > > > > it's > > > > > > > > OK to close the handle right after unmap. But for some > unknown > > > > > > reasons, > > > > > > > > currently CloseHandle() is done in windows version's > > mmapImpl(Ln. > > > > 151, > > > > > > > > luni/src/main/native/luni/windows/OSMemoryWin32.c) right > after > > > > > > > > MapViewOfFile, I'm not sure if this is right action or not. > > Some > > > > > > > relevant > > > > > > > > MSDN pages: > > > > > > > > > > > > > > > > "Unmapping a file view invalidates the pointer to the > > process's > > > > > > virtual > > > > > > > > address space. If any of the pages of the file view have > > changed > > > > since > > > > > > > the > > > > > > > > view was mapped, the system writes the changed pages of the > > file > > > > to > > > > > > disk > > > > > > > > using caching. To commit all data to disk before unmapping > the > > > > file > > > > > > > view, > > > > > > > > use the *FlushViewOfFile*< > > > > > > > > http://msdn2.microsoft.com/en-us/library/aa366563.aspx > > >function. > > > > > > > > > > > > > > > > When each process finishes using the file mapping object and > > has > > > > > > > unmapped > > > > > > > > all views, it must close the file mapping object's handle > and > > the > > > > file > > > > > > > on > > > > > > > > disk by calling > > > > > > > > *CloseHandle*< > > > > http://msdn2.microsoft.com/en-us/library/ms724211.aspx>. > > > > > > > > These calls to *CloseHandle* succeed even when there are > file > > > > views > > > > > > that > > > > > > > > are > > > > > > > > still open. However, leaving file views mapped causes memory > > > > leaks." > > > > > > > > In Harmony implementation, we actually call CloseHandle > before > > the > > > > > > only > > > > > > > > mapped file view is unmapped, but from the document above, I > > > > cannot > > > > > > say > > > > > > > > it's > > > > > > > > safe or not. I'll try to find some time to test later. > > > > > > > > > > > > > > > > > > > > > I've read in [1] in Remarks: > > > > > > > "Although an application may close the file handle used to > > create a > > > > file > > > > > > > mapping object, the system holds the corresponding file open > > until > > > > the > > > > > > > last > > > > > > > view of the file is unmapped: Files for which the last view > has > > not > > > > yet > > > > > > > been > > > > > > > unmapped are held open with no sharing restrictions." > > > > > > > > > > > > > > So, seems like immediate closing file handles after mapping > > looks > > > > ok... > > > > > > > > > > > > > > > > > > > > > [1] http://msdn2.microsoft.com/en-us/library/aa366882.aspx > > > > > > > > [2] http://msdn2.microsoft.com/en-us/library/aa366548.aspx > > > > > > > > [3] http://msdn2.microsoft.com/en-us/library/aa366532.aspx > > > > > > > > > > > > > > > > About AbstractMemorySpy modification: the modification is > > related > > > > to > > > > > > the > > > > > > > > one > > > > > > > > > in MappedPlatformAddress. Usually, the following > > construction is > > > > > > used > > > > > > > > for > > > > > > > > > resources explicit freeing: > > > > > > > > > if(memorySpy.free(this)){ > > > > > > > > > osMemory.free(osaddr); > > > > > > > > > } > > > > > > > > > i.e. after removing the address from memoryInUse, physical > > > > freeing > > > > > > > > happens > > > > > > > > > - in this case. The only place where this was not so is > > > > > > > > > MappedPlatformAddress (at least for Windows), so after i > > added > > > > > > > explicit > > > > > > > > > memory freeing in MappedPlatformAddress, the address could > > be > > > > safely > > > > > > > > removed > > > > > > > > > from refToShadow. > > > > > > > > > You're right - is this case the mechanizm of auto-freeing > is > > not > > > > > > > > > necessary. > > > > > > > > > I did not found places where free() method explicity used > > > > > > > > > except *PlatformAddress classes. Do you know any? > > > > > > > > > If not then do we really need this mechanizm if all > physical > > > > freeing > > > > > > > is > > > > > > > > > done in *PlatformAddress classes? > > > > > > > > > > > > > > > > > > > > > > > > PlatformAddress is used not only by MappedDirectBuffer but > by > > > > common > > > > > > > > direct > > > > > > > > buffer, too. We cannot ask applications running on Harmony > to > > > > > > explicitly > > > > > > > > free all direct buffer, so the automatic > > reallocation mechanism > > > > is > > > > > > > still > > > > > > > > necessary. If the number/size of direct buffer used by > > > > > > > FileChannelImpl's > > > > > > > > gather/scatter IO make you uncomfortable so that they are > > expected > > > > to > > > > > > be > > > > > > > > released explicitly and quickly, I prefer we find some way > to > > deal > > > > > > with > > > > > > > > this > > > > > > > > within FileChannelImpl rather than in a method of > > PlatformAddress > > > > or > > > > > > > > AbstractMemorySpy, which may be depended on by other > classlib > > and > > > > in > > > > > > > turn > > > > > > > > by > > > > > > > > applications. How do you think? > > > > > > > > > > > > > > > > -- > > > > > > > > Paulex Yang > > > > > > > > China Software Development laboratory > > > > > > > > IBM > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Leo Li > > > > > > China Software Development Lab, IBM > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Leo Li > > > > China Software Development Lab, IBM > > > > > > > > > > > > > -- > > Leo Li > > China Software Development Lab, IBM > > > > > > -- > Paulex Yang > China Software Development Lab > IBM > -- Leo Li China Software Development Lab, IBM ------=_Part_27676_28220933.1194250408226--