Return-Path: Delivered-To: apmail-incubator-harmony-dev-archive@www.apache.org Received: (qmail 97896 invoked from network); 12 Oct 2006 22:42:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 12 Oct 2006 22:42:50 -0000 Received: (qmail 69284 invoked by uid 500); 12 Oct 2006 22:42:48 -0000 Delivered-To: apmail-incubator-harmony-dev-archive@incubator.apache.org Received: (qmail 69230 invoked by uid 500); 12 Oct 2006 22:42:48 -0000 Mailing-List: contact harmony-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: harmony-dev@incubator.apache.org Delivered-To: mailing list harmony-dev@incubator.apache.org Received: (qmail 69219 invoked by uid 99); 12 Oct 2006 22:42:48 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Oct 2006 15:42:48 -0700 X-ASF-Spam-Status: No, hits=2.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of weldonwjw@gmail.com designates 64.233.182.189 as permitted sender) Received: from [64.233.182.189] (HELO nf-out-0910.google.com) (64.233.182.189) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Oct 2006 15:42:47 -0700 Received: by nf-out-0910.google.com with SMTP id c29so1209564nfb for ; Thu, 12 Oct 2006 15:42:25 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=E239j4uR0XT4yFvnAvTccYelzVYTG/G5vdzEQEPMX7nrMisA7uHQ2NklruYyodtvCG/VbFFzV/aa6MZVRGQDHhshkzdH2rblQ6p0yvsgBYMxlyXa/LZKeFd+0NB8PuSmE8HDJMoY4WiJnj8tEdx6yl3Esf8I1jjN/tKRtO69mfo= Received: by 10.78.139.1 with SMTP id m1mr2834299hud; Thu, 12 Oct 2006 15:42:25 -0700 (PDT) Received: by 10.78.136.5 with HTTP; Thu, 12 Oct 2006 15:42:25 -0700 (PDT) Message-ID: <4dd1f3f00610121542m27404082i89dbc365d81ae806@mail.gmail.com> Date: Thu, 12 Oct 2006 15:42:25 -0700 From: "Weldon Washburn" To: harmony-dev@incubator.apache.org Subject: Re: [drlvm] The first GC helper with fast-path implemented in Java: gc_alloc In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_24216_28113696.1160692945152" References: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_24216_28113696.1160692945152 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline All, This is a good discussion that has surfaced many topics related to writing inlinable vm helpers in java/vmmagic. I leave out all the email replies to reduce clutter. Ultimately we will need to solve all the problems that have surfaced including making changes to GC/JIT/VM interfaces. I suggest that for right now we focus only on demonstrating the benefit of inlining one specific existing API, gc_alloc_fast(). The debate on interface mods can happen later. How about the following steps? 1) Confirm that Mikhail's translation into java/vmmagic is accurate. 2) Get Jitrino.OPT to inline and optimize this code and generate correct binary image 3) Show the performance delta for some workloads More comments inlined below -- On 10/11/06, Mikhail Fursov wrote: > > GC, VM gurus! > I need your help in implementation of the first our helper written with > magic. > I've started with GCv41 allocation helper for objects. > Please review the way I'm going to implement it and correct me if I have > misunderstood something or confirm if everything is OK. > > > The native fast path: > > Managed_Object_Handle gc_alloc_fast(unsigned in_size, Allocation_Handle > ah, > void *thread_pointer) { > C1. assert((in_size % GC_OBJECT_ALIGNMENT) == 0); > C2. assert (ah); > C3. unsigned char *next; > > C4. GC_Thread_Info *info = (GC_Thread_Info *) thread_pointer; > C5. Partial_Reveal_VTable *vtable = ah_to_vtable(ah); > C6. GC_VTable_Info *gcvt = vtable->get_gcvt(); > C7. unsigned char *cleaned = info->tls_current_cleaned; > C8. unsigned char *res = info->tls_current_free; > > C9. if (res + in_size <= cleaned) { > C10. if (gcvt->is_finalizible()) return 0; > > C11. info->tls_current_free = res + in_size; > C12. *(VT32*)res = ah; > > C13. assert(((POINTER_SIZE_INT)res & (GC_OBJECT_ALIGNMENT - 1)) == > 0); > C14. return res; > C15. } > > C16. if (gcvt->is_finalizible()) return 0; > > C17. unsigned char *ceiling = info->tls_current_ceiling; > > > C18. if (res + in_size <= ceiling) { > > C19. info->tls_current_free = next = info->tls_current_free + > in_size; > > // cleaning required > C20. unsigned char *cleaned_new = next + > THREAD_LOCAL_CLEANED_AREA_SIZE; > C21. if (cleaned_new > ceiling) cleaned_new = ceiling; > C22. info->tls_current_cleaned = cleaned_new; > C23. memset(cleaned, 0, cleaned_new - cleaned); > C24. *(VT32*)res = ah; > > C25. assert(((POINTER_SIZE_INT)res & (GC_OBJECT_ALIGNMENT - 1)) == > 0); > C26. return res; > C27. } > > C28. return 0; > } > > > > The helper's code: > > public static Object gc_alloc(int objSize, int allocationHandle) { > > J1. Address tlsAddr = TLS.getGCThreadLocal(); > > J2. Address tlsCurrentFreeFieldAddr = tlsAddr.plus > (TLS_CURRENT_FREE_OFFSET); > J3. Address tlsCurrentCleanedFieldAddr = tlsAddr.plus > (TLS_CURRENT_CLEANED_OFFSET); > > J4. Address tlsCurrentFreeAddr = tlsCurrentFreeFieldAddr.loadAddress(); > J5. Address tlsCurrentCleanedAddr = > tlsCurrentCleanedFieldAddr.loadAddress(); > > J6. Address tlsNewFreeAddr = tlsCurrentFreeAddr.plus(objSize); > > // the fast path without cleaning > J7. if (tlsNewFreeAddr.LE(tlsCurrentCleanedAddr)) { > J8. tlsCurrentFreeFieldAddr.store(tlsNewFreeAddr); > J9. tlsCurrentFreeAddr.store(allocationHandle); > J10. return tlsCurrentFreeAddr; > J11. } > > J12. Address tlsCurrentCeilingFieldAddr = tlsAddr.plus > (TLS_CURRENT_CEILING_OFFSET); > J13. Address tlsCurrentCeilingAddr = > tlsCurrentCeilingFieldAddr.loadAddress(); > > // the fast path with cleaning > J14. if (tlsNewCurrentFreeAddr.LE(tlsCurrentCeilingAddr)) { > J15. Address tlsNewCleanedAddr = tlsCurrentCeilingAddr; > J16. if (tlsCurrentCeilingAddr.diff(tlsNewFreeAddr) > > THREAD_LOCAL_CLEANED_AREA_SIZE) { > J17. Address tlsCleanedNew = tlsNewFreeAddr.plus > (THREAD_LOCAL_CLEANED_AREA_SIZE); > J18. } > J19. int bytesToClean = tlsNewCleanedAddr.diff(tlsNewFreeAddr); > J20. org.apache.harmony.vmhelper.native.Utils.memset(tlsNewFreeAddr, > bytesToClean, 0); > J21. tlsCurrentCleanedFieldAddr.store(tlsNewCleanedAddr); > > J22. tlsCurrentFreeFieldAddr.store(tlsNewFreeAddr); > J23. tlsCurrentFreeAddr.store(allocationHandle); > J24. return tlsCurrentFreeAddr; > > } > > //the slow path > //this call will be replaced by JIT with direct native call as VM > magic > org.apache.harmony.vmhelper.native.DRLVMHelper.gc_alloc(objSize, > allocationHandle); > > } > > > The problems I see: > > 1) The problem: GC helper must know GC_Thread_Info struct offsets. If I understand correctly, you are referring to TLS_CURRENT_FREE_OFFSET and TLS_CURRENT_CEILING_OFFSET. Can we leave this as an ugly hack for right now? That is, hardcode the actual offsets. Something like: "static int TLS_CURRENT_FREE_OFFSET 0x18;" 2) The problem: Where to keep GC magic code? This code is GC specific and > must be available for bootstrap classloader. > JIT can know all the details which magic code to inline (the helper type, > the helper signature) from its properties (see opt.emconf file for > example) Its prototype code for now. Its not critical that we identify its final location at this point. In any case, it definitely belongs to the GC developers. 3) The problem: Is the signature for gc_alloc method : gc_alloc(int objSize, > int allocationHandle) is universal for all GCs? Well, gc_alloc(...) is what the GC/VM interface currently supports. After working with MMTk, I now know this API is *not* universal. I think it's not. But we can extend JIT with different signatures support if needed. This is correct. We need to extend Jitrino.JET with the MMTk allocation signature. Then we need to discuss the impact on GC/VM/JIT interfaces. I will restart this discussion soon. 4) The new magic method is proposed, line J21: > org.apache.harmony.vmhelper.native.Utils.memset(tlsNewFreeAddr, > bytesToClean, 0); I agree with the previous comments that #4 is not needed. 5) The magic code in does not contain 'finalizable' check. > JIT can do this check during the compilation and do not generate the fast > path. This is another option to pass to JIT from GC. #5 is really independent of writing helpers in java/vmmagic. How about addressing #5 at a later time? I've enumerated the lines in code if you want to comment it. > Please feel free to review the code and to discuss any other problems I > missed. > > -- > Mikhail Fursov > > -- Weldon Washburn Intel Middleware Products Division ------=_Part_24216_28113696.1160692945152--