Return-Path: X-Original-To: apmail-cxf-dev-archive@www.apache.org Delivered-To: apmail-cxf-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81E0711BC7 for ; Mon, 14 Jul 2014 09:12:13 +0000 (UTC) Received: (qmail 12884 invoked by uid 500); 14 Jul 2014 09:12:13 -0000 Delivered-To: apmail-cxf-dev-archive@cxf.apache.org Received: (qmail 12821 invoked by uid 500); 14 Jul 2014 09:12:13 -0000 Mailing-List: contact dev-help@cxf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cxf.apache.org Delivered-To: mailing list dev@cxf.apache.org Received: (qmail 12808 invoked by uid 99); 14 Jul 2014 09:12:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 09:12:12 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sberyozkin@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-we0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 09:12:08 +0000 Received: by mail-we0-f172.google.com with SMTP id x48so2014188wes.31 for ; Mon, 14 Jul 2014 02:11:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=NT8pVjrdUV3dVN1TKPDGnA0KkHQg4WdXxC4nDEU00sY=; b=BzBta7gxWxK97mtNaT9JqloSm1GFtBEWMFIhGD+XjZsG+HY8utcACKMMaOd5/9JJ7D 53cVLwCB3rC95uHpDE6Nwy7R8dIl+pa/nXvCSR8aG3YWwpBmj7lKzVLiJEMD1E6wj+7c eXQnvUd+GAIsjWX3gbXlIUIobNKfOes9nSi1HrKKpmZ7KCnKtyFDzti325QR24s2/KwP BLJl7QdOtk3W4oNk+xz3/CGL3iQywhprI/MRU5k+zcQmtddGUargSn+wXUcM3lUAqlhk 2WHnAleJfInCI6FC5LjA/VuQr6et3WVJfUJhVhcHvmmlsJdOqI68T6DSSZAye6n74PaC G6Bg== X-Received: by 10.194.63.77 with SMTP id e13mr2167849wjs.104.1405329102556; Mon, 14 Jul 2014 02:11:42 -0700 (PDT) Received: from [10.36.226.2] ([80.169.137.63]) by mx.google.com with ESMTPSA id d4sm28852402wiy.13.2014.07.14.02.11.41 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 14 Jul 2014 02:11:41 -0700 (PDT) Message-ID: <53C39ECD.7020509@gmail.com> Date: Mon, 14 Jul 2014 10:11:41 +0100 From: Sergey Beryozkin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: dev@cxf.apache.org Subject: Re: Performance issue with Content-ID computation of multipart MTOM/XOP messages References: <53C38CA1.7040308@redhat.com> In-Reply-To: <53C38CA1.7040308@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi Alessio On 14/07/14 08:54, Alessio Soldano wrote: > Hi, > while running some performance benchmarks here, we noticed lot of time > spent computing the content-id of multipart MTOM/XOP messages, which is > quite unexpected (at least to me). We have a client consuming a wsdl > which references an external xsd. That xsd contains a type with base64 > encoded data. The schema declares elementFormDefault="qualified", > attributeFormDefault="unqualified" and > targetNamespace="org:foo:PurchaseOrder". > The problem is in AttachmentUtil's createContentID: > > public static String createContentID(String ns) throws > UnsupportedEncodingException { > String cid = "cxf.apache.org"; > String name = ATT_UUID + "-" + String.valueOf(++counter); > if (ns != null && (ns.length() > 0)) { > try { > URI uri = new URI(ns); > String host = uri.toURL().getHost(); > cid = host; > } catch (Exception e) { > cid = ns; > } > } > return URLEncoder.encode(name, "UTF-8") + "@" + > URLEncoder.encode(cid, "UTF-8"); > } > > If the code inside the 'if' block is executed, a URL is to be created > from the namespace string, which in my case is something like > "org:foo:PurchaseOrder" (note, I can't change that, it's part of the > benchmark sources). Building a URL from a String is potentially very > expensive, because of the involved URLStreamHandler processing. In my > case, the method will try to locate a URLStreamHandler named something > like "xyz.org.Handler", which obviously does not exist; that causes a > CNFE to be initialized, thrown and caught in the catch block above. That > badly affects performances. > > Now, I have few questions: > 1) do we really need that mechanism for computing the content-id from > the host of the url generated using the namespace? is there a spec > requiring that? > 2) if that's required, would you mind me trying to add some preliminary > checks to avoid the URL generation when that's clearly going to raise an > exception (for instance by parsing the string using a pre-computed > regular expression) ? Doing some basic manual checks would be faster indeed. You can simply try URI.getScheme and/or URI.getAuthority, and do some basic checks around it, no need to convert to URL for sure... Thanks, Sergey > 3) any different idea / solution? > > Thanks > Alessio >