cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aki Yoshida <elak...@gmail.com>
Subject Re: Performance issue with Content-ID computation of multipart MTOM/XOP messages
Date Mon, 14 Jul 2014 09:58:24 GMT
I'm not sure whether it is really necessary to make the cid part
depend on the namespace string.

If we only need to guarantee uniquness within a document, a single
thread calling the createContentID method will get a series of unique
IDs. However, as the static variable counter is not synchronously
updated, currently two threads may get the same ID value but this
situation is not relevant as long as these two threads are working on
two different documents. And even if two threads may be working on the
same document, using the namespace depending value for the cid part
won't decrease the collision chance very much as they are likely to be
using the same namespace value. If we need to guarantee uniqueness
among multiple documents, we will need a different mechanism anyway.
So, I see not much benefit in using the namespace depending variable
here.

regards, aki

2014-07-14 11:11 GMT+02:00 Sergey Beryozkin <sberyozkin@gmail.com>:
> Hi Alessio
>
> On 14/07/14 08:54, Alessio Soldano wrote:
>>
>> Hi,
>> while running some performance benchmarks here, we noticed lot of time
>> spent computing the content-id of multipart MTOM/XOP messages, which is
>> quite unexpected (at least to me). We have a client consuming a wsdl
>> which references an external xsd. That xsd contains a type with base64
>> encoded data. The schema declares elementFormDefault="qualified",
>> attributeFormDefault="unqualified" and
>> targetNamespace="org:foo:PurchaseOrder".
>> The problem is in AttachmentUtil's createContentID:
>>
>>      public static String createContentID(String ns) throws
>> UnsupportedEncodingException {
>>          String cid = "cxf.apache.org";
>>          String name = ATT_UUID + "-" + String.valueOf(++counter);
>>          if (ns != null && (ns.length() > 0)) {
>>              try {
>>                  URI uri = new URI(ns);
>>                  String host = uri.toURL().getHost();
>>                  cid = host;
>>              } catch (Exception e) {
>>                  cid = ns;
>>              }
>>          }
>>          return URLEncoder.encode(name, "UTF-8") + "@" +
>> URLEncoder.encode(cid, "UTF-8");
>>      }
>>
>> If the code inside the 'if' block is executed, a URL is to be created
>> from the namespace string, which in my case is something like
>> "org:foo:PurchaseOrder" (note, I can't change that, it's part of the
>> benchmark sources). Building a URL from a String is potentially very
>> expensive, because of the involved URLStreamHandler processing. In my
>> case, the method will try to locate a URLStreamHandler named something
>> like "xyz.org.Handler", which obviously does not exist; that causes a
>> CNFE to be initialized, thrown and caught in the catch block above. That
>> badly affects performances.
>>
>> Now, I have few questions:
>> 1) do we really need that mechanism for computing the content-id from
>> the host of the url generated using the namespace? is there a spec
>> requiring that?
>> 2) if that's required, would you mind me trying to add some preliminary
>> checks to avoid the URL generation when that's clearly going to raise an
>> exception (for instance by parsing the string using a pre-computed
>> regular expression) ?
>
>
> Doing some basic manual checks would be faster indeed. You can simply try
> URI.getScheme and/or URI.getAuthority, and do some basic checks around it,
> no need to convert to URL for sure...
>
> Thanks, Sergey
>
>
>> 3) any different idea / solution?
>>
>> Thanks
>> Alessio
>>
>
>

Mime
View raw message