Return-Path: X-Original-To: apmail-httpd-dev-archive@www.apache.org Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E371510A70 for ; Wed, 4 Dec 2013 00:14:51 +0000 (UTC) Received: (qmail 97187 invoked by uid 500); 4 Dec 2013 00:14:51 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 97104 invoked by uid 500); 4 Dec 2013 00:14:51 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 97096 invoked by uid 99); 4 Dec 2013 00:14:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 00:14:50 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daniel.lescohier@cbsinteractive.com designates 74.125.149.77 as permitted sender) Received: from [74.125.149.77] (HELO na3sys009aog106.obsmtp.com) (74.125.149.77) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 04 Dec 2013 00:14:45 +0000 Received: from mail-ve0-f170.google.com ([209.85.128.170]) (using TLSv1) by na3sys009aob106.postini.com ([74.125.148.12]) with SMTP ID DSNKUp5z3tcKg8eetoJMaQFXeTR0szUUYlDi@postini.com; Tue, 03 Dec 2013 16:14:25 PST Received: by mail-ve0-f170.google.com with SMTP id oy12so11452315veb.29 for ; Tue, 03 Dec 2013 16:14:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=UviMwoLHXZ+c1rl93PH2fJV29p1bLc1aoCG1WmDWNzY=; b=cYVDa2QCpPjBFaYcYtSwJ13u2zZs2wCshTkZMm9XBmNWTdVIwHhUW+c4ehc8moo1+b 1NEKgztdOBPyhbgoDdDEumtmnpo0TATjyyV/ijyqseML+BVlieH1fvJ+GpNMqnf/JHcR M88QfPBThp/JA/9lDBVd7C0dchgC4fXztFgchW7YzYhJO8tiJnRDY0FNykwq9Mmj5C37 kYF87thtvZkZqYvl3HU1TiSdss/R7k/BgtZTBKy/AyzLeOPjli7PTTR/MZ/XFNEV9rmX /QDz5FqdD+Ju4Hwn0mWAYWIlYKSfyARxfZs5e9LwQW2ux2gNitRkDXMw2ATyIuArywRP cyKA== X-Gm-Message-State: ALoCoQnEPqY4nnEsX1Sv5LU92/VFLd5rJPn69Vgr5ZQaXVBOX0BVEWKoBtmPqdnPxOOqM9lvvCRMvKfTpbpQfXGaybGoGpYWL1KOlhfgkcnaYaK30TNi6h8u+20kckglirDaQQ2STjOmPeU2n4mx6y7ogempDUje0AXWKHHVQfacn/1AtZzWeX0= X-Received: by 10.220.95.139 with SMTP id d11mr7514398vcn.21.1386116062337; Tue, 03 Dec 2013 16:14:22 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.220.95.139 with SMTP id d11mr7514390vcn.21.1386116062152; Tue, 03 Dec 2013 16:14:22 -0800 (PST) Sender: daniel.lescohier@cbsinteractive.com Received: by 10.220.77.198 with HTTP; Tue, 3 Dec 2013 16:14:22 -0800 (PST) In-Reply-To: References: <5082195D-7095-4CF8-8D04-D249160EE804@jaguNET.com> Date: Tue, 3 Dec 2013 19:14:22 -0500 X-Google-Sender-Auth: fok6WkK-5hb73eNlZtWZDP-dpd8 Message-ID: Subject: Re: time caching in util_time.c and mod_log_config.c From: Daniel Lescohier To: dev@httpd.apache.org Content-Type: multipart/alternative; boundary=001a11c2ae7680da3a04ecaa4bf1 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2ae7680da3a04ecaa4bf1 Content-Type: text/plain; charset=ISO-8859-1 I took a look at apr's configure.in, and it's default for all architectures except for i486, i586, and i686 is to use the real atomic ops, but for those three architectures the default is to use the "generic" atomic ops. Any idea why there is a special rule for those three architectures? There's nothing wrong with the atomic operations on those three architectures: otherwise, how have we had semaphores and mutexes for all these years on those CPUs? I guess that is a question for the APR dev mailing list. I see that some distros override that default. E.g., the libapr1.spec for openSUSE has: %ifarch %ix86 --enable-nonportable-atomics=yes \ %endif and in /usr/lib/rpm/macros: On Tue, Dec 3, 2013 at 12:54 PM, Yann Ylavic wrote: > I personnally like this solution better (IMHO) since it does not rely on > apr_thread_mutex_trylock() to be wait-free/userspace (eg. natively > implements the "compare and swap"). > > On the other hand, apr_atomic_cas32() may itself be implemented using > apr_thread_mutex_lock() when USE_ATOMICS_GENERIC is defined (explicitly, or > with --enable-nonportable-atomics=no, or else forcibly with "gcc -stdc=c89" > or intel cpus <= i686). > > Hence with USE_ATOMICS_GENERIC, apr_thread_mutex_trylock() may be a better > solution than the apr_thread_mutex_lock()... > > > > On Tue, Dec 3, 2013 at 6:01 PM, Daniel Lescohier < > daniel.lescohier@cbsi.com> wrote: > >> If the developers list is OK using apr_atomic in the server core, there >> would be lots of advantages over trylock: >> >> 1. No need for child init. >> 2. No need for function pointers. >> 3. Could have a lock per cache element (I deemed it too expensive >> memory-wise to have a large mutex structure per cache element). >> 4. It would avoid the problem of trylock not being implemented on all >> platforms. >> 5. Fewer parameters to the function macro. >> >> The code would be like this: >> >> #define TIME_CACHE_FUNCTION(VALUE_SIZE, CACHE_T, CACHE_PTR, >> CACHE_SIZE_POWER,\ >> CALC_FUNC, AFTER_READ_WORK\ >> >> )\ >> const apr_int64_t seconds = apr_time_sec(t);\ >> apr_status_t status;\ >> CACHE_T * const cache_element = \ >> &(CACHE_PTR[seconds & ((1<> /* seconds==0 can be confused with unitialized cache; don't use cache >> */\ >> if (seconds==0) return CALC_FUNC(value, t);\ >> if (apr_atomic_cas32(&cache_element->lock, 1, 0)==0) {\ >> >> if (seconds == cache_element->key) {\ >> memcpy(value, &cache_element->value, VALUE_SIZE);\ >> apr_atomic_dec32(&cache_element->lock);\ >> >> AFTER_READ_WORK;\ >> return APR_SUCCESS;\ >> }\ >> if (seconds < cache_element->key) {\ >> apr_atomic_dec32(&cache_element->lock);\ >> return CALC_FUNC(value, t);\ >> }\ >> apr_atomic_dec32(&cache_element->lock);\ >> >> }\ >> status = CALC_FUNC(value, t);\ >> if (status == APR_SUCCESS) {\ >> if (apr_atomic_cas32(&cache_element->lock, 1, 0)==0) {\ >> >> if (seconds > cache_element->key) {\ >> cache_element->key = seconds;\ >> memcpy(&cache_element->value, value, VALUE_SIZE);\ >> }\ >> apr_atomic_dec32(&cache_element->lock);\ >> }\ >> }\ >> return status; >> >> -------------------------------------------------- >> >> typedef struct { >> apr_int64_t key; >> apr_uint32_t lock; >> >> apr_time_exp_t value; >> } explode_time_cache_t; >> >> TIME_CACHE(explode_time_cache_t, explode_time_lt_cache, >> TIME_CACHE_SIZE_POWER) >> >> AP_DECLARE(apr_status_t) ap_explode_recent_localtime( >> apr_time_exp_t * value, apr_time_t t) >> { >> TIME_CACHE_FUNCTION( >> >> sizeof(apr_time_exp_t), explode_time_cache_t, explode_time_lt_cache, >> TIME_CACHE_SIZE_POWER, apr_time_exp_lt, >> value->tm_usec = (apr_int32_t) apr_time_usec(t)) >> } >> > --001a11c2ae7680da3a04ecaa4bf1 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I took a look at apr's configure.in, and it's default for all = architectures except for i486, i586, and i686 is to use the real atomic ops= , but for those three architectures the default is to use the "generic= " atomic ops.=A0 Any idea why there is a special rule for those three = architectures?=A0 There's nothing wrong with the atomic operations on t= hose three architectures: otherwise, how have we had semaphores and mutexes= for all these years on those CPUs?=A0 I guess that is a question for the A= PR dev mailing list.

I see that some distros override that = default.=A0 E.g., the libapr1.spec for openSUSE has:

%ifarch %ix86=A0=A0=A0=A0=A0=A0=A0 --enable-nonportable-atomics=3Dyes \
%endif
<= br>
and in /usr/lib/rpm/macros:


On Tue, Dec 3, 2013 at 12:54 PM, Yann Ylavic <y= lavic.dev@gmail.com> wrote:
I personnally like this solut= ion better (IMHO) since it does not rely on apr_thread_mutex_trylock() to b= e wait-free/userspace (eg. natively implements the "compare and swap&q= uot;).

On the othe= r hand, apr_atomic_cas32() may itself be implemented using apr_thread_mutex= _lock() when USE_ATOMICS_GENERIC is defined (explicitly, or with --enable-n= onportable-atomics=3Dno, or else forcibly with "gcc -stdc=3Dc89" = or intel cpus <=3D i686).

Hence with = USE_ATOMICS_GENERIC, apr_thread_mutex_trylock() may be a better solution th= an the apr_thread_mutex_lock()...



On Tue, Dec 3, 2013 at 6:01 PM, Daniel= Lescohier <daniel.lescohier@cbsi.com> wrote:
If the d= evelopers list is OK using apr_atomic in the server core, there would be lo= ts of advantages over trylock:
  1. No need for child init.
  2. No need for function pointers.
  3. =
  4. Could have a lock per cache element (I deemed it too expensive memory-w= ise to have a large mutex structure per cache element).
  5. It would avoid the problem of trylock not being implemented on all plat= forms.
  6. Fewer parameters to the function macro.

The = code would be like this:

#define TIME_CACHE_FUNCTION(VALUE_SIZE, CACH= E_T, CACHE_PTR, CACHE_SIZE_POWER,\
=A0=A0=A0 CALC_FUNC, AFTER_READ_WORK\


)\
=A0=A0=A0 const apr= _int64_t seconds =3D apr_time_sec(t);\
=A0=A0=A0 apr_status_t status;\=A0=A0=A0 CACHE_T * const cache_element =3D \
=A0=A0=A0=A0=A0=A0=A0 &a= mp;(CACHE_PTR[seconds & ((1<<CACHE_SIZE_POWER)-1)]);\
=A0=A0=A0 /* seconds=3D=3D0 can be confused with unitialized cache; don'= ;t use cache */\
=A0=A0=A0 if (seconds=3D=3D0) return CALC_FUNC(value, t= );\
=A0=A0=A0 if (apr_atomic_cas32(&cache_element->lock, 1,= 0)=3D=3D0) {\

=A0=A0=A0=A0=A0=A0=A0 if (seconds =3D=3D cache_element->key) {\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 memcpy(value, &cache_element->valu= e, VALUE_SIZE);\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 apr_atomic_dec3= 2(&cache_element->lock);\

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = AFTER_READ_WORK;\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 return APR_SUCCESS;\=
=A0=A0=A0=A0=A0=A0=A0 }\
=A0=A0=A0=A0=A0=A0=A0 if (seconds < cache_el= ement->key) {\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 apr_atomic_dec32(&cache_element->l= ock);\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 return CALC_FUNC(value, t);\=A0=A0=A0=A0=A0=A0=A0 }\
=A0=A0=A0=A0=A0=A0=A0 apr_atomic_dec32(&ca= che_element->lock);\

=A0=A0=A0 }\
=A0=A0=A0 status =3D CALC_F= UNC(value, t);\
=A0=A0=A0 if (status =3D=3D APR_SUCCESS) {\
=A0=A0=A0=A0=A0=A0=A0 = if (apr_atomic_cas32(&cache_element->lock, 1, 0)=3D=3D0) {\

= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (seconds > cache_element->key) {= \
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 cache_element->key = =3D seconds;\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 memcpy(&cache_element->= ;value, value, VALUE_SIZE);\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 }\
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 apr_atomic_dec32(&cache_element-&g= t;lock);\
=A0=A0=A0=A0=A0=A0=A0 }\
=A0=A0=A0 }\
=A0=A0=A0 return s= tatus;

--------------------------------------------------

typedef struct {
=A0=A0=A0 apr_int64_t key;
=A0=A0=A0 apr_uint= 32_t lock;


=A0=A0=A0 apr_time_exp_t value;
} explode_time_ca= che_t;

TIME_CACHE(explode_time_cache_t, explode_time_lt_cache,
= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 TIME_CACHE_SIZE_POWER)

AP_DECLARE(apr_status_t) ap_explode_recent_localtime(
=A0=A0= =A0 apr_time_exp_t * value, apr_time_t t)
{
=A0=A0=A0 TIME_CACHE_FUNC= TION(

=A0=A0=A0 sizeof(apr_time_exp_t), explode_time_cache_t, explo= de_time_lt_cache,
=A0=A0=A0 TIME_CACHE_SIZE_POWER, apr_time_exp_lt,
=A0=A0=A0 value->tm_usec =3D (apr_int32_t) apr_time_usec(t))
}

= --001a11c2ae7680da3a04ecaa4bf1--