httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Chuguev" <Chug...@Clickstream.com>
Subject Re: "Better" mod_unique_id
Date Wed, 30 Apr 2008 08:46:21 GMT
Hi Ian,

Shame I wasn't aware of UUIDs. It looks like a very credible solution.  
RFC 4122 even defines a URN namespace for it. And it is provided on  
many platforms straight away. I think I'll stick to it until I find  
someone who convinces me it is not good for some reason.
Thanks a lot for the hint.

	Konstantin.


On 29 Apr 2008, at 10:53, Ian Holsman wrote:

> Hi Konstantin.
>
> I'm about to look at the same issue for my employer.
>
> for my version I was planning on using apr_uuid_get that uses  
> uuid_create / uuid_generate function to generate a unique value.
>
> have you looked at this function?
>
> regards
> Ian
>
> Konstantin Chuguev wrote:
>> Hi,
>>
>> I'm developing a solution generating unique IDs for the requests to  
>> websites that are not only clustered but also geographically  
>> dispersed. This implies the following:
>> - the website's virtual host section on each Apache server has the  
>> same ServerName which is mapped by DNS to different IP addresses  
>> using various methods, geo-proximity, round-robin, etc.
>> - the virtual host's IP address is normally but not necessarily *;
>> - the actual IP address the Apache listens to for this virtual host  
>> is normally, but not necessarily, an intranet address (behind a  
>> load balancer).
>>
>> After analysing the format of the ID generated by mod_unique_id,  
>> and reading the module's source code, I have a feeling that this  
>> module has serious flaws if used in my situation.
>> No offence to the authors, I'm sure the module serves its purpose  
>> just right for the majority of its users. But as it seems that it  
>> doesn't do this in my case, I thought I'd better ask if someone  
>> knows why.
>>
>> I understand that the module is relatively old and likely has been  
>> ported from a pre-2.0 version, when no APR library existed, and  
>> this might explain its design. I'd be glad if someone could either  
>> confirm this or
>> explain why it has been done like that.
>>
>> Now to the point of my question. The unique_id_rec structure that  
>> contains the binary representation of the unique ID consists of the  
>> following fields:
>>    unsigned int stamp;
>>    unsigned int in_addr;
>>    unsigned int pid;
>>    unsigned short counter;
>>    unsigned int thread_index;
>>
>> 1. Why use unsigned int timestamp when there exists apr_time_t  
>> which is 64 bit and seems to be at least 1 microsecond accurate?  
>> Surely there is unsigned short counter which helps if there is more  
>> than one request coming to the same IP address / PID / thread per  
>> second, but still I can hardly see this as a better design.
>>
>> 2. Why use unsigned id pid plus unsigned int thread_index if there  
>> exists long r->connection->id? thread_index is in fact produced by  
>> doing htonl((unsigned int)r->connection->id), but MPMs seem to  
>> ensure the child_id is included there already! While it is just 4  
>> bytes long compared to the 8-byte pid/thread_index combination,  
>> still it is guaranteed to be unique among all worker threads of the  
>> Apache server in the system. And I don't think this particular  
>> field needs converting to the network byte order.
>>
>> 3. Using unsigned int in_addr with the server-side IPv4 address  
>> works well in the single cluster in the IPv4 network only. What if  
>> only IPv6 is being used in the intranet? What if multiple dispersed  
>> clusters with exactly the same intranet IP addressing schemes serve  
>> the same website? Please correct me if I'm wrong but I think the  
>> following structure would represent the unique website more  
>> correctly:
>> - union {struct in_addr, struct in6_addr} local_ip_addr: the IP  
>> address of the local side of the HTTP connection;
>> - union {struct in_addr, struct in6_addr} dns_ip_addr: one (any?)  
>> of the IP addresses that are mapped to the website's domain name in  
>> DNS. The latter can be omitted if the former IP address is public.
>>
>> Does anyone see any flaws in the design where the following  
>> structure is used?
>>    apr_time_t stamp;    // 8 bytes, converted to network byte order
>>    long connection_id;    // size depends on architecture: normally  
>> 4 or 8 bytes, doesn't need htonl
>>    union {struct in_addr, struct in6_addr} local_ip_addr;    // 4  
>> to 16 bytes
>>    [union {struct in_addr, struct in6_addr} dns_ip_addr;]    // 0  
>> to 16 bytes
>>
>> Comments and suggestions are appreciated.
>>
>> Konstantin Chuguev
>> Software Developer
>>
>> Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF,  
>> Registered in England No. 3774129
>>
>>
>>
>

Konstantin Chuguev
Software Developer

Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF,  
Registered in England No. 3774129



Mime
View raw message