apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rici Lake <r...@ricilake.net>
Subject Re: LP64/P64 model API issue #2
Date Sat, 21 May 2005 15:27:20 GMT

On 21-May-05, at 8:04 AM, Wesley W. Garland wrote:

>> do we want (size_t) members of apr_table nelts values, or are we 
>> happy to
>> have them int?
>
> I'll vote size_t -- int suggests that it's possible for us to have
> negative-sized arrays, which strikes me a kind of silly. I've always
> been fond of size_t for nelts-type members, because when I write
> *static* arrarys, I use this code pattern requently:
>
> for (i = 0; i < (sizeof(array) / sizeof(array[0])); i++)
>   do_stuff(array[i]);
>
> Besides which, if it gets made into an unsigned quantity, it will
> clear up a *pile* of casts (to clear up signed/unsigned comparison
> warnings) in my code. ;)

In theory, I agree with you. In practice, I think there is a lot of
code out there which is aware that nelts is an int and would need to
have casts added.

>> I think int is fine. If apr_table_t were rewritten to be scalable to >
>> 2^31 elements, it would probably acquire a different iterator
>> interface.
>
> I'm not sure that's a valid concern, unless I'm missing something
> non-obvious. Are you worried only about the cost running comp() more
> than two billion times? If so -- I would suggest that if you're
> storing that many records in an apr_table_t, and need to
> apr_table_do() over them regularly, that you have simply chosen the
> wrong data storage/abstraction and basically deserve what you get...

That's exactly my point. The expected use case for an apr_table is
storage of a relatively small number of <key, value> pairs where the
both the key and the value are strings and moreover the keys are
case-insensitive and can appear multiple times. Furthermore, the
iteration order, at least for multiple matching keys, ought to be
preserved by iteration.

That's a use case which comes up a fair amount in servers, because
of the historical design based on email headers (a relatively
small number of <key: value> assignments where both the key and the
value ... etc.)

That could be done in other ways than simply storing the
<key, value> pairs as an array, but given that the expected
number of elements is small, it doesn't appear to be a priority.
If it were reimplemented using some variant of a hash table, it
may well be more efficient and capable of scaling to more than
2^31 elements, but:

1) The expected use case for apr_table's doesn't have that
requirement;

2) No-one, least of all me, seems to be clamouring for a datatype
which does have exactly those requirements; and

3) Such a datatype would probably have an iteration protocol
which was not based on extracting internal members (nelts,
elts).

So my conclusion is that int is fine, because it is in wide use
and the theoretical nicety of using an unsigned and possibly more
capacious datatype does not reflect any real need.


Mime
View raw message