lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-dev] Str_Find return type
Date Thu, 22 Oct 2015 21:21:10 GMT
On Thu, Oct 22, 2015 at 8:06 AM, Nick Wellnhofer <wellnhofer@aevum.de> wrote:
> On 18/10/2015 14:01, Nick Wellnhofer wrote:

>>>> - Make `Find` return a size_t.
>>>>   Requires special value for "not found".
>>>
>>> Hmm, that's a toughie.  If the sentinel is SIZE_MAX, that might not fit in
>>> all host numeric types.
>>
>>
>> Good point. The problem is not so much the byte size of the return type but
>> the fact that a host language might not support unsigned integers. Maybe we
>> should limit string sizes to SSIZE_MAX and make `Find` return an ssize_t.
>> But this requires to emulate the ssize_t type on non-POSIX platforms.
>
> Here's another idea. Most of the time, users of Str_Find don't care about
> the position of the substring and only want to know whether the substring is
> contained or not. For this use case, a method like Str_Contains returning a
> bool is a more appropriate interface.
>
> If someone is interested in the exact position of the substring, it might
> make more sense to return a string iterator pointing to the first occurrence
> of the substring. So what about:
>
>     public bool
>     Contains(String *self, String *substring);
>
>     public incremented nullable StringIterator*
>     Find(String *self, String *substring);

+1

You're absolutely right, avoiding an index which counts code points is
consistent with our iterator-centric model for string processing.  Good
insight, and nice API proposal!

Marvin Humphrey

Mime
View raw message