Return-Path: X-Original-To: apmail-lucy-dev-archive@www.apache.org Delivered-To: apmail-lucy-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A3AB610CDC for ; Fri, 26 Apr 2013 11:11:10 +0000 (UTC) Received: (qmail 73956 invoked by uid 500); 26 Apr 2013 11:11:10 -0000 Delivered-To: apmail-lucy-dev-archive@lucy.apache.org Received: (qmail 73798 invoked by uid 500); 26 Apr 2013 11:11:07 -0000 Mailing-List: contact dev-help@lucy.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucy.apache.org Delivered-To: mailing list dev@lucy.apache.org Received: (qmail 73569 invoked by uid 99); 26 Apr 2013 11:11:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Apr 2013 11:11:05 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [212.227.17.10] (HELO moutng.kundenserver.de) (212.227.17.10) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Apr 2013 11:11:00 +0000 Received: from [192.168.1.39] (dslb-088-065-036-209.pools.arcor-ip.net [88.65.36.209]) by mrelayeu.kundenserver.de (node=mreu4) with ESMTP (Nemesis) id 0LdyNE-1UqiU13lTI-00prsQ; Fri, 26 Apr 2013 13:10:34 +0200 Message-ID: <517A60A4.7040608@aevum.de> Date: Fri, 26 Apr 2013 13:10:28 +0200 From: Nick Wellnhofer User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: dev@lucy.apache.org References: <04F42080-EB5B-417D-BD19-A97540DA4F55@aevum.de> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:OPns69k+r3nJQrkPyDCZHD5V5hOMs9yV7If00XRmzCo C3wJLtm5nzTFVnF9dVIoqs7aSHu7NGb3S471dUg1utqE7nhrdW IN//koXu+8yv8PWZjTZD9mmCT6bL8gSuDxiF0wSWfkqd/iCO6G Wmw1YTpkgXDJ/1G4BqBkcHgLZa2ClaITtRhcVYPysT7RAgmnAy K+LX+DhIoUNBbhhNLahEkO5Q2cbQfPdJqJuLg7KvZnDN6Rv4oH Ndhdx9mh8uJP1ZoYPt2ffzOym1zDkEhUtA34SJhbm8CLTfnrbD aCCTWeR7PSTPsvQkOFuFTiXTGCmJt0Kzp67fnrgIY6OizteRQ= = X-Virus-Checked: Checked by ClamAV on apache.org Subject: Re: [lucy-dev] Proposal for implementation of immutable strings On 26/04/2013 01:48, Marvin Humphrey wrote: > Substrings of zombie strings are dangerous, because the buffer belonging to > the parent object may not outlive the substring. Right. This even applies to zombie strings alone. Consider assigning a zombie string to a member var. This is currently done with self->var = CB_Clone(zstr); With immutable strings, we don't have to create a copy and can simply write self->var = (String*)INCREF(zstr); This will break with zombie strings. > User-defined procedures will encounter ZombieStrings via wrapped callbacks -- > if a parameter is `String*` they'll get a real String with copied content from > host argument, but if it's `const String*`, they'll get a ZombieString* > wrapping the host string content. That's a great solution. But this isn't implemented yet, right? It would also require that String methods can be invoked on const Strings (like const member functions in C++). Would this work without further changes? > Unless we want to require that `SubString` > operate on non-const String* (like we will for `Inc_RefCount`), > ZStr_SubString() will have to return a fully independent String object which > owns its own buffer. That shouldn't be a problem. BTW, the INCREF macro should be changed so it doesn't work with const objects, see example above. >> For zombie strings, it's assumed that they don't have to care about the >> lifetime of the character buffer. So there are two cases left out: > >> stack-allocated strings that own a buffer > > Can we make that an invalid state and avoid it? Yes, we'll simply make the assumption that zombie strings never own a buffer. > I think that's a good approach, but I have an ulterior motive -- I'm hoping > that ultimately we end up with one class handling all encodings, a la > . That would only require a member var to store the encoding. But I don't quite understand the rationale behind this. Does it have to do with the Python bindings? > PS: Is it now true that ZombieStrings can only ever be allocated on the stack, > rather than in static memory? Because if that's the case, I'd favor the > name StackString instead. In ZombieKeyedHash, they're allocated from a MemPool. Otherwise, all allocations seem to be from the stack. Nick