incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: threads
Date Mon, 30 Mar 2009 22:08:28 GMT
On Sun, Mar 29, 2009 at 7:29 PM, Marvin Humphrey <> wrote:
> On Sun, Mar 29, 2009 at 09:21:58AM -0400, Michael McCandless wrote:
>> I think concurrency is so bad that either 1) we're (the "royal we",
>> our species) gonna have to punt on micro-concurrency (not use it, ie
>> use only macro/strongly-divorced concurrency, like multi-process,
>> one-thread-does-whole-search),
> Ensuring the bare minimum level of thread support in Lucy for
> "strongly-divorced concurrency" won't be insanely difficult.  We just have to
> ensure that all globals are safe to access -- and there aren't that many
> stateful globals in our KS prototype.


>  * VTables are stateful.
>  * The VTable_registry Hash which is used to associate VTables with class
>    names is stateful.

What is this hash used for?  Does it map name -> VTable?  It seems
like most references within core could be done by the compiler/linker?
 And then when host needs to refer to an object inside the core,
shouldn't the bindings be exposed however the host would do it (eg as
a Python module(s) with bindings), also compiled/linked statically.

>  * The Hash class uses shared keys.
>  * Debug and IndexReader have some stateful test-only globals that don't
>    matter.
> All other globals in KS are stateless, including the UNDEF object, format
> constants, string literals, and most importantly, offsets into the VTables
> used by the "inside out vtable" dispatch algo, which are stored in a bazillion
> individual global size_t variables.  The method offsets are all thread-safe,
> because they are constant per-core-compile.
> A little footnote though about those offsets, though.  When I was examining
> the assembler generated by GCC for the method dispatch static inline
> functions, I found that it ignored the "const" part of "extern const size_t"
> and performed an ordinary dynamic load.  When we compile under pthreads, we
> should see whether GCC skimps on the optimization because it's worried that
> those "extern const" variables might change.


> That won't be an issue for core, because we will probably end up
> pound-defining the offset symbols as actual integer constants if LUCY_CORE is
> defined.  But it might matter for compiled extensions if they run tight loops.


> The VTables themselves are stateful because they are refcounted.
> Furthermore, dynamically created VTables are reclaimed once the last object
> which needs them goes away -- so the "MyScorer" VTable at one point in the
> program might not be the same as the "MyScorer" VTable at another point in
> the program.

This is neat: so you can make a new subclass at runtime, instantiate a
bunch of objects off of it, and once all objects are gone, and nobody
directly refers to the subclass (vtable).  Can you define new vtables
from the host language?

> However, if we stop refcounting VTables (by making Inc_RefCount and
> Dec_RefCount no-ops) and accept that dynamically created VTables will leak,
> then those concerns go away.  I think this would be a reasonable approach.
> The leak ain't gonna matter unless somebody does something psycho like cycle
> through lots of unique class names.

That seems tentatively OK.  What do you see dynamic vtables being used
for in Lucy?

> That leaves the VTable_registry Hash, which is the hard part.
> One option is to write a lock-free hash implementation.  Cliff Click of Azul
> did a presentation at Google on a Java version.  The crucial requirement is an
> atomic pointer-compare-and-swap.  It'd take some work to set up probes for
> that to achieve portability, but it's probably doable.
> Another option is to throw a mutex around every read and write op on
> VTable_registry.
> A third option is to lock writes only but disallow deletions or displacements.
> Rehashing (where we double the table size and redistribute entries) is
> problematic; we'd probably have to keep all old versions around.
> The shared-keys aspect of Hash was a recent change, and we can just junk it
> in favor of per-object keys.

OK it sounds like "macro threading" should work fine.

So then we don't have to worry about individual objects being thread
safe as long as we can ensure threads never share objects.

>> In any event, if Lucy will not allow more than one thread in the core
>> at once, it still must mate properly with the threads running in the
>> host.  EG you can't just up and call a Python API; you have to
>> register your thread with it (likewise for JNI), release/reaquire
>> Python's global lock when crossing the bridge, etc.
> Let's avoid that mess by allowing more than one thread in the Lucy core.


>> > For indexing, I thing we should make it possible to support concurrency using
>> > multiple indexer *objects*.  Whether those multiple indexer objects get used
>> > within a single multi-threaded app, or within separate processes shouldn't be
>> > important.  However, I think it's very important that we not *require* threads
>> > to exploit concurrency at index-time.
>> Interesting... so you could take the "strongly divorced" approach,
>> yet, allow multiple such "worlds" inside a single Lucy process?  This
>> seems like a nice middle ground.  And mmap is the basis for sharing.
>> Python allows this too.  You can create multiple interpreter "worlds",
>> and each freely runs independent of the other (separate global lock).
> I agree -- this sounds like our best option.
> I think separate "worlds" should be good enough for indexing, provided that our
> multi-writer concurrency plans work out in practice as they have in theory.

It might get weird if the host also doesn't maintain separate worlds,
eg the host can think it created object X in Lucy but then if it
checks back later and gets the wrong world, object X is gone.

> Then there's search-time, where the multi-world approach isn't always
> adequate.  Here, things get tricky because we have to worry about the
> statefulness of objects that could be operating in multiple search threads.
> I think we'll be generally OK if we do most of our scoring at the segment
> level.  Individual SegReaders don't share much within a larger PolyReader, so
> Scorers derived from those SegReaders won't either.

OK even for concurrency within a single search, it sounds like?

> The main classes I'd be worried about right now in KS are Analyzers, which are
> cached per-FieldSpec in Schema.  It turns out that one of the main Analyzers
> in KS, Tokenizer, is stateful because it uses a Perl regex object which is
> stateful.  If for some reason a custom Scorer requests to use an Analyzer from
> a Schema instance (which is shared by all the SegReaders), it wouldn't be
> safe.


> Then there's refcounting.  For most objects operating in a per-segment scoring
> thread, it wouldn't be necessary to synchronize on Inc_RefCount and
> Dec_RefCount, but I don't see how we'd be able to pick and choose our sync
> points.

It sounds like we either 1) try to guarantee threads won't share
objects, in which case you don't need to lock in the object nor in
incref/decref, or 2) allow for the possibility of sharing.  (And in
either case we need known stateful globals to handle multiple

You could use atomic increment/decrement, though I'm unsure how costly
those really are at the CPU level.

>> Say lots of searches need to run concurrently (call this "macro"
>> concurrency, vs the "micro" concurrency below where more than one
>> thread works on a single search): will Lucy allow a host to send in N
>> threads doing their own searching, and allow those threads to run
>> concurrently (if the C platform supports real threads)?
> Yeah, that should work fine pending resolution of the issues surrounding the
> remaining stateful globals.



View raw message