incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: threads
Date Sun, 29 Mar 2009 23:29:03 GMT
On Sun, Mar 29, 2009 at 09:21:58AM -0400, Michael McCandless wrote:

> I think concurrency is so bad that either 1) we're (the "royal we",
> our species) gonna have to punt on micro-concurrency (not use it, ie
> use only macro/strongly-divorced concurrency, like multi-process,
> one-thread-does-whole-search), 

Ensuring the bare minimum level of thread support in Lucy for
"strongly-divorced concurrency" won't be insanely difficult.  We just have to
ensure that all globals are safe to access -- and there aren't that many
stateful globals in our KS prototype.

  * VTables are stateful.
  * The VTable_registry Hash which is used to associate VTables with class
    names is stateful.
  * The Hash class uses shared keys.
  * Debug and IndexReader have some stateful test-only globals that don't

All other globals in KS are stateless, including the UNDEF object, format
constants, string literals, and most importantly, offsets into the VTables
used by the "inside out vtable" dispatch algo, which are stored in a bazillion
individual global size_t variables.  The method offsets are all thread-safe,
because they are constant per-core-compile.  

A little footnote though about those offsets, though.  When I was examining
the assembler generated by GCC for the method dispatch static inline
functions, I found that it ignored the "const" part of "extern const size_t"
and performed an ordinary dynamic load.  When we compile under pthreads, we
should see whether GCC skimps on the optimization because it's worried that
those "extern const" variables might change.

That won't be an issue for core, because we will probably end up
pound-defining the offset symbols as actual integer constants if LUCY_CORE is
defined.  But it might matter for compiled extensions if they run tight loops.

The VTables themselves are stateful because they are refcounted.
Furthermore, dynamically created VTables are reclaimed once the last object
which needs them goes away -- so the "MyScorer" VTable at one point in the
program might not be the same as the "MyScorer" VTable at another point in
the program.

However, if we stop refcounting VTables (by making Inc_RefCount and
Dec_RefCount no-ops) and accept that dynamically created VTables will leak,
then those concerns go away.  I think this would be a reasonable approach.
The leak ain't gonna matter unless somebody does something psycho like cycle
through lots of unique class names.

That leaves the VTable_registry Hash, which is the hard part.

One option is to write a lock-free hash implementation.  Cliff Click of Azul
did a presentation at Google on a Java version.  The crucial requirement is an
atomic pointer-compare-and-swap.  It'd take some work to set up probes for
that to achieve portability, but it's probably doable.

Another option is to throw a mutex around every read and write op on

A third option is to lock writes only but disallow deletions or displacements.
Rehashing (where we double the table size and redistribute entries) is
problematic; we'd probably have to keep all old versions around.

The shared-keys aspect of Hash was a recent change, and we can just junk it
in favor of per-object keys.

> or 2) move to better dynamic languages
> that are largely declarative such that the runtime has the freedom to
> use micro-concurrency as it sees fit.

For now, an academically interesting point.  :)  

> In any event, if Lucy will not allow more than one thread in the core
> at once, it still must mate properly with the threads running in the
> host.  EG you can't just up and call a Python API; you have to
> register your thread with it (likewise for JNI), release/reaquire
> Python's global lock when crossing the bridge, etc.  

Let's avoid that mess by allowing more than one thread in the Lucy core.

> > For indexing, I thing we should make it possible to support concurrency using
> > multiple indexer *objects*.  Whether those multiple indexer objects get used
> > within a single multi-threaded app, or within separate processes shouldn't be
> > important.  However, I think it's very important that we not *require* threads
> > to exploit concurrency at index-time.
> Interesting... so you could take the "strongly divorced" approach,
> yet, allow multiple such "worlds" inside a single Lucy process?  This
> seems like a nice middle ground.  And mmap is the basis for sharing.
> Python allows this too.  You can create multiple interpreter "worlds",
> and each freely runs independent of the other (separate global lock).

I agree -- this sounds like our best option.

I think separate "worlds" should be good enough for indexing, provided that our
multi-writer concurrency plans work out in practice as they have in theory.

Then there's search-time, where the multi-world approach isn't always
adequate.  Here, things get tricky because we have to worry about the
statefulness of objects that could be operating in multiple search threads.  

I think we'll be generally OK if we do most of our scoring at the segment
level.  Individual SegReaders don't share much within a larger PolyReader, so
Scorers derived from those SegReaders won't either.  

The main classes I'd be worried about right now in KS are Analyzers, which are
cached per-FieldSpec in Schema.  It turns out that one of the main Analyzers
in KS, Tokenizer, is stateful because it uses a Perl regex object which is
stateful.  If for some reason a custom Scorer requests to use an Analyzer from
a Schema instance (which is shared by all the SegReaders), it wouldn't be

Then there's refcounting.  For most objects operating in a per-segment scoring
thread, it wouldn't be necessary to synchronize on Inc_RefCount and
Dec_RefCount, but I don't see how we'd be able to pick and choose our sync

> Say lots of searches need to run concurrently (call this "macro"
> concurrency, vs the "micro" concurrency below where more than one
> thread works on a single search): will Lucy allow a host to send in N
> threads doing their own searching, and allow those threads to run
> concurrently (if the C platform supports real threads)?

Yeah, that should work fine pending resolution of the issues surrounding the
remaining stateful globals.

Marvin Humphrey

View raw message