lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Khludnev (Commented) (JIRA)" <>
Subject [jira] [Commented] (SOLR-3011) DIH MultiThreaded bug
Date Tue, 13 Mar 2012 19:58:40 GMT


Mikhail Khludnev commented on SOLR-3011:


bq. So it seems that for this to work, not only does the core (DocBuilder etc) need to be
thread-safe, but every component in a given DIH configuration needs to be also.

For me it's doubtful statement. I believe it's possible to have bunch of threadUnsafe classes
synchronized by some smart orchestrator. 

bq. There also is quite a bit of code duplication in DocBuilder and classes

Yep. Agree, ThrdEPWrapper is a FullImport only DocBuilder code dupe.

bq. Mikhail, you've just noticed that MockDataSource was not designed to test a multi-threaded
scenario in a valid fashion.

not really, they just an odd mocks. With real DS every time you get a full resulset from the
beginning, but after you reach eof in MockDS's resultset, re-querying gets you the same eof.

bq. Take a look at TestDocBuilderThreaded.

I've never seen it actually.

bq. 1. Keep 3.x as-is, and make any quick fixes to threads for common use-cases there, as

No any quick fixes for any "common" use-cases is possible. I'm sure.

bq. 2. In 4.0 (or a separate branch), remove threading from DIH.

I suggest an opposite way:
* be honest with users and remove "threads" from 3.6. Zero impact here. Nobody use it. It
just doesn't work.
* as well I already spend enormous efforts for fixing in it 4.0. I hope I will complete the
fix anyway. (it will live at github at least). Btw, the reason why I fix 4.0 is SOLR-2382.
Actually I wait sometime before it was completed. 

bq. 4. Make DocBuilder, etc threadsafe. 5. Create a marker interface or annotation

I don't see how it's possible and be really helpful.

bq.  The SOLR-3011 patches work on 4.x .. But I can probably help with porting (some of?)
this patch back to 3.x.

Petr found a case where the patch doesn't work. After (if) I done it, all commits around SOLR-2382
can be cherrypicked to 3.x. Porting fix w/o DIHCacheSupport will take more time.

In  to my opposite proposals above, I think we really need to start a design of new Ultimate
DIH. I propose
# to pick up usecases (you are experienced in extreme caching, I did a throughput maximization
via async producer-consumer, Peter will give us his cases, etc)
# sketch a design in plant uml, check that it's bullet proof 
# cut in 

> DIH MultiThreaded bug
> ---------------------
>                 Key: SOLR-3011
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>          Components: contrib - DataImportHandler
>    Affects Versions: 3.5, 4.0
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>             Fix For: 4.0
>         Attachments: SOLR-3011.patch, SOLR-3011.patch, patch-3011-EntityProcessorBase-iterator.patch,
> current DIH design is not thread safe. see last comments at SOLR-2382 and SOLR-2947.
I'm going to provide the patch makes DIH core threadsafe. Mostly it's a SOLR-2947 patch from
28th Dec. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message