lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Khludnev (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3011) DIH MultiThreaded bug
Date Tue, 13 Mar 2012 19:58:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228649#comment-13228649
] 

Mikhail Khludnev commented on SOLR-3011:
----------------------------------------

James,

bq. So it seems that for this to work, not only does the core (DocBuilder etc) need to be
thread-safe, but every component in a given DIH configuration needs to be also.

For me it's doubtful statement. I believe it's possible to have bunch of threadUnsafe classes
synchronized by some smart orchestrator. 

bq. There also is quite a bit of code duplication in DocBuilder and classes

Yep. Agree, ThrdEPWrapper is a FullImport only DocBuilder code dupe.

bq. Mikhail, you've just noticed that MockDataSource was not designed to test a multi-threaded
scenario in a valid fashion.

not really, they just an odd mocks. With real DS every time you get a full resulset from the
beginning, but after you reach eof in MockDS's resultset, re-querying gets you the same eof.

bq. Take a look at TestDocBuilderThreaded.

I've never seen it actually.

bq. 1. Keep 3.x as-is, and make any quick fixes to threads for common use-cases there, as
possible.

No any quick fixes for any "common" use-cases is possible. I'm sure.

bq. 2. In 4.0 (or a separate branch), remove threading from DIH.

I suggest an opposite way:
* be honest with users and remove "threads" from 3.6. Zero impact here. Nobody use it. It
just doesn't work.
* as well I already spend enormous efforts for fixing in it 4.0. I hope I will complete the
fix anyway. (it will live at github at least). Btw, the reason why I fix 4.0 is SOLR-2382.
Actually I wait sometime before it was completed. 

bq. 4. Make DocBuilder, etc threadsafe. 5. Create a marker interface or annotation

I don't see how it's possible and be really helpful.

bq.  The SOLR-3011 patches work on 4.x .. But I can probably help with porting (some of?)
this patch back to 3.x.

Petr found a case where the patch doesn't work. After (if) I done it, all commits around SOLR-2382
can be cherrypicked to 3.x. Porting fix w/o DIHCacheSupport will take more time.

In  to my opposite proposals above, I think we really need to start a design of new Ultimate
DIH. I propose
# to pick up usecases (you are experienced in extreme caching, I did a throughput maximization
via async producer-consumer, Peter will give us his cases, etc)
# sketch a design in plant uml, check that it's bullet proof 
# cut in 


 
                
> DIH MultiThreaded bug
> ---------------------
>
>                 Key: SOLR-3011
>                 URL: https://issues.apache.org/jira/browse/SOLR-3011
>             Project: Solr
>          Issue Type: Sub-task
>          Components: contrib - DataImportHandler
>    Affects Versions: 3.5, 4.0
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-3011.patch, SOLR-3011.patch, patch-3011-EntityProcessorBase-iterator.patch,
patch-3011-EntityProcessorBase-iterator.patch
>
>
> current DIH design is not thread safe. see last comments at SOLR-2382 and SOLR-2947.
I'm going to provide the patch makes DIH core threadsafe. Mostly it's a SOLR-2947 patch from
28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message