lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fuad Efendi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2233) DataImportHandler - JdbcDataSource is not thread safe
Date Tue, 31 May 2011 22:44:47 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041884#comment-13041884
] 

Fuad Efendi commented on SOLR-2233:
-----------------------------------

Hi Frank, thanks for the patch; unfortunately it is not thread safe... if you don't mind let
me continue working on this, I want to use internal connection pool (if JNDI data source is
not available)...

My initial patch already contains *too much*; and new one will remove ResultSetIterator and
make it much simlper to understand (and multithreaded); and code shoulnd't have any dependency
on rare *optionally supported* patterns such as ResultSet.TYPE_FORWARD_ONLY; READ_ONLY should
be managed differently (and it is hard to manage if data size is huge and data is concurrently
updated while we are importing it)
Possible solution could be connection.close() after reading each single record (and initial
query should return PKs of records) - but it would be next step... I wrote initial patch for
a production system where complex 10-query-based documents (about 500k docs) took many hours
to import (and now it is about 40 minutes only) (and what happens if we have network problem
and we are in the middre of Iterator?)

Thanks

> DataImportHandler - JdbcDataSource is not thread safe
> -----------------------------------------------------
>
>                 Key: SOLR-2233
>                 URL: https://issues.apache.org/jira/browse/SOLR-2233
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.5
>            Reporter: Fuad Efendi
>         Attachments: FE-patch.txt, SOLR-2233-JdbcDataSource.patch, SOLR-2233-JdbcDataSource.patch,
SOLR-2233.patch
>
>
> Whenever Thread A spends more than 10 seconds on a Connection (by retrieving records
in a batch), Thread B will close connection.
> Related exceptions happen when we use "threads=" attribute for entity; usually exception
stack contains message "connection already closed"
> It shouldn't happen with some JNDI data source, where Connection.close() simply returns
Connection to a pool of available connections, but we might get different errors.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message