hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kannan Rajah (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-6237) DBRecordReader is not thread safe
Date Sat, 31 Jan 2015 00:12:38 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kannan Rajah updated MAPREDUCE-6237:
------------------------------------
    Description: 
DBInputFormat.createDBRecorder is reusing JDBC connections across instances of DBRecordReader.
This is not a good idea. We should be creating separate connection. If performance is a concern,
then we should be using connection pooling instead.

I looked at DBOutputFormat.getRecordReader. It actually creates a new Connection object for
each DBRecordReader. So can we just change DBInputFormat to create new Connection every time?
The connection reuse code was added as part of connection leak bug in MAPREDUCE-1443. Any
reason for caching the connection?

  was:
DBInputFormat.createDBRecorder is reusing JDBC connections across
instances of DBRecordReader. This is not a good idea. We should be creating separate connection.
If performance is a concern, then we should be using connection pooling instead.

I looked at DBOutputFormat.getRecordReader. It actually creates a
new Connection object for each DBRecordReader. So can we just change
DBInputFormat to create new Connection every time?

The connection reuse code was added as part of connection leak bug in 
MAPREDUCE-1443. Any reason for caching the connection?


> DBRecordReader is not thread safe
> ---------------------------------
>
>                 Key: MAPREDUCE-6237
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6237
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Kannan Rajah
>            Assignee: Kannan Rajah
>
> DBInputFormat.createDBRecorder is reusing JDBC connections across instances of DBRecordReader.
This is not a good idea. We should be creating separate connection. If performance is a concern,
then we should be using connection pooling instead.
> I looked at DBOutputFormat.getRecordReader. It actually creates a new Connection object
for each DBRecordReader. So can we just change DBInputFormat to create new Connection every
time? The connection reuse code was added as part of connection leak bug in MAPREDUCE-1443.
Any reason for caching the connection?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message