spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] cozos edited a comment on issue #25899: [SPARK-29089][SQL] Parallelize blocking FileSystem calls in DataSource#checkAndGlobPathIfNecessary
Date Sat, 26 Oct 2019 00:24:17 GMT
cozos edited a comment on issue #25899: [SPARK-29089][SQL] Parallelize blocking FileSystem
calls in DataSource#checkAndGlobPathIfNecessary
URL: https://github.com/apache/spark/pull/25899#issuecomment-546550712
 
 
   @steveloughran Sorry for the delay. Been busy past couple weeks;
   
   Here are the results for the test with various values threads and threadpool size:
   
   ```
   _____________________________________________________________________
   | Type            | s3a conn max       | Num Threads| Runtime(seconds)|
   |====================================================================|
   | Flat Paths      | 40                | 10         | 24.08           |
   | Flat Paths      | 40                | 20         | 12.07           |
   | Flat Paths      | 40                | 40         | 6.63            |
   | Flat Paths      | 40                | 60         | 6.94            |
   | Flat Paths      | 40                | 80         | 6.58            |
   | Flat Paths      | 40                | 100        | 8.24            |
   | Flat Paths      | 40                | 150        | 7.19            |
   | Flat Paths      | 40                | 200        | 6.24            |
   | Flat Paths      | 300               | 10         | 19.39           |
   | Flat Paths      | 300               | 20         | 10.16           |
   | Flat Paths      | 300               | 40         | 6.78            |
   | Flat Paths      | 300               | 60         | 6.34            |
   | Flat Paths      | 300               | 80         | 6.94            |
   | Flat Paths      | 300               | 100        | 5.35            |
   | Flat Paths      | 300               | 150        | 5.96            |
   | Flat Paths      | 300               | 200        | 6.78            |
   | Glob Paths      | 40                | 10         | 37.28           |
   | Glob Paths      | 40                | 20         | 4.74            |
   | Glob Paths      | 40                | 40         | 3.81            |
   | Glob Paths      | 40                | 60         | 4.17            |
   | Glob Paths      | 40                | 80         | 3.41            |
   | Glob Paths      | 40                | 100        | 3.01            |
   | Glob Paths      | 40                | 150        | 3.08            |
   | Glob Paths      | 40                | 200        | 2.63            |
   | Glob Paths      | 300               | 10         | 4.59            |
   | Glob Paths      | 300               | 20         | 3.26            |
   | Glob Paths      | 300               | 40         | 3.46            |
   | Glob Paths      | 300               | 60         | 2.62            |
   | Glob Paths      | 300               | 80         | 2.32            |
   | Glob Paths      | 300               | 100        | 2.45            |
   | Glob Paths      | 300               | 150        | 4.61            |
   | Glob Paths      | 300               | 200        | 2.5             |
   | Single glob path| 40                | 10         | 44.02           |
   | Single glob path| 40                | 20         | 38.54           |
   | Single glob path| 40                | 40         | 33.25           |
   | Single glob path| 40                | 60         | 34.83           |
   | Single glob path| 40                | 80         | 36.2            |
   | Single glob path| 40                | 100        | 34.94           |
   | Single glob path| 40                | 150        | 46.32           |
   | Single glob path| 40                | 200        | 35.36           |
   | Single glob path| 300               | 10         | 31.33           |
   | Single glob path| 300               | 20         | 35.35           |
   | Single glob path| 300               | 40         | 36.4            |
   | Single glob path| 300               | 60         | 34.7            |
   | Single glob path| 300               | 80         | 35.1            |
   | Single glob path| 300               | 100        | 33.87           |
   | Single glob path| 300               | 150        | 35.61           |
   | Single glob path| 300               | 200        | 37.25           |
     FileSystem org.apache.hadoop.fs.s3a.S3AFileSystem: 0 bytes read, 0 bytes written, 21232
read ops, 0 large read ops, 0 write ops
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message