flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From StephanEwen <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-1350] [runtime] Add blocking result par...
Date Thu, 12 Mar 2015 12:59:30 GMT
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/471#issuecomment-78474538
  
    Concerning the questions:
    
    1.  I think deploying after all blocking producers are finished is what we should go for
as a start. It is also what people would expect from a blocking model.
    
    2.  This is a fair initial restriction. Let's relax that later, I can see benefits in
that when dealing with tasks that cannot be deployed due to a a lack of resources.
    
    A few questions:
    
    The pull request generifies the IOManager and uses asynchronous disk I/O for the intermediate
result spilling. Are there any experience points that this helps performance in the case here?
I am curious, because the async I/O in the hash join / sorters was tricky enough. The interaction
between asynchronous disk I/O and asynchronous network I/O must be very tricky. I think there
should be a good reason to do this, otherwise we simply introduce error prone code for a completely
unknown benefit.
    
    The asynchronous writing seems straightforward. For the reading / sending part:
      - When do you issue the read requests to the reader (from disk)? Is that dependent on
when the TCP channel is writable?
      - When the read request is issued, before the response comes, if the subpartition de-registered
from netty and the re-registered one a buffer has returned from disk?
      - Given many spilled partitions, which one is read from next? How is the buffer assignment
realized? There is a lot of trickyness in there, because disk I/O performs well with longer
sequential reads, but that may occupy many buffers that are missing for other reads into writable
TCP channels.
    
    
    Can you elaborate on the mechanism behind this? I expect this to have quite an impact
on the reliability of the mechanism and the performance.
    
    *IMPORTANT*: There has been a fix by @tillrohrmann to the Asynchronous Channel Readers
/ Writers a few weeks back . Are we sure that this is not undone by the changes here?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message