spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mendelson, Assaf" <>
Subject structured streaming documentation does not match behavior
Date Thu, 15 Jun 2017 17:46:37 GMT
I have started to play around with structured streaming and it seems the documentation (structured
streaming programming guide) does not match the actual behavior I am seeing.
It says in the documentation that maxFilesPerTrigger (as well as latestFirst) are options
for the File sink. However, in fact, at least maxFilesPerTrigger does not seem to have any
real effect. On the other hand, the streaming source (readStream) which has no documentation
for this option, does limit the number of files.
This behavior actually makes more sense than the documentation as I expect the file reader
to define how to read files rather than the sink (e.g. if I would use a kafka sink or foreach
sink, they should still get the same behavior from the reading).


View raw message