impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Armstrong <>
Subject Re: Cancellation logic in HdfsScanners
Date Tue, 16 Jan 2018 17:05:57 GMT
ScannerContext::cancelled() == true means that the scan has completed,
either because it has returned enough rows, because the query is cancelled,
or because it hit an error.

RuntimeState::cancelled() == true means that the query is cancelled.

So there are cases where ScannerContext::cancelled() == true and
RuntimeState::cancelled() is false. E.g. where there's a limit on the scan.

I think the name of ScannerContext::cancelled() is misleading, it should
probably be called "done()" to match HdfsScanNode::done(). More generally,
the cancellation logic could probably be cleaned up and simplified further.

On Mon, Jan 15, 2018 at 6:20 PM, Quanlong Huang <>

> Hi all,
> I'm confused about the cancellation logic in hdfs scanners. There're two
> functions to detect cancellation: ScannerContext::cancelled() and
> RuntimeState::is_cancelled().
> When MT_DOP is not set (i.e. MT_DOP=0), ScannerContext::cancelled() will
> return HdfsScanNode::done(). However, the field done_ in HdfsScanNode seems
> to be set according to status return from scanners.
> I've witnessed some points when RuntimeState::is_cancelled() is true but
> ScannerContext::cancelled() is false.
> My question is why scanners don't use RuntimeState::is_cancelled() to
> detect cancellation, which is more timely than using
> ScannerContext::cancelled(). There must be some detailed reasons that I've
> missed. Would you be so kind to answer my question?
> Thanks,
> Quanlong

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message