druid-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [druid] eeren-mosaic opened a new issue #11348: index_parallel not creating subtask for a specific interval
Date Tue, 22 Jun 2021 10:12:36 GMT

eeren-mosaic opened a new issue #11348:
URL: https://github.com/apache/druid/issues/11348

   ### Affected Version
   ### Description
   To provide some background, we are performing a batch ingestion of datasource A -> datasource
A_tmp, and we have a kafka_indexer ingesting into A_tmp ,then we perform an index_parallel
again to copy A_tmp back into A. 
   However, we are seeing something really obscure where on a granularity of 'MONTH', it can't
seem to generate a sub_task for a specific segment range (2019-12-01-2020-01-01 in this case).
We have the exact same set up in a different environment, which doesn't have the same issue.

   The only notable difference from the index_parallel log is the below line for the one that
failed :
   `ParallelIndexPhaseRunner - There's no input split to process`
   As for the same set up in another environment, a subtask with intervals of 2019-12-01-2020-01-01
is being generated and submitted. 
   We could see segments for 2019-12-01 to 2021-01-01 being generated in the _tmp datasource.

   No errors/exceptions observed in historicals/coordinator. 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

View raw message