hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-hudi] amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read
Date Sat, 15 Feb 2020 13:10:43 GMT
amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config
not being read
URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586588934
 
 
   Okay, so I put in some log debugs in the `KafkaOffsetGen#getNextOffsetRanges` function.
What I found was that Hudi had a previous checkpoint registered with it. However, I am working
on understanding why it was restarting the sync with starting offset as 0 after checking valid
offset because the offset. What was also interesting is when I ran the following code piece
:
   ```
   fromOffsets.entrySet().forEach(entry -> {
           LOG.debug(entry.getKey().topic() + "-" + entry.getKey().partition() + " -> "
+ entry.getValue());
         });
   ```
   It printed nothing which as far as I understand means that the fromOffsets map was empty.
   
   I am still trying to understand if the registering of checkpoint was due to a previous
`spark-submit` running successfully when I was experimenting with other properties and Hudi
registering the information or Hudi somehow registering the checkpoint in spite of failed
run.
   
   For now, upon deleting the entire folder in S3 bucket which deleted the metadata, it worked.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message