spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] HeartSaVioR commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer
Date Wed, 28 Aug 2019 04:25:48 GMT
HeartSaVioR commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to
speed up replay log file in HistoryServer
URL: https://github.com/apache/spark/pull/25577#issuecomment-525578717
 
 
   Thanks for pointing out about "recover live entities". I've commented in your comment in
the doc, but elaborate again to get some guides about what we should do from experts for event
logging part.
   
   I thought `live*` fields are working as "cache" (code comment made me a bit confused).
If it's designed as a "cache" instead of playing as truth of source, it should also load from
KVStore if the entity is not in cache, and fill the cache if found. Looks like it's not: AppStatusListener
doesn't seek KVStore if it can't find the entity from live* fields.
   
   Some part of SQLAppStatusListener deals with that, though it doesn't deal with cache behavior
entirely.
   https://github.com/apache/spark/blob/8848af263570a82a985037a95468efff050c61b0/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L79-L101
   
   It seems that AppStatusListener always assumes KVStore is empty and we're replaying all
events. As we're breaking the assumption, at least as of now I'd say it's a bug and we should
fix the bug, regardless of how to snapshot - once we load KVStore and provide to AppStatusListener
we should fix.
   
   "initLiveEntities" in your PR seems to deal with this, so hopefully you can create smaller
PR to fix the bug first. (You could test the functionality without snapshotting, as once you
know the type of entities you could retrieve entities from KVStore.) 
   
   One thing I would like to modify is, I don't think we should initialize only for certain
condition. We should just initialize all the time for startup of AppStatusListener. It should
be fast enough if KVStore is empty, and it must be done regardless of flags if KVStore is
not empty - correctness matters, speed doesn't matter here. SQLAppStatusListener may also
need to be fixed as well.
   
   Once liveEntities can be restored from KVStore, it would also work for incremental replaying,
as well as incremental snapshotting (replaying some more events and snapshotting again). `live`
flag makes thing a bit complicated due to "maybeUpdate" and "liveUpdate", but AppStatusListener
guarantees to flush these live entities to KVStore eventually, at least when ElementTrackingStore
is closed. I guess it would be pretty much helpful if ElementTrackingStore provides "flush"
to force flush without closing it.
   
   @vanzin 
   As you're expert on this area, could you go through this and provide guidance on the approach?

   
   Feels like we could deal with below steps:
   1. address live entities initialization in AppStatusListener/SQLAppStatusListener with
non-empty KVStore (based on the part of this PR, maybe @Ngone51 would be better to deal with
this)
   2. address snapshotting of KVStore
   3. based on 1 and 2, address both issues concurrently, SPARK-28867 (@Ngone51), SPARK-28594
(me)
   
   The approach would be much better than the origin plan when you prefer smaller PR, as SPARK-28594
will be split to 3 sub-issues instead of 2.
   
   Let me know it makes sense to you. Thanks in advance!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message