mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Park" <mcyp...@gmail.com>
Subject Re: Review Request 29918: Introduced a generic checkpoint function.
Date Sun, 25 Jan 2015 01:23:35 GMT


> On Jan. 23, 2015, 6:44 p.m., Jie Yu wrote:
> > src/slave/state.hpp, line 120
> > <https://reviews.apache.org/r/29918/diff/9/?file=829596#file829596line120>
> >
> >     One more thought:
> >     
> >     Can we introduce a third parameter for this function to let users opt out the
temp + rename (we can use a default value)?
> >     
> >     This function is super critical. What if /tmp has too many links? It'll cause
slave to flap. For majority of the checkpointing work, we don't need this temp + rename because
we know the old file does not exist.
> >     
> >     What do you think?

But if we want to guarantee that the state of the checkpointed file is always valid, I don't
think we can get away from writing to temp + renaming.
For the case where the old file does not exist, we'd do something like, write directly to
the file if it doesn't already exist, and delete on error?
We would end up with an invalid checkpoint file if we crash while writing, or run into an
error and crash before deleting the file.

I'm also not sure if /tmp having too many links is the level of problem we need to worry about?
It seems like a very unlikely case to me, but I'm not sure.

Just my thoughts.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29918/#review69435
-----------------------------------------------------------


On Jan. 24, 2015, 1:21 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29918/
> -----------------------------------------------------------
> 
> (Updated Jan. 24, 2015, 1:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Introduced a generic checkpoint function.
> 
> 
> Diffs
> -----
> 
>   src/slave/state.hpp 70777cf6ab681c29ca4df601fe47903e1dbdf41f 
>   src/slave/state.cpp a36fa53099300ee03f051b0f5eaaafe9f1da68d1 
>   src/tests/slave_recovery_tests.cpp 809822e63b05a21418cd9297c927d656d6fd871d 
> 
> Diff: https://reviews.apache.org/r/29918/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message