mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Mesos replicated log dual writes
Date Tue, 05 Jun 2018 16:35:45 GMT
I am not aware any ticket related to this, so creating a ticket for
tracking the issue would be great.

- Jie

On Tue, Jun 5, 2018 at 9:30 AM, Meghdoot bhattacharya <
meghdoot_b@yahoo.com.invalid> wrote:

> Thx Joseph. Writing back the entire blob is costly instead of writing back
> only the 2 bytes. In this case this record is just a chunk of large
> snapshot that is getting written. So it multiplies.
>
> Should I open a jira to track and resolve this or something like this is
> already open?
>
>
> Thx
>
> > On Jun 5, 2018, at 9:10 AM, Joseph Wu <joseph@mesosphere.io> wrote:
> >
> > The two-byte difference is most likely coming from the "learned" field:
> > https://github.com/apache/mesos/blob/master/src/messages/log.proto#L49
> >
> > The first time an entry is recorded, the entry is "unlearned", basically
> meaning that the entry has not been written to a quorum of masters (yet).
> Once a quorum of masters indicate to each other that they have written an
> entry, they then flip the bit to "learned".
> >
> > This process is mandatory for Paxos to work, but I can see how
> overwriting an entry could be expensive.  There's no particular restriction
> on why we must overwrite the entire entry, except that we currently use the
> position number (i.e. 6506690 in your example) as the key in the leveldb
> implementation.
> >
> >> On Tue, Jun 5, 2018 at 12:30 AM, meghdoot bhattacharya
> <meghdoot_b@yahoo.com.invalid> wrote:
> >> Recently investigation of logs of aurora around snapshot creations in
> replicated log come up with this
> >> I0601 21:57:27.322444   144 log.cpp:577] Attempting to append 524304
> bytes to the logI0601 21:57:27.322501   144 coordinator.cpp:348]
> Coordinator attempting to write APPEND action at position 6506690I0601
> 21:57:27.323122   139 replica.cpp:539] Replica received write request for
> position 6506690 from __req_res__(3685)@x.x.x.x:yyyyI0601
> 21:57:27.443131   139 leveldb.cpp:341] Persisting action (524331 bytes) to
> leveldb took 119.847993msI0601 21:57:27.443208   139 replica.cpp:710]
> Persisted action APPEND at position 6506690I0601 21:57:27.443437   139
> replica.cpp:693] Replica received learned notice for position 6506690 from
> @0.0.0.0:0I0601 21:57:27.478754   139 leveldb.cpp:341] Persisting action
> (524333 bytes) to leveldb took 35.278735msI0601 21:57:27.478818   139
> replica.cpp:710] Persisted action APPEND at position 6506690
> >>
> >>
> >> It seems there are dual writes to leveldb for the same record (with 2
> byte diff). Becomes expensive if that is the case for large records. Hoping
> to get some insights.
> >> Thx
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message