jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Mari (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo
Date Wed, 16 Jan 2019 13:04:00 GMT

     [ https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Francesco Mari reassigned OAK-6749:
-----------------------------------

    Assignee: Francesco Mari

> Segment-Tar standby sync fails with "in-memory" blobs present in the source repo
> --------------------------------------------------------------------------------
>
>                 Key: OAK-6749
>                 URL: https://issues.apache.org/jira/browse/OAK-6749
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob, tarmk-standby
>    Affects Versions: 1.6.2
>            Reporter: Csaba Varga
>            Assignee: Francesco Mari
>            Priority: Major
>
> We have run into some issue when trying to transition from an active/active Mongo NodeStore
cluster to a single Segment-Tar server with cold standby. The issue itself manifests when
the standby server tries to pull changes from the primary after the first round of online
revision GC.
> Let me summarize the way we ended up with the current state, and my hypothesis about
what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob store.
The FileDataStore was set up with minRecordLength=4096. The Mongo store stores blobs below
minRecordLength as special "in-memory" blobIDs where the data itself is baked into the ID
string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. Our datastore
is over 1TB in size, so copying the binaries wasn't an option. The new repository is simply
reusing the existing datastore. The "in-memory" blobIDs still look like external blobIDs to
the sidegrade process, so they were copied into the Segment-Tar repository as-is, instead
of being converted into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The migrated "in-memory"
blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files of the
stopped primary instance and making the necessary config changes on both servers.
> # Everything worked fine until the primary server started its first round of online revision
GC. After that process completed, the standby node started throwing exceptions about missing
segments, and eventually stopped altogether. In the meantime, the following warning showed
up in the primary log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler
Exception caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the allowed
maximum (8192)
>         at io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
>         at io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
>         at io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
>         at io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
>         at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
>         at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
>         at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
>         at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
>         at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>         at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>         at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:611)
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:552)
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:466)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:438)
>         at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
>         at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> This is what seems to be happening:
> # The revision GC creates brand new segments, and the standby instance starts pulling
them into its own store.
> # When the standby sees an "in-memory" blobID, it decides that it doesn't have this blob
in its own blobstore, so it proceeds to ask for the bytes of the blob from the primary, even
though they are encoded in the ID itself.
> # The longest blobID can be more than 8K in size (the 4K blob gets doubled by hex encoding).
When such a long blobID is submitted to the primary, the request gets rejected because of
excessive length. The secondary keeps waiting until the request times out, and no progress
is made in syncing.
> The issue doesn't pop up with repositories that started as Segment-Tar since Segment-Tar
always inlines blobs below some hardcoded threshold (16K if I remember correctly).
> I think there could be multiple ways to approach this, not mutually exclusive:
> * Special-case the "in-memory" BlobIDs during sidegrade and replace them with the "native"
segment values. If hardcoding knowledge about this implementation detail isn't desired, there
could be a new option for the sidegrade process, to force "inlining" of blobs below a certain
threshold, even if they aren't in-line in the source repo.
> * Special-case the "in-memory" BlobIDs in StandbyDiff so they aren't requested from the
primary, but are either kept as-is or get converted to the "native" format.
> * Increase the network package size limit in the sync protocol, or allow it to be configured.
This is the least efficient option, but with the least impact on the code.
> I can work on detailed reproduction steps if needed, but I'd rather not do it beforehand
because this is rather cumbersome to reproduce



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message