Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C0E242004F3 for ; Tue, 15 Aug 2017 23:14:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BFD17165C42; Tue, 15 Aug 2017 21:14:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1138B165C37 for ; Tue, 15 Aug 2017 23:14:21 +0200 (CEST) Received: (qmail 55118 invoked by uid 500); 15 Aug 2017 21:14:19 -0000 Mailing-List: contact commits-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list commits@jackrabbit.apache.org Received: (qmail 55109 invoked by uid 99); 15 Aug 2017 21:14:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Aug 2017 21:14:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 68DABC00CE; Tue, 15 Aug 2017 21:14:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.002 X-Spam-Level: X-Spam-Status: No, score=-0.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id gUbvcek-Ysk9; Tue, 15 Aug 2017 21:14:17 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 960CA5FB4E; Tue, 15 Aug 2017 21:14:16 +0000 (UTC) Received: from moin-vm.apache.org (moin-vm.apache.org [163.172.69.106]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9B1CCE002B; Tue, 15 Aug 2017 21:14:15 +0000 (UTC) Received: from moin-vm.apache.org (localhost [IPv6:::1]) by moin-vm.apache.org (ASF Mail Server at moin-vm.apache.org) with ESMTP id 98EF480018; Tue, 15 Aug 2017 23:14:14 +0200 (CEST) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Tue, 15 Aug 2017 21:14:13 -0000 Message-ID: <150283165331.31781.14613788474037568496@moin-vm.apache.org> Subject: =?utf-8?q?=5BJackrabbit_Wiki=5D_Update_of_=22Composite_Blob_Store_Storage?= =?utf-8?q?_Filters=22_by_MattRyan?= Auto-Submitted: auto-generated archived-at: Tue, 15 Aug 2017 21:14:22 -0000 Dear Wiki user, You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" fo= r change notification. The "Composite Blob Store Storage Filters" page has been changed by MattRya= n: https://wiki.apache.org/jackrabbit/Composite%20Blob%20Store%20Storage%20Fil= ters New page: =3D Composite Blob Store Storage Filters =3D NOTE: The current status of this component is a '''rejected feature''' for= Oak 1.8, and a '''possible future feature''' for later Oak versions. =3D=3D Overview =3D=3D Storage filters add capability to [[Composite Blob Store]] by allowing conf= iguration to define criteria that restrict which blobs can be stored in a c= omposite blob store delegate. Storage filters would then be an optional el= ement of delegate configuration. Delegates would not be required to have a= ny storage filters at all. This is a possible future feature that can add a lot of value to the compos= ite blob store concept. A major hurdle to this implementation is that JCR = node information is not provided to data store implementations; the normal = case is that a binary is created before the corresponding node. This probl= em has to be resolved in order to implement this feature. More discussion = on this below. =3D=3D Technical Details =3D=3D =3D=3D=3D Storage Filter Types =3D=3D=3D Storage filters operate on any combination of the following: * JCR path * JCR node type * Existence of JCR property * JCR property value =3D=3D=3D Impacts on Composite Blob Store =3D=3D=3D The existence of a storage filter on a delegate blob store may have an impa= ct on the delegate traversal strategy implementation. For example, an impl= ementation may consider that a blob store with filters should take preceden= ce over a blob store without filters in terms of choosing a write destinati= on. =3D=3D=3D=3D Impact on Intelligent Delegate Traversal Strategy =3D=3D=3D=3D If storage filters were added to composite blob store, it is expected that = the following changes would be added to the Intelligent Delegate Traversal = Strategy, which is the default traversal strategy. =3D=3D=3D=3D=3D Delegate Search Order =3D=3D=3D=3D=3D Speaking generally, delegates with storage filters would take highest prece= dence, and would be tried first for reads and writes. =3D=3D=3D=3D=3D Writes =3D=3D=3D=3D=3D Delegates that have storage filters always have write precedence over deleg= ates that do not have storage filters - meaning writes always happen to del= egates with filters if the filters match what is being written. The algorithm for writing a blob to the composite blob store would be chang= ed: * When determining where to write the blob, start with delegates that have= storage filters. Prefer writing to a delegate with a storage filter that = matches the blob being written. * When searching for other versions of the same blob, '''also''' search us= ing delegates that have filters that don't match the blob being written (af= ter searching delegates without filters). * Configuration change can cause this situation - a blob may originally = have been written to a blob store, and afterward configuration changes such= that the blob would not have been written to that blob store originally. S= ee "Using Non-matching Delegates" below for more. * If this case occurs, the code should also asynchronously remove the bl= ob from the incorrect location once it has been written to the correct loca= tion. =3D=3D=3D=3D=3D Reads =3D=3D=3D=3D=3D Delegates that have storage filters always have read precedence over delega= tes that do not have storage filters - meaning we always attempt to read fr= om them first, because matching filters is faster than checking a delegate = to see if a blob exists. The algorithm for reading a blob from the composite blob store would be cha= nged: * When searching for a blob, start with delegates that have storage filter= s that match the blob. * If no match, search delegates without storage filters. * If no match, search delegates that have storage filters that do not matc= h the blob. * Configuration change can cause this situation - a blob may originally = have been written to a blob store, and afterward configuration changes such= that the blob would not have been written to that blob store originally. S= ee "Using Non-matching Delegates" below for more. * If this case occurs, the code should also asynchronously move the blob= from the incorrect location to the correct location. =3D=3D=3D=3D=3D Using Non-matching Delegates =3D=3D=3D=3D=3D This step is necessary to handle situations where blobs are temporarily loc= ated in the "wrong" blob store - in other words, when a blob is located in = a delegate where it would not be written according to configuration. The mo= st obvious case where this could occur is in the case of configuration chan= ge. A delegate D may be configured with certain storage filters, causing Bl= ob B to be written there. Then the configuration is changed such that if B = were being written now it would not have been written to D. This final step= allows B to be found in D even though it doesn't match the storage filters. When this situation is encountered, the composite blob store should also in= itiate an asynchronous background job to move the blob from it's current lo= cation to the proper one - the location where it would be found if it were = being created now - on read requests, or for write requests should write to= the correct location and remove the blob asynchronously in the background = from the current location after the write is done.