jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8551) Minimize network calls in cloud data stores (performance optimization)
Date Sat, 17 Aug 2019 01:26:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909530#comment-16909530

Matt Ryan commented on OAK-8551:

This issue was noticed when the fix for OAK-7998 was introduced and began being used in test
environments, although that fix just compounded the issue somewhat.

> Minimize network calls in cloud data stores (performance optimization)
> ----------------------------------------------------------------------
>                 Key: OAK-8551
>                 URL: https://issues.apache.org/jira/browse/OAK-8551
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: blob-cloud, blob-cloud-azure
>    Affects Versions: 1.16.0, 1.10.4
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
> Oak cloud data stores (e.g. {{AzureDataStore}}, {{S3DataStore}}) are by definition more
susceptible to performance degradation due to network issues.  While we can't do much about
the performance of uploading or downloading a blob, there are other places within the implementations
where we are making network calls to the storage service which might be avoidable or minimized.
> One example is the {{exists()}} call to check whether a blob with a particular identifier
exists in the blob storage.  In some places {{exists()}} is being called where instead we
could simply attempt the network access and handle failures elegantly, avoiding making an
extra network call.  In other places perhaps a cache could be used to minimize round trips.
> Another example is the higher-level {{getReference()}} call in {{DataStoreBlobStore}}. 
This asks the implementation for a {{DataRecord}} and then gets the reference from that, but
in truth the data store backend can already obtain a reference for an identifier on its own. 
Asking for the {{DataRecord}} however requires a network request to get the blob metadata
for the record.

This message was sent by Atlassian JIRA

View raw message