jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Saurabh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (OAK-7389) Mongo/FileBlobStore does not update timestamp for already existing blobs
Date Thu, 05 Apr 2018 10:45:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426732#comment-16426732
] 

Vikas Saurabh edited comment on OAK-7389 at 4/5/18 10:44 AM:
-------------------------------------------------------------

{quote}could not get the upsert working with Mongo findOneAndUpdate so, resorted to a separate
call for update on error.
{quote}
[1] says that calling findOneAndUpdate only accepts update operators [2] - I think {{$currentDate}}
and {{$set}} should serve the purpose well. Also, you'd probably need to pass \{"upsert":
true} for {{options}} param.

*EDIT*: btw, \[1] mentions that the method is new in 3.2 and updated (how??) in 3.6. I don't
recall what we recommend for 1.2 users - but, I think it won't be 3.2. Maybe, instead of having
different impls in different branches, we could work with what you suggested (I won't expect
resurrections of blobs just in time during blob gc... so, 2 remote calls might be ok). 

[1]: [https://docs.mongodb.com/manual/reference/method/db.collection.findOneAndUpdate/]
[2]: [https://docs.mongodb.com/manual/reference/operator/update/]


was (Author: catholicon):
bq. could not get the upsert working with Mongo findOneAndUpdate so, resorted to a separate
call for update on error.
\[1] says that calling findOneAndUpdate only accepts update operators \[2] - I think {{$currentDate}}
and {{$set}} should serve the purpose well. Also, you'd probably need to pass \{"upsert":
true} for {{options}} param.

\[1]: https://docs.mongodb.com/manual/reference/method/db.collection.findOneAndUpdate/
\[2]: https://docs.mongodb.com/manual/reference/operator/update/

> Mongo/FileBlobStore does not update timestamp for already existing blobs
> ------------------------------------------------------------------------
>
>                 Key: OAK-7389
>                 URL: https://issues.apache.org/jira/browse/OAK-7389
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob
>    Affects Versions: 1.2.14, 1.4.20, 1.8.2, 1.6.11
>            Reporter: Amit Jain
>            Assignee: Amit Jain
>            Priority: Critical
>             Fix For: 1.2.30
>
>         Attachments: OAK-7389-v1.patch
>
>
> MongoBlobStore uses uses the {{insert}} call and ignores any exceptions which means any
existing value won't be updated.
> {code:java}
>     @Override
>     protected void storeBlock(byte[] digest, int level, byte[] data) throws IOException
{
>         String id = StringUtils.convertBytesToHex(digest);
>         cache.put(id, data);
>         // Check if it already exists?
>         MongoBlob mongoBlob = new MongoBlob();
>         mongoBlob.setId(id);
>         mongoBlob.setData(data);
>         mongoBlob.setLevel(level);
>         mongoBlob.setLastMod(System.currentTimeMillis());
>         // TODO check the return value
>         // TODO verify insert is fast if the entry already exists
>         try {
>             getBlobCollection().insertOne(mongoBlob);
>         } catch (DuplicateKeyException e) {
>             // the same block was already stored before: ignore
>         } catch (MongoException e) {
>             if (e.getCode() == DUPLICATE_KEY_ERROR_CODE) {
>                 // the same block was already stored before: ignore
>             } else {
>                 throw new IOException(e.getMessage(), e);
>             }
>         }
>     }
> {code}
>  FileBlobStore also returns if there's a file already existing without updating the
timestamp
> {code:java}
>     @Override
>     protected synchronized void storeBlock(byte[] digest, int level, byte[] data) throws
IOException {
>         File f = getFile(digest, false);
>         if (f.exists()) {
>             return;
>         }
>         .........
> {code}
> The above would cause data loss in DSGC if there are updates to the blob blocks which
are re-surrected (stored again at the time of DSGC) because the timestamp would never have
been modified.
>  
> cc/ [~tmueller], [~mreutegg], [~chetanm], [~catholicon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message