couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jiangph <jiangpeng...@hotmail.com>
Subject Re: [DISCUSS] soft-deletion
Date Sat, 21 Mar 2020 13:47:31 GMT
Thanks a lot, Mike for your comments and suggestion on APIs.  They are quite helpful. Please
see embedded response below.

> On Mar 20, 2020, at 10:07 PM, Mike Rhodes <couchdb@dx13.co.uk> wrote:
> 
> Looking good. I'm going to skip the database internals as I'm not at all qualified to
comment. I do have some comments on the HTTP API however.
> 
> First, I think this API spec is missing how you find databases which have soft-deleted
versions. For example, this could be:
> 
> /_all_dbs 
>    - lists only live databases
> 
> /_deleted_dbs 
>    - lists only dbs with at least one soft-deleted version
>    - API identical otherwise to _all_dbs

Good call for `_deleted_dbs`. I can implement this endpoint soon.

> 
> Secondly, this API lacks the ability to restore a soft-deleted database to a new database,
you can only restore a soft-deleted database to the original name, which could be over the
top of a new database. I suspect this could be problematic in day-to-day use. I'd propose:
> 
> - You can't restore a soft-deleted database on top of a "live" database. This prevents
data loss from a restore.
> - Support for the ability to restore a database to a new database path.
> 
> Could you also clarify whether the restore operation "removes" the soft-deleted version.
My assumption is yes, that we're just switching back the database into the "live" keyspace.

Yes, it will remove the soft-deleted version and make this version back to a “live” one.

> 
> An alternative API takes the conceptual model of deleted databases moving to a "recycle
bin" within the API's namespace. I quite like this, but it's just how it works for me; others
may hate it. The original proposal, to me, takes the approach that every database has it's
own "recycle bin", so this is merely altering the conceptual containers here. 
> 
> Mostly, I found that the following way feels a bit more RESTful in that there is a specific
set of deleted database resources, and you avoid the cognitive dissonance of being able to
carry out operations on non-existent resources.
> 
> In brief it roots operations on soft-deleted databases in a /_deleted_dbs namespace rather
than augmenting existing endpoints:
> 
> GET /_all_dbs
>    - lists only live databases (noted just to clarify this)
> 
> GET /_deleted_dbs 
>    - lists only dbs with at least one soft-deleted version
>    - API identical otherwise to _all_dbs
>    - It's a bit confusing to me that there can be overlap
>        between _deleted_dbs and _all_dbs as a given database may 
>        have live and soft-deleted versions, but perhaps this is okay.
> 
> GET /_deleted_dbs/{db}
>    - Equivalent to `GET /{db}/_deleted_dbs_info` in original.
>    - To me, this makes it very clear that a deleted db has only one 
>        operation -- get the metadata -- rather than hanging the _deleted_dbs_info
>        on a path that could be a live database or a non-existent database.
> 
> POST /_deleted_dbs/_restore
>    - Restore a database at a given timestamp
>    - Allows supplying a destination database.
>    - I view _restore as an RPC-like call than a REST call, so it has a body 
>        containing the arguments rather than supplying as part of the path:
> 
>         { "source": {"db": "animaldb", "ts": "<deletedTS>"}, "destination": "animaldb"}
> 
>    - Fails if "destination" exists.
> 
> If we choose to go with the original suggestion, I would still suggest taking the RPC
approach to restore, meaning it looks like this:
> 
>    POST /{db}/_restore
> 
>    { "sourceTS": "<deletedTS>", "destination": "animaldb"}
> 
> For either approach, "destination" could be optional and default to the original database
name.
> 

>From technical point of view, it is feasible to implement to restore database to different
destination database. I just want to bring up my concern about this. If I am correct, there
is no endpoint to rename database in CouchDB 3.0 or before.  if we support to restore database
to new database here, it will provide alternative way to support *rename*. I am not sure whether
this is expected or not.

You mentioned that restoring database might fail if there is live database with same name,
and it will cause data loss. But I think that users may replicate live database to another
database, and then delete it, and then restore previously deleted database back to live database,
and replicate data again. The result seems to be equivalent as we provide restoring to different
destination.  

> -- 
> Mike.
> 
> On Wed, 18 Mar 2020, at 12:04, jiangph wrote:
>> ## API
>> 
>> 1)  `DELETE /{db}`
>> 
>> There is no change on this endpoint [4] to send DELETE against one 
>> database. The soft-deletion is triggered once 
>> [couchdb][enable_database_recovery] is set to true in configuration 
>> file.
>> 
>> 
>> 2)  `GET /{db}/_deleted_dbs_info`
>> 
>>    returning basic information of all deleted instances for the 
>> specified database, including when the instance was deleted.
>>    Parameters:	
>> 
>>        db –Database name
>> 
>>    Request Headers:
>>     	
>>        Content-Type –application/json
>> 
>>    Response Headers:
>>     	
>> 
>>        Content-Type –
>>            application/json
>> 
>>    Status Codes:	
>> 
>>        200 OK –Request completed successfully
>>        404 Not Found –Requested database not found
>> 
>>    Request:
>> 
>>    GET /db/_deleted_dbs_info HTTP/1.1
>>    Accept: application/json
>>    Host: localhost:5984
>> 
>>    Response:
>> 
>>    HTTP/1.1 200 OK
>>    Cache-Control: must-revalidate
>>    Content-Type: application/json
>>    {
>>    	"total_rows": 2,
>>    	"rows": [{
>>    		"deleted_when": "20200318.071532",
>>    		"info": {
>>    			"update_seq": "0000019100b5992700000000",
>>    			"doc_del_count": 0,
>>    			"doc_count": 3,
>>    			"sizes": {
>>    				"external": 287,
>>    				"views": 0
>>    			}
>>    		}
>>    	}, {
>>    		"deleted_when": "20200318.071703",
>>    		"info": {
>>    			"update_seq": "0000019105f0e29900000000",
>>    			"doc_del_count": 0,
>>    			"doc_count": 2,
>>    			"sizes": {
>>    				"external": 200,
>>    				"views": 0
>>    			}
>>    		}
>>    	}]
>>    }
>> 
>> 
>> 3) `PUT /{db}/_restore/{deletedTS}`
>> 
>>    Restore a deleted database. 
>>    Parameters: 
>> 
>>        db –Database name
>>        deletedTS - timestamp when database was deleted
>> 
>>    Request Headers:
>> 
>> 
>>        Accept –
>>            application/json
>>            text/plain
>> 
>>    Response Headers:
>> 
>> 
>>        Content-Type –
>>            application/json
>>            text/plain; charset=utf-8
>> 
>>    Response JSON Object:
>> 
>> 
>>        ok (boolean) –Operation status. Available in case of success
>>        error (string) –Error type. Available if response code is 4xx
>>        reason (string) –Error description. Available if response code is 4xx
>> 
>>    Status Codes:   
>> 
>>        200 Restored –Database restored successfully 
>>        400 Bad Request –Invalid database name or deleted timestamp
>>        401 Unauthorized –CouchDB Server Administrator privileges required
>>        412 Precondition Failed –Database already exists
>> 
>> 
>> What do you think of that? Any questions or thoughts on this? Once 
>> again a big acknowledgment to Nick and Paul who helped with initial 
>> design and provide consultation on this.
>> 
>> 
>> Cheers
>> Peng Hui
>> 
>> 
>> [1] 
>> https://github.com/apache/couchdb/blob/master/src/couch/src/couch_file.erl#L251 
>> [2] 
>> https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L182-L184
>> [3] https://activesphere.com/blog/2018/08/05/high-contention-allocator 
>> <https://activesphere.com/blog/2018/08/05/high-contention-allocator>
>> [4] 
>> https://docs.couchdb.org/en/stable/api/database/common.html#delete--db
>> 
>> 


Mime
View raw message