incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edison Su <Edison...@citrix.com>
Subject RE: new storage framework update
Date Thu, 10 Jan 2013 01:10:58 GMT


> -----Original Message-----
> From: John Burwell [mailto:jburwell@basho.com]
> Sent: Tuesday, January 08, 2013 8:51 PM
> To: cloudstack-dev@incubator.apache.org
> Subject: Re: new storage framework update
>
> Edison,
>
> Please see my thoughts in-line below.  I apologize for S3-centric nature of my
> example in advance -- it happens to be top of mind for obvious reasons ...
>
> Thanks,
> -John
>
> On Jan 8, 2013, at 5:59 PM, Edison Su <Edison.su@citrix.com> wrote:
>
> >
> >
> >> -----Original Message-----
> >> From: John Burwell [mailto:jburwell@basho.com]
> >> Sent: Tuesday, January 08, 2013 10:59 AM
> >> To: cloudstack-dev@incubator.apache.org
> >> Subject: Re: new storage framework update
> >>
> >> Edison,
> >>
> >> In reviewing the javelin, I feel that there is a missing abstraction.
> >> At the lowest level, storage operations are the storage, retrieval,
> >> deletion, and listing of byte arrays stored at a particular URI.  In
> >> order to implement this concept in the current Javelin branch, 3-5
> >> strategy classes must implemented to perform the following low-level
> operations:
> >>
> >>   * open(URI aDestinationURI): OutputStream throws IOException
> >>   * write(URI aDestinationURI, OutputStream anOutputStream) throws
> >> IOException
> >>   * list(URI aDestinationURI) : Set<URI> throws IOException
> >>   * delete(URI aDestinationURI) : boolean throws IOException
> >>
> >> The logic for each of these strategies will be identical which will
> >> lead to to the creation of a support class + glue code (i.e. either
> >> individual adapter classes
>
> I realize that I omitted a couple of definitions in my original email.  First, the
> StorageDevice most likely would be implemented on a domain object that
> also contained configuration information for a resource.  For example, the
> S3Impl class would also implement StorageDevice.  On reflection (and a little
> pseudo coding), I would also like to refine my original proposed
> StorageDevice interface:
>
>    * void read(URI aURI, OutputStream anOutputStream) throws IOException
>    * void write(URI aURI, InputStream anInputStream)  throws IOException
>    * Set<URI> list(URI aURI)  throws IOException
>    * boolean delete(URI aURI) throws IOException
>    * StorageDeviceType getType()
>
> >
> > If the lowest api is too opaque, like one URI as parameter,  I am wondering
> it may make the implementation more complicated than it sounds.
> > For example, there are at least 3 APIs for primary storage driver:
> createVolumeFromTemplate, createDataDisk, deleteVolume, and two
> snapshot related APIs: createSnapshot, deleteSnapshot.
> > How to encode above operations into simple write/delete APIs? If one URI
> contains too much information, then at the end of day, the receiver side(the
> code in hypervisor resource), who is responsible to decode the URI, is
> becoming complicated.  That's the main reason, I decide to use more specific
> APIs instead of one opaque URI.
> > That's true, if the API is too specific, people needs to implement ton of
> APIs(mainly imagedatastoredirver, primarydatastoredriver,
> backupdatastoredriver), and all over the place.
> > Which one is better? People can jump into discuss.
> >
>
> The URI scheme should be a logical, unique, and reversal values associated
> with the type of resource being stored.  For example, the general form of
> template URIs would
> "/template/<account_id>/<template_id>/template.properties" and
> "/template/<account_id>/<template_id>/<uuid>.vhd" .  Therefore, for
> account id 2, template id 200, the template.properties resource would be
> assigned a URI of "/template/2/200/template.properties.  The StorageDevice
> implementation translates the logical URI to a physical representation.  Using
> S3 as an example, the StorageDevice is configured to use bucket jsb-
> cloudstack at endpoint s3.amazonaws.com.  The S3 storage device would
> translate the URI to s3://jsb-
> cloudstack/templates/2/200/template.properties.  For an NFS storage device
> mounted on nfs://localhost/cloudstack, the StorageDevice would translate
> the logical URI to
> hfs://localhost/cloudstack/template/<account_id>/<template_id>/template
> .properties.  In short, I believe that we can devise a simple scheme that
> allows the StorageDevice to treat the URI path relative to its root.
>
> To my mind, the createVolumeFromTemplate is decomposable into a series
> of StorageDevice#read and StorageDevice#write operations which would be
> issued by the VolumeManager service such as the following:
>
> public void createVolumeFromTemplate(Template aTemplate,
> StorageDevice aTemplateDevice, Volume aVolume, StorageDevice
> aVolumeDevice) {
>
> try {
>
> if (aVolumeDevice.getType() != StorageDeviceType.BLOCK ||
> aVolumeDevice.getType() != StorageDeviceType.FILE_SYSTEM) { throw new
> UnsupportedStorageDeviceException(...);
> }
>
> // Pull the template from template device into a temporary directory final
> File aTemplateDirectory = new File(<template temp path>)
>
> // Non-DRY -- likely a candidate for a TemplateService#downloadTemplate
> method aTemplateDevice.read(new
> URI("/templates/<account_id>/<template_id>/template.properties"), new
> FileOutStream(aTemplateDirectory.createFille("template.properties"));
> aTemplate.read(new
> URI("/templates/<account_id>/<template_id>/<template_uuid>.vhd"),
> new
> FileOutputStream(aTemplateDirectory.createFile("<template_uuid>.vhd");
>
> // Perform operations with hypervisor as necessary to register storage which
> yields // anInputStream (possibly a List<InputStream>)
>
> aVolumeDevice.write(new URI("/volume/<account_id>/<volume_id>",
> anInputStream);


Not sure we really need the API looks like java IO, but I can see the value of using URI to
encode objects(volume/snapshot/template etc): driver layer API will be very simple, and can
be shared by multiple components(volume/image services etc)
Currently, there is one datastore object for each storage, the datastore object mainly used
by cloudstack mgt server, to read/write database, and to maintain the state of each object(volume/snapshot/template)
in the datastore. And the datastore object also provides interface for lifecycle management,
and a transformer(which can transform a db object into a *TO, or an URI). The purpose of datastore
object is that, I want to offload a lot of logic from volume/template manager into each object,
as the manager is a singleton, which is not easy to be extended.
The relationship between these classes are:
For volume service: Volumeserviceimpl -> primarydatastore -> primarydatastoredriver
For image service: imageServiceImpl -> imagedataStore -> imagedataStoredriver
For snapshot service: snapshotServiceImpl -> {primarydataStore/imagedataStore} - > {primarydatastoredriver/imagedatastoredriver},
the snapshot can be on both primarydatastore and imagedatastore.

The current driver API is not good enough, it's too specific for each object. For example,
there will be an API called createsnapshot in primarydatastoredriver, and an API called moveSnapshot
in imagedataStoredriver(in order to implement moving snapshot from primary storage to image
store ), also may have an API called, createVolume in primarydatastoredriver, and an API called
moveVolume in imagedatastoredriver(in order to implement moving volume from primary to image
store). The more objects we add, the driver API will be bloated.

If driver API is using the model you suggested, the simple read/write/delete with URI, for
example:
void Create(URI uri) throws IOException
void copy(URI desturi, URI srcUri) throws IOException
boolean delete(URI uri) throws IOException
set<URI> list(URI uri) throws IOException

create API has multiple means under different context: if the URI has "*/volume/*" means creating
volume, if URI has "*/template" means creating template, and so on.
The same for copy API:
 if both destUri and srcUri is volume, it can have different meanings, if both volumes are
in the same storage, means create a volume a from a base volume. If both are in the different
storages, means volume migration.
If destUri is a volume, while the srcUri is a template, means, create a volume from template.
If destUri is a volume, srcUri is a snapshot and on the same storage, means revert snapshot
If destUri is a volume, srcUri is a snapshot, but on the different storages, means create
volume from snapshot.
If destUri is a snapshot, srcUri is a volume, means create snapshot from volume.
If destUri is a snapshot, srcUri is a snapshot, but on the different places, means snapshot
backup.
If destUri is a template, srcUri is a snapshot, means create template from snapshot.
As you can see, the API is too opaque, needs a complicated logic to encode and decode the
URIs.
Are you OK with above API?

>
> } catch (IOException e) {
>
>       // Log and handle the error ...
>
> } finally {
>
>       // Close resources ...
>
> }
>
> }
>
> Dependent on the capabilities of the hypervisor's Java API, the temporary
> files may not be required, and an OutputStream could copied directly to an
> InputStream.
>
> >
> >> or a class that implements a ton of interfaces).  In addition to this
> >> added complexity, this segmented approach prevents the
> implementation
> >> of common, logical storage features such as ACL enforcement and asset
> >
> > This is a good question, how to share the code across multiple components.
> For example, one storage can be used as both primary storage and backup
> storage. In the current code, developer needs to implement both
> primarydataStoredriver and backupdatastoredriver, in order to share code
> between these two drivers if needed, I think developer can write one driver
> which implements both interfaces.
>
> In my opinion, storage drivers classifying their usage limits functionality and
> composability.  Hence, my thought is that the StorageDevice should describe
> its capabilities -- allowing the various services (e.g. Image, Template, Volume,
> etc) to determine whether or not the passed storage devices can support
> the requested operation.
>
> >
> >> encryption.  With a common representation of a StorageDevice that
> >> operates on the standard Java I/O model, we can layer in
> >> cross-cutting storage operations in a consistent manner.
> >
> > I agree that nice to have a standard device model, like the POSIX file
> system API in Unix world. But I haven't figure out how to generalized all the
> operations on the storage, as I mentioned above.
> > I can think about, createvolumefromtemplate, can be generalized as link
> api, but how about taking snapshot? How about who will handle the
> difference between delete voume and  delete snapshot, if they are using
> the same delete API?
>
> The following is an snippet that would be part of the SnapshotService to take
> a snapshot:
>
>       // Ask the hypervisor to take a snapshot yields anInputStream (e.g.
> FileInputStream)
>
>       aSnapshotDevice.write(new
> URI("/snapshots/<account_id>/<snapshot_id>), anInputStream)
>
> Ultimately, a snapshot can be exported to a single file or OutputStream
> which can written back out to a StorageDevice.  For deleting a snapshot, the
> following snippet would perform the deletion in the SnapshotService:
>
>       // Ask the hypervisor to delete the snapshot ...
>
>       aSnapshotDevice.delete(new
> URI("/snapshots/<account_id>/<snapshot_id>"))
>
> Finally, deleting a volume, the following snippet would delete a volume from
> VolumeService:
>
>       // Ask the hypervisor to delete the volume
>
>       aVolumeDevice.delete(new
> URI("/volumes/<account_id>/<volume_id>"))
>
> In summary, I believe that the opaque operations specified in the
> StorageDevice interface can accomplish these goals if the following
> approaches are employed:
>
>       * Logical, reversible URIs are constructed by the storage services.
> These URIs are translated by the StorageDevice implementation to the
> semantics of the underlying device
>       * The storage service methods break their logic down into a series
> operations against one or more StorageDevices.  These operations should
> conform to common Java idioms because StorageDevice is built on the
> standard Java I/O model (i.e. InputStream, OutputStream, URI).
>
> Thanks,
> -John
>
> >
> >>
> >> Based on this line of thought, I propose the addition of following
> >> notions to the storage framework:
> >>
> >>   * StorageType (Enumeration)
> >>      * BLOCK (raw block devices such as iSCSI, NBD, etc)
> >>      * FILE_SYSTEM (devices addressable through the filesystem such
> >> as local disks, NFS, etc)
> >>      * OBJECT (object stores such as S3 and Swift)
> >>   * StorageDevice (interface)
> >>       * open(URI aDestinationURI): OutputStream throws IOException
> >>       * write(URI aDestinationURI, OutputStream anOutputStream)
> >> throws IOException
> >>       * list(URI aDestinationURI) : Set<URI> throws IOException
> >>       * delete(URI aDestinationURI) : boolean throws IOException
> >>       * getType() : StorageType
> >>   * UnsupportedStorageDevice (unchecked exception): Thrown when an
> >> unsuitable device type is provided to a storage service.
> >>
> >> All operations on the higher level storage services (e.g.
> >> ImageService) would accept a StorageDevice parameter on their
> >> operations.  Using the type property, services can determine whether
> >> or not the passed device is an suitable (e.g. guarding against the
> >> use object store such as S3 as VM disk) -- throwing an
> >> UnsupportedStorageDevice exception when a device unsuitable for the
> >> requested operation.  The services would then perform all storage
> operations through the passed StorageDevice.
> >>
> >> One potential gap is security.  I do not know whether or not
> >> authorization decisions are assumed to occur up the stack from the
> >> storage engine or as part of it.
> >>
> >> Thanks,
> >> -John
> >>
> >> P.S. I apologize for taking so long to push my feedback.  I am just
> >> getting back on station from the birth of our second child.
> >
> >
> > Congratulation! Thanks for your great feedback.
> >
> >>
> >> On Dec 28, 2012, at 8:09 PM, Edison Su <Edison.su@citrix.com> wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Marcus Sorensen [mailto:shadowsor@gmail.com]
> >>>> Sent: Friday, December 28, 2012 2:56 PM
> >>>> To: cloudstack-dev@incubator.apache.org
> >>>> Subject: Re: new storage framework update
> >>>>
> >>>> Thanks. I'm trying to picture how this will change the existing code.
> >>>> I think it is something i will need a real example to understand.
> >>>> Currently we pass a
> >>> Yah, the example code is in these files:
> >>> XenNfsConfigurator
> >>> DefaultPrimaryDataStoreDriverImpl
> >>> DefaultPrimaryDatastoreProviderImpl
> >>> VolumeServiceImpl
> >>> DefaultPrimaryDataStore
> >>> XenServerStorageResource
> >>>
> >>> You can start from volumeServiceTest -> createVolumeFromTemplate
> >>> test
> >> case.
> >>>
> >>>> storageFilerTO and/or volumeTO from the serverto the agent, and the
> >>>> agent
> >>> These model is not changed, what changed are the commands send to
> >> resource. Right now, each storage protocol can send it's own command
> >> to resource.
> >>> All the storage related commands are put under
> >> org.apache.cloudstack.storage.command package. Take
> >> CopyTemplateToPrimaryStorageCmd as an example,
> >>> It has a field called ImageOnPrimayDataStoreTO, which contains a
> >> PrimaryDataStoreTO. PrimaryDataStoreTO  contains the basic
> >> information about a primary storage. If needs to send extra
> >> information to resource, one can subclass PrimaryDataStoreTO, e.g.
> >> NfsPrimaryDataStoreTO, which contains nfs server ip, and nfs path. In
> >> this way, one can write a CLVMPrimaryDataStoreTO, which contains clvm's
> own special information if
> >> needed.   Different protocol uses different TO can simply the code, and
> >> easier to add new storage.
> >>>
> >>>> does all of the work. Do we still need things like
> >>>> LibvirtStorageAdaptor to do the work on the agent side of actually
> >>>> managing the volumes/pools and implementing them, connecting
> them
> >> to
> >>>> vms? So in implementing new storage we will need to write both a
> >>>> configurator and potentially a storage adaptor?
> >>>
> >>> Yes, that's minimal requirements.
> >>>
> >>>> On Dec 27, 2012 6:41 PM, "Edison Su" <Edison.su@citrix.com> wrote:
> >>>>
> >>>>> Hi All,
> >>>>>    Before heading into holiday, I'd like to update the current
> >>>>> status of the new storage framework since last collab12.
> >>>>>   1. Class diagram of primary storage is evolved:
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/download/attachments/30741569/sto
> >>>> r
> >>>> age.jpg?version=1&modificationDate=1356640617613
> >>>>>         Highlight the current design:
> >>>>>         a.  One storage provider can cover multiple storage
> >>>>> protocols for multiple hypervisors. The default storage provider
> >>>>> can almost cover all the current primary storage protocols. In
> >>>>> most of cases, you don't need to write a new storage provider,
> >>>>> what you need to do is to write a new storage configurator. Write
> >>>>> a new storage provider needs to write a lot of code, which we
> >>>>> should avoid it as much as
> >>>> possible.
> >>>>>        b. A new type hierarchy, primaryDataStoreConfigurator, is
added.
> >>>>> The configurator is a factory for primaryDataStore, which assemble
> >>>>> StorageProtocolTransformer, PrimaryDataStoreLifeCycle and
> >>>>> PrimaryDataStoreDriver for PrimaryDataStore object, based on the
> >>>>> hypervisor type and the storage protocol.  For example, for nfs
> >>>>> primary storage on xenserver, there is a class called
> >>>>> XenNfsConfigurator, which put DefaultXenPrimaryDataStoreLifeCycle,
> >>>>> NfsProtocolTransformer and DefaultPrimaryDataStoreDriverImpl into
> >>>>> DefaultPrimaryDataStore. One provider can only have one
> >>>>> configurator for a pair of hypervisor type and storage protocol.
> >>>>> For example, if you want to add a new nfs protocol configurator
> >>>>> for xenserver hypervisor, you need to write a new storage provider.
> >>>>>       c. A new interface, StorageProtocolTransformer, is added.
> >>>>> The main purpose of this interface is to handle the difference
> >>>>> between different storage protocols. It has four methods:
> >>>>>            getInputParamNames: return a list of name of parameters
> >>>>> for a particular protocol. E.g. NFS protocol has ["server",
> >>>>> "path"], ISCSI has ["iqn", "lun"] etc. UI shouldn't hardcode these
> >>>>> parameters any
> >>>> more.
> >>>>>            normalizeUserInput: given a user input from UI/API,
> >>>>> need to validate the input, and break it apart, then store them
into
> database
> >>>>>            getDataStoreTO/ getVolumeTO: each protocol can have its
> >>>>> own volumeTO and primaryStorageTO. TO is the object will be passed
> >>>>> down to resource, if your storage has extra information you want
> >>>>> to pass to resource, these two methods are the place you can
> override.
> >>>>>       d. All the time-consuming API calls related to storage is
async.
> >>>>>
> >>>>>      2. Minimal functionalities are implemented:
> >>>>>           a. Can register a http template, without SSVM
> >>>>>           b. Can register a NFS primary storage for xenserver
> >>>>>           c. Can download a template into primary storage directly
> >>>>>          d. Can create a volume from a template
> >>>>>
> >>>>>      3. All about test:
> >>>>>          a. TestNG test framework is used, as it can provide
> >>>>> parameter for each test case. For integration test, we need to
> >>>>> know ip address of hypervisor host, the host uuid(if it's
> >>>>> xenserver), the primary storage url, the template url etc. These
> >>>>> configurations are better to be parameterized, so for each test
> >>>>> run, we don't need to modify test case itself, instead, we provide
> >>>>> a test configuration file for each test run. TestNG framework
> >>>>> already has this functionality, I just
> >>>> reuse it.
> >>>>>          b. Every pieces of code can be unit tested, which means:
> >>>>>                b.1 the xcp plugin can be unit tested. I wrote a
> >>>>> small python code, called mockxcpplugin.py, which can directly
> >>>>> call xcp
> >>>> plugin.
> >>>>>                b.2 direct agent hypervisor resource can be tested.
> >>>>> I wrote a mock agent manger, which can load and initialize
> >>>>> hypervisor resource, and also can send command to resource.
> >>>>>                b.3 a storage integration test maven project is
> >>>>> created, which can test the whole storage subsystem, such as
> >>>>> create volume from template, which including both image and volume
> >>>> components.
> >>>>>          A new section, called "how to test", is added into
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsys
> >>>> t
> >>>>> em+2.0,
> >>>>> please check it out.
> >>>>>
> >>>>>     The code is on the javelin branch, the maven projects whose
> >>>>> name starting from cloud-engine-storage-* are the code related to
> >>>>> storage subsystem. Most of the primary storage code is in
> >>>>> cloud-engine-storage-volume project.
> >>>>>      Any feedback/comment is appreciated.
> >>>>>
> >


Mime
View raw message