mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-mxnet] NRauschmayr commented on a change in pull request #15114: Added transform tutorial
Date Sun, 02 Jun 2019 03:52:15 GMT
NRauschmayr commented on a change in pull request #15114: Added transform tutorial
URL: https://github.com/apache/incubator-mxnet/pull/15114#discussion_r289625253
 
 

 ##########
 File path: docs/tutorials/gluon/transforms.md
 ##########
 @@ -0,0 +1,156 @@
+
+# Data Transforms
+
+Creating a [`Dataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=dataset#mxnet.gluon.data.Dataset)
is the starting point of the data pipeline, but we usually need to change samples before passing
them to the network. Gluon `transforms` provide us with a simple way to apply these changes.
We can use out-of-the-box transforms or create our own.
+
+We'll demonstrate this by adjusting samples returned by the [`CIFAR10`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=cifar#mxnet.gluon.data.vision.datasets.CIFAR10)
dataset and start by importing the relevant modules.
+
+
+```python
+import mxnet as mx
+from matplotlib import pyplot as plt
+from mxnet import image
+from mxnet.gluon import data as gdata, utils
+import numpy as np
+```
+
+After creating our [CIFAR-10 `Dataset`]([`CIFAR10`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=cifar#mxnet.gluon.data.vision.datasets.CIFAR10)),
we can inspect a random sample.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10()
+```
+
+
+```python
+sample_idx = 42
+sample_data, sample_label = dataset[sample_idx]
+print("data shape: {}".format(sample_data.shape))
+print("data type: {}".format(sample_data.dtype))
+print("data range: {} to {}".format(sample_data.min().asscalar(),
+                                    sample_data.max().asscalar()))
+print("label: {}".format(sample_label))
+plt.imshow(sample_data.asnumpy())
+```
+
+Our sample looks fine, but we need to need to make a few changes before using this as an
input to a neural network.
+
+### Using `ToTensor` and `.transform_first`
+
+Ordering of dimensions (sometimes called the data layout) is important for correct usage
of a neural network. Currently our samples are ordered (height, width, channel) but we need
to change this to (channel, height, width) before passing to our network. We also need to
change our data type. Currently it's `uint8`, but we need to change this to `float32`.
+
+MXNet Gluon provides a number of useful `transform`s for common computer vision cases like
this. We will use [`ToTensor`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=totens#mxnet.gluon.data.vision.transforms.ToTensor
to change the data layout and convert integers (between 0 and 255) to floats (between 0 and
1). We apply the transform to our `dataset` using the [`transform_first`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=transform_first#mxnet.gluon.data.Dataset.transform_first)
method. We have 2 elements per sample here (i.e. data and label), so the transform is only
applied to the first element (i.e. data).
+
+Advanced: `transform` (instead of [`transform_first`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=transform_first#mxnet.gluon.data.Dataset.transform_first))
can be used to transform all elements in the sample.
+
+
+```python
+transform_fn = mx.gluon.data.vision.transforms.ToTensor()
+dataset = dataset.transform_first(transform_fn)
+```
+
+
+```python
+sample_data, sample_label = dataset[sample_idx]
+print("data shape: {}".format(sample_data.shape))
+print("data type: {}".format(sample_data.dtype))
+print("data range: {} to {}".format(sample_data.min().asscalar(),
+                                    sample_data.max().asscalar()))
+print("label: {}".format(sample_label))
+```
+
+Our data has changed, while the label has been left untouched.
+
+### `Normalize`
+
+We scaled the values of our data samples between 0 and 1 as part of [`ToTensor`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=totens#mxnet.gluon.data.vision.transforms.ToTensor
but we may want or need to normalize our data instead: i.e. shift to zero-center and scale
to unit variance. You can do this with the following steps:
 
 Review comment:
   Links is not correctly displayed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message