From "zhangzhaoqi (Jira)" <j...@apache.org>
Date Wed, 26 Feb 2020 12:17:00 GMT
```
[ https://issues.apache.org/jira/browse/SINGA-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

zhangzhaoqi updated SINGA-506:
------------------------------
Description:
*We are going to support these three NLP models, called, Bidirectional Attention Flow, BERT-Squad
and GPT-2.*

*Totally, there are still 19 operators that we need to add as following,*
|{color:#000000}*Transpose*{color}|{color:#000000}easy{color}|{color:#000000}1h{color}|{color:#000000}Transpose
the input tensor similar to numpy.transpose. {color}|{color:#000000}T{color}|{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*ConstantOfShape*{color}|{color:#000000}easy{color}|{color:#000000}2h{color}|{color:#000000}Generate
a tensor with given value and shape.{color}|{color:#000000}T{color}| |{color:#000000}T{color}|
|{color:#000000}*ReduceMax*{color}|{color:#000000}easy{color}|{color:#000000}4h{color}|{color:#000000}Computes
the max of the input tensor's element along the provided axes. {color}|{color:#000000}T{color}| | |
|{color:#000000}*ReduceMean*{color}|{color:#000000}easy{color}|{color:#000000}4h{color}|{color:#000000}Computes
the mean of the input tensor's element along the provided axes.{color}| |{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*ReduceSum*{color}|{color:#000000}easy{color}|{color:#000000}4h{color}|{color:#000000}Computes
the sum of the input tensor's element along the provided axes.{color}|{color:#000000}T{color}| | |
|{color:#000000}*Shape*{color}|{color:#000000}easy{color}|{color:#000000}2h{color}|{color:#000000}Takes
a tensor as input and outputs an 1D int64 tensor containing the shape of the input tensor.{color}|{color:#000000}T{color}|{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*Slice*{color}|{color:#000000}easy{color}|{color:#000000}4h{color}|{color:#000000}Produces
a slice of the input tensor along multiple axes. {color}|{color:#000000}T{color}|{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*Dropout*{color}|{color:#000000}easy{color}|{color:#000000}3h{color}|{color:#000000}Dropout
takes an input floating-point tensor and an input ratio (floating-point scalar), and produces
two tensor outputs, output (floating-point tensor) and mask (Tensor<bool>). {color}|{color:#000000}T{color}| | |
|{color:#000000}*Hardmax*{color}|{color:#000000}easy{color}|{color:#000000}6h{color}|{color:#000000}The
operator computes the hardmax (1 for the first maximum value, and 0 for all others) values
for each layer in the batch of the given input.{color}|{color:#000000}T{color}| | |
|{color:#000000}*NonZero*{color}|{color:#000000}easy{color}|{color:#000000}12h{color}|{color:#000000}Returns
the indices of the elements that are non-zero (in row-major order - by dimension).{color}| | |{color:#000000}T{color}|
|{color:#000000}*Split*{color}|{color:#000000}easy{color}|{color:#000000}12h{color}|{color:#000000}Split
a tensor into a list of tensors, along the specified 'axis'.{color}| |{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*Tile*{color}|{color:#000000}easy{color}|{color:#000000}1d{color}|{color:#000000}Constructs
a tensor by tiling a given tensor. This is the same as function tile in Numpy, but no broadcast.
For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]{color}| |{color:#000000}T{color}| |
|{color:#000000}*Ceil*{color}|{color:#000000}easy{color}|{color:#000000}4h{color}|{color:#000000}y
= ceil(x){color}|{color:#000000}T{color}| | |
|{color:#000000}*Compress*{color}|{color:#000000}easy{color}|{color:#000000}6h{color}|{color:#000000}Selects
slices from an input tensor along a given axis where condition evaluates to True for each
axis index.{color}|{color:#000000}T{color}| | |
|{color:#000000}*Gather*{color}|{color:#000000}complicated{color}|{color:#000000}3d{color}|{color:#000000}Given
data tensor of rank r >= 1, and indices tensor of rank q, gather entries of the axis dimension
of data (by default outer-most one as axis=0) indexed by indices, and concatenates them{color}|{color:#000000}T{color}|{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*ArgMax*{color}|{color:#000000}complicated{color}|{color:#000000}2d{color}|{color:#000000}Computes
the indices of the max elements of the input tensor's element along the provided axis. {color}|{color:#000000}T{color}| | |
|{color:#000000}*Cast*{color}|{color:#000000}hard{color}|{color:#000000}-{color}|{color:#000000}The
operator casts the elements of a given input tensor to a data type specified by the 'to' argument
and returns an output tensor of the same size in the converted type.{color}|{color:#000000}T{color}|{color:#000000}T{color}|{color:#000000}T{color}|
|{color:#000000}*Scan*{color}|{color:#000000}hard{color}|{color:#000000}2w{color}|{color:#000000}Scan
can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output
tensors. It combines ideas from general recurrences, functional programming constructs such
as scan, fold, map, and zip and is intended to enable generalizations of RNN-like constructs
for sequence-to-sequence processing.{color}|{color:#000000}T{color}| | |
|{color:#000000}*CategoryMapper*{color}| |{color:#000000}-{color}|{color:#000000}not in onnx
document{color}|{color:#000000}T{color}| | |

*For details, these 19 operators belong to these three models separately:*

*Bidirectional Attention Flow:*
ArgMax
Cast
CategoryMapper
Ceil
Compress
ConstantOfShape
Dropout
Gather
Hardmax
ReduceMax
ReduceSum
Scan
Shape
Slice
Transpose

Slice
Shape
Gather
ReduceMean
Cast
Tile
Transpose
Split

*GPT-2:*
ConstantOfShape
Slice
Shape
Gather
ReduceMean
NonZero
Cast
Transpose
Split

