batchReduceLastAxis (...)
Signature: memAllocator:MemAllocatorT -> reduceFn:(ArrayNDManikinT -> ArrayNDManikinT -> CudaExecItemT list) -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
|
exection items to reduce src over the last axis into trgt
|
blasArg (...)
Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> ArrayNDManikinT * BlasTransposeOpT * CudaExecItemT list * bool
|
BLAS input argument passing, so that orientation is preserved.
Can return copy items if deemed necessary.
|
blasArgOperation (...)
Signature: manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> BlasArgOperation
|
Returns the operation that blasArg will perform.
|
blasTarget manikin
Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT
|
BLAS target argument passing, so that orientation is preserved
|
copyExecItems trgt src
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
|
generates ExecItems to copy srcView to trgtView
|
copyKeepingBroadcasted (...)
Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> broadcastAllowed:bool list -> src:ArrayNDManikinT -> ArrayNDManikinT * CudaExecItemT list
|
Generates ExecItems to copy srcView into newly allocated memory in C-order.
Broadcasted dimensions of srcView for which broadcastAllowed is true are kept broadcasted.
|
cppTemplateInstantiation tmpl args
Signature: tmpl:string -> args:string list -> string
|
returns the C++ template instantiation code for the given template and argument list
|
dynamicSubtensorTmplAndIdx (...)
Signature: bas:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> ArrayNDArgTmpl * ArrayNDSDArgTmpl * CPPArrayTmpl<IntPtr>
|
|
elementsFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> cOp:ICudaArgTmpl -> srcViews:'?179146 list -> workSize:int64 list -> string * ICudaArgTmpl list * ICudaArgTmpl list
Type parameters: '?179146
|
function name of elements wrapper and its arguments for the given target, operation and sources
|
elemwiseFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> cOp:'?179134 -> srcViews:'?179135 list -> string * ICudaArgTmpl list
Type parameters: '?179134, '?179135
|
function name of element-wise wrapper and its arguments for the given target, operation and sources
|
execItemsForCFunc tmplTmpls argTmpls
Signature: tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> CudaExecItemT list
Type parameters: 'FuncDelegate
|
generate ExecItems to call a C++ template function
|
execItemsForCopyFromDynamicSubtensor (...)
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> CudaExecItemT list
|
|
execItemsForCopyToDynamicSubtensor (...)
Signature: trgt:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> src:ArrayNDManikinT -> CudaExecItemT list
|
|
execItemsForElements (...)
Signature: compileEnv:CudaCompileEnvT -> trgt:ArrayNDManikinT -> elemFunc:UElemFuncT -> srcViews:ArrayNDManikinT list -> CudaExecItemT list
|
execution items for an element-wise operation
|
execItemsForElemwise trgt cOp srcViews
Signature: trgt:ArrayNDManikinT -> cOp:'?179137 -> srcViews:'?179138 list -> CudaExecItemT list
Type parameters: '?179137, '?179138
|
execution items for an element-wise operation
|
execItemsForGather trgt src idxViews
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179140 option list -> CudaExecItemT list
Type parameters: '?179140
|
execution items for a gather operation
|
execItemsForIdxReduceAxis (...)
Signature: memAllocator:'?179161 -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
Type parameters: '?179161
|
reduce one axis by appling an operation on indices such as argMax, argMin, ...
|
execItemsForKernel (...)
Signature: cppFuncName:string -> tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> (int64 * int64 * int64) -> CudaExecItemT list
|
execution item to launch the given kernel template function
|
execItemsForOp compileEnv arg2
Signature: compileEnv:CudaCompileEnvT -> ExecItemsForOpArgs -> CudaExecItemT list
|
returns the execution units for the specified op
|
execItemsForReduce (...)
Signature: memAllocator:MemAllocatorT -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
|
exection items to reduce all elements of src into the scalar trgt
|
execItemsForReduceAxis (...)
Signature: memAllocator:MemAllocatorT -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list
|
reduce one axis by appling an operation such as sum, max, min, ...
|
execItemsForReduction (...)
Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> CudaExecItemT list
|
execution items for a reduction operation
|
execItemsForScatter trgt src idxViews
Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179142 option list -> CudaExecItemT list
Type parameters: '?179142
|
execution items for a scatter operation
|
exprToCudaExecUnits compileEnv
Signature: compileEnv:CudaCompileEnvT -> UExprT -> ExecUnitsForExprT
|
generates CUDA execution units that will evaluate the given unified expression
|
needExtra op
Signature: op:'?179112 -> '?179113
Type parameters: '?179112, '?179113
|
failure for extra ops
|
reductionFuncnameAndArgs (...)
Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> string * ICudaArgTmpl list
|
function name of reduction wrapper and its arguments for the given target, operation, initial value and source
|
srcReqs cudaEnv arg2
Signature: cudaEnv:CudaCompileEnvT -> SrcReqsArgs -> ChannelReqsT list
|
Computes desired source views given desired target view.
There is no guarantee that the desired source views will be used.
|
toCudaUOp uop
Signature: uop:obj -> ICudaUOp
|
converts a IUOp or a IOp to a ICudaUOp
|
toIExecItem items
Signature: items:'?179169 list -> IExecItem list
Type parameters: '?179169
|
|
tracePostItemsForExpr compileEnv arg2
Signature: compileEnv:'?179167 -> TraceItemsForExprArgs -> CudaExecItemT list
Type parameters: '?179167
|
returns the execution units for tracing the result after execution of the op items
|
tracePreItemsForExpr compileEnv arg2
Signature: compileEnv:'?179165 -> TraceItemsForExprArgs -> CudaExecItemT list
Type parameters: '?179165
|
returns the execution units for tracing becore execution of the op items
|
trgtGivenSrcs compileEnv arg2
Signature: compileEnv:CudaCompileEnvT -> TrgtGivenSrcsArgs -> ChannelManikinsAndSharedT
|
computes the definitive target view of an op given its source views
|
trimUnitaryBatchedBlasDims manikin
Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT
|
If all batch dimensions (all dimensions but the last two) of the array are of
size one, a view of the last two dimensions is returned.
Otherwise the original array is returned.
|
unsupLoc dev
Signature: dev:ITensorDevice -> '?179115
Type parameters: '?179115
|
|
workDimForElemwise trgt hetero
Signature: trgt:ArrayNDManikinT -> hetero:bool -> int64 * int64 * int64
|
returns the CUDA work dimensions (x, y, z) for an element-wise or elements operation
|
workDimForWorkSize workSize hetero
Signature: workSize:int64 list -> hetero:bool -> int64 * int64 * int64
|
returns the CUDA work dimensions (x, y, z) for work of given size
|