CudaExecUnit

Namespace: SymTensor.Compiler.Cuda

Nested types and modules

Type	Description
BlasArgOperation	The operation the blasArg will perform.

Functions and values

Function or value	Description
`batchReduceLastAxis (...)` Signature: memAllocator:MemAllocatorT -> reduceFn:(ArrayNDManikinT -> ArrayNDManikinT -> CudaExecItemT list) -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list	exection items to reduce src over the last axis into trgt
`blasArg (...)` Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> ArrayNDManikinT * BlasTransposeOpT * CudaExecItemT list * bool	BLAS input argument passing, so that orientation is preserved. Can return copy items if deemed necessary.
`blasArgOperation (...)` Signature: manikin:ArrayNDManikinT -> shared:bool -> willOverwrite:bool -> BlasArgOperation	Returns the operation that blasArg will perform.
`blasTarget manikin` Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT	BLAS target argument passing, so that orientation is preserved
`copyExecItems trgt src` Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list	generates ExecItems to copy srcView to trgtView
`copyKeepingBroadcasted (...)` Signature: memAllocator:(TypeNameT -> int64 -> MemAllocKindT -> MemManikinT) -> broadcastAllowed:bool list -> src:ArrayNDManikinT -> ArrayNDManikinT * CudaExecItemT list	Generates ExecItems to copy srcView into newly allocated memory in C-order. Broadcasted dimensions of srcView for which broadcastAllowed is true are kept broadcasted.
`cppTemplateInstantiation tmpl args` Signature: tmpl:string -> args:string list -> string	returns the C++ template instantiation code for the given template and argument list
`dynamicSubtensorTmplAndIdx (...)` Signature: bas:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> ArrayNDArgTmpl * ArrayNDSDArgTmpl * CPPArrayTmpl<IntPtr>
`elementsFuncnameAndArgs (...)` Signature: trgt:ArrayNDManikinT -> cOp:ICudaArgTmpl -> srcViews:'?179146 list -> workSize:int64 list -> string * ICudaArgTmpl list * ICudaArgTmpl list Type parameters: '?179146	function name of elements wrapper and its arguments for the given target, operation and sources
`elemwiseFuncnameAndArgs (...)` Signature: trgt:ArrayNDManikinT -> cOp:'?179134 -> srcViews:'?179135 list -> string * ICudaArgTmpl list Type parameters: '?179134, '?179135	function name of element-wise wrapper and its arguments for the given target, operation and sources
`execItemsForCFunc tmplTmpls argTmpls` Signature: tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> CudaExecItemT list Type parameters: 'FuncDelegate	generate ExecItems to call a C++ template function
`execItemsForCopyFromDynamicSubtensor (...)` Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> CudaExecItemT list
`execItemsForCopyToDynamicSubtensor (...)` Signature: trgt:ArrayNDManikinT -> rngs:UExprRngsSpecT -> rngManikins:ArrayNDManikinT list -> src:ArrayNDManikinT -> CudaExecItemT list
`execItemsForElements (...)` Signature: compileEnv:CudaCompileEnvT -> trgt:ArrayNDManikinT -> elemFunc:UElemFuncT -> srcViews:ArrayNDManikinT list -> CudaExecItemT list	execution items for an element-wise operation
`execItemsForElemwise trgt cOp srcViews` Signature: trgt:ArrayNDManikinT -> cOp:'?179137 -> srcViews:'?179138 list -> CudaExecItemT list Type parameters: '?179137, '?179138	execution items for an element-wise operation
`execItemsForGather trgt src idxViews` Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179140 option list -> CudaExecItemT list Type parameters: '?179140	execution items for a gather operation
`execItemsForIdxReduceAxis (...)` Signature: memAllocator:'?179161 -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list Type parameters: '?179161	reduce one axis by appling an operation on indices such as argMax, argMin, ...
`execItemsForKernel (...)` Signature: cppFuncName:string -> tmplTmpls:ICudaArgTmpl list -> argTmpls:ICudaArgTmpl list -> (int64 * int64 * int64) -> CudaExecItemT list	execution item to launch the given kernel template function
`execItemsForOp compileEnv arg2` Signature: compileEnv:CudaCompileEnvT -> ExecItemsForOpArgs -> CudaExecItemT list	returns the execution units for the specified op
`execItemsForReduce (...)` Signature: memAllocator:MemAllocatorT -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list	exection items to reduce all elements of src into the scalar trgt
`execItemsForReduceAxis (...)` Signature: memAllocator:MemAllocatorT -> ax:int -> eOpName:string -> initial:ConstSpecT -> trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> CudaExecItemT list	reduce one axis by appling an operation such as sum, max, min, ...
`execItemsForReduction (...)` Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> CudaExecItemT list	execution items for a reduction operation
`execItemsForScatter trgt src idxViews` Signature: trgt:ArrayNDManikinT -> src:ArrayNDManikinT -> idxViews:'?179142 option list -> CudaExecItemT list Type parameters: '?179142	execution items for a scatter operation
`exprToCudaExecUnits compileEnv` Signature: compileEnv:CudaCompileEnvT -> UExprT -> ExecUnitsForExprT	generates CUDA execution units that will evaluate the given unified expression
`needExtra op` Signature: op:'?179112 -> '?179113 Type parameters: '?179112, '?179113	failure for extra ops
`reductionFuncnameAndArgs (...)` Signature: trgt:ArrayNDManikinT -> indexed:bool -> cOp:ICudaArgTmpl -> cInitialOp:ICudaArgTmpl -> src:ArrayNDManikinT -> string * ICudaArgTmpl list	function name of reduction wrapper and its arguments for the given target, operation, initial value and source
`srcReqs cudaEnv arg2` Signature: cudaEnv:CudaCompileEnvT -> SrcReqsArgs -> ChannelReqsT list	Computes desired source views given desired target view. There is no guarantee that the desired source views will be used.
`toCudaUOp uop` Signature: uop:obj -> ICudaUOp	converts a IUOp or a IOp to a ICudaUOp
`toIExecItem items` Signature: items:'?179169 list -> IExecItem list Type parameters: '?179169
`tracePostItemsForExpr compileEnv arg2` Signature: compileEnv:'?179167 -> TraceItemsForExprArgs -> CudaExecItemT list Type parameters: '?179167	returns the execution units for tracing the result after execution of the op items
`tracePreItemsForExpr compileEnv arg2` Signature: compileEnv:'?179165 -> TraceItemsForExprArgs -> CudaExecItemT list Type parameters: '?179165	returns the execution units for tracing becore execution of the op items
`trgtGivenSrcs compileEnv arg2` Signature: compileEnv:CudaCompileEnvT -> TrgtGivenSrcsArgs -> ChannelManikinsAndSharedT	computes the definitive target view of an op given its source views
`trimUnitaryBatchedBlasDims manikin` Signature: manikin:ArrayNDManikinT -> ArrayNDManikinT	If all batch dimensions (all dimensions but the last two) of the array are of size one, a view of the last two dimensions is returned. Otherwise the original array is returned.
`unsupLoc dev` Signature: dev:ITensorDevice -> '?179115 Type parameters: '?179115
`workDimForElemwise trgt hetero` Signature: trgt:ArrayNDManikinT -> hetero:bool -> int64 * int64 * int64	returns the CUDA work dimensions (x, y, z) for an element-wise or elements operation
`workDimForWorkSize workSize hetero` Signature: workSize:int64 list -> hetero:bool -> int64 * int64 * int64	returns the CUDA work dimensions (x, y, z) for work of given size

Deep.Net

CudaExecUnit

Nested types and modules

Functions and values