TVM
TVM is a kind code generation and testing tools target at specific architecture. It improves the efficiency of code optimization on CNN tasks on various platforms.
This blog records my reading of the source code and tracking of the execution (and even modification) of TVM.
TVM provides operator description / scheduler (blocking and partition) interface. Following the tensor_expr_get_started.ipynb
for an example.
Python and C++ connect
From $TVM/python/tvm/init.py, we could see they are from the api.py in the same folder, which actually a wrapper of C++ function(all function begin with _api_internal). We can find the following note in the python file.
The functions in this namespace are automatically exported from C++ side via PackedFunc
that is registered by “TVM_REGISTER_” macro. This way makes calling Python functions from C++
side very easily.
Each string starts with “_” in the “TVM_REGISTER_” macro is an internal API. You can find
all the functions in “api_lang.cc”, “api_base.cc”, “api_arith.cc” and “api_ir.cc” under “src/api”.
It seems that the python files in the $TVM/python/tvm folder are highly related with these cc files. And the following c++ api is wrapper in these python files. One could find the related functions in file:
from . import tensor
from . import arith —> api_arith.cc
from . import expr —> api_ir.cc
from . import make —> api_ir.cc
from . import ir_pass
from . import codegen
from . import container
from . import schedule
from . import module
from . import node
from . import attrs
from . import ir_builder
from . import target
from . import generic —> api_ir.cc
from . import hybrid
from . import testing
from . import errorfrom .api import *
There are TVM_REGISTER_API and TVM_REGISTER_NODE_TYPE macros. We first focus on those on the src/api folder ones.
grep -r TVM_REGISTER_ src/api
we get (141 lines):
src/api/api_test.cc:TVM_REGISTER_NODE_TYPE(TestAttrs);
src/api/api_test.cc:TVM_REGISTER_API(“_nop”)
src/api/api_test.cc:TVM_REGISTER_API(“_test_wrap_callback”)
src/api/api_test.cc:TVM_REGISTER_API(“_test_raise_error_callback”)
src/api/api_test.cc:TVM_REGISTER_API(“_test_check_eq_callback”)
src/api/api_test.cc:TVM_REGISTER_API(“_context_test”)
src/api/api_test.cc:TVM_REGISTER_API(“_ErrorTest”)
src/api/api_test.cc:TVM_REGISTER_API(“_ndarray_use_count”)
src/api/api_ir.cc:TVM_REGISTER_API(“_Var”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.abs”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.floor”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.ceil”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.round”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.trunc”)
src/api/api_ir.cc:TVM_REGISTER_API(“make._cast”)
src/api/api_ir.cc:TVM_REGISTER_API(“make._range_by_min_extent”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.For”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.Load”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.Store”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.Realize”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.Call”)
src/api/api_ir.cc:TVM_REGISTER_API(“make.CommReducer”)
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_ir.cc: TVM_REGISTER_API(“make.”#Node) \
src/api/api_codegen.cc:TVM_REGISTER_API(“codegen._Build”)
src/api/api_codegen.cc:TVM_REGISTER_API(“module._PackImportsToC”)
src/api/api_base.cc:TVM_REGISTER_API(“_format_str”)
src/api/api_base.cc:TVM_REGISTER_API(“_raw_ptr”)
src/api/api_base.cc:TVM_REGISTER_API(“_save_json”)
src/api/api_base.cc:TVM_REGISTER_API(“_load_json”)
src/api/api_base.cc:TVM_REGISTER_API(“_TVMSetStream”)
src/api/api_base.cc:TVM_REGISTER_API(“_save_param_dict”)
src/api/api_lang.cc:TVM_REGISTER_API(“_min_value”)
src/api/api_lang.cc:TVM_REGISTER_API(“_max_value”)
src/api/api_lang.cc:TVM_REGISTER_API(“_const”)
src/api/api_lang.cc:TVM_REGISTER_API(“_str”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Array”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ArrayGetItem”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ArraySize”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Map”)
src/api/api_lang.cc:TVM_REGISTER_API(“_MapSize”)
src/api/api_lang.cc:TVM_REGISTER_API(“_MapGetItem”)
src/api/api_lang.cc:TVM_REGISTER_API(“_MapCount”)
src/api/api_lang.cc:TVM_REGISTER_API(“_MapItems”)
src/api/api_lang.cc:TVM_REGISTER_API(“Range”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Buffer”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BufferAccessPtr”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BufferVLoad”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BufferVStore”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Layout”)
src/api/api_lang.cc:TVM_REGISTER_API(“_LayoutIndexOf”)
src/api/api_lang.cc:TVM_REGISTER_API(“_LayoutFactorOf”)
src/api/api_lang.cc:TVM_REGISTER_API(“_LayoutNdim”)
src/api/api_lang.cc:TVM_REGISTER_API(“_LayoutGetItem”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BijectiveLayout”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BijectiveLayoutForwardIndex”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BijectiveLayoutBackwardIndex”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BijectiveLayoutForwardShape”)
src/api/api_lang.cc:TVM_REGISTER_API(“_BijectiveLayoutBackwardShape”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Tensor”)
src/api/api_lang.cc:TVM_REGISTER_API(“_TensorIntrin”)
src/api/api_lang.cc:TVM_REGISTER_API(“_TensorIntrinCall”)
src/api/api_lang.cc:TVM_REGISTER_API(“_TensorEqual”)
src/api/api_lang.cc:TVM_REGISTER_API(“_TensorHash”)
src/api/api_lang.cc:TVM_REGISTER_API(“_Placeholder”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ComputeOp”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScanOp”)
src/api/api_lang.cc:TVM_REGISTER_API(“_TensorComputeOp”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ExternOp”)
src/api/api_lang.cc:TVM_REGISTER_API(“_HybridOp”)
src/api/api_lang.cc:TVM_REGISTER_API(“_OpGetOutput”)
src/api/api_lang.cc:TVM_REGISTER_API(“_OpNumOutputs”)
src/api/api_lang.cc:TVM_REGISTER_API(“_OpInputTensors”)
src/api/api_lang.cc:TVM_REGISTER_API(“_IterVar”)
src/api/api_lang.cc:TVM_REGISTER_API(“_CreateSchedule”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageSetScope”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageBind”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageSplitByFactor”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageSplitByNParts”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageFuse”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageComputeAt”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageComputeInline”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageComputeRoot”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageReorder”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageTile”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageEnvThreads”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageSetStorePredicate”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageUnroll”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageVectorize”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageTensorize”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageParallel”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StagePragma”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StagePrefetch”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageStorageAlign”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageDoubleBuffer”)
src/api/api_lang.cc:TVM_REGISTER_API(“_StageOpenGL”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScheduleNormalize”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScheduleCreateGroup”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScheduleCacheRead”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScheduleCacheWrite”)
src/api/api_lang.cc:TVM_REGISTER_API(“_ScheduleRFactor”)
src/api/api_lang.cc:TVM_REGISTER_API(“_CommReducerCombine”)
src/api/dsl_api.cc:TVM_REGISTER_GLOBAL(“dsl_api.singleton”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.Simplify”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.CanonicalSimplify”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.Substitute”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.Equal”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.StorageFlatten”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.AttrsEqual”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.AttrsHash”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.ExprUseVar”)
src/api/api_pass.cc:TVM_REGISTER_API(“ir_pass.PostOrderVisit”)
src/api/api_pass.cc: TVM_REGISTER_API(“ir_pass.”#PassName) \
src/api/api_pass.cc: TVM_REGISTER_API(“ir_pass.”#PassName) \
src/api/api_pass.cc: TVM_REGISTER_API(“ir_pass.”#PassName) \
src/api/api_pass.cc: TVM_REGISTER_API(“ir_pass.”#PassName) \
src/api/api_pass.cc: TVM_REGISTER_API(“ir_pass.”#PassName) \
src/api/api_schedule.cc:TVM_REGISTER_API(“schedule.AutoInlineElemWise”)
src/api/api_schedule.cc:TVM_REGISTER_API(“schedule.AutoInlineInjective”)
src/api/api_schedule.cc:TVM_REGISTER_API(“schedule.ScheduleOps”)
src/api/api_schedule.cc: TVM_REGISTER_API(“schedule.”#PassName) \
src/api/api_schedule.cc: TVM_REGISTER_API(“schedule.”#PassName) \
src/api/api_arith.cc:TVM_REGISTER_API(“arith.intset_single_point”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.intset_vector”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.intset_interval”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.DetectLinearEquation”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.DetectClipBound”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.DeduceBound”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith.DomainTouched”)
src/api/api_arith.cc:TVM_REGISTER_API(“_IntervalSetGetMin”)
src/api/api_arith.cc:TVM_REGISTER_API(“_IntervalSetGetMax”)
src/api/api_arith.cc:TVM_REGISTER_API(“_IntSetIsNothing”)
src/api/api_arith.cc:TVM_REGISTER_API(“_IntSetIsEverything”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith._make_ConstIntBound”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith._make_ModularSet”)
src/api/api_arith.cc:TVM_REGISTER_API(“arith._CreateAnalyzer”)
glimpse
api_base seems to give serilize/deserilze/save/load functions
api_ir seems to provide all possible operators and variable definition and loop control interface. They are worth to read if wanting to get more insight how to descripe an new operator.
api_codegen: only two interface: “codegen._Build” “module._PackImportsToC”
api_pass.cc: different optimization pass for the IR
api_lang: data type/variable and their attr