up
100
作者 blueardour 2019-06-12 12:38:14
Wrote 0 BlogsTotally 0 words
ARM computing library
Freq width height cin cout kernel CPU Direct CPU GEMM CPU Winograd GPU Direct GPU GEMM GPU Winograd comment
415 28 28 64 64 3 4894 8329 8764 3702 3160 2073 comment
415 56 56 64 64 3 10550 8928 6389 11480 9794 4364 comment
415 112 112 64 64 3 61924 36448 29850 49285 44137 10729 comment
415 224 224 64 64 3 221346 156849 90626 609088 553315 31557 comment
415 448 448 64 64 3 938130 707488 584700 2083466 2036609 387459 comment
415 56 56 16 16 3 2078 2222 2407 1502 1878 1377 comment
415 56 56 32 32 3 5924 4531 3929 3619 2770 2256 comment
415 56 56 64 64 3 20023 8636 5254 12443 9336 3677 comment
415 56 56 128 128 3 39891 27909 16788 64911 43961 18008 comment
767 28 28 64 64 3 13135 4867 3867 2569 2174 4876 comment
767 56 56 64 64 3 20223 8148 6814 7177 5901 4322 comment
767 112 112 64 64 3 76097 37595 28982 26862 25174 6157 comment
767 224 224 64 64 3 195877 167989 87208 529305 684542 120056 comment
767 448 448 64 64 3 790089 610065 534801 3177016 488922 180504 comment
767 56 56 16 16 3 2570 3334 4277 3186 5256 3109 comment
767 56 56 32 32 3 5593 6777 4587 12356 9398 4785 comment
767 56 56 64 64 3 20750 9228 6636 46598 37182 13012 comment
767 56 56 128 128 3 39743 24085 16577 225332 24101 10531 comment

unit(us)

index Freq width height cin cout kernel GPU Direct GPU GEMM GPU Winograd Speedup Binary Speedup Ternary
1 415 28 28 64 64 3 3702 3160 2073 6.7 6.1
2 415 56 56 64 64 3 11480 9794 4364 7.9 7.9
3 415 112 112 64 64 3 49285 44137 10729 10.8 11.1
4 415 224 224 64 64 3 609088 553315 31557 36.3 37.3
5 415 448 448 64 64 3 2083466 2036609 387459 33.9 35.0
6 415 56 56 16 16 3 1502 1878 1377 5.5 5.2
7 415 56 56 32 32 3 3619 2770 2256 5.0 4.9
8 415 56 56 64 64 3 12443 9336 3677 7.4 7.7
9 415 56 56 128 128 3 64911 43961 18008 11.2 12.1
10 415 56 56 256 256 3 799638 344602 142474 24.4 27.1
11 767 28 28 64 64 3 2569 2174 4876 5.8 6.5
12 767 56 56 64 64 3 7177 5901 4322 7.0 8.3
13 767 112 112 64 64 3 26862 25174 6157 10.9 11.3
14 767 224 224 64 64 3 529305 684542 120056 76.0 84.3
15 767 448 448 64 64 3 3177016 488922 180504 15.0 15.5
16 767 56 56 16 16 3 3186 5256 3109 9.2 9.7
17 767 56 56 32 32 3 12356 9398 4785 7.4 7.9
18 767 56 56 64 64 3 46598 37182 13012 46.5 21.7
19 767 56 56 128 128 3 225332 24101 10531 11.0 11.8
20 767 56 56 256 256 3 500958 88924 53745 10.6 12.7

Note: For certrain index, for example 4,5,14, the speedup is abnormal. It is not becuase we really run faster. It is in fact caused by the bad corner case of the ARM computing library.

-->