Thursday, March 14, 2019

TI DSP code optimization

Texas Instruments TMS320C6x DSP code optimization

1. Hand-Tuning Loops and Control Code on the TMS320C6000
1.1 loop optimization
 -nrestrict
 - #pragma MUST_ITERATE(lower_bound, upper_bound, factor)
 - _nasserts()
1.2 if statement optimization

2. Introduction to TMS320C6000 DSP Optimization
2.1 Cannot make "Pipelined Loop"
 - exceed 14 executed packets (1 packet is 8 instructions)
 - nested loops
 - conditional branches inside loops
 - function calls inside loops
2.2 "Pipelined Loop" consists of
 - Prolog: above Kernel
 - Kernel: pipeline is fully utilized
 - Epilog: below Kernel
2.3 ii
 - ii = iteration interval
 - software pipeline loop can be approximated with ii * number_of_iterations.
 - ii is bounded below by two factors: the loop carried dependency bound and the partitined resource bound.
 - the loop carried dependency bound: the distance of the largest loop path


* Reference
Hand-Tuning Loops and Control Code on the TMS320C6000
Introduction to TMS320C6000 DSP Optimization
TMS320C6000 DSP Optimization Workshop - Texas Instruments Wiki
TMS320C6000 Programmer's Guide (Rev. K) - Texas Instruments
http://processors.wiki.ti.com/index.php/Software_libraries
http://processors.wiki.ti.com/index.php/Profiler
DSP/BIOS timers and benchmarking Tips SPRA829: Profile