"DSPs vs GPUs: Understanding the Differences"
In this presentation by Stephanie Mitchell and Tim Knudtson, the main focus is on exploring the real difference between DSPs (digital signal processors) and GPUs (graphics processing units
- Uploaded on | 4 Views
About "DSPs vs GPUs: Understanding the Differences"
PowerPoint presentation about '"DSPs vs GPUs: Understanding the Differences"'. This presentation describes the topic on In this presentation by Stephanie Mitchell and Tim Knudtson, the main focus is on exploring the real difference between DSPs (digital signal processors) and GPUs (graphics processing units. The key topics included in this slideshow are . Download this presentation absolutely free.
Slide1Is There a Real Differencebetween DSPs and GPUs? by Stephanie Mitchell and Tim Knudtson
Slide2Main Topics1. Examples Used in this Presentation 2. D.S.P. Processor 3. Features of the D.S.P. Processor 4. D.S.P. Architecture 5. D.S.P. Programming 6. G.P.U. Processor 7. Features of the G.P.U. Processor 8. G.P.U. Architecture 9. G.P.U. Programming 10. Conclusions
Slide3Examples Used in this PresentationInformation is given for the following processors: 1. Digital Signal Processor (DSP) TigerSHARC 2. Graphics Processor (GPU) Nvidia GeForce Series 6
Slide4D.S.P. ProcessorA digital signal processor (DSP) is a specialized microprocessor designed specifically for digital signal processing, generally in real-time. Programmable Digital Signal Processor (DSPs) are tuned to efficiently execute the computationally-intensive loops that typically characterize digital signal processing algorithms (i.e. FIR and IIR filters).
Slide5Features of the D.S.P. ProcessorDesigned for real-time processing Optimum performance with streaming data Separate program and data memories (Harvard architecture) Special Instructions for SIMD operations No hardware support for multitasking The ability to act as a direct memory access device if in a host environment
Slide6D.S.P. ArchitectureMemory architecture DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time: Harvard architecture modified von Neumann architecture Use of direct memory access Memory-address calculation unit
Slide7D.S.P. Architecture … continued Data operations Saturation arithmetic operations that produce overflows will accumulate at the maximum (or minimum) values that the register can hold rather than wrapping around (maximum+1 doesn't overflow to minimum as in many general-purpose CPUs, instead it stays at maximum). Fixed-point arithmetic is often used to speed up arithmetic processing. Single-cycle operations to increase the benefits of pipelining.
Slide8D.S.P. ProgrammingFloating-point unit integrated directly into the data-path Special looping hardware. Low-overhead or Zero-overhead looping capability Multiply-accumulate (MAC) operations, which are good for all kinds of matrix operations, such as convolution for filtering, dot product, or even polynomial evaluation.
Slide9D.S.P. Programming … continued Instructions to increase parallelism: SIMD, VLIW, superscalar architecture. Specialized instructions for modulo addressing in ring buffers and bit-reversed addressing mode for FFT cross-referencing. Digital signal processors sometimes use time- stationary encoding to simplify hardware and increase coding efficiency
Slide10G.P.U. ProcessorA Graphics Processing Unit or GPU (also occasionally called Visual Processing Unit or VPU) is a dedicated graphics rendering device for a personal computer, workstation, or game console. A GPU is the main processing unit in the architecture of every graphic cards used on computers or game consoles.
Slide11Features of the G.P.U. ProcessorGPU architecture offers a large degree of parallelism. It supports Single Instruction, Multiple Data (SIMD) Most of them have two different types of processing units: Vertex processor (or vertex shader): it is responsible of mathematical operations Pixel (or fragment) processor: it is responsible of texturing operations The third stage is for detailed processing, and may change from architecture to another.
Slide12G.P.U. ArchitectureProcessing Unit Focus on Floating point math fp32 and fp16 precision support for intermediate calculations 6 four-wide fp32 vector in shaders and 1scalar multifunction op 16 four-wide fp32 vector in frag-proc plus 16 four-wide fp32 MULs Dedicated fp16 normalization hardware
Slide13G.P.U. Architecture… continued Memory Use dedicated but standard memory architectures (eg DRAM) Multiple small independent memory partitions for improved latency Memory used to store buffers and optionally textures In low-end system (Intel 855GM) system memory is shared as the Graphics memory
Slide14G.P.U. Architecture… continued Cache Texture caches (2 level) Shared between vertex processors and fragment processors Cache processed/filtered textures Vertex caches cache processed and unprocessed vertexes improve computation and fetch performance Z and buffer cache and write queues
Slide15G.P.U. ProgrammingNon graphical applications to be executed on GPUs has been named GPGPU, or General Purpose Computations on GPUs. Optimization Texture caches (2 level) Super-scalability resulting in high parallelism SIMD (single instruction multiple data) structure RISC (reduced instruction set computer) architecture neither a board design nor an extra high speed data link is necessary a programmable pipeline (shading and lighting calculations programmed by the user)
Slide16ConclusionsThe answer to the title of this presentation: Is There a Real Difference between DSPs and GPUs? The is no ‘real’ difference simply because these two technologies are always in competition with one of another and b oth architectures offer a large degree of parallelism at a relatively low cost . But …
Slide17Conclusions … continued There pipelines have different units. The GPU is a specialist of gaming graphics so, Vertex Unit: transforms primitives from global 3D into 2D coordinates system. Rasterizer Unit = primitives are converted into square fragments Fragment Unit = the final color for each fragment is computed, (i.e. texture) Composing Unit = fragments are combined with the current rendering The DSP is a specialist digital processing so, Data ALU unit = performs multiply/accumulate and other ALU operations AGU unit = performs memory operand address calculation Program Control Pipeline (PCP) Unit = performs all other instructions (branches, loops, bit tests, etc.)
Slide18References P. Trancoso and M. Charalambous. Exploring Graphics Processor Performance for General Purpose Applications. Nicosia, Byprus.  M. Takefman and P. Chow. A Streamlined DSP Microprocessor Architecture. Toronto, Canada. 1991.  M. Saghir, P. Chow, and C. Lee. Application-Driven Design of DSP Architectures and Compilers. Toronto, Canada. 1994.  D. Geer. Taking the Graphics Processor Beyond Graphics. Published by the IEE Computer Society. September, 2005.