Introduction to the ARM Architecture and Cortex M

Introduction to the ARM Architecture and Cortex M
paly

In this seminar, Joe Bungo, an applications engineer at ARM University Program, gives an overview of the ARM architecture with a focus on Cortex M. Learn about ARM Ltd, the history of the architecture, development tools, and more.

About Introduction to the ARM Architecture and Cortex M

PowerPoint presentation about 'Introduction to the ARM Architecture and Cortex M'. This presentation describes the topic on In this seminar, Joe Bungo, an applications engineer at ARM University Program, gives an overview of the ARM architecture with a focus on Cortex M. Learn about ARM Ltd, the history of the architecture, development tools, and more.. The key topics included in this slideshow are ARM architecture, Cortex M, ARM Ltd, development tools, embedded systems,. Download this presentation absolutely free.

Presentation Transcript


1. 1 The ARM Architecture (with focus on Cortex-M3) Joe Bungo Applications Engineer ARM University Program

2. 2 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools

3. 3 ARM Ltd Founded in November 1990 Spun out of Acorn Computers Initial funding from Apple, Acorn and VLSI Designs the ARM range of RISC processor cores Licenses ARM core designs to semiconductor partners who fabricate and sell to their customers ARM does not fabricate silicon itself Also develop technologies to assist with the design- in of the ARM architecture Software tools, boards, debug hardware Application software Bus architectures Peripherals, etc

4. 4 ARMs Activities memory memory SoC SoC Processors System Level IP: Data Engines Fabric 3D Graphics Physical IP Software IP Development Tools Connected Community

5. 5 ARM Connected Community 700+ 5

6. 6 Huge Range of Applications Energy Efficient Appliances IR Fire Detector Intelligent Vending Tele-parking Utility Meters Exercise Machines Intelligent toys Equipment Adopting 32-bit ARM Microcontrollers

7. 7 Worlds Smallest ARM Computer? A C B Wirelessly networked into large scale sensor arrays University of Michigan Sensors, timers Cortex-M0 +16KB RAM 65nm UWB Radio antenna 10 kB Storage memory ~3fW/bit 12 Ah Li-ion Battery Wireless Sensor Network Cortex-M0; 65

8. 8 Worlds Largest ARM Computer? 4200 ARM powered Neutrino Detectors Work supported by the National Science Foundation and University of Wisconsin-Madison 2.5km 70 bore holes 2.5km deep 60 detectors per string starting 1.5km down 1km 3 of active telescope 1km

9. 9 From 1mm 3 to 1km 3 1mm 3 1km 3 2mm 0.7mm 1.2mm 0.35mm 10 $1000 Mobile Embedded Consumer Mobile Computing Server Enterprise PC Home HPC

10. 10 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools

11. 11 ARM Cortex Processors (v7) ARM Cortex- A family (v7-A): Applications processors for full OS and 3 rd party applications ARM Cortex- R family (v7-R): Embedded processors for real-time signal processing, control applications ARM Cortex- M family (v7-M): Microcontroller-oriented processors for MCU and SoC applications Cortex-R4 Cortex-A8 SC300 Cortex-M1 Cortex -M3 ... 2.5 GHz x1-4 Cortex-A9 12k gates... Cortex-M0 Cortex-M4 x1-4 Cortex-A5 1-2 Heron R x1-4 Cortex-A15

12. 12 Cortex family Cortex-A8 Architecture v7A MMU AXI VFP & NEON support Cortex-R4 Architecture v7R MPU (optional) AXI Dual Issue Cortex-M3 Architecture v7M MPU (optional) AHB Lite & APB

13. 13 Relative Performance* *Represents attainable speeds in 130, 90, 65, or 45nm processes

14. 14 Data Sizes and Instruction Sets The ARM is a 32-bit architecture. When used in relation to the ARM: Byte means 8 bits Halfword means 16 bits (two bytes) Word means 32 bits (four bytes) Most ARMs implement two instruction sets 32-bit ARM Instruction Set 16-bit Thumb Instruction Set Jazelle cores can also execute Java bytecode

15. 15 ARM and Thumb Performance Memory width (zero wait state) Dhrystone 2.1/sec @ 20MHz

16. 16 The Thumb-2 instruction set Variable-length instructions ARM instructions are a fixed length of 32 bits Thumb instructions are a fixed length of 16 bits Thumb-2 instructions can be either 16-bit or 32-bit Thumb-2 gives approximately 26% improvement in code density over ARM Thumb-2 gives approximately 25% improvement in performance over Thumb

17. 17 Cortex-M Programmers Model Fully programmable in C Stack-based exception model Only two processor modes Thread Mode for User tasks Handler Mode for OS tasks and exceptions Vector table contains addresses Process r8 r9 r10 r11 r12 sp lr r15 (pc) xPSR r0 r1 r2 r3 r4 r5 r6 r7 Main sp

18. 18 ARM Cortex-M3 Application code OS System Call (SVCall) Undefined Instruction Privileged Cortex-M3 Processor Privilege Memory Instructions & Data Aborts Interrupts Reset Non-Privileged Supervisor User Handler Mode Thread Mode

19. 19 Cortex-M3 Interrupt Handling One Non-Maskable Interrupt (INTNMI) supported 1-240 prioritizable interrupts supported Interrupts can be masked Implementation option selects number of interrupts supported Nested Vectored Interrupt Controller (NVIC) is tightly coupled with processor core Interrupt inputs are active HIGH Cortex-M3 Processor Core INTNMI NVIC Cortex-M3 1-240 Interrupts INTISR[239:0]

20. 20 Cortex-M3 Exception Handling Reset : power-on or system reset NMI : cannot be stopped or preempted by any exception other than reset Faults Hard Fault : default Fault or any fault unable to activate Memory Manage : MPU violations Bus Fault : prefetch and memory access violations Usage Fault : undef instructions, divide by zero, etc. SVCall : privileged OS requests Debug Monitor : debug monitor program PendSV : pending SVCalls SysTick Interrupt : internal sys timer, i.e., used by RTOS to periodically check resources or peripherals External Interrupt : i.e., external peripherals

21. 21 Cortex-M3 Program Status Register One Status Register consisting of APSR - Application Program Status Register ALU flags IPSR - Interrupt Program Status Register Interrupt/Exception No. EPSR - Execution Program Status Register IT field If/Then block information ICI field Interruptible-Continuable Instruction information xPSR Composite of the 3 PSRs Stored on the stack on exception entry IT/ICI IT 27 31 N Z C V Q 28 7 ISR Number 16 23 15 0 24 25 26 10 T

22. 22 Conditional Execution ITTET EQ Inst 1 Inst 2 Inst 3 Inst 4 If Then (IT) instruction added (16 bit) Up to 3 additional then or else conditions maybe specified (T or E) Makes up to 4 following instructions conditional Any normal ARM condition code can be used 16-bit instructions in block do not affect condition code flags Apart from comparison instruction 32 bit instructions may affect flags (normal rules apply ) Current if-then status stored in CPSR Conditional block maybe safely interrupted and returned to Must NOT branch into or out of if-then block MOVEQ ADDEQ SUBNE ORREQ

23. 23 Load/Store Miscellaneous Classes of Instructions (v4T) Data Operations MOV PC, Rm Bcc BL BLX Change of Flow

24. 24 Data processing Instructions Consist of : Arithmetic: ADD ADC SUB SBC RSB RSC Logical: AND ORR EOR BIC Comparisons: CMP CMN TST TEQ Data movement: MOV MVN These instructions only work on registers, NOT memory. Syntax: {}{S} Rd, Rn, Operand2 Comparisons set flags only - they do not specify Rd Data movement does not specify Rn Second operand is sent to the ALU via barrel shifter.

25. 25 Register, optionally with shift operation Shift value can be either be: 5 bit unsigned integer Specified in bottom byte of another register. Used for multiplication by constant Immediate value 8 bit number, with a range of 0-255. Rotated right through even number of positions Allows increased range of 32-bit constants to be loaded directly into registers Result Operand 1 Barrel Shifter Operand 2 ALU Using a Barrel Shifter:The 2nd Operand

26. 26 Single register data transfer LDR STR Word LDRB STRB Byte LDRH STRH Halfword LDRSB Signed byte load LDRSH Signed halfword load Memory system must support all access sizes Syntax: LDR {}{} Rd,

STR {}{} Rd,
e.g. LDREQB

27. 27 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools

28. 28 Cortex-M3 Datapath Register Bank Mul/Div Address Incrementer ALU B A INTADDR I_HADDR Address Register Barrel Shifter Writeback ALU Read Data Register Write Data Register Instruction Decode I_HRDATA D_HWDATA D_HRDATA Address Incrementer D_HADDR Address Register

29. 29 Cortex-M3 has 3-stage fetch-decode-execute pipeline Similar to ARM7 Cortex-M3 does more in each stage to increase overall performance Cortex-M3 Pipeline Branch forwarding & speculation 1 st Stage - Fetch 2 nd Stage - Decode 3 rd Stage - Execute Execute stage branch (ALU branch & Load Store Branch) Fetch (Prefetch) Fetch (Prefetch) AGU AGU Instruction Decode & Register Read Instruction Decode & Register Read Branch Branch Address Phase & Write Back Address Phase & Write Back Data Phase Load/Store & Branch Data Phase Load/Store & Branch Multiply & Divide Multiply & Divide Shift Shift ALU & Branch ALU & Branch Write Write

30. 30 ARM10 vs. ARM11 Pipelines ARM11 Fetch 1 Fetch 2 Decode Issue Shift ALU Saturate Write back MAC 1 MAC 2 MAC 3 Address Data Cache 1 Data Cache 2 Shift + ALU Memory Access Reg Write FETCH DECODE EXECUTE MEMORY WRITE Reg Read Multiply Branch Prediction Instruction Fetch ISSUE ARM or Thumb Instruction Decode Multiply Add ARM10

31. 31 Full Cortex-A8 Pipeline Diagram 13-Stage Integer Pipeline 10-Stage NEON Pipeline

32. 32 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools

33. 33 High Performance ARM processor High-bandwidth on-chip RAM High Bandwidth External Memory Interface DMA Bus Master APB Bridge Keypad UART PIO Timer AHB APB High Performance Pipelined Burst Support Multiple Bus Masters Low Power Non-pipelined Simple Interface An Example AMBA System

34. 34 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools

35. 35 ARM Debug Architecture ARM core ETM TAP controller Trace Port JTAG port Ethernet Debugger (+ optional trace tools) EmbeddedICE Logic Provides breakpoints and processor/system access JTAG interface (ICE) Converts debugger commands to JTAG signals Embedded trace Macrocell (ETM) Compresses real-time instruction and data access trace Contains ICE features (trigger & filter logic) Trace port analyzer (TPA) Captures trace in a deep buffer EmbeddedICE Logic

36. 36 Keil Development Tools for ARM Includes ARM macro assembler, compilers (ARM RealView C/C++ Compiler, Keil CARM Compiler, or GNU compiler), ARM linker, Keil uVision Debugger and Keil uVision IDE Keil uVision Debugger accurately simulates on-chip peripherals (I 2 C, CAN, UART, SPI, Interrupts, I/O Ports, A/D and D/A converters, PWM, etc.) Evaluation Limitations 16K byte object code + 16K data limitation Some linker restrictions such as base addresses for code/constants GNU tools provided are not restricted in any way http://www.keil.com/demo/

37. 37 Keil Development Tools for ARM

38. 38 University Resources http://www.arm.com/support/university/ University@arm.com

39. 39 Your Future at ARM Graduate and Internship/Co-op Opportunities Engineering: Memory, Validation, Performance, DFT, R&D, GPU and more! Sales and Marketing: Corporate and Technical Corporate: IT, Patents, Services (Training and Support), and Human Resources Incredible Culture and Comprehensive Benefit Package Competitive Reward Work/Life Balance Personal Development Brilliant Minds and Innovative Solutions Keep in Touch! www.arm.com/about/careers

40. 40 TI Panda Board OMAP4430 Processor 1 GHz Dual-core ARM Cortex-A9 (NEON+VFP) C64x+ DSP PowerVR SGX 3D GPU 1080p Video Support POP Memory 1 GB LPDDR2 RAM USB Powered < 4W max consumption (OMAP small % of that) Many adapter options (Car, wall, battery, solar, ..)

41. 41 Project Ideas Using Panda OS Projects OS porting to ARM/Cortex (TI OMAP) MythTV system Super-Panda stack of Pandas as compute engine and task distribution Linux applications NEON Optimization Projects Codec optimization in ffmpeg (pick your favorite codec) Voice and image recognition Open-source Flash player optimizations (swfdec)

42. 42 Fin

43. 43 Nokia N95 Multimedia Computer Symbian OS v9.2 Operating System supporting ARM processor-based mobile devices, developed using ARM RealView Compilation Tools OMAP 2420 Applications Processor ARM1136 processor-based SoC, developed using Magma Blast family and winner of 2005 INSIGHT Award for Most Innovative SoC Connect. Collaborate. Create. Mobiclip Video Codec Software video codec for ARM processor-based mobile devices ST WLAN Solution Ultra-low power 802.11b/g WLAN chip with ARM9 processor-based MAC S60 3 rd Edition S60 Platform supporting ARM processor-based mobile devices

44. 44 Beagle Board

45. 45 $149 > 1000 participants and growing Open access to hardware documentation Wikis, blogs, promotion of community activity Free software Freedom to innovate Personally affordable Active & technical community Opportunity to tinker and learn Instant access to >10 million lines of code Addressing open source community needs Targeting community development

46. 46 OMAP3530 Processor 600MHz Cortex-A8 NEON+VFPv3 16KB/16KB L1$ 256KB L2$ 430MHz C64x+ DSP 32K/32K L1$ 48K L1D 32K L2 PowerVR SGX GPU 64K on-chip RAM POP Memory 128MB LPDDR RAM 256MB NAND flash USB Powered 2W maximum consumption OMAP is small % of that Many adapter options Car, wall, battery, solar, Peripheral I/O DVI-D video out SD/MMC+ S-Video out USB 2.0 HS OTG I 2 C, I 2 S, SPI, MMC/SD JTAG Stereo in/out Alternate power RS-232 serial 3 Fast, low power, flexible expansion

47. 47 Peripheral I/O DVI-D video out SD/MMC+ S-Video out USB HS OTG I 2 C, I 2 S, SPI, MMC/SD JTAG Stereo in/out Alternate power RS-232 serial 3 Other Features 4 LEDs USR0 USR1 PMU_STAT PWR 2 buttons USER RESET 4 boot sources SD/MMC NAND flash USB Serial On-going collaboration at BeagleBoard.org Live chat via IRC for 24/7 community support Links to software projects to download And more

48. 48 Project Ideas Using Beagle OS Projects OS porting to ARM/Cortex (TI OMAP) MythTV system Super-Beagle stack of Beagles as compute engine and task distribution Linux applications NEON Optimization Projects Codec optimization in ffmpeg (pick your favorite codec) Voice and image recognition Open-source Flash player optimizations (swfdec)