Products & Solutions
ARC® FPX Floating Point Extensions
Silicon-Efficient Floating Point Extensions for ARC Configurable Cores
ARC FPX Floating Point Extensions add high performance single and double precision math instructions to the configurable ARC 600 and 700 core families. ARC FPX dramatically accelerates computations where data sets have a large dynamic range and when high precision is required.
When used with ARC's MetaWare®, ARC FPX complies with the IEEE-754 Standard for Binary Floating Point Arithmetic.
ARC cores with FPX provide an ideal solution for system-on-chips (SoCs) that perform graphics and image processing, complex computations or control algorithms, especially where power and area budgets are constrained.
Content On This Page
| Highlights | ARC's Extendible Architecture | Features |
Highlights
- Very small die area and power:
ARC FPX is implemented using the extendibility feature of the ARC 600 and ARC 700 configurable architectures. In contrast to the very large floating point co-processors required by competitive cores, ARC FPX instructions are integrated into the ARC CPU core. ARC's approach achieves similar floating point performance to a co-processor, but with significantly smaller die area and power.
- Flexible configuration options: SoC designers can specify single precision extensions only, double precision only, or both, as required in their application.
- Compiler math library optimizes performance: ARC's compiler takes full advantage of ARC FPX instructions to accelerate transcendental and other functions specified in IEEE-754.
top
ARC's Extendible Architecture
Extendibility is designed into the ARC 600 and ARC 700 configurable processor architectures. It provides the flexibility to add instructions, registers, flags and condition codes to create a processor that is highly tuned for specific applications. ARC makes this powerful feature available to SoC designers, along with an Extension Instruction Automation (EIA) tool to simplify and automate the process of designing and verifying extensions.
Instead of taking the conventional coprocessor approach, ARC chose to use extendibility and the EIA tool to implement hardware floating point instructions. A coprocessor is essentially a second processor core, with its own pipeline, data paths, registers and ALU. In contrast, ARC FPX makes use of the main processor pipeline and data paths, adding only the minimum registers and logic required for the floating point instructions.
The resulting design is much smaller and lower power. It is also much more flexible and further extendible. The advantages of ARC's configurable design approach can be seen by comparing ARC FPX with a competitor's floating point coprocessor with similar performance as shown in the table below.
top
Features
Single Precision Instructions
- MUL, ADD, SUB implemented directly in hardware
- 3 CPU cycles latency per instruction, pipelined
- 13X – 23X faster than optimized software library
- Peak performance 1.0 Mflops/MHz
- 13K - 22K gates
Double Precision Instructions
- MUL, ADD, SUB implemented directly in hardware
- 5 CPU cycles per ADD or SUB instruction, 7 CPU cycles per MUL instruction
- 9X – 19X faster than optimized software library
- Peak performance: 200Kflops/MHz
- 26K - 31K gates
ARC MetaWare® Math Library
- Optimized for ARC FPX hardware
- Provides additional arithmetic and transcendental functions
- Complies with IEEE-754
- Allows relinking of existing object files
ARC FPX vs. Traditional FP Coprocessor
|
| ARC™ FPX
| Floating Point Coprocessor
|
| Size in gates |
Single Precision:13K - 22K Double Precision: 26K-31K |
100K - 130K |
| Performance |
1.0 Mflops/MHz |
1.3 Mflops/MHz |
| Additional power |
None (confirmed with customer benchmarks) |
0.4 mW/MHz typical (0.13um, G process) |
| Configurability options |
Flexible - SP, DP or both |
None, monolithic design |
| Additional math instructions |
Can be implemented by the user |
Fixed design, not extendible |
top
|