-
2022-09-23 10:31:41
ADSP-2106x SHARC® Processor, DSP Microcomputer Series - ADSP-21060/ADSP-21060L
Summary
High-performance signal processor for communications, graphics, and imaging applications; super Harvard architecture; four independent buses for dual data acquisition, instruction fetch, and non-intrusive I/O; 32-bit IEEE floating-point computing unit - multiplier, arithmetic Arithmetic and Shifter; Dual Port On-Chip SRAM with Integrated I/O; Peripherals - A Complete System-on-Chip; Integrated Multiprocessing.
main feature
40 MIPS, 25 ns instruction rate, single-cycle instruction execution; 120 MFLOPS peak, 80 MFLOPS sustained performance modulo-reversed dual data address generator; efficient program sequencing with zero overhead; loops: single-cycle loop setup; IEEE JTAG Standard 1149.1 tested Access ports and on-chip emulation; 240 -lead thermally enhanced MQFP package; 225 PBGA package; 32-bit single-precision and 40-bit extended precision; IEEE floating-point data format or 32-bit fixed-point data format; parallel computing; parallel single-cycle multiplication and arithmetic operations; Features dual memory read/write and instruction fetch; multiplication, addition and subtraction for accelerated FFT; butterfly computation; 4mbit on-chip SRAM; dual ports, independent access by core processor and DMA; off-chip memory interface; 4 gigabytes searchable address; programmable wait state generation, page mode DRAM support.
DMA controller
10 DMA channels for transfers between ADSP-2106x; internal and external memory; peripherals, host processor, serial port, or link port; 40 MHz background DMA transfers, executing in parallel with full-speed processor; 16 Host processor interface for 32-bit and 32-bit microprocessors; host can directly read and write ADSP-2106x internal memory; multiprocessing; glueless connections for scalable DSP multiprocessing; architecture; distributed on-chip bus arbitration for parallel buses; Up to six ADSP-2106xs Plus mainframes; six link port multiprocessing for point-to-point connections and arrays; 240 mbytes/s parallel bus transfer rate 240 MB/s transfer rate through link ports.
serial port
Two 40mbit/s synchronous serial ports; companding hardware independent transmit and receive functions.
General Instructions
The ADSP-21060 SHARC Super Harvard Architecture Computer is a signal processing microcomputer that offers a new level of functionality and performance. The ADSP-2106x SHARC is a 32-bit processor digital signal processor application optimized for high performance. a DSP-2106x takes adsp2100dsp as the core to form a complete on-chip system, adding a dual-port on-chip SRAM and an integrated I/O peripheral supported by a dedicated I/O bus.
The ADSP-2106x is manufactured in a high-speed, low-power CMOS process, with an instruction cycle of 25ns and a working speed of 40MIPS. With the on-chip instruction cache, the processor can execute each instruction in one cycle. Table 1 shows the performance benchmarks of the ADSP-2106x.
The ADSP-2106x SHARC represents a new signal computer integration standard that combines a high-performance floating-point digital signal processor core with integrated system-on-chip functionality, including a 4-mbit SRAM memory host processor interface, DMA controller, Serial ports as well as link ports and parallel bus connections for glueless DSP multiprocessing.
Figure 1 shows a block diagram of the ADSP-2106x, illustrating the following architectural features:
with shared data register file; data address generators (DAG1, DAG2); program sequencer with instruction cache; on-chip timing SRAM with off-chip memory and peripherals; host port and multiprocessor interface; DMA controller; serial port and link port; JTAG test access port.
Figure 2 shows a typical uniprocessor system. The multiprocessing system is shown in Figure 3.
ADSP-21000 Series Core Architecture
The ADSP-2106x contains the following architectural features at the core of the ADSP-21000 series. The ADSP-21060 is code and function compatible with the ADSP-21061 and ADSP-21062.
Independent parallel computing unit
The arithmetic/logic unit (ALU), multiplier, and shifter all execute single-cycle instructions. The three units are arranged in parallel, maximizing computational throughput. A single multifunction instruction performs parallel arithmetic unit and multiplier operations. These computational units support IEEE 32-bit single-precision floating-point, extended-precision 40-bit floating-point, and 32-bit fixed-point data formats.
data register file
The general purpose data register file is used to transfer data between the computational unit and the data bus and to store intermediate results. This 10-port, 32-register (16 primary, 16 secondary) register file, combined with the ADSP21000 Harvard architecture, allows for unconstrained data flow between the compute unit and internal memory.
Single-Cycle Fetch of Instructions and Two Operands The ADSP-2106x employs an enhanced Harvard architecture in which the data memory (DM) bus transfers data and the program memory (PM) bus transfers instructions and data (see Figure 1). Thanks to its separate program and data memory buses and on-chip instruction cache, the processor can simultaneously fetch two operands and an instruction (from cache) in one cycle.
instruction cache
The ADSP-2106x includes an on-chip instruction cache that supports triple bus operations to fetch one instruction and two data values. The cache is selective, and only instructions that fetch conflict with PM bus data access are cached. This allows core, looping operations such as digital filter multiply-accumulate and FFT butterfly processing to be performed at full speed.
Data Address Generators with Hardware Circular Buffers The ADSP-2106x's two data address generators (DAGs) implement circular data buffers in hardware. Circular buffers allow efficient programming of delay lines and other data structures required for digital signal processing, typically used in digital filters and Fourier transforms. The two DAGs of the ADSP-2106x contain enough registers to create up to 32 circular buffers (16 primary register sets, 16 secondary register sets). DAGs automatically handle wrapping address pointers, reducing overhead, improving performance, and simplifying implementation. A circular buffer can start and end at any memory location.
Flexible instruction set
The 48-bit instruction word accommodates various parallel operations for concise programming. For example, the ADSP2106x can conditionally perform multiply, add, subtract, and branch in one instruction.
ADSP-21060/ADSP-21060L Features
Expanding the core of the ADSP-21000 series, the ADSP-21060 adds the following architectural features:
Dual-port on-chip memory
The ADSP-21060 contains 4 Mbits of on-chip SRAM divided into 2 blocks per Mbit, which can be configured into different combinations of code and data storage. Each memory block is dual-ported and independently accessed in a single cycle by the core processor and the I/O processor or DMA controller. Dual-port memory and separate on-chip buses allow two data transfers from the core and one data transfer from the I/O in one cycle.
On the ADSP-21060, memory can be configured for a maximum of 128K words of 32-bit data, 256K words of 16-bit data, 80K words of 48-bit instructions (or 40-bit data), or a combination of different word sizes up to four megabits . All memory can be accessed as 16-bit, 32-bit or 48-bit words.
A 16-bit floating-point storage format is supported, effectively doubling the amount of data that can be stored on-chip. Conversion between 32-bit floating-point and 16-bit floating-point formats is done in one instruction.
Although each memory block can store a combination of code and data, when one block stores data, the DM bus is used for transmission, and when another block stores instructions and data, the PM bus is used for transmission, and the access efficiency is the highest. Using the DM bus and PM bus in this way, each memory block has a dedicated bus, ensures single-cycle execution of two data transfers. In this case, the instruction must be available in the cache. Single-cycle execution is also maintained when one of the data operands is transferred to or from the ADSP2106x's external port.
Off-Chip Memory and Peripheral Interface The external ports of the ADSP-2106x provide the interface between the processor and off-chip memory and peripheral devices. 4G of off-chip address space is contained in the unified address space of the ADSP-2106x. Separate on-chip buses for PM address, PM data, DM address, DM data, I/O address, and I/O data are multiplexed on external ports to create a single 32-bit address bus and a single 48-bit (or 32-bit) data bus for the external system bus.
The bank select signal is generated by on-chip decoding of high-order address lines, which facilitates addressing of external storage devices. To simplify addressing of page-mode DRAMs, separate control lines are also generated. The ADSP-2106x provides programmable memory wait states and external memory acknowledgement control, allowing interfacing with DRAM and peripherals with variable access, hold, and disable time requirements.
host processor interface
The host interface of the ADSP-2106x allows easy connection to standard 16-bit and 32-bit microprocessor buses without additional hardware. Asynchronous transfers are supported at speeds up to the full clock rate of the processor. The host interface is accessed through the external ports of the ADSP-2106x and maps memory into the unified address space. The host interface has four DMA channels; code and data transfers are done with low software overhead.
The host processor requests the ADSP-2106x's external bus using the host bus request (HBR), host bus grant (HBG), and ready (REDY) signals. The host can directly read and write the internal memory of the ADSP-2106x, and can access the DMA channel settings and mailbox registers. Vectored interrupt support is provided for efficient execution of host commands.
DMA controller
The ADSP-2106x's on-chip DMA controller allows zero-overhead data transfers without processor intervention. The DMA controller operates independently and invisible to the processor core, allowing DMA operations to occur while the cores are executing their program instructions concurrently.
DMA transfers can occur between the ADSP-2106x's internal memory and external memory, external peripherals, or the host processor. DMA transfers can also occur between the ADSP-2106x's internal memory and its serial or link ports. DMA transfers between external memory and external peripherals are another option. Perform external bus packing of 16-, 32- or 48-bit words during DMA transfers.
There are 10 DMA channels on the ADSP-2106x-2, via the link port, 4 via the serial port, and 4 via the processor's external port (for host processor, other ADSP-2106xs, memory, or I/O transfers ). The other four link port DMA channels are shared with serial port 1 and external ports. Programs can be downloaded to the ADSP-2106x using DMA transfers. Asynchronous off-chip peripherals can control two DMA channels using the DMA request/grant lines (DMAR1-2, DMAG1-2). Other DMA features include generating interrupts when DMA transfers are complete, and DMA chains for automatically chaining DMA transfers.
serial port
The ADSP-2106x features two synchronous serial ports that provide an inexpensive interface to a variety of digital and mixed-signal peripherals. The serial ports can operate at the full clock rate of the processor, providing a maximum data rate of 40 Mbit/s each. Separate transmit and receive functions provide greater flexibility for serial communications. Serial port data can be automatically transferred between on-chip memories via DMA. Each serial port provides TDM multi-channel mode.
Serial ports can use little-endian or big-endian transfer formats, and word lengths can be selected from 3 to 32 bits. They offer selectable synchronization and transmission modes and selectable μ-law or A-law companding. Serial port clock and frame synchronization can be generated internally or externally.
multiprocessing
The ADSP-2106x provides powerful functionality for multiprocessing DSP systems. The unified address space (see Figure 4) allows direct interprocessor access to each ADSP2106x's internal memory. Distributed bus arbitration logic is included on-chip for simple, glue-free connection of systems consisting of up to six ADSP-2106xs and a host processor. The main processor switch incurs only one cycle of overhead. Bus arbitration can choose between fixed priority or rotating priority. Bus locks allow an indivisible read-modify-write sequence of semaphores. Provides vectored interrupts for interprocessor commands. The maximum throughput for data transfer between processors is 240 Mbytes/sec on link ports or external ports. Broadcast writes allow simultaneous data transfer to all ADSP-2106xs and can be used to implement reflected semaphores.
link port
The ADSP-2106x has six 4-bit link ports that provide additional I/O capabilities. Link ports can be clocked twice per cycle, allowing each port to transfer 8 bits per cycle. Link port I/O is particularly useful for point-to-point interprocessor communication in multiprocessing systems.
Link ports can operate independently with a maximum throughput of 240 Mbytes/sec. Link port data is packed into 32-bit or 48-bit words and can be read directly by the core processor or DMA transferred to on-chip memory.
Each link port has its own double-buffered input and output registers. The clock/answer handshake controls link port transfers. Transmissions can be programmed to send or receive.
program start
The internal memory of the ADSP-2106x can be booted from an 8-bit EPROM, the host processor, or through one of the link ports at system power-up. Boot source selection is controlled by the BMS (boot memory select), EBOOT (EPROM boot) and LBOOT (link/host boot) pins. 32-bit and 16-bit host processors are available for booting.
development tools
The ADSP-21060 supports a complete set of software and hardware development tools, including the EZ-ICE inner loop simulator, EZ toolkit and development software. The SHARC-EZ toolkit is a complete low-cost software package for DSP evaluation and prototyping. The EZ kit contains a PC add-in card (EZ-LAB?) with an ADSP-21062 (5 V) processor. The EZ Toolkit also includes an optimizing compiler, assembler, instruction-level simulator, runtime libraries, diagnostic utilities, and a complete set of example programs.
The same EZ-ICE hardware can be used with the ADSP-21061/ADSP-21062 to fully emulate the ADSP-21060, except for the display and modification of two new motion registers specific to the ADSP-21061.
Analog Devices ADSP-21000 series development software includes an easy-to-use assembler based on algebraic syntax, assembly libraries/libraries, linker, instruction-level simulator, an ANSI C optimizing compiler, CBUG™ C source-level debugger, and a C runtime library (including DSP and math functions). The optimizing compiler includes numerical C extensions based on the work of the ANSI Numerical C Extensions group. Numerical C provides the C language with array selection, vector math, complex data types, loop pointers, and extended arrays of variable dimensions. ADSP-21000 series development software is available for PC and Sun platforms.
The ADSP-21060 EZ-ICE emulator uses the IEEE1149.1JTAG test access port of the ADSP-21060 processor to monitor the target board processor during the emulation process. EZ-ICE provides full-speed emulation, allowing inspection and modification of memory, registers, and the processor stack. Non-intrusive in-circuit emulation is ensured by using the processor's JTAG interface. The emulator does not affect the loading or timing of the target system.
More details and ordering information are available in the ADSP-21000 Series Hardware and Software Development Tools Data Sheet (ADDS-210xx-Tools). This data sheet is available from any analog sales office or dealer.
In addition to the software and hardware development tools provided by the emulation devices, third parties also provide a range of tools that support the SHARC processor family. Hardware tools include SHARC-PC add-in cards, multiprocessor SHARC-VME boards, submodules and modules with multiple SHARCs and additional memory. These modules are based on the SHARCPAC™ module specification. Third-party software tools include Ada compilers, DSP libraries, operating systems, and block diagram design tools.
Additional Information
This data sheet provides an overview of the architecture and functionality of the ADSP-21060. For details on the ADSP-21000 series core architecture and instruction set, refer to the ADSP-2106x SHARC User Manual Second Edition.
Pin function description
All pins are the same on the ADSP-21060 and ADSP-21060L. Inputs identified as synchronous must meet timing requirements related to CLKIN (or TCK for TMS, TDI). Inputs identified as asynchronous (A) can be asynchronously asserted as CLKIN (or asynchronously asserted as TCK for TRST).
EZ-ICE Probe Target Board Connector
The ADSP-2106x EZ-ICE emulator uses the IEEE1149.1JTAG test access port of the ADSP-2106x to monitor the target board processor during the emulation process. The EZ-ICE probe requires that the CLKIN, TMS, TCK, TRST, TDI, TDO, EMU, and GND signals of the ADSP-2106x are accessible on the target system through a 14-pin connector (2 row x 7-pin strip header), as shown in Figure 5 Show. The EZ-ICE probe plugs directly into this connector for analog chips on the board. This connector must be added to the target board design if the ADSP-2106x EZ-ICE is to be used. The total trace length between the EZ-ICE connector and the EZ-ICE JTAG pins shared by the farthest device should be limited to 15 inches maximum for guaranteed operation. This length limit must include EZ-ICE JTAG signals routed to one or more ADSP-2106x devices, or a combination of ADSP-2106x devices and other JTAG devices on the chain.
14-pin, two-row needle bar headers are keyed in place of pin 3 - pin 3 must be removed from the header. Pins must be 0.025 inches square and at least 0.20 inches long. Pin spacing should be 0.1 x 0.1 inches. Needle headers are available from suppliers such as 3M, McKenzie, and Samtec.
BTMS, BTCK, BTRST and BTDI signals are provided, and the test access port can also be used for board level testing. When the connector is not used for emulation, place a jumper between the Bxxx pins and the xxx pins. If the test access port is not used for board testing, connect BTRST to GND and connect or pull BTCK to VDD. The TRST pin must be asserted after power is applied (via BTRST on the connector) or held low for the ADSP-2106x to function properly. No Bxxx pins (pins 5, 7, 9, 11) are connected to the EZ-ICE probes.
Figure 6 shows the JTAG scan path connections for a system containing multiple ADSP-2106x processors.
Connecting CLKIN to pin 4 of the EZ-ICE header is optional. The emulator only uses CLKIN when instructed to perform operations such as starting, stopping, and single-stepping multiple ADSP-21061s in a synchronous manner. If you don't need to synchronize these operations on multiple processors, just connect pin 4 of the EZ-ICE header to ground.
Clock skew between multiple ADSP-21061/ADSP-21061L processors and the CLKIN pins on the EZ-ICE header must be minimized if synchronous multiprocessor operation is required if CLKIN is connected. If the skew is too large, synchronous operations may shut down one or more cycles between processors. For simultaneous multiprocessor operation, TCK, TMS, CLKIN, and EMU should be considered critical signals in terms of skew and should be placed as short as possible on your board. If TCK, TMS, and CLKIN drive a large number of ADSP-21061s (more than 8) in your system, then treat them as a clock tree using multiple drivers to minimize skew. (See Figure 7, JTAG Clock Tree and Clock Distribution, in the "High Frequency Design Considerations" section of the ADSP-2106x User Manual, Second Edition.)
If synchronous multiprocessor operation is not required (i.e. CLKIN is not connected), just use appropriate parallel termination on TCK and TMS. TDI, TDO, EMU and TRST are not critical signals in terms of tilt.
For complete information on SHARC EZ-ICE, see the ADSP2100 Family JTAG EZ-ICE User Guide and Reference.
Timing Specifications
The ADSP-21060 is available in two speed grades: 40MHz and 33.3MHz. Specifications shown are based on a CLKIN frequency of 40 MHz (tCK=25 ns). DT derating allows specification at other CLKIN frequencies (within the min-max range of the tCK specification; see Clock Inputs below). DT is the difference between the actual CLKIN period and the 25 ns CLKIN period: DT = tCK – 25 ns, using the given precise timing information. Do not try to derive parameters from other additions or subtractions. While addition or subtraction will yield meaningful results for a single device, the values given in this data sheet reflect statistical variation and worst-case scenarios. Therefore, parameters cannot be meaningfully added to get longer.
Switch characteristics specify how the processor changes its signals. Timing circuits outside of the processor that you cannot control must be designed to be compatible with these signal characteristics. Switch characteristics tell you what the processor will do in a given situation. You can also use the toggle feature to ensure that any timing requirements of devices attached to the processor, such as memory, are met.
Timing requirements apply to signals controlled by circuits external to the processor, such as data inputs for read operations. Timing requirements ensure that the processor works properly with other devices.
(O/D) = Open Drain
(A/D) = Active Drive
output drive current
Figure 28 shows the typical IV characteristics of the ADSP-2106x output drivers. These curves represent the current drive capability of the output driver as a function of output voltage.
Power consumption
There are two parts to the total power dissipation, one is due to the internal circuitry and the other is due to the switching of the external output driver. Internal power consumption depends on the instruction execution sequence and the number of data operands involved. The internal power consumption is calculated as follows:
The external component of the total power dissipation is caused by toggling of the output pins. Its size depends on:
– Number of output pins (O) toggled in each loop
- the maximum frequency at which they can switch (f)
– Load capacitance (C)
- Voltage fluctuation (VDD), calculated as follows:
The load capacitance should include the package capacitance (CIN) of the processor. The switching frequency consists of driving the load up and down again. The maximum rate at which the address and data pins can drive high and low is 1/(2TCK). The write strobe can toggle every cycle at a frequency of 1/tCK. The select pin switches at 1/(2tCK), but the select can be turned on every cycle.
example:
PEXT is estimated based on the following assumptions:
– Systems with a set of external data memory RAM (32-bit)
– Uses four 128K×8 RAM chips, each with a 10 pF load
– External data memory writes occur every other cycle at a rate of 1/(4tCK), 50% of pins toggle
– The instruction cycle rate is 40 MHz (tCK=25 ns).
The PEXT equation is calculated for each type of pin that can drive:
Typical power dissipation under these conditions can now be calculated by adding the typical internal power dissipation:
Note that the conditions that lead to a worst-case PEXT are not the same as the conditions that lead to a worst-case PINT. Maximum PUT cannot occur while 100% of the output pins are switched from all switches to all zeros. Also note that it is not uncommon for applications to switch 100% or even 50% of the output at the same time.
Test conditions
Output disable time
When output pins stop driving, go into a high impedance state, and begin to decay from the high or low voltage they output, they are considered disabled. The time for the voltage on the bus to decay ∏V depends on the capacitive load CL and the load current IL. This decay time can be approximated by the following equation:
As shown in Figure 25, the output disable time tDIS is the difference between tMEASURED and tDECAY. The measured time t is the time interval from when the reference signal switches to when the output voltage decays ∏V from the measured output high voltage or output low voltage. tDECAY is calculated with test loads CL and IL, and ∏V equals 0.5 V.
Output enable time
An output pin is considered enabled when it transitions from a high-impedance state to start driving. The output enable time, tENA, is the interval between the reference signal reaching a high or low voltage level and the output reaching the specified high or low trip point, as shown in the output enable/disable diagram (Figure 25). If multiple pins are enabled (such as a data bus), the measurement is the measurement of the first pin to start driving.
System Hold Time Calculation Example
To determine the data output hold time in a particular system, first calculate tDECAY using the formula given above. For devices that require hold time, ∏V is chosen as the difference between the output voltage and input threshold of the ADSP-2106x. A typical ∏V is 0.4 V. CL is the total bus capacitance (per data line) and IL is the total leakage or tri-state current (per data line). The hold time is tDECAY plus the minimum disable time (ie, tDATRWH for the write cycle).
capacitive load
Output delay and hold are based on standard capacitive loading: 50 pF on all pins (see Figure 26). For loads other than the 50 pF rating, the given delay and hold specifications should be derated by 1.5 ns/50 pF. Figures 29-30, 33-34 show the output rise time as a function of capacitance. Figures 31, 35 graphically show how output delay and hold vary with load capacitance. (Note that this graph or derating does not apply to output disable delay; see the previous section for output disable time under test conditions.) The graphs of Figures 29, 30, and 31 may not be linear outside the ranges shown.
environmental conditions
Thermal characteristics
The ADSP-21060KS and ADSP-21060LKS are packaged in a 240-lead thermally enhanced MQFP. The top surface of the package contains a copper plug from which most of the heat from the mold escapes. The warhead is flush with the top surface of the package. Note that the copper slug is internally connected to GND through the device substrate. The ADSP-21060KB and ADSP21060LKB are plastic ball grid arrays. The θ of the PBGA package is 1.7°C/Q.
The ADSP-2106x is specified for case temperature (TCASE). To ensure that the TCASE data sheet specification is not exceeded, a heat sink and/or airflow source may be used. The heat sink should be attached with thermal adhesive.
TCASE=box temperature (measured on top of package)
PD = Power Loss (W) (this value depends on the specific application; the calculation of partial discharge is shown under Power Loss).
θθ=value in the table below.
Package Dimensions: Dimensions are in inches and (mm).
225 Plastic Ball Grid Array (PBGA)