-
2022-09-23 11:33:08
ADSP-21062 SHARC processor
General Instructions
The ADSP-21062 SHARC Super Harvard Architecture Computer is a signal processing microcomputer that provides a new level of functionality and performance. The ADSP-21062 SHARC is a 32-bit processor optimized for high performance, digital signal processor applications. The ADSP-21062 uses the ADSP-21000 digital signal processor as the core to form a complete system-on-chip, adding a dual-port on-chip SRAM and integrated I/O peripherals supported by a dedicated I/O bus.
The ADSP-21062 is manufactured in a high-speed, low-power CMOS process with an instruction cycle of 25ns and a working speed of 40MIPS. With the on-chip instruction cache, the processor can execute each instruction in one cycle. Table 1 shows the performance benchmarks of the ADSP-21062.
The ADSP-21062 SHARC represents a new signal computer integration standard that combines a high-performance floating-point digital signal processor core with integrated system-on-chip functionality, including a 2 Mbit SRAM memory (4 M bits on the ADSP-21060 megabits), host processor interface, DMA controller, serial and link ports, and glue-free parallel bus connections to DSP multiprocessing.
Figure 1 shows the block diagram of the ADSP-21062, illustrating the following architectural features:
with shared data register file; data address generators (DAG1, DAG2); program sequencer with instruction cache; on-chip timing SRAM; Serial and link ports; JTAG test access port.
ADSP-21000 Series Core Architecture
The ADSP-21062 includes the following architectural features of the ADSP-21000 series core. The ADSP-21062 processor is code and function compatible with the ADSP-21020.
Independent parallel computing unit
The arithmetic/logic unit (ALU), multiplier, and shifter all execute single-cycle instructions. The three units are arranged in parallel, maximizing computational throughput. A single multifunction instruction performs parallel arithmetic unit and multiplier operations. These computing units support IEEE 32-bit single-precision floating-point, extended-precision 40-bit floating-point and 32-bit fixed-point data formats.
data register file
The general purpose data register file is used to transfer data between the computational unit and the data bus and to store intermediate results. This 10-port, 32-register (16 primary, 16 secondary) register file, combined with the ADSP21000 Harvard architecture, allows for unconstrained data flow between the compute unit and internal memory. Single-Cycle Fetch of Instructions and Two Operands The ADSP-21062 has an enhanced Harvard architecture in which the data memory (DM) bus transfers data and the program memory (PM) bus transfers instructions and data (see Figure 1). Thanks to its separate program and data memory buses and on-chip instruction cache, the processor can simultaneously fetch two operands and an instruction (from cache) in one cycle.
instruction cache
The ADSP-21062 includes an on-chip instruction cache that supports triple bus operations to fetch one instruction and two data values. The cache is selective, and only instructions that fetch conflict with pm bus data access are cached. This allows full-speed execution of core, looping operations such as digital filter multiply-accumulate and fft butterfly processing. Data Address Generators with Hardware Circular Buffers The ADSP-21062's two data address generators (DAGs) implement circular data buffers in hardware. Circular buffers allow efficient programming of delay lines and other data structures required for digital signal processing, typically used in digital filters and Fourier transforms. The two DAGs of the ADSP-21062 contain enough registers to create up to 32 circular buffers (16 primary register sets, 16 secondary register sets). DAGs automatically handle wrapping address pointers, reducing overhead, improving performance, and simplifying implementation. A circular buffer can start and end at any memory location.
Flexible instruction set
The 48-bit instruction word accommodates various parallel operations for concise programming. For example, the ADSP21062 can conditionally perform multiply, add, subtract, and branch in one instruction.
ADSP-21062/ADSP-21062L functions to expand the core of ADSP-21000 series, ADSP-21062 adds the following architectural functions:
Dual-port on-chip memory
The adsp-21062 contains two Mbits of on-chip SRAM, each organized into two 1 Mbits blocks that can be configured for different combinations of code and data storage. Each memory block is dual-ported and independently accessed in a single cycle by the core processor and the I/O processor or DMA controller. Dual-port memory and separate on-chip buses allow two data transfers from the core and one data transfer from the I/O in one cycle. On the ADSP-21062, memory can be configured for up to 64K words of 32-bit data, 128K words of 16-bit data, 40K words of 48-bit instructions (or 40-bit data), or a combination of different word sizes up to two megabytes . All memory can be accessed as 16-bit, 32-bit or 48-bit words. A 16-bit floating-point storage format is supported, effectively doubling the amount of data that can be stored on-chip. Conversion between 32-bit floating-point and 16-bit floating-point formats is done in one instruction.
Although each memory block can store a combination of code and data, when one block stores data, the dm bus is used for transmission, and when the other block stores instructions and data, the pm bus is used for transmission, and the access efficiency is the highest. Using the dm bus and the pm bus in this way, each memory block has a dedicated bus, ensures single-cycle execution of two data transfers. In this case, the instruction must be available in the cache. Single-cycle execution is also maintained when one of the data operands is transferred to or from the ADSP21062's external port.
Off-chip memory and peripheral interface The external port of the ADSP-21062 provides the interface between the processor and off-chip memory and peripheral devices. 4G of off-chip address space is contained in the unified address space of the ADSP-21062. Separate on-chip buses for pm address, pm data, dm address, dm data, i/o address, and i/o data are multiplexed on external ports to create a bus with a single 32-bit address and a single 48-bit (or 32-bit) data bus for the external system bus.
Addressing select signals for external memory devices is facilitated by on-chip decoding of high-order address lines to generate memory banks. To simplify addressing of page mode DRAM, separate control lines are also generated. The ADSP-21062 provides programmable memory wait states and external memory acknowledgment control, allowing interfacing with DRAM and peripherals with variable access, hold, and disable time requirements.
host processor interface
The host interface of the ADSP-21062 allows easy connection to 16-bit and 32-bit standard microprocessor buses without additional hardware. Asynchronous transfers are supported at speeds up to the full clock rate of the processor. The host interface is accessed through the external port of the ADSP-21062, and the memory is mapped into the unified address space. The host interface has four DMA channels; code and data transfers are done with low software overhead. The host processor requests the ADSP-21062's external bus using the host bus request (hbr), host bus grant (hbg) and ready (redy) signals. The host can directly read and write the internal memory of the ADSP-21062, and can access the DMA channel settings and mailbox registers. Vectored interrupt support is provided for efficient execution of host commands.
DMA controller
The ADSP-21062's on-chip DMA controller allows zero-overhead data transfers without processor intervention. The dma controller operates independently and invisible to the processor core, allowing dma operations while the cores are executing their program instructions concurrently. DMA transfers can occur between the adsp-21062's internal memory and external memory, external peripherals, or the host processor. DMA transfers can also occur between the adsp-21062's internal memory and its serial or link ports. DMA transfer between external memory and external peripherals is another option. Perform external bus packing of 16-, 32-, or 48-bit words during dma transfers.
The ADSP-21062-2 provides 10 DMA channels through the link port, 4 through the serial port, and 4 through the processor's external port (for host processor, other ADSP-21062S, memory, or I/O transfers). The other four link port DMA channels are shared with serial port 1 and external ports. Programs can be downloaded to the ADSP21062 using DMA transfers. Asynchronous off-chip peripherals can control two DMA channels using the DMA request/grant lines (dmar1-2, dmag1-2). Other DMA features include generating interrupts when a DMA transfer is complete, and a DMA chain for automatically chaining DMA transfers.
serial port
The ADSP-21062 features two synchronous serial ports that provide an inexpensive interface to a variety of digital and mixed-signal peripherals. The serial port can operate at the full clock rate of the processor, providing a maximum data rate of 40 Mbit/s for each data rate. Separate transmit and receive functions provide greater flexibility for serial communications. Serial port data can be automatically transferred between on-chip memories via DMA. Each serial port provides TDM multi-channel mode.
Serial ports can use little-endian or big-endian transfer formats, and word lengths can be selected from 3 to 32 bits. They offer selectable synchronization and transmission modes and selectable μ-law or a-law companding. Serial port clock and frame synchronization can be generated internally or externally.
multiprocessing
adsp-21062 provides powerful features tailored for multiprocessor DSP systems. The unified address space (see Figure 4) allows direct inter-processor access to each ADSP21062's internal memory. Distributed bus arbitration logic is included on-chip for simple, glue-free connection of systems consisting of up to six ADSP-21062S and a host processor. The main processor switch incurs only one cycle of overhead. Bus arbitration can choose between fixed priority or rotating priority. Bus locks allow an indivisible read-modify-write sequence of semaphores. Provides vectored interrupts for interprocessor commands. The maximum throughput for data transfer between processors is 240 Mbytes/sec on link ports or external ports. Broadcast Write allows simultaneous transmission of data to all ADSP-21062S and can be used to implement reflected semaphores.
link port
The ADSP-21062 has six 4-bit link ports that provide additional I/O capabilities. Link ports can be clocked twice per cycle, allowing each port to transmit eight bits of data per cycle. Link port I/O is particularly useful for point-to-point interprocessor communication in multiprocessing systems. Link ports can operate independently and simultaneously with a maximum data throughput of 240 Mbytes/sec. Link port data is packed into 32-bit or 48-bit words and can be transferred directly to on-chip memory by the core processor or DMA. Each link port has its own double-buffered input and output registers. The clock/answer handshake controls link port transfers. Transmissions can be programmed to send or receive.
program start
The ADSP-21062's internal memory can be booted from an 8-bit EPROM, the host processor, or through one of the link ports at system power-up. Boot source selection is controlled by the bms (boot memory selection), eboot (eprom boot), and lboot (link/host boot) pins. 32-bit and 16-bit host processors are available for booting.
EZ-ICE Probe Target Board Connector
The ADSP-2106X EZ-ICE emulator adopts the IEEE1149.1JTAG test access port of ADSP-2106X to monitor the target board processor during the emulation process. The Ezice probe requires that the CLKIN, TMS, TCK, TRST, TDI, TDO, EMU, and GND signals of the ADSP-2106X be accessible on the target system through a 14-pin connector (2 row x 7-pin strip header), as shown in Figure 5 . The EZ-ICE probe plugs directly into this connector for analog chips on the board. This connector must be added to the target board design if the ADSP-2106X EZ-ICE is to be used. The total trace length between the EZ-ICE connector and the farthest device sharing the EZ-ICE JTAG pins should be limited to a maximum of 15 inches for guaranteed operation. This length limit must include EZ-ICE JTAG signals routed to one or more ADSP-2106X devices, or a combination of ADSP2106X devices and other JTAG devices on the chain.
14-pin, two-row needle bar headers are keyed in place of pin 3 - pin 3 must be removed from the header. Pins must be 0.025 inches square and at least 0.20 inches long. Pin spacing should be 0.1 x 0.1 inches. Pin leads are available from suppliers such as 3M, McKenzie, and Samtec.
BTMS, BTCK, BTRST and BTDI signals are provided so that the test access port can also be used for board level testing. When the connector is not used for emulation, place a jumper between the BXXX pin and the XXX pin as shown in Figure 5. If you are not going to use the test access port for board testing, connect BTRST to GND and BTCK to or pull up to VDD. The TRST pin must be asserted after power is applied (via BTRST on the connector) or held low for the ADSP-2106X to function properly. No BXXX pins are connected to the EZ-ICE probe.
When the software starts, TRST is driven low until the simulator turns on the EZ-ICE probe. After software startup, trst is driven high. Figure 6 shows the JTAG scan path connections for a system containing multiple ADSP-2106X processors. Connecting CLKIN to pin 4 of the EZ-ICE header is optional. The Emulator only uses clkin when instructed to perform operations such as starting, stopping, and single-stepping multiple ADSP-2106XS in a synchronous manner. If you don't need to synchronize these operations on multiple processors, just connect pin 4 of the EZ-ICE header to ground.
If simultaneous multiprocessor operation is required
With CLKIN connected, the clock skew between multiple ADSP21062 processors and the CLKIN pins on the EZ-ICE header must be minimal. If the skew is too large, synchronous operations may shut down one or more cycles between processors. For simultaneous multiprocessor operation, tck, tms, clkin, and emu should be considered critical signals in terms of skew, and should be scheduled on your board for as short a time as possible. If tck, tms and clkin are driving a large number of adsp-21062s (more than 8) in your system, then treat them as "clock trees" using multiple drivers to minimize skew. (See "JTAG Clock Tree" and "Clock Distribution" in the "High-Frequency Design Considerations" section of the ADSP2106X User Manual, Second Edition, in Figure 7.)
If synchronous multiprocessor operation is not required (ie clkin is not connected), just use appropriate parallel terminals on tck and tms. TDI, TDO, EMU and TRST are not critical signals in terms of tilt.
Timing Specifications
The ADSP-21062 will be available in two speed grades, 40MHz and 33.3MHz. Specifications shown are based on a clkin frequency of 40 MHz (tck=25 ns). The dt derating allows specification at other clkin frequencies (within the min-max range of the tck specification; see clock input below). dt is the difference between the actual clkin period and the 25 ns clkin period:
Use the given precise timing information. Do not try to derive parameters from other additions or subtractions. While addition or subtraction will yield meaningful results for a single device, the values given in this data sheet reflect statistical variation and worst-case scenarios. Therefore, parameters cannot be meaningfully added to get longer.
See Figure 27 under test conditions for voltage reference levels.
Switch characteristics specify how the processor changes its signals. Timing circuits outside of the processor that you cannot control must be designed to be compatible with these signal characteristics. Switch characteristics tell you what the processor will do in a given situation. You can also use the toggle feature to ensure that any timing requirements of devices attached to the processor, such as memory, are met.
Timing requirements apply to signals controlled by circuits external to the processor, such as data inputs for read operations. Timing requirements ensure that the processor works properly with other devices.
development tools
The ADSP-21062 supports a full suite of software and hardware development tools, including the EZ-ICE Inner Loop Simulator, EZ-Lab® Development Board, EZ-Kit, and development software. The EZ-Lab includes an evaluation board with an ADSP-21062 (5 V) processor and provides a serial connection to a PC. SHARC EZ-KIT combines ADSP21000 series development software for PC and development board for EZ-Lab ADSP-21062 in one software package. In addition to the ez-lab development board, the ez-kit includes an optimizing compiler, assembler, instruction-level simulator, runtime library, diagnostic utilities, and a complete set of example programs.
The same ez-ice hardware can be used for the adsp-21060/adsp-21061 to fully emulate the adsp-21062, except for the display and modification of two new motion registers. The simulator does not display these two registers, but your code can use them.
The ADSP-21000 series development software for analog devices includes an easy-to-use assembler based on algebraic syntax, an assembly library/library, a linker, an instruction-level simulator, an ANSI C optimizing compiler, a CBUG™ C source-level debugger, and a C runtime library containing dsp and math functions. The optimizing compiler includes numerical c extensions based on the work of the ANSI numerical c extension group. Numerical C provides extensions to the C language for array selection, vector math, complex data types, loop pointers, and variables.
The ADSP-21062 EZ-ICE emulator uses the IEEE1149.1 JTAG test access port of the ADSP-21062 processor to monitor the target board processor during the emulation process. EZ-ICE provides full-speed emulation, allowing inspection and modification of memory, registers, and the processor stack. Non-intrusive in-circuit emulation is ensured by using the processor's jtag interface. The emulator does not affect the loading or timing of the target system.
More details and ordering information are available in the ADSP-21000 Series Hardware and Software Development Tools Data Sheet (ADDS-210XX-TOOLS). This data sheet is available from any analog sales office, dealer or documentation center.
In addition to the software and hardware development tools provided by the emulation device, there is also a range of tools available from third parties that support the sharc processor family. Hardware tools include sharc pc add-in cards, multiprocessor sharc vme boards, and daughter card modules with multiple sharcs and additional memory. These modules are based on the sharcpac™ module specification. Third-party software tools include ada compilers, dsp libraries, operating systems, and block diagram design tools.
Pin function description
The ADSP-21062 pin definitions are as follows: All pins are the same on the ADSP-21062 and ADSP-21062L. Inputs identified as synchronous must meet the timing requirements of CLKIN (or TCK for TMS, TDI). Inputs identified as async(a) can be asserted asynchronously as clkin (or tck as trst).
Except for ADDR31-0, DATA47-0, FLAG3-0, SW and inputs with internal pull-up or pull-down resistors (CPA, ACK, DTX, DRX, TCLKX, RCLKX, LXDAT3-0, LXCLK, LXACK, TMS), Unused inputs should be tied to VDD or GND or pulled and TDI) - these pins can be left floating. These pins have a logic level hold circuit that prevents the input from floating internally.
A = asynchronous G = ground I = input; O = output P = power S = synchronous; (A/D) = active drive (O/D) = open drain; t = three states (when sbts is asserted, or when the ADSP-21062 is a bus slave).