-
2022-09-23 11:45:21
ADSP-21161N SHARC processor
The ADSP-21161N SHARC DSP is a low-cost derivative of the ADSP-21160 with an analog device Super Harvard architecture. For ease of portability, the ADSP-21161N is in SISD (Single Instruction, Single Data) mode with the ADSP-21160 and first-generation ADSP-2106X SHARC processors. Like other sharc DSPs, the adsp-21161N is a 32-bit processor optimized for high-performance digital signal processor applications. The ADSP-21161N includes a 100 MHz or 110 MHz core, a dual-port on-chip SRAM, an integrated I/O processor that supports multiprocessing, and multiple internal buses to eliminate I/O bottlenecks. As first offered by adsp-21160adsp-21161n, adsp-21161n offers a single instruction multiple data (simd) architecture. Using two computational units (the ADSP-2106X SHARC processor has one), the ADSP-21161N can achieve double cycle performance over the ADSP-2106X on a range of digital signal processor algorithms.
The ADSP-21161N is fabricated on a state-of-the-art high-speed, low-power CMOS process with an instruction cycle of 10 ns or 9 ns. With its SIMD computing hardware running at 110MHz, the adsp-21161n can perform 660 million floating point operations per second. Table 1 shows the performance benchmarks of the ADSP-21161N. These benchmarks provide single-channel extrapolation of measured dual-channel processing performance.
The ADSP-21161N continues SHARC's industry-leading DSP integration standard, combining a high-performance 32-bit DSP core with integrated system-on-chip functionality. These features include 1 Mbit dual port SRAM memory, host processor interface, i/o processor supporting 14 dma channels, 4 serial ports, 2 link ports, sdram controller, spi interface, external parallel bus and glue-free multiprocessing.
The block diagram of the ADSP-21161N illustrates the following architectural features:
* Two processing elements, each consisting of an ALU, pliers, shifters, and data register files; data address generators (DAG1, DAG2); program sequencer with instruction cache; PM and DM buses capable of supporting Four 32-bit data transfers between memory and core per core processor cycle; interval timer; on-chip SRAM (1M bits); SDRAM controller for glueless interface to SDRAM.
*External ports supporting the following functions: Interfacing with off-chip memory peripherals.
*Glueless multiprocessing support for six ADSP-21161N SHARCs; host port read/write IOP registers; DMA controller; four serial ports; two link ports; SPI compatible interface; JTAG test access port; 12 general purpose I/O pins.
ADSP-21161N Series Core Architecture
The ADSP-21161N includes the following structural features of the ADSP-2116X series core. The ADSP-21161N is compatible at the assembly level with the ADSP-21160, ADSP-21060, ADSP-21061, ADSP-21062 and ADSP-21065L.
simd calculation engine
adsp-21161n contains two computational processing elements that operate as a single instruction multiple data (simd) engine. The processing elements are called PEX and pey, and each contains an alu, multiplier, shifter, and register file. PEX is always active and PEY can be enabled by setting the PEYEN mode bit in the Mode1 register. When this mode is enabled, the same instructions are executed in both processing elements, but each processing element operates on different data. The architecture is capable of efficiently executing math-intensive DSP algorithms.
Entering simd mode also affects how data is transferred between memory and processing elements. In simd mode, twice the data bandwidth is required to sustain computational operations in the processing unit. Because of this requirement, entering simd mode also doubles the bandwidth between memory and processing elements.
When transferring data using dags in simd mode, two data values are transferred each time the memory or register file is accessed. simd only supports internal memory access, not off-chip access.
Independent parallel computing unit
Within each processing unit is a set of computational units. Computational units include arithmetic/logic units (ALUs), multipliers, and shifters. These units execute single-cycle instructions. The three units in each processing unit are arranged in parallel to maximize computational throughput. A single multifunction instruction performs parallel arithmetic unit and multiplier operations. In simd mode, parallel alu and multiplier operations occur simultaneously in both processing units. These computing units support IEEE 32-bit single-precision floating-point, 40-bit extended-precision floating-point and 32-bit fixed-point data formats.
data register file
A general purpose data register file is included with each processing element. The register file transfers data between the computational unit and the data bus and stores intermediate results. This 10-port, 32-register (16 primary, 16 secondary) register file combined with the SHARC-enhanced harvard architecture allows for unconstrained data flow between compute units and memory. The registers in PEX are called R0–R15, and the registers in PEY are called S0–S15.
Single-Cycle Fetch of Instruction and Four Operands
The ADSP-21161N uses an enhanced Harvard architecture in which the data memory (DM) bus transfers data and the program memory (PM) bus transfers instructions and data (see Figure 2). Using the ADSP-21161N's separate program and data memory buses and on-chip instruction cache, the processor can simultaneously fetch four operands (two operands per data bus) and one instruction (slave cache) in one cycle.
instruction cache
The ADSP-21161N includes an on-chip instruction cache that supports triple-bus operations to fetch one instruction and four data values. The cache is selective, and only instructions that fetch conflict with pm bus data access are cached. This cache supports full-speed execution of cores, loop operations (such as digital filter multiply-accumulate), and fft butterfly processing.
Data address generator with hardware circular buffer
The ADSP-21161N's two data address generators (DAGs) are used for indirect addressing and implementing circular data buffers in hardware. Circular buffers allow efficient programming of delay lines and other data structures required for digital signal processing, typically used in digital filters and Fourier transforms. The two DAGs of the ADSP-21161N contain enough registers to create up to 32 circular buffers (16 primary register sets, 16 secondary register sets). DAG handles address pointer wrapping automatically, reducing overhead, improving performance, and simplifying implementation. A circular buffer can start and end at any memory location.
Flexible instruction set
The 48-bit instruction word accommodates various parallel operations for concise programming. For example, the adsp-21161n can conditionally perform multiplication, addition, and subtraction in two processing elements, while branches can be performed in a single instruction.
ADSP-21161N Memory and I/O Interface Capabilities
The ADSP-21161N adds the following architectural features to the ADSP-2116X series core.
Dual-port on-chip memory
The ADSP-21161N contains 1 Mbit of on-chip SRAM organized in two blocks of 0.5 Mbit (Figure 3). Each block can be configured for different combinations of code and data storage. Each memory block is dual-ported and independently accessed by the core processor and the I/O processor in a single cycle. The dual-port memory combines three independent on-chip buses, allowing two transfers of data from the core and the I/O processor in one cycle. On the adsp-21161n, memory can be configured for up to 32k words of 32-bit data, up to 64k words of 16-bit data, up to 21k words of 48-bit instructions (or 40-bit data), or up to 1 Mbit of various word sizes combination. All memory can be accessed as 16-bit, 32-bit, 48-bit or 64-bit words. A 16-bit floating-point storage format is supported, effectively doubling the amount of data that can be stored on-chip. Conversion between 32-bit floating-point and 16-bit floating-point formats is done in one instruction. Although each memory block can store a combination of code and data, access is most efficient when one block uses the dm bus to store data for transfers and the other block uses the pm bus to store instructions and data for transfers. Using the dm bus and the pm bus, each memory block has a dedicated bus, ensuring single-cycle execution and two data transfers. In this case, the instruction must be available in the cache.
Off-Chip Memory and Peripheral Interfaces
The external ports of the ADSP-21161N provide the interface between the processor and off-chip memory and peripherals. The ADSP-21161N's unified address space includes 62.7Mwords of off-chip address space (if all SDRAM is 254.7Mwords). Separate onchip buses for pm address, pm data, dm address, dm data, i/o address and i/o data are multiplexed on external ports to create a single 24-bit address bus and a single 32-bit data bus the external system bus. Every access to external memory is based on taking the address of a 32-bit word. When an instruction is fetched from external memory, two 32-bit data locations are accessed for the compressed instruction. Unused link port lines can also be used as additional data lines data15–data0, allowing single-cycle execution of instructions from external memory at frequencies up to 110MHz. Figure 4 shows the alignment of various accesses to external memory.
External ports support asynchronous, synchronous, and synchronous burst access. Sync burst SRAM can be glueless. adsp-21161n can also be connected with sdram without glue. The bank select signal is generated by on-chip decoding of high-order address lines, which facilitates addressing of external storage devices. The ADSP-21161N provides programmable memory wait states and external memory acknowledgement control, allowing interfacing with memory and peripherals with variable access, hold, and disable time requirements.
SDRAM interface
The sdram interface enables the adsp-21161n to transfer data to synchronous dram (sdram) at the core clock frequency or half the core clock frequency. The synchronization method, coupled with the core clock frequency, supports high-throughput data transfers, up to 440mbytes/sec for 32-bit transfers and 660mbytes/sec for 48-bit transfers.
The sdram interface provides a glueless interface, standard sdrams - 16mb, 64mb, 128mb and 256mb - and includes options to support additional buffers between the ADSP-21161N and SDRAM. The sdram interface is very flexible, capable of connecting sdram to any of the four external repositories of the adsp-21161n, up to four repositories are mapped to sdram. Systems with multiple sdram devices connected in parallel may require buffering to meet overall system timing requirements. The adsp-21161n supports the piping of address and control signals to achieve this buffering between itself and multiple sdram devices.
Target board JTAG emulator connector
The analog device dsp-tools product line of the jtag emulator uses the ieee 1149.1jtag test access port of the adsp-21161n processor to simulate and monitor the target board processor. The Analog Devices DSP Tool Line JTAG Simulator provides emulation at full processor speed, allowing inspection and modification of memory, registers and processor stacks. The processor's jtag interface ensures that the emulator does not affect target system loading or timing. For complete information on the SHARC line of analog device DSP tools for JTAG emulator operation, see the appropriate emulator hardware user guide.
DMA controller
The ADSP-21161N's on-chip DMA controller enables zero-overhead data transfers without processor intervention. The dma controller operates independently and invisible to the processor core, allowing dma operations while the cores are executing their program instructions concurrently. DMA transfers can occur between the adsp-21161n's internal memory and external memory, external peripherals, or the host processor. DMA transfers are also possible between ADSPs - the 21161N's internal memory and its serial port, link port, or SPI-compatible (Serial Peripheral Interface) port. Performs external bus packing and unpacking of 32-bit, 48-bit or 64-bit words in internal memory during DMA transfers from 8-bit, 16-bit or 32-bit wide external memory. There are 14 DMA channels on the ADSP-21161N-2, 8 through the serial port and 4 through the processor's external port (for host processor, other ADSP-21161N, memory, or I/O transfers) on the SPI interface and link ports. Programs can be downloaded to adsp-21161n using dma transfer. Asynchronous off-chip peripherals can control two DMA channels (dmar2–1, DMAG2–1) using the DMA request/grant lines. Other DMA features include generating interrupts when a DMA transfer is complete, and a DMA chain for automatically chaining DMA transfers.
multiprocessing
adsp-21161n provides powerful features for multiprocessing dsp systems. External ports and link ports provide integrated glueless multiprocessing support.
The external ports support a unified address space (see Figure 3) that allows direct access to each ADSP21161N's internal memory-mapped (I/O processor) registers. All other internal memory can be accessed indirectly by initiating a DMA transfer by programming the iop DMA parameters and control registers. Distributed bus arbitration logic is included on-chip for simple, glue-free connection of systems containing up to six ADSP-21161Ns and a host processor. The main processor switch only incurs an overhead of one cycle. Bus arbitration can choose between fixed priority or rotating priority. A bus lock enables an indivisible read-modify-write sequence for a semaphore. Provides vectored interrupts for interprocessor commands. Using an instruction rate of 110mhz, the maximum throughput of inter-processor data transfer via the external port is 440mbytes/sec.
Two link ports provide a second method of multiprocessing communications. Each link port can support communication with another ADSP-21161N. The ADSP-21161N operates at 110MHz and has the maximum throughput for interprocessor communication over a 220Mbyte/sec link. Link ports and cluster multiprocessing can be used simultaneously or independently.
link port
The ADSP-21161N has two 8-bit link ports that provide additional I/O capabilities. Since each link port is capable of operating at 110mhz, a rate of 110mbytes/sec can be supported. Link port i/o is especially useful for point-to-point interprocessor communication in multiprocessing systems. Link ports can work independently and simultaneously with a maximum data throughput of 220mbytes/sec. Link port data is packed into 48-bit or 32-bit words that can be read directly by the core processor or transferred by DMA to on-chip memory. Each link port has its own double-buffered input and output registers. The clock/answer handshake controls link port transfers. Transmissions can be programmed to send or receive.
serial port
The ADSP-21161N features four synchronous serial ports that provide an inexpensive interface to a variety of digital and mixed-signal peripherals. Each serial port consists of two data lines, a clock and frame synchronization. The data lines can be programmed to transmit or receive.
Serial ports operate at up to half the core clock rate, with a maximum data rate of 55m bit/s per port. Serial data pins can be programmed as transmitters or receivers, providing greater flexibility for serial communications. Serial port data can be automatically transferred between on-chip memories via dedicated DMA. Each serial port has a time division multiplexing (TDM) multi-channel mode, where two serial ports are TDM transmitters and two serial ports are TDM receivers (Sport0 Rx paired with Sport2 TX, Sport1 Rx with Sport3 TX pair). Each serial port also supports the i2s protocol (an industry standard interface commonly used by audio codecs, adc, and dac) with two data pins, allowing four i2s channels per serial port (using two i2s stereo devices ), up to 16 i2s channels. The serial port allows selection of little-endian or big-endian transfer formats and word lengths from 3 to 32 bits. For i2s mode, the data word length is selectable between 8 and 32 bits. The serial port offers selectable synchronization and transmission modes and selectable μ-law or a-law companding. Serial port clock and frame synchronization can be generated internally or externally.
Serial Peripheral (Compatible) Interface
The Serial Peripheral Interface (SPI) is an industry-standard synchronous serial link that enables the ADSP-21161N SPI-compatible port to communicate with other SPI-compatible devices. spi is a 4-wire interface consisting of two data pins, a device select pin and a clock pin. It is a full-duplex synchronous serial interface that supports master and slave modes. The SPI port can operate in a multi-master environment, it can interface with up to four other SPI-compatible devices (either as master or slave). The ADSP-21161N SPI-compatible peripheral implementation also features programmable baud rate and clock phase/polarity. The ADSP-21161N SPI-compatible port uses an open-drain driver to support multi-master configurations and avoid data races.
host processor interface
The ADSP-21161N host interface allows easy connection to a standard 8-, 16-, or 32-bit microprocessor bus without requiring additional hardware. The host interface is accessed through the external port of the ADSP-21161N. The host interface has four DMA channels; code and data transfers are done with low software overhead. The host processor uses the host bus to request the ADSP-21161N's external bus request (HBR), host bus grant (HBG) and chip select (CS) signals. The host can directly read and write the internal iop registers of the adsp-21161n, and can access the dma channel settings and message registers. Passing the host's dma settings will allow it to access any internal memory address via the dma transfer. Vectored interrupt support provides efficient execution of host commands.
The host-processor interface can be used for multiprocessor or uniprocessor sharc systems. For multiprocessor systems, host access to the sharc requires low driver address pins addr17, addr18, addr19, and addr20.
It is not enough to just connect these pins to ground through a resistor (eg 10k ohms). These pins must be driven low with a strong enough drive force (10–50 ohms) to overcome the SHARC retainer latches present on these pins. If the drive provided is not strong enough, data access failures may occur.
For uniprocessor SHARC systems using this host access feature, the address pins ADDR17, ADDR18, ADDR19, and ADDR20 may be constrained low (e.g., via a 10K ohm resistor), driven by a buffer/driver, or left floating. Either option is sufficient.
General purpose I/O ports
The ADSP-21161N also contains 12 programmable general-purpose I/O pins that can be used as inputs or outputs. As outputs, these pins can send signals to peripheral devices; as inputs, these pins can provide tests for conditional branches.
program start
The internal memory of the ADSP-21161N can be booted from an 8-bit EPROM, the host processor, the SPI interface, or through one of the link ports at system power-up. Selecting the boot source is controlled by the boot memory selection (bms), eboot (eprom boot), and link/host boot (lboot) pins. An 8-bit, 16-bit or 32-bit host processor can also be used for booting.
Phase Locked Loop and Crystal Dual Enable
The adsp-21161n uses an on-chip phase-locked loop (pll) to generate the core's internal clock. The clk_cfg1–0 pins are used to select 2:1, 3:1, and 4:1 ratios. In addition to PLL ratios, the CLKDBL pin can be used for more clock ratio options. The rate (1/2 × CLKIN) set by the CLKDBL pin determines the rate of the PLL input clock and the operating rate of the external port. Together with support for clk_cfg1–0 and clkdbl, the ratios between core and clkin are 2:1, 3:1, 4:1, 6:1 and 8:1.
power supply
The ADSP-21161N has separate power connections for analog (AVDD/AGND), internal (VDDINT), and external (VDDEXT) power supplies. Internal and analog supplies must meet the 1.8 V requirement. The external power supply must meet the 3.3V requirement. All external power pins must be connected to the same power supply.
Note that the analog power supply (AVDD) powers the clock generator PLL of the ADSP-21161N. To generate a stable clock, an external circuit is provided to filter the power supply input to the AVDD pin. Place filters as close to the pins as possible. The AVDD filter circuit shown in Figure 6 must be added to each ADSP-21161N in a multiprocessor system. To prevent noise coupling, use wide traces for the analog ground (agnd) signal and install decoupling capacitors as close to the pins as possible.