-
2022-09-23 11:34:48
SPARTAN System Level Function - selectram Hierarchical Memory
Introducing SPARTAN? -II Field Programmable Gate Array series provides users with high performance, rich logic resources and rich function set at a very low price. This six-person family offers system door densities ranging from 15,000 to 200,000. System performance supports up to 200 MHz. Features include block RAM (up to 56K bits), distributed RAM (up to 75264 bits), 16 selectable I/O standards, and 4 DLLs. Fast, predictable interconnects mean successive design iterations continue to meet timing requirements.
The Spartan II family is a superior alternative to mask-programmed ASICs. FPGAs avoid the initial costs, lengthy development cycles and inherent risks of traditional ASICs. Additionally, FPGA programmability allows design upgrades in the field without hardware replacement (not possible with ASIC).
Features
2nd Generation ASIC Replacement Technology - Densities up to 5292 Logic Cells, Up to 200,000 System Gates - Streamlined Features Based on Virtex® FPGA Architecture - Infinite Reprogrammability - Very Low Cost - Cost-Effective 0.18 Micron Process System Level Functions - Selectram Hierarchical memory: 16-bit/lut distributed ram Configurable 4k-bit block ram Fast interface to external ram - fully pci compliant - low power segment routing architecture - proven full readability/observability - Dedicated carry logic for high speed arithmetic - Efficient multiplier support - Cascaded chaining for wide input functions - Rich registers/latches with enable, set and reset functions - Four dedicated dlls for advanced clock control - Four Main Low Skew Global Clock Distribution Nets - IEEE 1149.1 Compatible Boundary Scan Logic 8226 ; Versatile I/O and Packages - No PB Package Options - Offers Low Cost Packages at All Densities - Family Package Compatibility in Universal Packages - 16 high performance interface standards - Hot-swappable compact PCI friendly - Zero hold time simplifies system clocking 2.5V for core logic and 1.5V, 2.5V or 3.3V for I/O Xilinx® ISE® Development System Support - Fully automated mapping, placement and routing
Features Second-generation ASIC replacement technology
- Density up to 5292 logic cells
200000 system doors
- Reduced functionality based on virtex® fpga architecture
- Unlimited reprogrammability
- very low cost
- Cost-effective 0.18-micron process Spartan II FPGA family member System Level Features - SelectRAM™ Hierarchical Memory:
16-bit/lut distributed ram
Configurable 4K-bit block RAM
· Fast interface to external RAM
- Fully PCI compliant
- Low-power segment routing architecture
- Complete readability of verification/observation
- Dedicated carry logic for high-speed operations
- Effective multiplier support
- Cascading chain of wide input functions
- Numerous registers/latches with enable, set, reset functions
- Four dedicated dlls for advanced clock control
- Four main low-skew global clock distribution nets
- IEEE 1149.1 Compliant Boundary Scan Logic Multifunction I/O and Package
- PB free tier option
- Low cost packaging in various densities
- Family footprint compatibility in public packs
- 16 high performance interface standards
- Hot-swappable compact PCI friendly
- Zero hold time simplifies system timing with 2.5V for core logic supply and 1.5V for I/O supply,
2.5V or 3.3V fully supported by the powerful XilinxISE® development system
- Fully automatic mapping, placement and routing
Overview Spartan II series FPGAs have a regular, flexible, programmable Configurable Logic Block (CLB) architecture surrounded by programmable peripheral Input/Output Blocks (IOBs). There are four Delay Locked Rings (DLLs), one at each corner of the die. Two columns of block RAM are located on opposite sides of the die, between the CLB and IOB columns. These functional elements are interconnected through a robust hierarchy of versatile routing channels
Spartan II FPGAs are customized by loading configuration data into internal static memory cells. An infinite number of reprogramming is possible using this method. The stored values in these cells determine the logic function and interconnection implemented in the FPGA. Configuration data can be read from an external serial prom (master serial mode) or written to the FPGA in slave serial, slave parallel or boundary scan mode.
Spartan II FPGAs are typically used in high-volume applications, and the versatility of a fast programmable solution adds to the benefits. Spartan II FPGAs are ideal for shortening product development cycles while providing a cost-effective solution for high-volume production.
Spartan Type II FPGAs enable high-performance, low-cost operation through advanced architecture and semiconductor technology. Spartan II devices offer system clock rates up to 200 MHz. In addition to the traditional advantages of high-density programmable logic solutions, Spartan II FPGAs provide on-chip synchronous single- and dual-port RAM (block and distributed forms), dynamic link library clock drivers, programmable settings on all flip-flops, and reset, fast carry logic and many other functions.
Spartan II Product Availability The maximum number of user I/Os available on the device and the number of user I/Os available per device/package combination. The four global clock pins can be used as additional user I/O when not used as global clock pins. These pins are not included in the user I/O count.
Architectural Description Spartan II Field Programmable Gate Array
The SpartanII field programmable gate array, shown in Figure 2, consists of five main configurable elements:
• IOB provides interface between package pins and internal logic • CLB provides functional elements to build most logic • Dedicated block RAM memory of 4096 bits each • Clock dll for clock distribution delay compensation and clock domain control
• Multifunctional multilayer interconnect structure
The CLB forms the central logical structure with easy access to all support and routing structures. IOBs are located around all logic and memory elements to facilitate fast routing of signals on and off-chip.
Values stored in static storage cells control all configurable logic elements and interconnect resources. These values are loaded into the memory unit at power-up and can be reloaded if the functionality of the device needs to be changed.
The following sections discuss these elements in detail.
input/output block
These three iob registers function either as edge-triggered d-type flip-flops or as level-sensitive latches. Each IOB has a clock signal (clk) shared by the three registers and an independent clock enable (ce) signal for each register. In addition to the CLK and CE control signals, the three registers share a SET/RESET (set/reset). For each register, this signal is independently configurable as synchronous set, synchronous reset, asynchronous preset, or asynchronous clear.
Not shown in the block diagram, but a software-controlled feature is the polarity control. Input and output buffers and all iob control signals have independent polarity control.
Optional pull-up and pull-down resistors and optional weak holdover circuits are connected to each pad. All outputs not related to configuration are forced into a high impedance state prior to configuration. Pull-down resistors and weak hold circuits do not work, but the input can be selectively pulled up.
All pads are protected against damage from electrostatic discharge (ESD) and overvoltage transients. Two forms of overvoltage protection are provided, one that allows 5V compliance and one that does not. For 5V compliance, the zener type structure connected to ground turns on when the output rises to about 6.5V. When 5V compliance is not required, a conventional clamp diode can be connected to the output supply voltage VCCO. The type of overvoltage protection can be selected individually for each pad.
All Spartan II FPGA IOBs support IEEE1149.1 compliant boundary scan testing.
input path
Buffers in the Spartan II FPGA IOB input path route input signals directly to internal logic or through optional input flip-flops.
There is an optional delay element on the D input of this flip-flop, which eliminates the pad-to-pad hold time. The delay matches the FPGA's internal clock distribution delay, ensuring that the pad-pad hold time is zero when used.
Each input buffer can be configured to conform to any supported low voltage signaling standard. In some of these standards, the input buffer utilizes a user-supplied threshold voltage vref.
Each input has optional pull-up and pull-down resistors for use after configuration.
Output Path The output path includes a tri-state output buffer that drives the output signal onto the pad. Output signals can be routed to buffers directly from internal logic or via optional iob output flip-flops.
Tri-state control of the outputs can also be routed directly from internal logic or by providing toggles that synchronize enable and disable.
Each output driver is individually programmable for a variety of low-voltage signaling standards. Each output buffer can source up to 24mA and sink up to 48mA. Drive strength and slew rate controls minimize bus transients.
In most signaling standards, the output high voltage depends on the externally supplied vcco voltage. The need to supply VCCOs imposes restrictions on the use of the standard. See "I/O Bank".
An optional weak holdover circuit is connected to each output. When selected, the circuit monitors the voltage on the pad and weakly drives the pin high or low to match the input signal. If the pin is connected to a multi-source signal, the weak keeper will hold the signal in its last state if all drivers are disabled. Maintaining an efficient logic level in this way helps eliminate bus jitter.
Since the weak holdover circuit uses the iob input buffer to monitor the input level, the appropriate vref voltage must be provided if required by the signaling standard. The supplied voltage must comply with the I/O bank rules.
Input Output Banking Some of the I/O standards described above require VCCO and/or VREF voltages. These voltages are connected externally to device pins that serve a set of IOBs called a battery pack. Therefore, there are restrictions on which I/O standards can be combined in a given bank.
Eight I/O banks divide each edge of the FPGA into two banks. Each bank has multiple VCCO pins that must be connected to the same voltage. The voltage is determined by the output standard in use.
Some input standards require a user-supplied threshold voltage vref. In this case, some user I/O pins are automatically configured as inputs for the VREF voltage. About one-sixth of the I/O pins in the bank take on this role.
The VREF pins within a group are interconnected internally, so only one VREF voltage can be used within each group. However, all VREF pins in the group must be connected to an external voltage source for proper operation.
In a group, inputs that require vref can be mixed with those that do not require vref, but only one vref voltage can be used in a group. The input buffer using vref does not allow 5v. LVTTL, LVCMOS2 and PCI are 5V tolerant. The VCCO and VREF pins for each group appear in the device pin-out table.
In a given package, the number of vref and vcco pins can vary depending on the size of the device. In larger devices, more I/O pins are converted to VREF pins. Since these are a superset of the vref pins used for smaller devices, it is possible to design PCBs that allow migration to larger devices. All VREF pins of the intended largest device must be connected to the VREF voltage and not used for I/O.
Configurable Logic Module Spartan Type II Field Programmable Gate Array CLB The basic building block is the Logic Cell (LC). The lc includes a 4-input function generator, carry logic and storage elements. The function generator output in each lc drives the clb output and d input of the flip-flop. Each Spartan II FPGA CLB contains four LCSs, organized in two similar slices. In addition to the four basic LCs, the spartan II FPGA CLBs also contain logic that combines function generators to provide functions with five or six inputs .
lookup table
The FPGA function generator for Spartan II is implemented as a 4-input look-up table (LUT). In addition to running as a function generator, each LUT can also provide 16 x 1 bits of synchronous ram. Additionally, two LUTs on-chip can be combined to create 16 x 2-bit or 32 x 1-bit synchronous ram, or 16 x 1-bit dual-port synchronous ram.
The spartan ii fpga lut can also provide a 16-bit shift register, ideal for capturing high speed or burst mode data. This mode can also be used to store data in applications such as digital signal processing
storage element
The storage elements in the spartan ii fpga chip can be configured as edge-triggered d-type flip-flops or level-sensitive latches. The d input can be driven by an on-chip function generator or directly by the on-chip input, bypassing the function generator.
In addition to the clock and clock enable signals, each slice also has synchronous set and reset signals (sr and by). sr forces the storage element into the initialization state specified for it in the configuration. Force it into the opposite state. Alternatively, these signals can be configured to operate asynchronously.
All control signals are independently reversible and shared by two flip-flops on-chip.
Additional logic The f5 multiplexer in each slice combines the function generator output. This combination provides a function generator that can implement any 5-input function, a 4:1 multiplexer, or a selected function with up to 9 inputs.
Similarly, the f6 multiplexer combines the outputs of all four function generators in clb by selecting one of the f5 multiplexer outputs. This allows any 6-input function, 8:1 multiplexer, or selected function of up to 19 inputs to be implemented.
There are four direct feedthrough paths per CLB, one per LC. These paths provide additional data entry lines or additional local routes that do not consume logic resources.
Arithmetic logic dedicated carry logic provides high-speed operation. The Spartan II FPGA CLB supports two independent carry chains, one for each slice. The height of the carry chain is two bits per CLB.
Arithmetic logic includes an XOR gate that allows a 1-bit full adder to be implemented in lc. In addition, dedicated AND gates improve the efficiency of the multiplier implementation.
A dedicated carry path is also available for cascading function generators to implement wide logic functions.
Tri-state buffers Each Spartan II FPGA CLB contains two 3-state drivers (bufts) that can drive the on-chip bus. See "Private Routing", page 12. Each Spartan II FPGA BUFT has an independent 3-state control pin and an independent input pin.
block memory
Spartan II FPGAs contain several large block RAM memories. They complement distributed RAM look-up tables (LUTs), which provide shallow memory structures implemented in CLBs.
Block ram memory blocks are organized in columns. All Spartan II devices contain two of these posts, one for each vertical edge. These pillars extend the full height of the chip. Each memory block is four CLB highs, so a Spartan II device of eight CLB highs would contain two memory blocks per rank, for a total of four blocks.
Programmable Routing Matrix It is the longest delay path, limiting the speed of any worst-case design. Therefore, the Spartan II routing architecture and its location and routing software are defined in a single optimization process. This joint optimization minimizes long path delays for optimal system performance.
Joint optimization also reduces design compilation time because the architecture is software friendly. Due to the shortened design iteration time, the design cycle is correspondingly reduced.
Local Routing Local routing resources provide the following three types of connections:
• Interconnection between LUTs, flip-flops, and general routing matrix (grm) • Internal CLB feedback path, providing high-speed connectivity for LUTs within the same CLB, linking them together with minimal routing delays • Between horizontally adjacent CLBs Provides a direct path between high-speed connections, eliminating GRM latency
Generic Routing Most Spartan II FPGA signals are routed on Generic Routing, so most interconnect resources are associated with this level of routing hierarchy. General routing resources are located in horizontal and vertical routing channels associated with row and column clbs. Generic routing resources are shown below.
• Adjacent to each CLB is a Generic Routing Matrix (GRM). GRM is a switching matrix that connects horizontal and vertical routing resources, and is also a means for CLB to gain access to general routing.
• 24 single wires route GRM signals to adjacent GRMs in four directions.
• 96 buffered hex lines in each of the four directions route GRM signals to other GRMs six blocks away. Organized in a staggered pattern, hex lines can only be driven at their endpoints. Hex line signals can be accessed at either the endpoints or the midpoint (three blocks away from the source). One third of the hex is bidirectional, the rest is unidirectional.
• The 12 long lines are buffered bi-directional wires for fast and efficient distribution of signals across equipment. Long vertical lines span the entire height of the device, and horizontal lines span the entire width of the device.
I/O Routing There are additional routing resources on the periphery of the Spartan II device that form the interface between the CLB array and the IOB. This extra routing, called Versaring, facilitates pin swapping and pin locking so that logic redesigns can fit into existing PCB layouts. Time to market is shortened because PCBs and other system components can be manufactured while the logic design is still in progress.
Dedicated Routing Certain types of signals require dedicated routing resources to maximize performance. In the Spartan II architecture, dedicated routing resources are provided for both types of signals.
• Provides horizontal routing resources for the on-chip 3-state bus. Each CLB row provides four divisible buses, allowing multiple buses in a row, as shown in Figure 7.
• Each CLB has two dedicated networks to propagate carry signals vertically to adjacent CLBs.
Global Routing Global routing resources distribute clocks and other signals with very high fanout throughout the device. Spartan II devices include two layers of global routing resources, called primary and secondary global routing resources.
• The main global routing resource is four dedicated global nets with dedicated input pins designed to distribute high fanout clock signals with minimal skew. Each global clock network can drive all CLB, IOB and block RAM clock pins. The main global net can only be driven by the global buffer. There are four global buffers, one for each global network.
• The second global routing resource consists of 24 backbones, 12 across the top of the chip and 12 across the bottom. From these lines, each column can pass through long lines in up to 12 columns. These secondary resources are more flexible than primary resources because they are not restricted to routing only to clock pins.
clock distribution
The Spartan II family provides high-speed, low-slope clock distribution through the major global routing resources described above.
Four global buffers are provided, two in the upper center of the device and two in the lower center of the device. These drive the four main global nets, which in turn drive any clock pins.
Four dedicated clock boards are provided, one adjacent to each global buffer. Inputs to the global buffer can be selected from these pads or from signals in the general routing. Global clock pins do not have options for internal weak pull-up resistors.
Delay lock loop (dll)
Associated with each global clock input buffer is an all-digital delay-locked loop (DLL) that eliminates skew between the clock input board and the internal clock input pins throughout the device. Each dll can drive two global clock networks. The dynamic link library monitors input clocks and distributed clocks, and automatically adjusts clock delay elements. An additional delay is introduced so that the clock edge reaches the internal flip-flop exactly one clock cycle after reaching the input. This closed-loop system effectively eliminates clock distribution delays by ensuring that the clock edge arrives at the internal flip-flop synchronously with the clock edge arriving at the input.
In addition to eliminating clock distribution delays, DLLs provide advanced control over multiple clock domains. The dynamic link library provides four quadrature phases of the source clock and can double the clock, or divide the clock by 1.5, 2, 2.5, 3, 4, 5, 8, or 16. It has six outputs.
The dynamic link library is also used as a clock mirror. The DLL can be used to eliminate board clocks across multiple Spartan II devices by driving the DLL's output off-chip and then turning it back on.
To ensure that the system clock works properly before the configured fpga starts, the DLL can delay the completion of the configuration process until after locking.
Boundary Scan Spartan II devices support all mandatory boundary scan commands specified in IEEE Standard 1149.1. A test access port (tap) and registers are provided for implementing the extest, sample/preload and bypass instructions. The tap also supports two user code instructions and an internal scan chain.
The wiper uses dedicated package pins and always operates with LVTTL. For the TDO to work with LVTTL, the VCCO of bank 2 must be 3.3V. Otherwise, TDO will switch to orbit between ground and VCCO. tdi, tms and tck have default internal weak pull-up resistors, tdo has no default resistors. The Bitstream option allows any of the four wiper pins to be set to have internal pull-ups, pull-downs, or neither.
Boundary scan operations are independent of individual iob configurations and are not affected by packet type. All IOBs, including unbound IOBs, are treated as separate 3-state bidirectional pins in a scan chain. Maintaining bidirectional test capability after configuration facilitates testing of external interconnects.
Spartan II gas. Internal signals can be caught during extest by connecting them to unbound or unused IOBs. They can also be connected to unused outputs of iobs defined as unidirectional input pins.
Common boundary scan directives are available before configuration. After configuration, public directives remain available along with any user code directives installed during configuration. While the sample and bypass instructions are available during configuration, it is recommended not to perform boundary scan operations during this transition.
In addition to the above-mentioned test commands, the boundary scan circuit can also be used to configure the fpga, and to read configuration data.
development system
Spartan II FPGAs are supported by Xilinx ISE® development tools. The basic methodology for Spartan II FPGA design consists of three interrelated steps: design entry, implementation, and verification. Industry standard tools are used for design entry and simulation, while xilinx provides dedicated architecture-specific implementation tools.
The xilinx development system is integrated under a single graphical interface, providing designers with a common user interface, regardless of their choice of input and validation tools. The software simplifies the selection of implementation options with drop-down menus and online help.
For hdl design entry, xilinx's fpga development system provides interfaces to multiple integrated design environments.
A standard interface file specification, Electronic Design Interchange Format (EDIF), simplifies file transfer to and from development systems.
Spartan II FPGAs are supported by a unified library of standard functions. This library contains over 400 primitives and macros ranging from 2-input AND gates to 16-bit accumulators, including arithmetic functions, comparators, counters, data registers, decoders, encoders, I/O functions, latches, boolean functions , multiplexers, shift registers, and barrel shifters.
The design environment supports hierarchical design entries. These hierarchical design elements are automatically combined by the implementation tool. Different design entry tools can be combined in a hierarchical design, allowing the most convenient entry method for each part of the design.
The Design Implementation Location and Path Tool (PAR) automatically provides the implementation flow described in this section. The partitioner is designed with the EDIF netlist and the logic is mapped into the FPGA's architectural resources (eg, CLBs and IOBs). The placer then determines the best locations for these blocks based on their interconnection and desired performance. Finally, routers interconnect the blocks.
The PAR algorithm supports fully automatic implementation of most designs. However, for demanding applications, the user can exercise varying degrees of control over the process. User partition, location, and routing information can optionally be specified during design entry. The realization of a highly structured design can benefit greatly from basic floorplanning.
Implementation software includes timing-driven placement and routing. Designers specify timing requirements along the entire path during design entry. The timing path analysis routine in the PAR then recognizes these user-specified requirements and adapts to them.
Timing requirements are entered in a form directly related to system requirements, such as target clock frequency or maximum allowable delay between two registers. In this way, the overall performance of the system along the entire signal path will automatically adjust to user-generated specifications. There is no need to provide specific timing information for individual networks.
Design Verification In addition to traditional software simulation, FPGA users can use in-circuit debugging techniques. Because xilinx devices are infinitely reprogrammable, designs can be verified in real-time without the need for extensive software simulation vector sets.
The development system supports software emulation and in-circuit debugging techniques. For simulation, the system extracts the post-placement timing information from the design database and re-annotates this information into the netlist for use by the simulator. Alternatively, users can use a static timing analyzer to verify timing-critical parts of the design.
For in-circuit debugging, the development system includes a download cable that connects the FPGA in the target system to a PC or workstation. After downloading the design to the fpga, the designer can read the contents of the flip-flop to observe the logic state inside the flip-flop. Simple modifications can be downloaded to the system in minutes.
Configuration Configuration is the process of loading the design bitstream generated by xilinx software into the FPGA's internal configuration memory. Spartan II devices support both serial configuration, using master/slave serial and JTAG modes, and byte-wide configuration, using slave parallel mode.
Profile Spartan II devices are configured by sequentially loading the data frames connected to the profile.
It's important to note that while a prom is typically used to store configuration data before loading it into the FPGA, it's not required. Various types of underpopulated non-volatile memory (i.e. hard drives, flash cards, etc.) that are already available on or off the board can be used
Signals There are two types of pins used to configure Spartan II devices: Dedicated pins perform only specific configuration-related functions; other pins can be used as general-purpose I/O after user operation has begun.
Dedicated pins include mode pins (M2, M1, M0).
Configure clock pins (cclk), program pins, done pins, and boundary scan pins (tdi, tdo, tms,
TCK). Depending on the configuration mode chosen, the cclk can be an output generated by the FPGA or externally generated and provided to the FPGA as an input.
Note that some configuration pins can be used as outputs. To function properly, these pins require a 3.3V VCCO to drive LVTTL signals, or 2.5V to drive LVCMOS signals. All "pinout tables" in module 4 and xapp176, Spartan II FPGA family configuration and readback.
The sequence of steps required to configure the Spartan II device is shown in Figure 11. The whole process can be divided into three different stages.
• Boot Configuration • Configuration Memory Clear • Load Data Frame • Boot The memory clear and boot phases are the same for all configuration modes; however, the steps to load the data frame are different. Therefore, the details of dataframe loading are described separately in the sections dedicated to each mode.
Startup Configuration The startup configuration process has two different methods: applying power to the device or asserting program input.
Configuration at power-up occurs automatically unless delayed by the user, as described in a separate section below. Before starting configuration, VCCO bank 2 must be greater than 1.0V.
Additionally, all VCCINT supply pins must be connected to a 2.5V supply.
Once in user action, the device can be reconfigured by simply pulling the program pins low. The device confirms the start of the configuration process by driving done low, and then enters the memory clearing phase.
Clear Configuration Memory Device Indicates that clearing configuration memory is in progress by setting the initial value to low. At this point, the user can delay configuration by keeping program or init low, which will cause the device to remain in the memory clear phase. Note that the bidirectional initialization line is driving a low logic level during memory clear. To avoid contention, use an open-drain driver to keep init low.
In the absence of a delay, the device indicates that the memory is completely clear by driving init high. During this low-to-high transition, the FPGA samples its mode pin.
Loading configuration data Once init is high, the user can start loading the configuration data frame into the device. The details of loading configuration data are discussed in the separate chapters dealing with configuration modes. The sequence of operations required to load configuration data using serial mode is shown in Figure 14. Loading data using slave parallel mode is shown in Figure 19 on page 25.
CRC error checking During loading of configuration data, the crc value embedded in the configuration file is checked against the crc value calculated in the FPGA. If the crc values do not match, the fpga will start low to indicate a framing error and abort configuration.
To reconfigure the device, the program pin should be asserted to reset the configuration logic. Recycling power also resets the configured FPGAs. See "Clearing Configuration Memory".
The boot boot sequence supervises the transition of the FPGA from the configuration state to full user operation. A match of the crc value, indicating a successful load of configuration data, starts the sequence.
During startup, the device performs four actions:
1. Completed assertions. A done to go high failure may indicate a configuration data load failure.
2. Launch of the Global Tri-State Network. This will activate the I/O that assigns the signal. The remaining I/Os remain in a high-impedance state with weak internal pull-down resistors.
3. Negative Global Settings Reset (GSR). This allows all triggers to change state.
4. Global write enable (gwe) assertion. This allows all ram and triggers to change state.
By default, these operations are synchronized with cclk.
The entire boot sequence lasts 8 cycles, called C0-C7, after which the loaded design works perfectly fine.
The bottom half shows another commonly used version of startup timing, called Sync to Completion. This version makes gts, gsr and gwe events conditional on done pin going high. This timing is important for daisy-chaining multiple FPGAs in serial mode, as it ensures that all FPGAs start together after all done pins go high.
Select sync to done timing by setting the gts, gsr and gwe periods to done values in the configuration options. This causes these signals to transition one clock cycle after completing an external high transition.
Serial Mode There are two serial configuration modes: In master serial mode, the FPGA controls the configuration process by driving the cclk as an output. In slave serial mode, the fpga passively receives cclk as input from an external agent that controls the configuration process (such as a microprocessor, cpld, or a second fpga in master mode). In both modes, the FPGA is configured by loading one bit per cclk cycle. The msb of each configuration data byte is always written to the DIN pin first.
Load data into the Spartan II Series Field Programmable Gate Array. This is an extended config data frame for the "load" block. Note that cs and write are not normally used during serial configuration. To ensure successful FPGA loading, do not toggle cs low write during serial configuration.
Slave Serial Mode In slave serial mode, the cclk pin of the FPGA is driven by an external source, allowing the FPGA to be configured from other logic devices such as a microprocessor or in a daisy-chain configuration. Figure 15 shows the master serial fpga slave prom configuration slave serial fpga connection. The Spartan II device in slave serial mode should be connected to the third device on the left as shown. Slave serial mode is selected by <11x> on the mode pins (m0, m1, m2).
Timing for slave serial configuration. The serial bit stream must be set on the DIN input pin a short time before each rising edge of the externally generated cclk.
Multiple FPGAs in slave serial mode can be daisy-chained from one source to configure. For a serial daisy chain, the maximum amount of data that can be sent to a dout pin is 220-1 (1048575) 32-bit words, or 33554400 bits, which is about 25 xc2s200 bit streams. The configuration bitstream for downstream devices is limited to this size.
After an FPGA is configured, the data for the next device is routed to the DOUT pins. Two-pin data change on the rising edge of CCLK. Configuration must be delayed
Master Serial Mode In master serial mode, the fpga's cclk output drives the xilinx prom, which feeds the serial stream of configuration data to the fpga's din input.
pins (M0, M1, M2). The prom reset pin is driven by init and the ce input is driven by done. This interface is the same as the slave serial mode except that an internal to fpga oscillator is used to generate the configuration clock (cclk). Any different frequency from 4 to 60 MHz can be set using the configrate option in the xilinx software. At power up, when the first 60 bytes of configuration data are loaded, the cclk frequency is always 2.5mhz. The frequency is used until the configrate bits, part of the config file, are loaded into the FPGA, at which point the frequency will be changed to the selected configrate. The default configuration rate is 4 MHz unless a different frequency is specified in the design. The frequency of the cclk signal generated by the internal oscillator deviates from the specified value by +45%–30%.
From parallel mode From parallel mode is the fastest configuration option. Byte-range data is written to the fpga. A busy flag is provided to control data flow with clock frequency fccnh above 50mhz.
Spartan II devices use slave parallel mode. Slave parallel mode is selected by a<011> on the mode pins (m0, m1, m2).
If using .bit, .rbt, or non-interchange hexadecimal format configuration files for parallel programming, the most significant bits (that is, the leftmost bits of each configuration byte, as shown in a text editor) must be routed to the d0 input on the fpga.
The proxy control configuration is not shown. Typically, a processor, microcontroller or cpld controls the slave parallel interface. The control agent provides byte-range configuration data, CCLK, chip select (CS) signals, and write signals (write). If the fpga asserts busy (high), it must hold the data until busy goes low.
Once configured, the pins of the slave parallel port (d0-d7) can be used as additional user I/O. Alternatively, the port can be reserved to allow high-speed 8-bit
Multiple Spartan II FPGAs can be configured using Slave Parallel Mode and have them boot simultaneously. To configure multiple devices this way, connect the individual cclk, data, write, and busy pins of all devices in parallel. Load each device individually by asserting each device's cs pin in turn and writing the appropriate data. The sync to done startup timing is used to ensure that the startup sequence does not start until all fpgas have been loaded. see "Startup"
When using slave parallel mode, a write operation sends a byte-range configuration packet to the FPGA. Flowchart of the write sequence for loading data into the Spartan II field programmable gate array.
For this example, the user keeps write and cs low throughout the sequence of write operations. Note that when asserting cs on successive cclks, write must either keep asserting or de-asserting. Otherwise an abort is initiated, as described in the next section.
1. Drive data onto d0-d7. Note that to avoid contention the datasource should not be enabled when cs is low and write high. Likewise, when write is high, no
CS for multiple devices should be asserted.
2. On the rising edge of cclk: If busy is low, accept data on this clock. If busy is high (from a previous write), data is not accepted. Instead, an accept occurs on the first clock after busy goes low, and data must be held until that happens.
3. Repeat steps 1 and 2 until all data is sent
4. Deassert CS and write.
If cclk is slower than fccnh, the fpga will never assert busy. In this case, the above handshake is unnecessary,
Data for each cclk cycle can simply be fed into the FPGA.
Use a delay lock loop
Spartan II's family of FPGAs offers up to four all-digital dedicated on-chip delay-locked loop (DLL) circuits that provide zero propagation delay, low clock skew between output clock signals distributed throughout the device, and advanced clock domain control . These specialized dlls can be used to implement circuits that improve and simplify system-level designs.
It is important to introduce high-quality on-chip clock distribution. Clock skew and clock delay can affect device performance, and in large devices the task of managing clock skew and clock delay becomes more difficult using traditional clock trees. The Spartan II family of devices addresses this potential problem by providing up to four all-digital dedicated on-chip delay-locked loop (DLL) circuits that provide zero propagation delay and low clock skew between output clock signals distributed throughout the device.
Each dll can drive up to two global clock routing networks within the device. A global clock distribution network minimizes clock skew due to load differences. By monitoring a sample of the DLL's output clock, the DLL can compensate for delays on the routing network, effectively eliminating delays from external input ports to individual clock loads within the device.
In addition to providing zero latency to the user source clock, the DLL can also provide multiple phases of the source clock. The dll can also act as a clock multiplier, or it can divide the user source clock by 16.
Clock multiplication provides designers with many design options. For example, a 50mhz source clock multiplied by the dll can drive an FPGA design operating at 100mhz. This technique simplifies board design because the clock paths on the board no longer distribute such high-speed signals. Multiplying clocks also give designers the option of time-domain multiplexing, using a circuit twice per clock cycle, occupying less area than two copies of the same circuit.
The dll can also act as a clock mirror. By driving the dll out of the chip, and then in again, the dll can be used to smooth out board clock skew across multiple devices.
To ensure that the system clock is established before the device "wakes up", the DLL can delay the completion of the device configuration process until the DLL achieves lock.
Eliminating on-chip clock delays with dynamic-link libraries (DLLs) can greatly simplify and improve system-level designs involving high-fanout, high-performance clocks.
The clkdll primitives pin description library clkdll primitives provide access to the full set of dll functions required when implementing more complex applications using the dll.
source clock input -clkin
The clkin pin provides the user source clock (the clock signal when the dll is running) to the dll. The clkin frequency must be within the range specified in the datasheet. This clock signal must be supplied by a global clock buffer (bufg) driven from another clkdll or a global clock input buffer (ibufg) on the same edge (top or bottom) of the device.
Feedback Clock Input - CLKFB
The dynamic link library requires a reference or feedback signal to provide delay compensation output. Only connect the clk0 or clk2x dll output to the feedback clock input (clkfb) pin to provide the necessary feedback to the dll. This clock signal must be provided by the global clock buffer (bufg) or a global clock input buffer (ibufg) on the same edge (top or bottom) of the device.
If ibufg originates from clkfb pin, the following special rules apply.
1. The external input port must provide a signal to drive the ibufg i pin.
2. If both the CLK0 and CLK2X outputs drive off-chip devices, the CLK2X output must be fed back to the device.
3. This signal must drive obufs directly and nothing else.
These rules enable software to determine which dll clock output source clkfb pin.
RESET INPUT - RESET When reset pin rst is active, the lock signal is deactivated for four source clock cycles. The RST pin (active high) must be tied to a dynamic signal or ground. When the DLL delay tap is reset to zero, the DLL clock output pin may malfunction. Activation of the RST pin also heavily affects the duty cycle of the clock output pins. Also, the dll's output clocks no longer cancel each other out. The dll must be reset when the input clock frequency changes, if the device is reconfigured in boundary scan mode, if the device is hot swapped, and after device configuration if the input clock is not stable during the boot sequence.
2X clock output - CLK2X
The output pins clk2x provide a multiplied clock with automatic 50/50 duty cycle correction. Before CLKDLL achieves lock, the CLK2X output appears as a 1X version of the input clock with a 25/75 duty cycle. This behavior allows the DLL to lock on the correct edge relative to the source clock. This pin is not available on the clkdllhf primitive.
Clock divider output - CLKDV
The clock divider output pin, clkdv, provides a low frequency version of the source clock. The clkdv_u divide attribute controls clkdv such that the source clock is divided by n, where n is 1.5, 2, 2.5, 3, 4, 5, 8, or 16.
This feature provides automatic duty cycle correction. this
The clkdv output pin has a 50/50 duty cycle for all values of division factor n except non-integer division in high frequency (hf) mode. For a division factor of 1.5, the duty cycle of the high frequency mode is 33.3% high and 66.7% low. For a division factor of 2.5, the duty cycle of the high frequency mode is 40.0% high and 60.0% low.
1x Clock Out - CLK [0 90 180 270]
The 1X clock output pin clk0 represents the delay compensated version of the source clock (clkin) signal. The clkdll primitive provides three phase-shifted versions of the clk0 signal, while clkdllhf only provides a 180-degree phase-shifted version.
Lock Out - Lock In order to implement the lock, the DLL may need to sample several thousand clock cycles. After the dll is locked, the lock signal is activated. The "DLL Timing Parameters" section of Module 3 provides an estimate of lock time.
To ensure that the system clock is established before the device "wakes up", the DLL can delay the completion of the device configuration process until after the DLL is locked. The startup_wait property activates this feature.
The DLL output clock is invalid until the lock signal is activated, and may experience glitches, spikes, or other spurious movements. In particular the CLK2X output will appear as a 1X clock with a 25/75 duty cycle.
The dll properties attribute provides access to Spartan II series DLL functions such as clock division and duty cycle correction.
Duty Cycle Correction Characteristics
1x clock outputs, CLK0, CLK90, CLK180 and CLK270, use duty-cycle corrected defaults so that they display
50/50 duty cycle. The duty cycle correction property (true by default) controls this feature. To disable the DLL duty cycle correction output for the 1X clock, append the duty cycle correction=false attribute to the DLL primitive.
Design Considerations Use the following design considerations to avoid pitfalls and increase the success rate of designing with xilinx devices.
input clock
The dll's output clock signal, which is essentially a delayed version of the input clock signal, reflects any instabilities in the input clock in the output waveform. Therefore, the quality of the dll input clock is directly related to the quality of the output clock waveform generated by the dll. The DLL input clock requirements are specified in the "DLL Timing Parameters" section of the datasheet.
In most systems, a crystal oscillator generates the system clock. The dynamic link library can be used with any commercial crystal oscillator. For example, most crystal oscillators produce output waveforms with a frequency tolerance of 100ppm, which means a 0.01% variation in clock cycles. The dynamic link library works reliably on input waveforms with a frequency drift of up to 1 nanosecond, which exceeds the order of magnitude required to support any crystal oscillator in the industry. However, cycle-to-cycle jitter must be kept to less than 300ps at low frequencies and less than 150ps at high frequencies.
Changes in the input clock that change the period of the input clock by more than the maximum drift amount require a manual reset of clkdll. Failure to reset the dll will produce unreliable lock signals and output clocks.
The input clock can be stopped in a way that has little effect on the dll. Stopping the clock should be limited to about 100µs to keep device cooling to a minimum and maintain the validity of the current tap setting. The clock should stop during the low phase and should see a full high period when it resumes. During this time, the lock will be held high and will remain high when the clock is recovered. If these conditions may not be met in the design, apply a manual reset to the dll after restarting the input clock, even if the locked signal has not changed.
When the clocks are stopped, one to four clocks are still observed when the delay line is refreshed. When the clock restarts, one to four clocks will not observe the output clock because the delay line is full. The most common case is two or three clocks.
In a similar manner, a phase shift of the input clock is also possible. The phase shift will propagate to the output 1 to 4 clocks after the original shift, without interrupting clkdll control.
Output Clock As mentioned earlier in the dll pin description, some restrictions apply to the connection of the output pins. The dll clock output can drive the obuf, the global clock buffer bufg, or be routed directly to the target clock pin. The only bufgs that the dll clock output can drive are two bufgs located on the same edge (top or bottom) of the device. A dll output can drive multiple obufs; however, this adds skew.
Do not use the dll to output the clock signal until the lock signal is activated. Until the lock signal is activated, the DLL output clock is invalid and may experience glitches, spikes, or other spurious movements.
Using the block RAM function
The Spartan II FPGA family offers a dedicated on-chip, true dual read/write port synchronous RAM with 4096 memory cells. Each port of the block ram memory can be independently configured as a read/write port, read port and write port, and can be configured to a specific data width. Block RAM memory provides new features that enable FPGA designers to simplify their designs.
Operating Modes The block RAM memory supports two operating modes.
• Read-through • Write-back read (one clock edge)
The read address is registered on the read port clock edge, and the data appears on the output after the RAM access time. Some memories may put latches/registers on the output, depending on whether you want a faster clock output compared to the setup time. This is generally considered a lower solution because it changes the read operation to an asynchronous function and may lose address/control line transitions during the generation of the read pulse clock.
Write Back (Single Clock Edge)
The write address is registered on the write port clock edge, and the data input is written to memory and mirrored to the write port input.
Block RAM Features
1. All inputs are registered with the port clock and have timing specifications set to the clock.
2. All outputs are capable of reading and writing, depending on the state of the port. The output relative to the port clock is available after the clock-to-output timing specification.
3. Block ram is real sram memory, there is no combinatorial path from address to output. This feature is still available for lut units in clbs.
4. Complete independence between ports (ie clock, control, address, read/write function and data width) without arbitration.
5. Only one clock edge is required for a write operation.
6. A read operation requires only one clock edge.
The output ports are locked by an automatic timing circuit to ensure trouble-free reading. The state of the output port does not change until the port performs another read or write operation.
Using Multifunction I/O
The Spartan II FPGA family includes a highly configurable high-performance I/O resource called multipurpose I/O that supports multiple I/O standards. The versatile I/O resource is a powerful set of features including programmable control of output drive strength, slew rate, input delay and hold time. System-level design can be improved and simplified by taking advantage of the flexibility and general-purpose I/O features and design considerations described in this document.
Introduction As FPGAs continue to grow in size and capacity, so do the larger and more complex systems they are designed for that require higher I/O standards. Additionally, as system clock speeds continue to increase, the need for high-performance I/O becomes more important. While chip-to-chip latency has an increasing impact on overall system speed, the task of achieving the required system performance has become more difficult as low-voltage I/O standards become more common. General Purpose I/O is a revolutionary input/output resource for Spartan II devices that addresses this potential problem by providing a highly configurable, high-performance alternative to the I/O of more traditional programmable devices O resources. The versatile I/O features of the Spartan II FPGA combine the flexibility and time-to-market advantages of programmable logic with the high performance previously only available with ASICs and custom ICs.
Each general-purpose I/O block can support up to 16 I/O standards.
Supporting such a variety of I/O standards can support a variety of applications from general-purpose standard applications to high-speed low-voltage memory buses.
The multifunction I/O block also provides selectable output drive strength and programmable slew rate for LVTTL output buffers, as well as selectable, programmable weak pull-up, weak pull-down, or weak "hold" circuits ideal for For external bus applications.
Each input/output block (iob) includes three registers, each for input, output, and tri-state signals within the iob. These registers can optionally be configured as d-type flip-flops or level-sensitive latches.
The input buffer has an optional delay element to guarantee a zero hold time requirement for the input signal registered in the iob.
The multi-function I/O function also provides dedicated resources for input reference voltage (VREF) and output source voltage (VCCO), as well as a convenient banking system that simplifies board design.
The built-in features and multiple I/O standards supported by the versatile I/O feature can greatly simplify and improve system-level design and board design.
Fundamentals Modern bus applications pioneered by the largest and most influential companies in the digital electronics industry often employ new I/O standards tailored specifically to the needs of the application. The bus I/O standard provides specifications for other vendors who create products designed to interface with these applications. Each standard typically has its own specifications for current, voltage, I/O buffering, and termination.
The ability to provide the flexibility and time-to-market advantages of programmable logic increasingly depends on the ability of programmable logic devices to support an ever-increasing variety of I/O standards Multipurpose I/O resources with highly configurable input and output buffers , which provides support for various I/O standards. As shown in Table 15, each buffer type can support various voltage requirements.