A Vec in chisel represents a collections signal (same type). These are similar to the array data structures in other languages. Each element in Vec can be accessed by an index. A Vec will be created by calling a constructor with two parameters:
The number of elements
The type of the elements
The combinations vec must be wrap into a wire. val vs = Wire(Vec(3 , UInt(4.W)))
Individual elements of the vecot can be accessed by index vs(0) := 1.U vs(1) := 2.U vs(2) := 3.U
A vector wrapped into a Wire is multiplexer. We can also wrap the vec into register to define the array of registers as shown below val regfile = Reg(Vec(32, UInt(32.W)) The elements of that register are accessed by index val idx = 1.U(2.W) val dout = regfile(idx)
We can freely mix bundles and vectors, as shown below
val vecBundle = Wire(Vec(8, new vlsiscape() ) ) —> vlsispace() was a bundle defined
In Chisel provides two constructs to group related signals
A Bundle to group signals of different type.
A Vec represents the collection of signal of same type.
A chisel Bundle groups several signals. The entire bundle can be accessed or individual field can be accessed by their names. User or Designer can define a bundle(collections of signals) by definnig a class which extends Bundle and list the fields as vals within the constructor block
class vlsispace() extends Bundle { val vlsi = UInt(32.W) val space = Bool() }
To access the bundle in following way val vs = Wire(new vlsispace()) vs.vlsi := 124.U vs.space := false.B
val a = vs.space
By using the dot we can access the field of the particular constructor, which is commonly used in object oriented programming. A Bundle is similar to struct in C , a record in VHDL or struct in system verilog. A bundle can be referred as a whole as follows
val channel = vs
A bundle may as well contain a vector:
class BundleVec extends Bundle { val vs = UInt(8.U) val vector = Vec(4 , UInt(4,W)) }
When we want a register of a bundle type that needs a reset value, we first wire of that bundle and define the values for the bundle elements and then passing the bundle to RegInit:
val initval = Wire( new vlsispcae()) initval.vlsi := 1.U initval.space := true.B
By using full adder, a single 1-bit binary adders can be constructed from basic logic gates as shown below
But what if we wanted to add together two n-bit numbers, then n number of 1-bit full adders need to be connected or “cascaded” together to produce a adder known as a Ripple Carry Adder.
Ripple carry adder is simply “n“ number of 1-bit full adders cascaded together with full adder representing a single weighted column in a long binary addition. It is called a ripple carry adder because the carry signals produce a “ripple” effect through the binary adder from LSB to MSB.
Let us consider a three bit full adders to “add” together two 3-bit numbers, the two outputs of the first full adder will provide the first place digit sum (S) of the addition plus a carry-out bit that acts as the carry-in digit of the next binary adder. The second binary adder in the chain also produces a summed output (the 2nd bit) plus another carry-out bit and we can keep adding more full adders to the combination to add larger numbers, linking the carry bit output from the first full binary adder to the next full adder, and so forth. An example of a 3-bit adder is given below.
Let us consider A =4 , B=3, sum of A and B will be 7. In binary addition
There will be a overflow ,If the sum was greater than or equal to 2n one of the disadvantage in this adder. Let us consider A =4 , B=4, Sum of A and B will be 8 which is equal to 23 and we will have a overflow .
There two main disadvantages in ripple carry adder
Propagation delay: if inputs A and B changes, the sum at its output will not be valid until any carry-input has “rippled” through every full adder in the chain because the MSB of the sum has to wait for any changes from the carry input of the LSB. Consequently, there will be a finite delay before the output of the adder responds to any change in its inputs resulting in a accumulated/propagation delay. This delay can be neglected for 4 to 8 bits but we cannot neglect the delay for higher bits like 32 bits and more.
Over flow: Overflow occurs when the two n bit numbers add together whose sum is greater than or equal to 2n
To reduce the propagation delay of carry_in we can use Carry Look Ahead Binary Adder
In digital systems counter is plays a main role. Counters are used to keep a track of events , time intervals and no of interrupts etc. . Counter can be programmed in many languages like verilog, c , c++ , java, python, chisel etc. In this post we will see how to program a counter in chisel language.
Design a counter which starts counting from 0 till 9 and reset to 0 once it counts till 9.
val cntSpace = RegInit (0.U (8.U))—> defines a 8 bit register which initialize to 0 upon reset signal cntSpace := Mux( cntSpace ===9.U , 0.U, cntSpace + 1.U)
:= ——> update register When the cntSpace reaches to 9, it will initialize to 0 otherwise the cntSpace will be incremented by one
Scan cell is one of the DFT technique , to test the sequential circuits in the Asic/Soc design. Normal D flip flop are converted to scan flip flop, if the tool meets the following criteria
Clock of the flip flop must be controllable
The set/reset of the flops must be inactive during the shift mode.
Once the all the flops in the design meet the above two rules the tool will convert the d flops to scan flops by adding a mux as shown in figure above. Then the tools will stitch the scan cells into a scan chain according to the design requirements.
When Scan Enable is 0 (SE=0), all the scan chains in the design will be disconnected and the flops are connected to comb. logic
When Scan Enable is 1 (SE=1) , combinational logic will be bypassed and all scan cells will be connected to form a scan chain
When SE=1 , patterns are loaded to the scan chains and the data from the comb logic are captured when SE=0
Scan chain reordering is an optimization technique to ensure scan chains are connected in more efficient way – based upon the placement of the flip-flops. At initial stage , we donot have the placement information, so we just stitch the flops register by register. The tools will stitch the flops randomly to form a scan chain before placement. For proper understanding flops are numbered and two scan chains are stitched in the screen shot shown below.
But after placement it might be possible that the two flops stitched at initial stage of a different block sits far from each other when the placement is done. So if we keep the same scan chain order, we will face the placement congestion and timing congestion and more routing resources are required.
We can see from above screenshot, depending on the timing and placement congestion flops are placed at different locations when compared to before placement figure. This results in usage of more resources, space congestion increases and also timing violations. By reordering the scan cells in the scan chain we can reduce the congestion
In order to avoid the congestion before placement we have to follow the below steps
Disconnect the scan chain in the designing.
Based on the timing and congestion the tool optimizes the standard cells
Once the placement was done, reordering of scan chains are done based on the timing and placement congestion in design by maintaining the same number of scan cells .
SCANDEF file contains the scan chain information of the design and this file need to be read during PnR.
Faults list in design are categorized into sub categories. Faults class are mainly divided into
Testable(TE)–> Faults can be tested by some patterns.
Untestable(UT)–> Faults foe which no pattern exits to detect the faults
Untestable Faults: Are the faults for which no pattern exit to either detect or possible detect them. These faults cannot cause any functional failures. And so the tools excludes them while calculating the test coverage. Types of Untestable faults are
Unused (UU)
Tied (TI)
Blocked(BL)
Redundant Faults (RE)
Unused (UU)
Any floating pins not used in the design come under UU faults
The unused faults class includes all the faults on circuit unconnected to any observation point
Tied (TI)
This faults includes faults on gates where the point of the faults is tied to a value identical to the fault stuck value
Blocked (BL)
Due to tied logic in the design few faults are blocked and these are categories into Blocked faults. By adding the observable test point we can increase the coverage report.
Redundant (RE)
The faults which are undetectable by the tool by any pattern , are classified as redundant faults
Faults list in design are categorized into sub categories. Faults class are mainly divided into
Testable(TE)–> Faults can be tested by some patterns.
Untestable(UT)–> Faults foe which no pattern exits to detect the faults
Testable Faults: There are four sub category under TE.
DETECTED(DT)
POSDET(PD)
ATPG UNTESTABLE(AU)
UNDETETED(UD)
Detected(DT): The Faults which are detected during the ATPG process are categories under DT
det_simulation(DS): The faults detected when the tools performs simulation
det_implication(DI): The faults detected when the tool performs learning analysis
POSDET(PD): The Possible detected, faults includes all the faults that fault simulation identifies as possible detected
posdet_testable(PT): Potentially detectable posdet faults.With higher abort limit we can reduce the number of these faults
posdet_untestable(PU): These are proven ATPG untestable and hard undetectable faults.
ATPG_UNTESTABLE(AU): This fault class includes all the faults for which test generator unable to find the pattern to create a test. Testable faults become ATPG untestable faults because of constraints or limitations, placed on the ATPG tool such as pin constraint or an insufficient sequential depth. This faults may be detectable, if we remove some constraint, or change some limitations on the test generator
UNDETECTED (UD): This fault class includes the undetected faults that cannot be proven untestable or atpg_untestable
uncontrolled(UC)
unobserved(UO) All the testable faults prior to ATPG are put in the UC category. Faults that remain UC or UO after APTG aborted, which means that with higher abort limit may reduce the UC and UO fault class
Before implementing the logic, we will have a look at the truth table of the NAND gate and the inverter.
NAND GATE
A
B
O
0
0
1
0
1
1
1
0
1
1
1
0
NOT GATE
A
O
0
1
1
0
Fro the NAND gate truth table we can conclude the following When both the inputs are zero(0) ==> output is 1 (same as inverter) when both the inputs are one(1) ==> output is 0 (same as inverter)
Thus we can implement the not gate by connecting the both inputs together as shown below
There is another way of implementation of inverter using NAND gate , from truth table when input pin A is high (logic one) Nand gate behavious as INVERTER
Before implementing the logic, we will have a look at the truth table of the NOR gate and the inverter.
NOR GATE
A
B
O
0
0
1
0
1
0
1
0
0
1
1
0
NOT GATE
A
O
0
1
1
0
case I: From the NOR truth table we can see that when both the inputs are zero(0) ==> output is 1(same as inverter) both the inputs are one (1) ==> output is 0 (same as inverter)
Case II : second way of implementation of Inverter using Nor Gate.
A digital design can be represented at various levels from three different angles
Behavioral
Structural
Physical
This can be represented by Y chart
Behavioral Representation
Specifies how a particular should respond to a given set of inputs
May be specified by -Boolean Equations -Tables of input and output values -Algorithms written in standard HLL like C/C++ -Algoriths written in special HDL like verilog or VHDL or CHISEL
Example:
———————————–An Algorithm level of description of carry(Cy)———————————- module carry (cy, a,b,c); input a,b,c; output cy; assign cy = (a&b)|(a&c)|(b&c); endmodule
In general, the description is a list of modules and their interconnects – called Netlist – can be represented at various levels
At Structural Level, levels of abstraction are: – The module (functional) level – The Gate level – The switch level – The circuit level
Example: ——————————————–Structural Representation—————————————– module carry (cy , a, b, c); input a, b, c; output cy; wire w1,w2,w3; and g1 (w1, a, b); and g2 (w2, a, c); and g3 (w3, b, c); or g4 (cy, w1,w2,w3); endodule
Physical Representation
The lowest level of physical specification – Photo-mask information required by various processing steps in the fabrication process.
At the module level, the physical layout for the adder may be defined by a rectangle or polygon, and collection of ports
Example: ———————————————–Physical representation————————————————- A possible (partial) physical description of 4 bit adder
Generates the netlist for the register transfer level components
Logic Design
Generate the netlist of Gates/Flip-Flops or Standard cells
Physical Design
Generate the final layout
Manufacturing the chip in Fabrication unit
Some more Intermediate steps are required during the Design flow.
Simulation for Verification
It should be carried out at various levels, which includes: Logic level, Switch level, Circuit level
Formal Verification
Logical equivalence check will be carried at various levels, to check core design was not disturbed.
LEC/Formal Verification on the design was done between -RTL and Synthesised Netlist -Synthesized Netlist and DFT inserted Netlist -MBIST Inserted Netlsit and Synthesized Netlist, etc
In digital design, register are the basic elements which are used widely. Chisel provides a register , which is collection of D Flip Flops. The register is connected to a clock and the output of the register updates on every rising edge. When an initialization value is provided at the declaration of thr register, it uses a synchronous reset connected to reset signal. A register can be any chisel type that can be represented as a collection of bits.
Below line defines an 8 bit register, initialized with 0 at reset: val reg = RegInit(0.U(8.W))
An input is connected to the register with the := update operator and the output of the register can be used just with the name in an expression
reg := d val q = reg
A register can also be connected to its input at the definition:
val nextReg = RegNext(d)
A register can also be initialized during the definition:
MBIST( Memory Built In Self Test) is implemented to test memories in the design for different types of faults. MBIST contains the processor and wrapper which will be wrapped arround the memories.The MBIST processor controlls the wrapper and generates various control signals during the memory testing. A design may have multiple processors depending on the number of memories, memory size, power, frequency and memory placement.
Memories which are placed near by are grouped together and controlled by single processor. Thereofore, we need the memory placement info to group the memories under a controller and this info was given to the DFT team in the form of DEF and floorplan snapshot. This info will be given by PD/PNR team.
What happens if memories are not grouped properly? If memories are not grouped properly according to their physical location i.e memories under same processors are sitting at different corners. This will lead to MBIST logic spreading, which impacts on MBIST timing during the STA due to long paths or increase in congestion due to lots of criss-cross while implementing the PNR and also increases the unneccesary power consumtption.
A multiplexer is a circuit which selects between the input signals depending on select signal. In basic form of multipexer (2:1 mux) selects between two signals. Below fig represents the 2:1 multiplexer , depending upon the sel signal y will represent the input signal a or b
A multiplexer can be designed using logic gates. As the multiplexer is used more frequently in digital desgin, chisel provides the function called MUX
val results = Mux(sel , a, b)
where a is selected when sel is true, otherwise b is selected, type of sel is a chisel Bool. The inputs a and b can be any chisel base type or aggregate (bundlers or vectors) as long as they are same type
A Bundle to group signals of different types. A Vec to represents an indexable collection of signals of the same type
During the DFT validation patterns are used which are generated during ATPG stage, even these patterns(in Still, wgl format) are used to test a chip on ATE. As there is limitation on memory of the ATE, size of the patterns generated must be with in the memory limit of ATE. Thus we have to reduce the patterns count/pattern volume for a design without losing the coverage. Few of the technique are
For pattern reduction, First step is chain balancing. During scan insertion scan chains present in the design must be balanced(of equal length), so that tool will insert the less dummy patterns for reaching a required flip flop.
we can also include compression on the chains. This means if we are having the compression factor of 2 then your 1 scan chain will get divided into 2 inside the device reducing your chain length (flops per scan chain), thus less patterns are required.
compression ratio: The compression ratio in DFT used to reduce the TESTER Application time and TESTER data volume(size of pattern).
If u ask to do DFT implementation on a design, then what factors do we need consider mainly. While doing the DFT implementation designer need to have some knowledge on tester which will be used for testing IC.
Number of channels available on the tester
Memory size of the channel
Number of scan pins
The operational frquecny of the tester
This above facotrs must be considered while implementing the DFT on the design
Design for Testability (DFT) is required to guarantee the product quality, reliability, performances, etc. Design for Testability refers to those design techniques that
Enhances testability of device
Ease ability to generate vectors
Reduce test time
Reduce the cost involved during test
There are different methods to implement the DFT Logic for Digital circuits which are listed below
Ad-hoc methods: Good design practices learnt through experience and those methods are used as guidelines
Avoid combinational feedback
All flip flops must be initializable
Avoid redundant and large fanin gates
Provide test control for the signals which are not controllable
While designing test logic we have to consider the ATE requirements
Ad-hoc methods had few disadvantages, and these gives more advantage to Structured methods.
Disdvantages od ad-hoc DFT methods:
Experts and tools not always available
Test generation is often manual with no guarantee of high fault coverage
Design iterations may be necessary
Structured Methods: Structured DFT provides a more systematic and automatic approach to enhancing design testability. Structured DFT’s goal is to increase the controllability and observability of a circuit. Various methods exist for accomplishing this. The most common is the scan design technique, which modifies the internal sequential circuitry of the design.
Scan: In the design all the flip flops are converted to scan flip flop.
Boundary Scan
Built-in self-test
we have came across the scan flip flop, and you may be wondering, what would be the difference between a norml flip flop and a scan flip flop. Below pictorial representation give clear picture about a flop and scan flop.
TM represents Test Mode signal and this signal should be 1 during DFT testing and 0 for functional model.
Chisel uses the boolean algebra operators and arithmetic operators same as in c, java, scala, etc programming languages.
val sel = a & b
The keyword val is part of scala which is used to name the variables that have values that won’t change. And here it is used to name the chisel wire, sel, holding the output of the bitwise and operation. A signal can also first be defined as a Wire of some type. Afterward, we can assign a value to the wire with the ‘:=’ update operator.
val sel = Wire(UInt())) sel := a & b
Boolean operators
& —– represents bitwise AND operator val and = a & b | —– represents bitwise OR operator val or = a | b ^ —– represents bitwise XOR operator val xor = a ^ b ~ —– represents bitwise negation val not = ~a
Arithmetic operations
+ —– Addition operation val add = a + b – —– Subtraction operation val sub = a – b * —– multiplication operation val mul = a * b / —– division operation val div = a / b % —– modulo operation val mod = a % b
Recovery Time: Recovery time is the minmium time that as asynchronous control signal must be stable before the clock active- edge transition. In other words, this check ensures that after the asynchronous signal become inactive, there is adequate time to recover so that the next active clock edge can be effective.
Consider the time as show in below figure, between an asynchronous reset becoming inactive and the clock active edge of a flip-flop. If the active clock edge occurs too soon after the release of reset, in this case the state of the flip-flop may be unknow. Therefore it is required to have minimum time for asynshronus control signal to become stable. Recovery time is similar to setup time.
Removal Time: Removal time is the minimum length of time that an asynchronous control must be stable after the clock active edge transition. This check ensures that the active clock edge has no effect because the asynchronus control signal remains active until removal time after the active clock edge.
Consider the asynchronous control signal is released(becomes inactive) well after the active clock edge so that the clock edge can have no effect. Similar to hold check, it is minimum path check except that it is on an asynchronous pin of flip flop.
Chisel data types are used to specify the type of values held in the state elements or flowing on wires. Chisel defines the Bundles for making collection of values with named fields (Similar to Structs in other languages and Vecs for indexable collections of values. There are three data types (which represents the vector of bits) to describe the signals, combinational logic, registers.
Bits
UInt (Unsigned Integer)
SInt (Signed Integer)
UInt and SInt extends the Bits data type. The Chisel uses the two’s complement as signed integer representation. Definition for three data types as follows
Bits (8.W) –> an 8 bit width Bits UInt (8.W) –> an 8 bit width unsigned Integer SInt (8.W) –> an 8 bit width Signed Integer
Example:
0.U //defines the Unsigned Integer constant of 0 -3.S // defines the Signed Integer constant of -3
We can also define the Integer with width.
7.U(4.W) // defines the Unsigned Integer of width 4 8.S(7.W) // Signed decimal 7 bit literal of type SInt
For constants defined in other bases than decimal, the constant is defined in a string with a preceding h for hexdecimal(base 16), o for octal(base 8), b for binary (base 2). In these case we omit the bit width and the chisel infers the minimum width to fit the constants in, in this below case width 8 will be considered.
“hdf”.U // Hexa decimal representation of 223 “o337”.U // Octal representation of 223 “b1101_1111”.U // Binary representation of 223
Design for Testability circuit is used for controllability and observability of the design. The test logic is inserted in to the main core logic for testing the chip once it is manufactured. Types of DFT logic are
Logic BIST Build in self-test is inserted into the core logic design. This circuit is used to test the core logic.
MBIST Memory build in self-test is carried on the memory elements and this logic is used for testing memories
Boundary Scan In the board level Boundary Scan circuitry provides the access to the inputs and output ports of the chips. This circuitry not only does board level testing, it can also do circuit level such as BIST or internal scan and it can test board interconnection. To control all these operation, TAP controller was used.