About VLSI..

Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed

Monday, November 28, 2011

Physical Design: Routing

Routing

Routing flow is shown in the Figure (1).



Figure (1) Routing flow [1]

Routing is the process of creating physical connections based on logical connectivity. Signal pins are connected by routing metal interconnects. Routed metal paths must meet timing, clock skew, max trans/cap requirements and also physical DRC requirements.

In grid based routing system each metal layer has its own tracks and preferred routing direction which are defined in a unified cell in the standard cell library.

There are four steps of routing operations:

1. Global routing
2. Track assignment
3. Detail routing
4. Search and repair


Global Route assigns nets to specific metal layers and global routing cells. Global route tries to avoid congested global cells while minimizing detours. Global route also avoids pre-routed P/G, placement blockages and routing blockages.

Track Assignment (TA) assigns each net to a specific track and actual metal traces are laid down by it. It tries to make long, straight traces to avoid the number of vias. DRC is not followed in TA stage. TA operates on the entire design at once.

Detail Routing tries to fix all DRC violations after track assignment using a fixed size small area known as "SBox". Detail route traverses the whole design box by box until entire routing pass is complete.

Search and Repair fixes remaining DRC violations through multiple iterative loops using progressively larger SBox sizes.

Reference

[1] Astro User Guide, Version X-2005.09, September 2005

Expressions, Operators and Operands in Verilog HDL

Verilog HDL: Expressions, Operators and Operands

Dataflow modeling in Verilog describes the design in terms of expressions, instead of primitive gates. 'expressions,, 'operators' and 'operands' form the basis of Verilog dataflow modeling.

Arithmetic:

                            *       ---> Multiplication
                            /        ---> Division
                           +        ---> Addition
                           -         ---> Subtraction
                           %       ---> Modulo
                          **        ---> Power or exponent

Logical:

                           !         ---> logical negation (one operand)
                       &&       ---> logical AND
                           ||         ---> logical OR

Relational:

                          >        ---> greater than
                          <        ---> lesser than
                          >=      ---> gretaer than or equal to
                          <=      ---> less than or equal to

Equality:

                          ==      ---> equality
                          !=       ---> inequality
                       ===       ---> case equality
                        !==       ---> case inequality

Bitwise:

                          ~        ---> bitwise negation (one operand)
                         &        ---> bitwise AND
                          |          ---> bitwise OR
                         ^         ---> bitwise XOR
             ^~ or ~^         ---> bitwise XNOR

Reduction:

                         &          ---> reduction and (one operand) 
                       ~&          ---> reduction NAND
                          |            ---> reduction OR
                        ~|            ---> reduction NOR
                         ^           ---> reduction XOR
              ^~ or ~^          ---> reduction XNOR

Shift:

                       >>           ---> right shift
                      <<            ---> left shift
                    >>>            ---> arithmetic right shift
                    <<<            ---> arithmetic left shift

Concatenation:

                          { }        ---> any number operand 

Eg:

         A= 1'b1, B=2'b00, C =2'b10, D=3'b110
         Y={B,C}                                                 //result y is 4'b0010
         Y={A,B,C,D,3'b001}                            //y=11'b10010110001
         Y={A,B[0],C[1]}                                   //Y=3'b101


Replication:

                       {{ }}        ---> any number operand 

Eg :- 
           reg A;
           reg [1:0] B,C;
           reg [2:0] D;
          A=11b1; B=2'b00; C=2'b10; D=3'b110;

          Y={4{A}}                                              //result y is 4'b1111
          Y={4{A} , 2{B}}                                  //y=8'b11110000
          Y={4{A},2{B},C}                                //y=8'b1111000010


Conditional:

                          ?: (three operands)

Data Types in Verilog HDL

Verilog HDL: Data Types

Value Set: 
                           ---> Four values  to model the functionality

                           ---> Eight strengths of real hardware

 
    Value level--------- Condition in hardware circuits

 
            0 ------------- > Logic zero, false condition
            1 ------------- > Logic one, true condition
            X ------------ > Unknown logic value
            Z ------------- > High impedance ,floating state

 

Nets: 

 
  • Represent connection between hardware elements ; is a datatype; not a keyword
  • Nets are declared primarily with the keyword 'wire'
  • Default value is 'z'
  • Exception : 'trireg' net,which defaults to x

Eg: 

 
           wire a;
           wire b,c;
           wire d=1'b0     //Net d is fixed to logic value zero at declaration

 
Register:

 
  • Represent data storage element
  • Retain value until another value is placed onto them
  • Keyword is 'reg'

 
Eg: 

 
            reg reset;          //declare a variable reset that can hold its value

 
Registers can also be declared as signed variables

 
Eg:

 
             reg signed[63:0] ; //64 bit signed value

 

 

Vectors:

 
Nets or reg (multiple bit widths) data types can be declared as vectors

Default is scalar (1-bit)

 
Eg : 

 
       wire a;                                         //scalar net variable;default

       wire [7:0] bus;                            //8 bit bus

       wire [31:0] bus A,bus B,bus C;  //3 buses of 32 bit width

       reg clock;                                    //scalar register; default

       reg [0:40] virtual_addr;              //vector register. Virtual address 41 bits wide

 
// 0:always MSB ; 40:always LSB

 
Vector Part Select:
It is possible to address bits or parts of vectors.

 
Eg: 

 
        busA[7]                   //bit 7 of vector bus A

        bus[2:0]                  //three LSBs of vector bus

        virtual_addr[0:1]    //two MSBs of vector virtual_addr

 

 Variable Vector Part Select:

 

 
Eg

 
       reg[255:0] data 1;         //Little endian notation

       byte = data1[31-:8];      //starting bit=31,width=8=>data[31:24]

       byte = data[24+:8];       //starting bit =24, width=8=>data[31:24]

 
Integer:

 
Default width is the host machine word size

 
Eg:    

 
         integer counter;                    //general purpose variable used as a counter

 
Real:

 
  • Can be in decimal notation(eg: 3.14)
  • Can be in scientific notation(eg: 3e6)
  • No range declaration possible
  • Default value is zero

 
Eg

 
          real delta;           //define a real variable

 

 
Time:

 
  • A special time register data type is used in verilog to store simulation time
  • Width is application specific ;but atleast 64 bits
  • The system function '$time' is invoked to get the current simulation time
  • The simulation time is measured in terms of simulation seconds

Eg : 

 
       time save_sim_time;                 //define a time variable save_sim_time

       initial save_sim_time= $time;  //save the current simulation time

 

 
Arrays:

 
  • Allowed for reg, integer, time, real, realtime and vector
  • Multidimensional arrays are also allowed
  • Arrays are accessed by []

 
Eg

 
      integer count[0:7] ;         //an array of 8 count variables

      reg bool[31:0] ;              //array of 32 one-bit Boolean register variables

      time chk_point[1:100];   //array of 100 time checkpoint variables

      reg [4:0] port_id[0:7];    //array of 8 port_ids; each port_id is 5 bits wide

 

 
Memories:

 
  • Memories are modeled as a one-dimensional array of registers
  • Each word can be one or more bits

 
Eg : 

 
      reg memory_1_bit[0:1023];           //memory memory_1_bit with 1K 1-bit words

      reg[7:0] memory_byte[0:1023];    //memory memory_byte with 1K 8-bit words(bytes)

      memory_byte[511]                        //fetches 1 byte word whose address is 511

 

 

Parameters:

 
  • Constant definitions

 
Eg

 
         parameter part_id =5;                        //defines a constant port_id

         parameter cache_line_width=256;    //constant defines width of cache line

         parameter signed [15:0] width;         //fixed sign and range for parameter width

Tuesday, November 22, 2011

VLSI DESIGNS ARE CLASSIFIED INTO THREE CATEGORIES

VLSI DESIGNS ARE CLASSIFIED INTO THREE CATEGORIES:  

  1. Analog: Small transistor count precision circuits such as Amplifiers, Data converters, filters, Phase Locked Loops, Sensors etc.

 

  1. ASICS or Application Specific Integrated Circuits: Progress in the fabrication of IC's has enabled us to create fast and powerful circuits in smaller and smaller devices. This also means that we can pack a lot more of functionality into the same area. The biggest application of this ability is found in the design of ASIC's. These are IC's that are created for specific purposes - each device is created to do a particular job, and do it well. The most common application area for this is DSP - signal filters, image compression, etc. To go to extremes, consider the fact that the digital wristwatch normally consists of a single IC doing all the time-keeping jobs as well as extra features like games, calendar, etc.

 

  1. SoC or Systems on a chip: These are highly complex mixed signal circuits (digital and analog all on the same chip). A network processor chip or a wireless radio chip is an example of an SoC. 

THE VLSI DESIGN PROCESS : A typical analog design flow is as follows.....

A typical analog design flow is as follows:

In case of analog design, the flow changes somewhat.

  • Specifications
  • Architecture
  • Circuit Design
  • Simulation
  • Layout
  • Parametric Extraction / Back Annotation
  • Final Design
  • Tape Out to foundry.

 

While digital design is highly automated now, very small portion of analog design can be automated. There is a hardware description language called AHDL but is not widely used as it does not accurately give us the behavioral model of the circuit because of the complexity of the effects of parasitic on the analog behavior of the circuit. Many analog chips are what are termed as "flat" or non-hierarchical designs. This is true for small transistor count chips such as an operational amplifier, or a filter or a power management chip. For more complex analog chips such as data converters, the design is done at a transistor level, building up to a cell level, then a block level and then integrated at a chip level. Not many CAD tools are available for analog design even today and thus analog design remains a difficult art.

 

THE VLSI DESIGN PROCESS :digital design

A typical digital design flow is as follows: 


  • Specification
  • Architecture  
  • RTL Coding  
  • RTL Verification  
  • Synthesis
  • Backend
  • Tape Out to Foundry to get end product..a wafer with repeated number of identical Ics.

 

All modern digital designs start with a designer writing a hardware description of the IC (using HDL or Hardware Description Language) in Verilog/VHDL. A Verilog or VHDL program essentially describes the hardware (logic gates, Flip-Flops, counters etc) and the interconnect of the circuit blocks and the functionality. Various CAD tools are available to synthesize a circuit based on the HDL.  

Without going into details, we can say that the VHDL, can be called as the "C" of the VLSI industry. VHDL stands for "VHSIC Hardware Definition Language", where VHSIC stands for "Very High Speed Integrated Circuit". This languages is used to design the circuits at a high-level, in two ways. It can either be a behavioral description, which describes what the circuit is supposed to do, or a structural description, which describes what the circuit is made of. There are other languages for describing circuits, such as Verilog, which work in a similar fashion.

Both forms of description are then used to generate a very low-level description that actually spells out how all these are to be fabricated on the silicon chips. This will result in the manufacture of the intended IC.

Saturday, November 19, 2011

Production Programming of Flash for FPGAs and MCUs...

This's from comp.arch.fpga-Google group ...
Experience/thoughts or answers of the members of the group  ..... 
1.
Someone on Linkedin asked about a stand alone device for programming 
the flash for FPGAs in the field or in a production environment. 
There doesn't seem to be anything currently available like this. 
Looking at the big three manufacturers I see at least two formats for 
the files that might be used.  Xilinx and Lattice use SVF with Xilins 
offering support for a compressed version called... XSVF of course. 
Altera uses JAM.  JAM seems to be a JEDEC standard while SVF appears 
to be a defacto industry standard developed by a company. 
I'm curious why two standards came about.  Was there a problem with 
using the version the company developed?  I'm assuming the industry 
version came first and the JEDEC version came later.  Or is that 
wrong?  It won't be too much trouble to support both, but I don't get 
why both standards exist. 
How do you program production devices?  I know in large facilities 
they pay big bucks for JTAG hardware and software that will work 
across the spectrum including test and diagnosis.  I'm thinking there 
is a market for a more limited device that is just used to program the 
non-volatile memory in embedded systems in an efficient manner for 
production and field upgrades.  Any thoughts? 



2.
Programming procedure for programming a bare at91sam9 board. 
1. Insert SD-Card 
2. Press reset 
3. Wait until LED blinks 
4. Remove SD-Card 
5. Press Reset 
    Application boots... 
Procedure for updating the linux kernel in a preprogrammed board. 
1. Connect board to host PC using USB. 
2. Reset the board in the USB Mass Storage mode. 
3. Wait until FAT partition window appears on the host PC. 
4. Clíck on the new kernel version, drag and drop it on the FAT window 
5. Reset the PC into normal linux boot sequence. 
 
3.
I know where I've been we always ended up building our own board with 
MCU on 
it for the production testing. Usually involving bit- 
banging(everything from JTAG to PCI) 
a bootloader or test program into the dut and programming a flash via 
uart/spi or 
something like that 
the code to be programmed was usually store on flash on the board, so 
unless you 
needed to add serial numbers and such it could be used standalone, 
just plug it into 
the dut and push the program button and done 
We did at one point try some jtag hw, but it could never really do 
what we wanted 

4.
IEEE 1532 is something that is a bit newer, I believe both xilinx and 
altera 
support it, not sure 'bout the others. http://grouper.ieee.org/groups/1532/ 
(we are primarily Xilinx users...) 
As for programming, it depends on the system.  These days we usually 
have a PC in the test fixture for all but the simplest of boards, so 
we use 
a Xilinx cable for the initial load.  We usually have the Xilinx part 
as a coprocessor 
with other devices, so even if the xilinx boots first, we have other 
devices that 
can do updates to the memory already on-board. 
Even if you don't have a CPU, it's not hard to put in a picoblaze core 
and do a 
loader to update a SPI flash via bit-banging.  You could probably do 
one with 
access to raw SD/MMC cards without too much trouble. 

5.
It depends on the devices in question. 
Many larger microcontrollers have a bootloader in ROM that you can use 
to program them over a serial link or perhaps USB.  Smaller 
microcontrollers can often be programmed easily using a JTAG or other 
debugging port, or an SPI-like interface (such as AVR devices). 
Typically that means using the manufacturer's own JTAG debuggers and 
software, but these are always far cheaper than the JTAG test equipment 
and software you describe. 
I have also used the JTAG or BDM port of bigger microcontrollers, 
combined with a cheap hardware interface and gdb, to script programming 
and testing setups.  For ARM devices you can use OpenOCD or Urjtag in a 
similar fashion. 
For devices that can boot from a serial flash, the easiest method is 
often to make these pins available on a header, along with the "boot 
mode" control pins for the device.  Then you can make a little card with 
a serial flash device that you plug into the board for initial bootup - 
this software can then test the board and program the real code into the 
main memory. 
If you have a serial flash on the board as the main memory, then you can 
have a similar header that lets an off-board device hold the processor 
or FPGA in reset while it programs the serial flash.  You can make such 
a device using an FTDI 2232H module and a few wires. 

6.
I try to stick with devices which can be programmed over a standard 
serial port. A programmer is nothing more than a USB to serial 
converter. Very convenient. 
If I need in system programming I use a standard programmer with a 
cable. IC socket to put in the programmer at one end, a special 
connector on the other end. 
In order to program large numbers of devices I once build a special 
rig with 8 Jtag and 8 serial ports. The devices to be programmed where 
designed to be plugged into this programmer. There is a lot you can do 
at the design stage to make programming easier & faster. A cheap 
device may cost more in the end if the programming takes more time & 
effort. Time is expensive in many places. 

Monday, November 14, 2011

Coding Style of Verilog HDL-Modules

Verilog is a one of the famous Hardware Descriptive Languages (HDL). (VHDL is the other one). Verilog langauage syntax very well matches with C language syntax. This is big advantage in learning Verilog. Logic operators, data types, loops are similar to C. In addition to this certain data types which are necessary to describe a hardware are available in Verilog. For example, nets in a schematic or hardware design is refered here as 'wire'. Flip-flops (or in general are called as registers) are defined as type 'reg'.



"module"s are building block of Verilog. Consider any design represented by a block diagram with its inputs and outputs.


Consider a design named "A". Let "Input1" and "Input2" are its inputs. Let "output1" be its output. In verilog we define this design "A" as module "A" and its inputs and outputs become Input/Output (I/O) "ports" of that module. The keyword for declaring the module in Verilog is "module". This keyword should be followed by the name of the module, in this case it is "A". Hence it becomes :module A. Thus complete syntax of declaring or defining a hardware circuit is as follows:


module ();

            input ---- ; // telll which are inputs

            output ---- ; //these are outputs

             ------------------------------

             ------------------------------

           

            --------------------------------

endmodule

Where,

module ==> Verilog keyword to declare a name of the hardware circuit.

 ==> Name of the module i.e. hardware circuit. (Here it is "A").

 ==>  'terminal' is Verilog way of naming I/O ports, here we should specify all inputs and outputs of circuit. Outputs are written first. Why? Answer is "That's how Verilog is !!!!"

input ==> Verilog keyword to tell the Verilog compiler which are inputs of the module in the given list of .

output ==> Verilog keyword to tell the Verilog compiler which are outputs of the module in the given list of .

endmodule ==> Another verilog keyword to tell that module related all definitions are over and this is the end of the module.

; ==> This semi colon represents end of line.

// --> represents start of comment; same as C syntax.

( /*……………………………..*/  are multiple line comment)



So above design "A" can be defined in Verilog as given below:

module A (Output1, Input1, Input2)

                   output Output1;

                   input Input1, Input2;

                    -------------

                    -------------

 endmodule



If above design is a T flip-flop then we can write module description as:



module T_FF(q,clock,reset); // Here q is o/p; must be first

                    //Other control i/pts must be next i.e. clk and reset inputs]

                    -------------

                    --------------

                   

endmodule

Nesting of modules is not allowed in verilog. One module definition can't contain another module definition within the 'module ' and 'endmodule' statements.

Hardware circuit of design "A" can be defined in 3 different ways in Verilog. They are:

1. Gate level modeling

2. Data flow modeling

3. Behavioural modeling

Sunday, November 13, 2011

Fundamental DSP/speech processing patent for sale

US Patent 7,124,075 "Methods and apparatus for pitch determination" 
will be auctioned as Lot 147 at the upcoming ICAP Patent Brokerage 
Live IP Action on November 17, 2011 at The Ritz Carlton, San 
Francisco. 
The patent addresses a core problem of signal processing in general, 
and speech signal processing in particular: period (fundamental 
frequency) determination of a (quasi)-periodic signal, or pitch 
detection problem in speech/audio signal processing. 
Patented nonlinear signal processing techniques originate from chaos 
theory and address known limitations of traditional linear signal 
processing methods like FFT or correlation. 
Patented methods are amenable to efficient implementation in both 
software and hardware (FPGAs, ASICs). 
Forward citations include Microsoft, Mitsubishi Space Software, 
Broadcom, Sharp and Teradata. 
Visit ICAP's website for more information: http://icappatentbrokerage.com/forsale 

ASIC design job vs FPGA design job Options

ASIC design job vs FPGA design 

 THIS IS FROM  comp.arch.fpga group ...... 

question:

I am an ASIC design engineer with over 6 years experience. My experience 
 in ASIC design spans across microarchitecture, RTL coding, synthesis, 
 timing closure and verification. Is it advisable for me if I change to a 
 FPGA design job? I mean, what are the pros and cons? I do not have much 
experience in FPGA other than school projects. How much learning is 
 involved? Will it be difficult to switch back to ASIC design position in 
 the future if I move to a FPGA job? Do FPGA design involve less work and 
 stress than ASIC? Please provide your opinion, experience or any other 
 comment. 

 answer by anonymous author  : I knew a guy who had done really good FPGA designs for years, and for 
years had yearned to do ASIC design with the "big boys".  He lasted a 
year or two -- not because he wasn't up to the job, but because he hadn't 
realized the difference in the design cycle between ASIC and FPGA, and he 
vastly preferred FPGA design. 

Because with FPGA design, you do your system design and have a design 
review, then you do your coding and have a design review, and then you 
pour it all into the PC board that's been underway at the same time that 
you were doing your FPGA design.  You bring it all up with the test 
features in the software whose design has _also_ been underway while you 
were working, and you test the heck out of it. 

At this point, you're far from done: the board will be getting green 
wires, the software will be getting revised (or, if everyone is smart, 
only the critical portions of the software will have been completed), and 
your logic design will probably need revision (or be incomplete). 

So it's not uncommon to spend a month or two tweaking and revising your 
"finished" design after it's finished. 

Tom's experience with ASIC design, on the other hand, was that you get 
the system design done, then you go write a bunch of behavioral code to 
completely embody the system design, and a testbench to completely test 
it.  You churn on that for weeks or months while your colleagues make up 
new tests for corner cases.   

Then, once you've verified the snot out of the system design, you start 
replacing parts of your behavioral system piece-by-piece with the RTL- 
level code for your ASIC, testing all the way. 

So, (in Tom's words), you spend 90% of your time flogging the 
verification. 

This all makes sense:  the cycle time between moving a comma in a Verilog 
file and testing the effect in an FPGA might only take between half an 
hour and several hours.  The cycle time to do the same thing with an ASIC 
is weeks, and $$$, and trash bins full of parts.  So doing the 
verification "live" makes good economic sense with FPGAs, and doing it in 
simulation makes equally good economic sense with ASICs. 

So:  if the design cycle that I'm quoting for ASICs sounds accurate to 
you (I'm just forwarding a long-ago conversation), and the design cycle 
for FPGA work makes you think "ewww!", then FPGA work isn't for you.  If, 
on the other hand, you get no joy from spending 90% of your time 
verifying before you actually get to see your work working -- maybe 
you'll like FPGA work. 

Tom did note barriers to transitioning to ASIC work (in part because he 
has an EET degree, not a "real" EE degree), and may not have found the 
transition back to FPGA work as easy as he did if he did not have a large 
circle of former coworkers who -- to a man -- were impressed by his work 
and willing to tell their bosses.  (Tom's one of those guys that if he's 
applying for work you tell your boss "just hire him, he'll make it work"). 

So, that's what I know. 

Monday, November 7, 2011

Physical Design Flow




The main steps in the flow are:
Design Netlist (after synthesis)
Floor Planning
Partitioning
Placement
Clock-tree Synthesis (CTS)
Routing
Physical Verification
GDS II Generation
These steps are just the basic. There are detailed PD Flows that are used depending on the Tools used and the methodology/technology. Some of the tools/software used in the back-end design are :
Cadence (SOC Encounter, VoltageStorm, NanoRoute)
Synopsys (Design Compiler)
Magma (BlastFusion, etc)
Mentor Graphics (Olympus SoC, IC-Station, Calibre)
A more detailed Physical Design Flow is shown below. Here you can see the exact steps and the tools used in each step outlined.

Membership and member grades for IEEE....

Membership and member grades

Most IEEE members are electrical and electronics engineers, but the organization's wide scope of interests has attracted people in other disciplines as well (e.g., computer sciencemechanical andcivil engineering) as well as biologistsphysicists, and mathematicians.

An individual can join the IEEE as a student member, professional member, or associate member. In order to qualify for membership, the individual must fulfil certain academic or professional criteria and abide to the code of ethics and bylaws of the organization. There are several categories and levels of IEEE membership and affiliation:

  • Student Members: Student membership is available for a reduced fee to those who are enrolled in an accredited institution of higher education as undergraduate or graduate students in technology or engineering.
  • Members: Ordinary or professional Membership requires that the individual have graduated from a technology or engineering program of an appropriately-accredited institution of higher education or have demonstrated professional competence in technology or engineering through at least six years of professional work experience. An associate membership is available to individuals whose area of expertise falls outside the scope of the IEEE or who does not, at the time of enrollment, meet all the requirements for full membership. Students and Associates have all the privileges of members, except the right to vote and hold certain offices.
  • Society Affiliates: Some IEEE Societies also allow a person who is not an IEEE member to become a Society Affiliate of a particular Society within the IEEE, which allows a limited form of participation in the work of a particular IEEE Society.
  • Senior Members: Upon meeting certain requirements, a professional member can apply for Senior Membership, which is the highest level of recognition that a professional member can directly apply for. Applicants for Senior Member must have at least three letters of recommendation from Senior, Fellow, or Honorary members and fulfill other rigorous requirements of education, achievement, remarkable contribution, and experience in the field. The Senior Members are a selected group, and certain IEEE officer positions are available only to Senior (and Fellow) Members. Senior Membership is also one of the requirements for those who are nominated and elevated to the grade IEEE Fellow, a distinctive honor.
  • Fellow Members: The Fellow grade of membership is the highest level of membership, and cannot be applied for directly by the member – instead the candidate must be nominated by others. This grade of membership is conferred by the IEEE Board of Directors in recognition of a high level of demonstrated extraordinary accomplishment.
  • Honorary Members: Individuals who are not IEEE members but have demonstrated exceptional contributions, such as being a recipient of an IEEE Medal of Honor, may receive Honorary Membership from the IEEE Board of Directors.
  • Life Members and Life Fellows: Members who have reached the age of 65 and whose number of years of membership plus their age in years adds up to at least 100 are recognized as Life Members – and, in the case of Fellow members, as Life Fellows.

Wednesday, November 2, 2011

Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis

Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis

The major design challenges of ASIC design consist of microscopic issues and macroscopic issues [1]. The microscopic issues are ultra-high speeds, power dissipation, supply rail drop, growing importance of interconnect, noise, crosstalk, reliability, manufacturability and the clock distribution. The macroscopic issues are time to market, design complexity, high levels of abstractions, reuse, IP portability, systems on a chip and tool interoperability.

To meet the design challenge of clock distribution, the timing analysis is performed. Timing analysis is to estimate when the output of a given circuit gets stable. Timing Analysis (TA) is a design automation program which provides an alternative to the hardware debugging of timing problems. The program establishes whether all paths within the design meet stated timing criteria, that is, that data signals arrive at storage elements early enough valid gating but not so early as to cause premature gating. The output of Timing Analysis includes 'Slack" at each block to provide a measure of the severity of any timing problem [13].


Static vs. Dynamic Timing Analysis

Timing analysis can be static or dynamic.

Static Timing Analysis (STA) works with timing models where as the Dynamic Timing Analysis (DTA) works with spice models. STA has more pessimism and thus gives maximum delay of the design. DTA overcomes this difficulty because it performs full timing simulation. The problem associated with DTA is the computational complexity involved in finding the input pattern(s) that produces maximum delay at the output and hence it is slow. The static timing analyzer will report the following delays: Register to Register delays, Setup times of all external synchronous inputs, Clock to Output delays, Pin to Pin combinational delays. The clock to output delay is usually just reported as simply another pin-to-pin combinational delay. Timing analysis reports are often pessimistic since they use worst case conditions.

The wide spread use of STA can be attributed to several factors [2]:

The basic STA algorithm is linear in runtime with circuit size, allowing analysis of designs in excess of 10 million instances.

The basic STA analysis is conservative in the sense that it will over-estimate the delay of long paths in the circuit and under-estimate the delay of short paths in the circuit. This makes the analysis "safe", guaranteeing that the design will function at least as fast as predicted and will not suffer from hold-time violations.

The STA algorithms have become fairly mature, addressing critical timing issues such as interconnect analysis, accurate delay modeling, false or multi-cycle paths, etc.
Delay characterization for cell libraries is clearly defined, forms an effective interface between the foundry and the design team, and is readily available. In addition to this, the Static Timing Analysis (STA) does not require input vectors and has a runtime that is linear with the size of the circuit [9].


PVT vs. Delay

Sources of variation can be:

  • Process variation (P)
  • Supply voltage (V)
  • Operating Temperature (T)



Process Variation [14]


This variation accounts for deviations in the semiconductor fabrication process. Usually process variation is treated as a percentage variation in the performance calculation. Variations in the process parameters can be impurity concentration densities, oxide thicknesses and diffusion depths. These are caused bye non uniform conditions during depositions and/or during diffusions of the impurities. This introduces variations in the sheet resistance and transistor parameters such as threshold voltage. Variations are in the dimensions of the devices, mainly resulting from the limited resolution of the photolithographic process. This causes (W/L) variations in MOS transistors.

Process variations are due to variations in the manufacture conditions such as temperature, pressure and dopant concentrations. The ICs are produced in lots of 50 to 200 wafers with approximately 100 dice per wafer. The electrical properties in different lots can be very different. There are also slighter differences in each lot, even in a single manufactured chip. There are variations in the process parameter throughout a whole chip. As a consequence, the transistors have different transistor lengths throughout the chip. This makes the propagation delay to be different everywhere in a chip, because a smaller transistor is faster and therefore the propagation delay is smaller.


Supply Voltage Variation [14]


The design's supply voltage can vary from the established ideal value during day-to-day operation. Often a complex calculation (using a shift in threshold voltages) is employed, but a simple linear scaling factor is also used for logic-level performance calculations.

The saturation current of a cell depends on the power supply. The delay of a cell is dependent on the saturation current. In this way, the power supply inflects the propagation delay of a cell. Throughout a chip, the power supply is not constant and hence the propagation delay varies in a chip. The voltage drop is due to nonzero resistance in the supply wires. A higher voltage makes a cell faster and hence the propagation delay is reduced. The decrease is exponential for a wide voltage range. The self-inductance of a supply line contributes also to a voltage drop. For example, when a transistor is switching to high, it takes a current to charge up the output load. This time varying current (for a short period of time) causes an opposite self-induced electromotive force. The amplitude of the voltage drop is given by .V=L*dI/dt, where L is the self inductance and I is the current through the line.


Operating Temperature Variation [14]


Temperature variation is unavoidable in the everyday operation of a design. Effects on performance caused by temperature fluctuations are most often handled as linear scaling effects, but some submicron silicon processes require nonlinear calculations.

When a chip is operating, the temperature can vary throughout the chip. This is due to the power dissipation in the MOS-transistors. The power consumption is mainly due to switching, short-circuit and leakage power consumption. The average switching power dissipation (approximately given by Paverage = Cload*Vpower supply 2*fclock) is due to the required energy to charge up the parasitic and load capacitances. The short-circuit power dissipation is due to the finite rise and fall times. The nMOS and pMOS transistors may conduct for a short time during switching, forming a direct current from the power supply to the ground. The leakage power consumption is due to the nonzero reverse leakage and sub-threshold currents. The biggest contribution to the power consumption is the switching. The dissipated power will increase the surrounding temperature. The electron and hole mobility depend on the temperature. The mobility (in Si) decreases with increased temperature for temperatures above –50 °C. The temperature, when the mobility starts to decrease, depends on the doping concentration. A starting temperature at –50 °C is true for doping concentrations below 1019 atoms/cm3. For higher doping concentrations, the starting temperature is higher. When the electrons and holes move slower, then the propagation delay increases. Hence, the propagation delay increases with increased temperature. There is also a temperature effect, which has not been considered. The threshold voltage of a transistor depends on the temperature. A higher temperature will decrease the threshold voltage. A lower threshold voltage means a higher current and therefore a better delay performance. This effect depends extremely on power supply, threshold voltage, load and input slope of a cell. There is a competition between the two effects and generally the mobility effect wins.


The following figure shows the PVT operating conditions.




The best and worst design corners are defined as follows:

  • Best case: fast process, highest voltage and lowest temperature

  • Worst case: slow process, lowest voltage and highest temperature


On Chip Variation


On-chip variation is minor differences on different parts of the chip within one operating condition. On-Chip variation (OCV) delays vary across a single die due to:
  • Variations in the manufacturing process (P)

  • Variations in the voltage (due to IR drop)

  • Variations in the temperature (due to local hot spots etc)

This need is to be modeled by scaling the coefficients. Delays have uncertainty due to the variation of Process (P), Voltage (V), and Temperature (T) across large dies. On-Chip variation allows you to account for the delay variations due to PVT changes across the die, providing more accurate delay estimates.





Timing Analysis With On-Chip Variation

  • For cell delays, the on-chip variation is between 5 percent above and 10 percent below the SDF back-annotated values.

  • For net delays, the on-chip variation is between 2 percent above and 4 percent below the SDF back-annotated values.

  • For cell timing checks, the on-chip variation is 10 percent above the SDF values for setup checks and 20 percent below the SDF values for hold checks.

    In Prime Time, OCV derations are implemented using the following commands:

  • pt_shell> read_sdf -analysis_type on_chip_variation my_design.sdf

  • pt_shell> set_timing_derate -cell_delay -min 0.90 -max 1.05

  • pt_shell> set_timing_derate -net -min 0.96 -max 1.02

  • pt_shell> set_timing_derate -cell_check -min 0.80 -max 1.10



In the traditional deterministic STA (DSTA), process variation is modeled by running the analysis multiple times, each at a different process condition. For each process condition, a so-called corner file is created that specifies the delay of the gates at that process condition. By analyzing a sufficient number of process conditions, the delay of the circuit under process variation can be bounded.

The uncertainty in the timing estimate of a design can be classified into three main categories.

  • Modeling and analysis errors: Inaccuracy in device models, in the extraction and reduction of interconnect parasitics and in the timing analysis algorithms.
  • Manufacturing variations: Uncertainty in the parameters of a fabricated devices and interconnects from die-to-die and within a particular die.

  • Operating context variations: Uncertainty in the operating environment of a particular device during its lifetime, such as temperature, supply voltage, mode of operation and lifetime wear-out.
For instance, the STA tool might utilize a conservative delay noise algorithm resulting in certain paths operating faster than expected. Environmental uncertainty and uncertainty due to modeling and analysis errors are typically modeled using worst-case margins, whereas uncertainty in process is generally treated statistically.

Taxonomy of Process Variations

As process geometries continue to shrink, the ability to control critical device parameters is becoming increasingly difficult and significant variations in device length, doping concentrations and oxide thicknesses have resulted [9]. These process variations pose a significant problem for timing yield prediction and require that static timing analysis models the circuit delay not as a deterministic value, but as a random variable.

Process variations can either systematic or random.

  • Systematic variation: Systematic variations are deterministic in nature and are caused by the structure of a particular gate and its topological environment. The systematic variations are the component of variation that can be attributed to a layout or manufacturing equipment related effects. They generally show spatial correlation behavior.

  • Random variation: Random or non-systematic variations are unpredictable in nature and include random variations in the device length, discrete doping fluctuations and oxide thickness variations. Random variations cannot be attributed to a specific repeatable governing principle. The radius of this variation is comparable to the sizes of individual devices, so each device can vary independently.

    Process variations can classified as follow:

  • Inter-die variation or die-to-die: Inter-chip variations are variations that occur from one die to next, meaning that the same device on a chip has different features among different die of one wafer, from wafer to wafer and from wafer lot to wafer lot. Die-to-die variations have a variation radius larger than the die size including within wafer, wafer to wafer, lot to lot and fab to fab variations [12].

  • Intra-die or within-die variation: Intra-die variations are the variations in device features that are present within a single chip, meaning that a device feature varies between different locations on the same die. Intra-chip variations exhibit spatial correlations and structural correlations.


  • Front-end variation: Front-end variations mainly refer to the variations present at the transistor level. The primary components of the front end variations entail transistor gate length and gate width, gate oxide thickness, and doping related variations. These physical variations cause changes in the electrical characteristics of the transistors which eventually lead to the variability in the circuit performance.

  • Back-end variation: Back-end variations refer to the variations on various levels of interconnecting metal and dielectric layers used to connect numerous devices to form the required logic gates.
In practice, device features vary among the devices on a chip and the likelihood that all devices have a worst-case feature is extremely small. With increasing awareness of process variation, a number of techniques have been developed which model random delay variations and perform STA. These can be classified into full-chip analysis and path-based analysis approaches.


Full Chip Analysis

Full-chip analysis models the delay of a circuit as a random variable and endeavors to compute its probability distribution. The proposed methods are heuristic in nature and have a very high worst-case computational complexity. They are also based on very simple delay models, where the dependence of gate delay due to slope variation at the input of the gate and load variation at the output of the gate is not modeled. When run time and accuracy are considered, full chip STA is not yet practical for industrial designs.


Path Based STA


Path based STA provides statistical information on a path-by-path basis. It accounts for intra-die process variations and hence eliminates the pessimism in deterministic timing analysis, based on case files. It is a more accurate measure of which paths are critical under process variability, allowing more correct optimization of the circuit. This approach does not include the load dependence of the gate delay due to variability of fan out gates and does not address spatial correlations of intra-die variability.

To compute the intra-die path delay component of process variability, first the sensitivity of gate delay, output slope and input load with respect to slope, output load and device length are computed. Finally, when considering sequential circuits, the delay variation in the buffered clock tree must be considered.

In general, the fully correlated assumptions will under-estimate the variation in the arrival times at the leaf nodes of the clock tree which will tend to overestimate circuit performance.


References

[1] http://www.ecs.umass.edu/ece/vspgroup/burleson/courses/558/558%20L01.pdf
[2] David Blaauw, Kaviraj Chopra, Ashish Srivastava and Lou Scheffer, "Statistical Timing Analysis: From basic principles to state-of-the-art." Transactions on Computer-Aided Design of Integrated Circuits and Systems (T-CAD), invited review article, to appear.
[3] Andrew B. Kahng, Bao Liu and Xu Xu, "Statistical Timing Analysis in the Presence of Signal-Integrity Effects," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 22, no.10, Oct. 2007.
[4] http://eetimes.com/news/design/showArticle.jhtml?articleID=163703301
[5] Jinjun Xiong, Vladimir Zolotov, Natesan Venkateswaran and Chandu Visweswariah, "Criticality Computation in Parameterized Statistical Timing," DAC 2006: 63-68.
[6] http://www.cdnusers.org/Interviewsstastratosphere/tabid/418/Default.aspx
[7] http://www.edadesignline.com/showArticle.jhtml;jsessionid=1ISIZARO0KMGMQSNDLOSKH0CJUNN2J
[8] A. Nardi, E. Tuncer, S. Naidu, A. Antonau, S. Gradinaru, T.Lin and J. Song, "Use of Statistical timing Analysis on Real Designs" Proceedings of the IEEE Design, Automation & Test in Europe Conference & Exhibition, pp. 1-6, April 2007.
[9] Agarwal, A. Blaauw, D. Zolotov, V. Sundareswaran, S. Min Zhao Gala, K. and Panda, R., "Statistically Delay computation considering spatial correlations," Proceedings of the ASP-DAC 2003, pp.271-276, Jan 2003.
[10] Aseem Agarwal, David Blaauw and Vladimir Zolotov, "Statistical Timing Analysis for Intra-Die process Variations with spatial correlations" IEEE Transactions on Computer-Aided Design, pp. 900-907, Nov 2003.
[11] Aseem Agarwal, David Blaauw and Vladimir Zolotov, "Statistical Clock Skew Analysis Considering Intra-Die Process Variations," IEEE Transactions on Computer-Aided Design, vol. 23, no. 8, pp. 1231-1242, Aug, 2004.
[12] Ayhan Mutlu, Kelvin J. Le, Mustafa Celik, Dar-sun Tsien, Garry Shyu, and Long-Ching Yeh, "An Exploratory Study on Statistical Timing Analysis and Parametric Yield Optimization," Proceedings of the 8th International Symposium on Quality Electronic Design, pp. 677-684, 2007.
[13] Robert B.Hitchcock, Sr, Gordon L. Smith, David D. Cheng, "Timing Analysis of Computer Hardware," IBM Journal, vol. 26, no. 1, Jan 1981.

Below link in contributed by Rajneesh. Thanks Raj.
(14) "Investigation of typical 0.13 μm CMOS technology timing effects in a complex digital system on-chip", www.diva-portal.org/diva/getDocument?urn_nbn_se_liu_diva-2118-1__fulltext.pdf