Published on SemiWiki (Article Link)
One task that is not very exciting but is critical is that of library quality assurance. Many design groups have created their own procedures, often having been burned in the past, to ensure that the libraries that they use are good. Failure to do so has resulted in:
- example 1: just before tapeout it was discovered that the layout and the LEF for a cell did not match. It took days to track this down
- example 2: the timing files did not match the netlist causing post-layout simulation to fail
- example 3: after weeks of iterating to try and achieve timing closure it was found that there was a constraint error in the library with setup+hold<0
These types of errors are at one level almost trivial, but the consequences can be severe. ICScape have a tool Qualib to address this. As you might guess from its name, Qualib is a library quality inspection tool. It is a comprehensive platform to qualify library/IP with advanced analysis features for better quality. Qualib can be used by design groups as a sort of incoming inspection on 3rd party libraries and IP, and it can also be used by the creators of library and IP to perform an outgoing inspection, ensuring that they are shipping good product.Qualib performs a number of important checks:
- Cell Presence Check: based on a cell list ensure that all views of all cells are present and that there are no additional cells or views
- LEF vs GDS Check: ensure that the reduced cell view in the LEF matches the actual layout: pin name, pin shape, boundary, obstructions (layers should either be pins or obstructed)
- Timing versus Verilog Check: ensure that the timing in the .lib matches the Verilog: pin name and direction, function, timing arcs
- LEF Check: make sure that the LEF is consistent: cell properties, pin properties, DRC issues (such as off-grid), routability issues (unreachable pins)
- GDS Check: label errors, tag errors, DRC errors issues (layout out of boundary etc)
- Timing Check: ensure timing constraint consistency, setup+hold>0, timing arc presence, power arc presence, condition consistency, timing table monotonicity (if the load doubles the cell should get slower not faster)
- For transistor level designs there are also equivalent checks that the circuit description language (CDL) matches LEF, timing and verilog
The flow is straightforward, selecting the rules, running the checks and then getting reports in either html or text formats. There is an interactive environment for setting up the checks and examining issues.
The benefits of Qualib are not so much that errors are found that would otherwise be missed. Modern design practice almost guarantees that the problems will eventually come to light. But what is important is to find problems that will be tapeout show-stoppers earlier in the design cycle, with the reduced risk of major panic just before tapeout. By having “known good” libraries early in the design cycle there is a reduced risk of missing the tapeout schedule when errors only get discovered during final verification. This is another example of what has become almost universally known as “shift left”, moving the discovery and fixing of problems earlier in the design cycle so that the design cycle is shortened and issues are discovered when there is still time to fix them without slipping the tapeout.
Published on SemiWiki (Article Link)
- the clocks can consume 30% or more of the power of the whole chip, so minimizing the number of buffers inserted is critical to keeping power under control
- the clock insertion delay and clock skew have a major impact on timing. If a flop on the early side of the skew window drives a flop on the late side, or vice versa, it can consume a large part of the setup/hold margin and so affect the maximum clock frequency that the chip will work at
The clock-tree is actually constructed during physical design during the clock-tree synthesis (CTS) phase. This is driven by constraints provided by the design team and so a large part of producing a good clock-tree is creating good constraints.An additional issue is that increasingly SoCs are built out of blocks of IP assembled together. Typically the IP blocks are designed by a “front-end” design team, often overseas, and the physical design and assembly is done by a “back-end” team at the headquarters.
But this leads to another problem. The front-end designers have to come up with good constraints, plus avoid producing inherently unbalanced logic that will be difficult to clock. However they don’t think like back-end designers and don’t understand the physical CTS process well.
Meanwhile the back-end team doesn’t understand the clock structure well, and by that stage in the design process has little time for interaction. They will typically run with whatever the front-end teams gave them and do their best to close timing with what they have. But it is frustrating and may be impossible to close timing with a suboptimal clock tree.
ICScape has a tool, ClockExplorer, that addresses these problems. It provides front-end designers with feedback on the quality of the clock tree to find errors or suboptimal design. Structure and constraint checking can also evaluate clock quality, and help front-end and back-end designers to identify design problems that should be fixed early.It then allows the front-end designers to communicate this information to the back-end designers and gives them similar feedback. It can also be used after CTS to do a more in-depth analysis taking the physical information into account. Of course at this point it can display a layout view, showing where the actual clock-paths run on the physical chip. For each problem, ClockExplorer can identify the problem, detail what issue it will cause and explain what needs to be changed to fix the problem. In this way it allows less experienced designers to be effective and avoid creating problems that will only show up later.
Note that ClockExplorer does not create the actual clock tree, that is still left to the CTS. ClockExplorer is a tool that allows front-end and back-end designers together to create good clock constraints, which in turn will lead to better clocks, lower power, and a fast timing closure process. In short, better CTS QoR.ClockExplorer allows designers to look at a schematic of the clock tree. Since all the datapath elements are suppressed, it can handle extremely large designs very fast. For front-end designers it produces a timing dependency report, reports suboptimal structures, missing constraints and so on. It can automatically identify false paths or unnecessary balancing, and so minimize the number of buffers that will need to be inserted. The clock tree can be displayed by level or by delay.
As an example of its use on a 28nm design with 600K instances it reduced the clock tree buffer count by 40%, the hold time total negative slack (TNS) by 80% and so on. See the table below.
In summary, ClockExplorer is a tool offering structure and constraint checking, constraint optimization, and clock tree debugging.More details on ClockExplorer are available on the ICScape website here.
Published on SemiWiki (Article Link)
What is Skipper? Well, it seems it’s a penguin in the movie Madagascar. And one of Barbie’s sisters. Who knew? But for Semiwiki readers it’s an integrated chip finishing platform from ICScape. Skipper can read in full-chip layout extremely fast, examine it and manipulate it in various ways, and write it out again.
Skipper solves a number of different problems, both before tapeout and when debugging silicon exhibiting problems:
Published on SemiWiki (Blog Link)
Interview with Jason Xing, Ph.D., CEO & President of ICScape Inc.
My EDA career started in the mid-90s when I started working on my PhD thesis at the University of Illinois in Urbana-Champaign. My thesis topic was on parallel algorithms for standard cell based placement. After graduation in 1997, I joined Sun Labs doing research on new physical design methodologies using concurrent logical and physical design. At that time, physical synthesis was becoming a critical need for high performance VLSI designs.How did you end up at ICScape? How do you feel about the evolution of your role with ICScape?
After several years of research at Sun Labs, in 2001 I joined Sun’s internal physical design development team to lead the geometrical database design and router development, where I met Dr. Steve Yang. We talked often on the physical design issues and EDA tool limitations. We decided to start a company to develop effective tools for physical design. In 2004, I quit Sun, and started working on setting up ICScape Inc. In the early years of ICScape, I was the CTO and VP of Engineering in charge of the product architecture and development. After the products, TimingExplorer™ and ClockExplorer™ were developed and achieved good market traction, the board of directors requested me to take on the role of CEO and run ICScape. I saw this as a great opportunity and a challenge. It has opened a new chapter in my career.What are the specific design challenges your customers are facing?
For large SoC designs, it takes too long and there are too many iteration to close timing due the fact that timing sign off and implementation tools are using different timing engines, creating a major correlation issue. Timing closure typically involves up to hundreds of corners and modes, and requires setup, hold, max. transition, and max. capacitance violations to be addressed. In today’s designs, thousands of timing violations are found by the sign off STA (static timing analysis) engine. Fixing them using STA’s timing engine or with the users’ custom scripts means that the placement and routing constraints and requirements are not taken into account at all. This is the reason for too many iterations. On the other hand, it is difficult for current P&R tools to address timing closure because they can handle only a few modes and corners at a time. In addition, their lack of timing correlation with signoff STA is a major hurdle against closure.
What are your plans for DAC this year? What is your goal for DAC?
Continue to promote our SoC design closure products, which include our flagship product TimingExplorer. This tool solves placement and routing aware timing ECOs, and is capable of handling all multi-corner, multui-mode (MCMM) scenarios together. Since the introduction of the company and its products at DAC last year, some of our products have received a high level of interest from potential customers. We have closed several high profile accounts and are in active evaluation with other companies. We want to continue the momentum and increase the customer base.
How does your company help with your customers’ design challenges?
Timing closure is a major issue for customers. TimingExplorer fully addresses the two major limitations of current tools and methods: 1) lack of timing correlation between STA and P&R tools and 2) an inability to simultaneously handle all MCMM timing scenarios. This is done by directly mapping timing graphs from the sign-off STA engine on to the built-in timing engine and leveraging a built-in P&R engine, capable of simultaneously handling all MCMM timing scenarios to generate ECO directives for the sign-off STA as well as the user’s P&R engine.
The results are better and faster timing closure using typically 2-4 iterations and cutting ECO time by 50%.
What are the tool flows your customers are using?
Major P&R and timing signoff flow. P&R flow include ICC and SoC Encounter EDI. Timing signoff flow tools include PrimeTime and ETS.
ICScape is currently aiding customers in timing closure and in the creation of clock tree synthesis constraints, what adjacent areas do you think might make sense for ICScape to enter in the future?
We could and would like to do more in chip finishing and low power physical design solutions including low power clock trees, and dynamic and leakage power reduction.
To visit with ICScape at DAC, click here.
Published on SemiWiki (Blog Link)
Published on 05-09-2013 07:30 PM
As an applications engineer for over 15 years supporting physical design tools that enable implementation closure, I have seen the complexity of timing closure grow continuously from one process node to the next. At 28nm, the number of scenarios for timing sign-off has increased to the extent that is way beyond the number that a Place & Route tool can handle. Most designers turned to Static Timing Analysis (STA) tools for a solution. But the STA tools have two limitations:
- STA tools usually run in a scenario-by-scenario fashion. For STA tools to generate ECOs that close timing for all scenarios, one would need to run multiple sessions at the same time, one session for each scenario. This requires the STA tools to be run simultaneously on multiple servers, with each server needing a license.
- Current STA tools do not have or use the physical information. As a result, many ECO’s (Engineering Change Orders) generated by STA tools may end up being not implementable in the physical world due to placement and/or routing congestions.
These limitations prompted for a new solution that can:
- Simultaneously handle large number of scenarios without requiring large number of licenses/server machines
- Understand the impact placement and routing have on those scenarios and hence implement ECO directive accordingly
These requirements are critical to effectively and efficiently achieve timing closure.Without these capabilities, designers are forced into not only a process that takes too many iterations and longer time to closure, and often have to accept lower chip performance for time to market.
In a recent customer engagement, I had to help the customer close timing on a design that was highly congested in both placement and routing. In addition, the design required timing closure on more than 100 sign-off scenarios. It would have taken multiple engineers and many weeks to close timing using an STA based methodology.
A key point to note is that not all routing congested areas are also placement congested, such as the channels between the macros at the top level of an SoC design. Hence, to effectively address timing violations, the tools and flow must understand both placement and routing congestion. Otherwise, one might cause new setup violations while fixing the hold violations due to detoured ECO routes. This is the primary reason why an STA based flow that is not placement and most importantly routing-aware takes many iterations to close timing.
We identified the congestion issues and used a placement and routing aware timing closure solution that could simultaneously handle all MMMC scenarios. Results: quicker timing closure with far fewer iterations!
At 20nm, a timing closure solution must be routing aware, because the additional requirements of double patterning and Vt implanting rules have a direct impact on timing and hence closure.Welcome your comments and sharing your experiences with timing closure.
ICScape Inc. (Santa Clara, California) develops and markets solutions that accelerate SoC design closure. Its flagship products, ClockExplorer and TimingExplorer were released to the market in 2006 and 2009 respectively. They have been successfully used and taped-out in over 100 SoC designs. Other products from ICScape include PowerExplorer, RCExplorer and LibExplorer. It offers sales and technical support for its products in US, China, Japan, South Korea and Taiwan.
Published on SemiWiki (Article Link)
Given today’s design requirements with respect to low power, there is increasing focus on the contribution to total power made by a design’s clock trees. The design decisions made by the front-end team to achieve high performance without wasting power must be conveyed to back-end team. This hand-off must be accurate and complete. A key component of that hand-off is the clock tree synthesis (CTS) constraints. Let’s look at what can go wrong and how to avoid these pitfalls.The clock trees in chips ten years ago were fairly simple and most chips had only a handful of clock trees. In today’s technologies this has exploded into a forest of clock trees. Sheer volume alone points to the need for automation. But even more daunting are complexities of today’s clock trees. Clock gating has been in use for a while now to aid in reducing power. Included IP blocks will have their own clock requirements. There are generated clocks, overlapping clocks, clock dividers, and on and on. All of this information needs to be packaged by the front-end team into the SDC file and clock specification (clock constraint) file for use by the back-end team.
IC Scape’s ClockExplorer tool was developed to provide analysis tools to help both teams understand the entire clock graph being developed. It crosschecks equivalence of constraints generated by front-end and back-end teams. Both teams could use ClockExplorer to analyze and sign-off the netlist and clock constraints. ClockExplorer’s platform checks the clock structure and aids in the generation constraints for a CTS tool, including CTS sequencing for complex situations with multiple SDC files and overlapping clock trees. If these tasks are done manually by either team, mistakes are much more likely to occur.
Beyond the important capabilities of simply generating and checking the constraints, ClockExplorer also optimizes the clock topology to reduce latency. As a visual aid, ClockExplorer also generates a clock schematic, greatly assisting in reviews and discussions between the teams. For a more detailed look at all the analysis features of ClockExplorer, including more details on its SDC constraint checking features, see the white paper.
By using tools such as IC Scape’s ClockExplorer, I think that front-end and back-end design teams will be able to cut design errors due to improper understanding of, or generation of, clock tree synthesis constraints. They will have a common view of the clock system, consistent checking and automated generation handling the key aspects of the constraint files. This should make a difficult task much easier and more reliable. Where discrepancies due crop up, the visual aid enabled by the automatic generation of the clock schematics should make debugging and communications between the teams much easier.