# ATE Test Time Reduction Using Asynchronous Clock Period

Praveen Venkataramani\* and Vishwani D. Agrawal<sup>†</sup>
Department of Electrical and Computer Engineering
Auburn University, Auburn, AL 36849

\*Email: pzv0006@tigermail.auburn.edu †Email: vagrawal@eng.auburn.edu

Abstract-A conventional wafer sort test on an automatic test equipment (ATE) uses a fixed synchronous clock period. Typical test cycles may produce high signal activity and to keep the power dissipation under control, a relatively slow test clock is used. This results in long test times, especially for large scan based circuits. Observing that each test clock cycle may consume different amount of power, we propose an asynchronous clock test methodology to reduce the test time. Smallest customized clock periods for test cycles or sets of cycles are computed based on power and critical path constraints. A theoretical analysis shows that the total energy consumed by the entire test is invariant and the test time depends on the rate it is dissipated during test. An asynchronous clock test dissipates this energy at the maximum allowable rate, while the conventional synchronous clock test dissipates it at a lower average rate. The asynchronous clock test method is first implemented in simulation using several ISCAS'89 benchmark circuits. These results show test time reductions up to 47%. To establish the test programming feasibility of the new methodology the Advantest T2000GS ATE at Auburn University Test Lab was used. Test time reduction of 38% is demonstrated for scan test of a circuit. The paper ends with an investigation showing that for a circuit under test, given its power budget and a test there exists a supply voltage that minimizes the test time. An analysis determines whether the shortest test must use a synchronous or an asynchronous clock.

### I. Introduction

Most digital VLSI circuits today are tested using the scan based method [8]. This reduces the complexity of testing sequential circuits to that of testing combinational circuits. In the scan method, flip-flops are loaded and unloaded through a shift register mechanism for testing faults in the combinational logic. In the era of low power devices that contain more than a billion gates, long test times have become a critical concern. Custom system-on-chip (SoC) designs containing microprocessors, digital signal processors and memories use large number of clock cycles during scan based tests. This directly impacts the final cost of the chip [8].

While the large size of a device is one reason for long test times, the main limiting factor for test speed is the power dissipated during test due to signal transitions in the circuit. Test power dissipation is known to be  $2\times$  the functional power dissipation in CPUs [22] and  $4\times$  the functional power dissipation in GPUs [35]. If the power dissipated during tests go beyond the rated power of the device then it is possible for

a good device to fail or even be damaged. Several approaches have been investigated and implemented to reduce the total power dissipation of the circuit under test (CUT), however, these methods in turn lengthen the test time [21]. Hence, in the current semiconductor industry, where devices continue to get denser and smaller, both test power and test time must be addressed together.

#### II. PRIOR WORK

Earlier approaches to reduce test time used pattern overlapping [12], [14] and reusable scan chains [18] to eliminate unwanted scan chain operations through similar patterns to reduce the scan shift process. Reduction in test time depends on the availability of such patterns. Scan chain partitioning also reduces test time to great lengths but increases the number of scan input pins. Some authors [7], [9] propose methods that can overcome this problem while achieving similar test time reduction as in multiple scan chains. Recent papers [31], [34] find an optimum voltage that is used in scan test for stuck at faults to improve test time in power constrained circuits without violating any timing constraints at reduced voltage, while considering the power dissipated in both shift and capture cycles. Test time reduction for multicore SoC designs requires power-constrained scheduling of tests [13], [19]. Recent proposals optimize SoC test schedules by selecting supply voltage and clock frequency [27]-[29].

Shanmugasundaram and Agrawal [25] [26] proposed a technique to reduce the test time in power constrained built in self test (BIST) circuits. They implement an activity monitor that increases the clock frequency if the monitor records low activity in the chain, otherwise it decreases the frequency. The method achieves 20-50% reduction in test time in BIST circuits with a little area overhead. Hashempour et. al [17] implement both BIST and ATE in an effort to reduce test time on the ATE. The methodology identifies all "easy-to-detect" faults using BIST and then uses ATE to identify the "hard-to-detect" faults.

In this work we aim to reduce the test time for scan based circuits during wafer sort using ATE by employing cycle by cycle clock variation based on the energy dissipated during that cycle. The work is initially investigated mathematically to obtain dependencies that would enable us to reach our goal. The proposed method is then verified first through simulation and then experimentally on an Advantest T2000GS ATE. Although the research as it appears here has never been presented in entirety, parts have been displayed as posters [30], [32] or discussed at technical forums [5], [6], [31], [33], [34]. In this paper we demonstrate the feasibility of a new method of testing through simulation and experiment.

### III. TEST TIME REDUCTION

During stuck-at fault testing a CUT is synchronously tested at a fixed clock rate, which is determined by the amount of power dissipated in the CUT during test and the structural delay of the CUT. The time taken to test the CUT is proportional to the product of the number of clock cycles and the test clock period. In our work the dynamic energy consumed during test, which is a function of the signal transitions (logic activity and glitches) and short circuit power, is assumed to dominate the total power dissipation during test. However, the following result always holds.

**Theorem 1.** For power constrained testing where the peak power during any clock cycle must not exceed  $P_{MAX}$ , the test time (TT) has a lower bound,

$$\frac{E_{TOTAL}}{P_{MAX}} \le TT = \frac{E_{TOTAL}}{P_{AVG}} \tag{1}$$

where  $E_{TOTAL}$  is the total energy and  $P_{AVG}$  is the average power consumed by the test.

**Proof:** Consider a test that runs for N clock cycles and for cycle i, we define:

 $T_i$  as period the clock cycle,

 $E_{di}$  as dynamic energy consumed during the cycle,

 $P_{li}$  as leakage power dissipated during the cycle, and

 $E_i$  as total energy consumed during the cycle.

Then, test time and total energy are given by,

$$TT = \sum_{i=1}^{N} T_i \tag{2}$$

$$E_{TOTAL} = \sum_{i=1}^{N} E_i = \sum_{i=1}^{N} (E_{di} + T_i \times P_{li})$$
 (3)

In particular, for a synchronous clock test,  $T_i = T_{test}$ , i.e., all clock cycles have the same period  $T_{test}$ ,

$$TT = N \times T_{test}$$
 (4)

The equality in equation 1 follows from the standard definitions of energy and power.  $P_{AVG}$  is the rate of energy usage averaged over the test duration TT. Therefore, total energy is  $E_{TOTAL} = TT \times P_{AVG}$ .

To prove the lower bound, we examine the power constrain each clock cycle must satisfy. We assume clock cycles can have any different periods; a conventional synchronous clock would be a special case. Thus,

$$\frac{E_{di}}{T_i} + P_{li} \le P_{MAX}, \ \forall \ 1 \le i \le N \tag{5}$$

0

$$T_i \ge \frac{E_{di} + T_i \times P_{li}}{P_{MAX}}, \ \forall \ 1 \le i \le N$$
 (6)

Hence, from equations 2 and 3,

$$TT \ge \frac{1}{P_{MAX}} \sum_{i=1}^{N} (E_{di} + T_i \times P_{li}) = \frac{E_{TOTAL}}{P_{MAX}}$$
 (7)

This proves the lower bound on test time in equation 1.  $\blacksquare$ 

Leakage power plays an interesting role. Notice that in inequality 6,  $T_i$  appears on both sides. For given  $P_{MAX}$  as clock period  $T_i$  is increased to satisfy the power constraint, the right hand side also increases, though at a slower rate because of small  $P_{li}$ . The minimum period for ith clock cycle is,

$$T_i = \frac{E_{di}}{P_{MAX} - P_{li}} \tag{8}$$

To determine  $T_i$  we must know dynamic energy  $E_{di}$  and leakage power  $P_{li}$ , both of which are functions of the input vector applied to the circuit in clock cycle i. For now, we neglect the leakage power and thus equation 8 will take a simpler form,

$$T_i = \frac{E_{di}}{P_{MAX}} \approx \frac{E_i}{P_{MAX}} \tag{9}$$

For a given set of test patterns generated by an automatic test pattern generator (ATPG), the total energy consumed during test remains unchanged irrespective of how tests are applied. The total test time is dependent only upon the average power consumed. In order to reduce the test time, it is required that the test be run with the smallest clock period possible while dissipating power less than the rated power. Since the minimum period is limited by the critical path delay of CUT, test time is dependent on both the rated power and the structural delay of the circuit. The two constraints that determine the minimum test clock period can be defined as follows,

- 1) Power Constraint A test is power constrained, if the minimum test clock period is limited by the maximum rated power for the circuit. We define this period as  $T_{power} = E_{MAX(test)}/P_{MAX(rated)}$  where  $P_{MAX(rated)}$  is the maximum power dissipated during functional operation or the rated maximum for the CUT and  $E_{MAX(test)}$  is the maximum energy dissipated during any test cycle.
- Structure Constraint A test is structure constrained if the minimum test clock period is limited by the structural (critical path) delay of the CUT. We define



Fig. 1. Example of a test using fixed clock period [5].

the fastest clock as  $f_{structure} = 1/T_{structure}$  where  $T_{structure}$  is the structure constrained clock period.

Based on the above definitions, the minimum test clock period would have to satisfy both power and structure constraints, i.e.,

$$T_{test} = max\{ T_{structure}, T_{power} \}$$
 (10)

In a power constrained test, the test clock period is  $T_{power} > T_{structure}$ , that is,

$$T_{test} = T_{power} = \frac{E_{MAX(test)}}{P_{MAX(rated)}}$$
 (11)

Substituting equation (11) in equation (4) we get the total test time for power constrained test as;

$$TT_{min} = N \times \frac{E_{MAX(test)}}{P_{MAX(rated)}}$$
 (12)

# IV. ASYNCHRONOUS CLOCK TEST

Equation (12) shows that, for a given rated power  $P_{MAX(rated)}$ , the total test time is a function of the maximum energy dissipated during test. At a constant power supply, the energy consumed during any cycle is a function of the amount of signal transitions in the CUT caused by the pattern applied. It is observed that, for a given set of combinational ATPG patterns the amount of energy dissipated during each clock cycle is not constant. This scenario is illustrated in Figure 1, which shows energy on the left y-axis and power on the right y-axis for eight clock periods of duration T on the xaxis. The energy during each clock cycle is shown by a gray circle and the power per cycle is denoted as orange circle. The energy dissipated in a cycle is dependent on the circuit activity during that cycle, and the power during that cycle is dependent on the period T. The clock period T is based on the amount of power dissipation  $P_{MAX}$  that the circuit can handle. Correspondingly, the maximum energy is  $E_{MAX}$ . Keeping the period T fixed during the entire test the power dissipated during each cycle may not reach the maximum power. Therefore, the period of those cycles that dissipate less than the rated power  $P_{MAX}$  can be squeezed. This is illustrated in Figure 2 where each clock period is adjusted such that power (shown in blue circle) during that cycle is closer



Fig. 2. Example of a test using varying clock period [5].

to the rated power  $P_{MAX}$ . Since energy is independent of the clock period, the same amount of energy is now consumed in a shorter time interval. This is achieved by using *asynchronous clock test* where each cycle may not use exactly the same period as its neighboring cycle.

## A. Optimum Test Clock Period

In an asynchronous clock test the amount of power dissipated during each cycle equals or is closer to the rated power  $P_{MAX(rated)}$ . Assuming that every cycle dissipates the same amount of power  $P_{MAX(rated)}$ , from theorem 1 we can say that during asynchronous clock test the test energy  $E_{TOTAL}$  is dissipated at a constant maximum average rate of  $P_{MAX(rated)}$ , thus achieving the lower bound on test time,

$$TT = \frac{E_{TOTAL}}{P_{MAX(rated)}} \tag{13}$$

Hence the test time for an N-cycle asynchronous clock test is,

$$TT = \sum_{i=1}^{N} \frac{E_i}{P_{MAX(rated)}}$$
 (14)

where  $E_i$  is the energy dissipated in  $i^{th}$  test cycle. If  $T_{test(i)}$  is the test clock period for  $i^{th}$  cycle, then

$$TT = \sum_{i}^{N} T_{test(i)} \tag{15}$$

where,

$$T_{test(i)} = \frac{E_i}{P_{MAX(rated)}}$$
 (16)

# B. Optimum Test Time

Following equation (16), for a low energy vector where the CUT consumes arbitrarily low energy during test, it may seem possible to run the test at the arbitrarily fast frequency. This, however, is not true owing to structural constraint of the device. When the CUT structure (critical path) constant is  $T_{structure} > T_{power}$ , the lower bound on test clock period in asynchronous clock test must be,

$$T_{test(i)} = T_{structure}$$
 (17)

Hence, we conclude that the clock period customized for  $i^{th}$  cycle should be,

$$T_{test(i)} = max\{ T_{structure}, \frac{E_i}{P_{MAX(rated)}} \}$$
 (18)

Equation (18) suggests that each test cycle during asynchronous clock test is either structure constrained where  $E_i \leq T_{structure} \times P_{MAX(rated)}$ , or power constrained where  $E_i > T_{structure} \times P_{MAX(rated)}$ .

Hence, all cycles that are structure constrained will have a synchronous clock equal to  $T_{structure}$  and all the cycles that are power constrained will be adjusted asynchronously to dissipate power close to  $P_{MAX(rated)}$ . Therefore, the lower bound for test time using asynchronous clock test will be achieved when maximum number of cycles consume low energy such that the cycles become structure constrained, this value will be greater than or equal to  $N \times T_{structure}$ . An upper bound on test time using asynchronous clock test will be achieved when maximum number of cycles consume more energy, i.e., are power constrained, this value will be less than or equal to the test time obtained using a constant synchronous clock period. Thus,

$$N \times T_{structure} \le TT_{asynch} \le N \times \frac{E_{MAX(test)}}{P_{MAX(rated)}}$$
 (19)

## V. EXAMPLE

We examine the theory put forth in the previous section using an ISCAS'89 sequential benchmark circuit. For simplicity, we chose s298 benchmark circuit that contains 14 flip-flops, 3 primary inputs and 6 primary outputs. We synthesized the circuit using Mentor Graphics Leonardo Spectrum [3] with TSMC 180nm technology. The spectrum tool also provided the critical path delay through static timing analysis (STA) of the circuit. A more accurate critical path delay information can be obtained after the routing of the circuit with inserted scan chains. Statistical static timing analysis (SSTA) can also be used to consider process variations during delay calculations [4], [15]. All flip-flops in the circuit were daisy chained for full scan, using Mentor Graphics DFT advisor. Once the scan chain was inserted a set of deterministic ATPG test vector patterns for stuck-at faults were generated using Mentor Graphics Tessent Fastscan [2]. A transistor level simulation was performed using Synopsys Nanosim [1] at the nominal voltage of 1.8V. The transistor level description of the netlist was generated using Mentor graphics design architect and the Spice file was imported into Nanosim. Using Nanosim we measured the energy dissipated per cycle during the entire test. Based on the report obtained through transistor level simulation, we determined the test period for each cycle. For each



Fig. 3. Synchronous and asynchronous clock simulation of 450-cycle scan test of ISCAS'89 benchmark circuit s298. Synchronous test clock frequency is 240MHz and test time is  $1.87\mu$ s. Asynchronous clock test time is  $1.31\mu$ s.

cycle the test period would be constrained both by structure as given by STA and by maximum rated power. The maximum rated power depends on the functional characteristics, physical design, packaging, etc., and is part of the specification of the circuit. In the absence of available data, for our analysis we measured the maximum power in functional mode through simulation of 1,000 random vectors, which was 1.23mW. Once the time period for each cycle was obtained the circuit was simulated again to calculate the power dissipated during each test cycle.

Figure 3 shows the simulation results of s298 benchmark circuit. The plot compares the test performed using synchronous (fixed) and asynchronous (varying) test clock periods. The x-axis shows time as test was run and the yaxis shows the power dissipated during each test cycle. As observed from the figure, when a synchronous clock is used the power dissipated during each cycle does not reach the maximum rated power at most cycles. Hence the test clock periods for cycles dissipating less power can be safely reduced until the cycle power is close to the rated power. This effect is seen in the simulation results using asynchronous clock. When the particular cycle dissipates low power the period is reduced such that the power for that cycle increases to a value closer to the rated power. However, while trying to do so if the period becomes shorter than the critical path delay then the period is set to the value of the critical path delay. Thus, we ensure that the power constrained period is minimum without violating any timing constraints. This limitation on the minimum period will force the circuit to dissipate significantly less than the rated power and hence the "dips" in the asynchronous clock plot.

In this example, the total test time with synchronous clock was  $1.87\mu s$  and the test time with asynchronous clock was

TABLE I SCAN TEST TIME FOR ISCAS  $^{\circ}89$  circuits in TSMC  $^{180}$ nm technology.

| Circuit | Total        | Per cycle        | Max per cycle   | Total energy | Synchronous clock |           | Asynchronous clock | Test time |
|---------|--------------|------------------|-----------------|--------------|-------------------|-----------|--------------------|-----------|
| name    | scan test    | peak power (W)   | Energy (pJ)     | of test (nJ) | Frequency         | Test time | test time          | reduction |
|         | clock cycles | $P_{MAX(rated)}$ | $E_{MAX(test)}$ | $E_{TOTAL}$  | (MHz)             | $(\mu s)$ | $(\mu s)$          | %         |
| s298    | 450          | 0.00123          | 5.12            | 1.611        | 240               | 1.87      | 1.31               | 30        |
| s298    | 540          | 0.00123          | 6.00            | 1.710        | 205               | 2.63      | 1.39               | 47.5      |
| s382    | 703          | 0.00290          | 7.47            | 3.828        | 388               | 1.81      | 1.32               | 27        |
| s713    | 701          | 0.00270          | 9.54            | 4.914        | 283               | 2.48      | 1.82               | 27        |
| s1423   | 6975         | 0.00450          | 33.33           | 189.15       | 135               | 51.5      | 42.06              | 18        |
| s1423   | 7724         | 0.00450          | 43.68           | 209.15       | 103               | 74.8      | 46.50              | 37        |
| s13207  | 41119        | 0.02130          | 163.84          | 5671.3       | 130               | 314.3     | 266.26             | 15        |
| s15850  | 101707       | 0.06780          | 423.75          | 26103        | 160               | 534.7     | 385.07             | 28        |
| s384584 | 224112       | 0.11060          | 582.10          | 133625       | 190               | 1393.7    | 1213               | 13        |

 $\approx 1.31 \mu s$ . This represents a reduction of 30%. Greater reduction is achievable if the average power of the entire test is significantly lower than the maximum power. Thermal analysis [36] and characterization of test power can be performed to determine the safe operation of testing by the proposed method. The test can then be modified appropriately.

#### VI. RESULTS

Table I shows the simulation results for several ISCAS'89 benchmark circuits using the procedure described in Section V. All circuits were synthesized using TSMC 180nm technology. The nominal supply voltage for this technology is 1.8V. For s298 and s1423, two different sets of test patterns were used for each circuit to observe the effect of test power on reduction in test time. This is discussed later in this section.

Column 2 of Table I shows the number of scan test clock cycles used for each circuit. It is determined by the number of flip-flops in the scan chain and the total number of vectors along with one cycle per vector for capture. Since vectors were generated for stuck at faults, only one capture cycle is used for response capture at the end of each test. The maximum rated power  $(P_{MAX(rated)})$  shown in column 3 is normally given in the circuit datasheet. However, for these benchmark circuits we obtained a value by simulating the CUT in functional mode at its fastest frequency for 100 random vectors. In some cases the power value thus obtained might be closer to the power calculated during test but employing asynchronous clock to reduce test time can still be shown. Column 4 shows the maximum energy  $(E_{MAX(test)})$  dissipated due to signal transitions in the clock cycle that consumes most energy. Column 5 shows the total energy  $(E_{TOTAL})$  consumed by the entire test. These were obtained by simulation as discussed in Section V.

Columns 6 and 7 give the test frequency and test time for synchronous clock test. The synchronous clock period  $T_{POWER}$  is obtained from equation (11), using the data from columns 3 and 4. The test frequency in column 6 is  $\frac{1}{T_{POWER}}$ .

The total test time for synchronous clock in column 7 is calculated using equation (4). Column 8 shows the total test time taken when an asynchronous clock is used and the corresponding test time reduction over that of column 7 for synchronous clock is given in column 9.

An interesting observation here is that asynchronous to synchronous test time ratio for power constrained testing is the ratio of average energy to the maximum energy per cycle. For example, consider s298 in the first row of Table I. Average energy per cycle is  $E_{AVG} = 1.611 \text{nJ}/450 = 3.58 \text{pJ}$ . The ratio  $E_{AVG}/EMAX(test) = 3.58/5.12 = 0.699$  is about the same as the test time ratio 1.31/1.87. In cases where a significant number of clock cycles are structure constrained the test time ratio may move toward unity. If every cycle consumes significantly low energy compared to a few cycles that consume very high energy, then it is possible to achieve a large reduction in test time. This is because, based on equation (17) all low energy cycles will only be limited by the critical path delay and only the cycle that has high energy consumption will run at its slowest clock period. On the other hand, if the number of cycles consuming very high energy is significantly larger than the number of cycles consuming low energy then the reduction in test time will be less. This effect was examined for two circuits of Table I, s298 and s1423. Using alternative sets of vectors with one test pattern having high energy consuming cycles and the rest of the patterns consuming low energy, the test time reduction improved from 30% to 47.5% for s298 and from 18% to 37% for s1423.

# VII. ASYNCHRONOUS CLOCK TEST USING ATE

## A. Experimental Setup

The asynchronous clock technique is experimentally verified on the Advantest T2000GS ATE at Auburn University. The ATE can be operated at a maximum speed of 250MHz and has 128 bi-directional tester channels. The power supply to the CUT is provided by the ATE through a digital power supply module DPS500mA, which has a power supply range of -2

to 8V and a output current range up to 500mA. The testplan is programmed using the native *Open architecture Test system Programming Language*, in short OTPL. Provisions to place a chip on the tester head are available. For our experiments with benchmark circuits, we used a Xilinx Spartan 3 FPGA XC3S50 soldered on a printed circuit board. The CUT used for our experiment was the s298 benchmark circuit with daisy chained mux-type scan flip-flops configured on the FPGA. The FPGA is configured on the run by the ATE using the bit file generated by the Xilinx ISE tool [20].

The ATE has a frame processor and a pattern generator, which are synchronized with the rate generator. The rate generator, generates a fixed rate clock pulse and triggers the pattern at the start of each pulse. Based on the waveform set by the frame processor and the corresponding pattern value, the pattern is applied to the CUT mounted on the tester head. The test plan for the FPGA consists of three steps. First two steps account for the configuration of the FPGA using the ATE. In the first step, The FPGA is powered by the ATE with a supply voltage of 2.5V and the configuration memory is cleared during this process. The second step provides the bit file generated by Xilinx ISE using a slave serial mode. In this mode, the configuration data is provided through the DIN input pin of the FPGA and clocked externally using the ATE. A successful configuration of the FPGA is indicated by a High output value on the DONE pin. The third step performs the functional test on the CUT now configured on the FPGA.

# B. External Test

The clock period required for the scan based functional test is determined prior to the external testing. Certain limitations of the tester framework set few margins in the clock periods and the granularity in its variations. The limitations are as follows,

- 1) The latency due to the analog measurement module puts additional delay overhead.
- 2) Only 4 unique clock periods can be provided for each testflow.

Hence, the periods for each test cycle is obtained through simulations is split into 4 groups. The latency of the analog measurement modules is included in the selected period. The longest cycle period corresponds to the pulse width determined by the cycle during which we achieve maximum switching. The shortest period corresponds to the lowest test period using which we achieve significant reduction in test time. Each test cycle is assigned to a period that is closer to, but not less than, the required period for that cycle.

Based on the periods obtained earlier, the synchronization with the rate generator is controlled by specifying the periods in the test program using a timing block. The timing block has information about the rates at which the pattern should be applied at each input and the behavior of the signal at

each pin corresponding to the value in the pattern file. Since the patterns are applied at the start of each period, the pulse provided by the rate generator is not used as a clock to the scan circuit of the CUT, but instead it is used to synchronize the pattern generation. The clock pin is considered as an input pin and the duty cycle is set to 50% of the period set by the rate generator. This way we avoid any race conditions caused during the application of the inputs at the start of the each period of the rate generator. The pattern for each cycle contains the signal value needed at each input pin and the response to be observed at each output pin. The period for each cycle is specified by mapping the cycle with the waveform information that is uniquely defined to match one period.

#### C. Results

The proposed method was applied by the ATE to the s298 benchmark circuit configured on the Xilinx. We simulated 36 deterministic combinational ATPG patterns used for simulation of s298 circuit in Table I row 2. The cycle times required for each period was obtained through a perl script based on the energy consumption per cycle reported by Nanosim [1]. Due to the latency and settling time of the analog measurement modules, the minimum clock period that was used with the CUT was 100ns. For clarity of our experiment, the clock periods obtained through simulation were multiplied by 100. Four unique clock periods were then obtained such that we achieve significant reduction in test time. Figure 4 shows the test clock periods on y-axis for each corresponding test cycles on x-axis. The horizontal broken (red) lines show the four unique test cycle periods. A test cycle will use the test clock just above the period shown in Figure 4. For the synchronous clock test the maximum period in Figure 4 will be used as the fixed clock period.

The waveforms for ATE tests are shown in Figures 5 and 6, as viewed in the logic analyzer of the Advantest T2000GS system. Two figures have the same time scale. Figures 5 shows 33 cycles (13 to 46) which account for 2 scan sequences of the synchronous clock test using a 500ns clock. The cycle number is indicated in the first row, followed by the period for each cycle as indicated above the first waveform. The labels on the left side of each waveform correspond to scan out, scan in, scan enable, 3 primary inputs and clock pins. The value expected at the scan out signals are indicated by X, L or H, at the beginning of each period and the strobe instants at which the output response is verified are indicated by downward/upward triangles, placed at the end of each period. The strobe points are located such that there is enough time for the signal to settle after a clock pulse is applied. The input waveforms are indicated along with the pattern that is applied at the start of that period. A '1' pattern for the clock during each period indicates that the clock is turned on and based on the 50% duty cycle for the clock during that



Fig. 4. Asynchronous clock for 540-cycle scan test of s298 for a power budget of 1.23mW. Horizontal broken lines indicate four test clock periods available from the T2000GS ATE. Period used for a test cycle was the nearest higher ATE clock period.



Fig. 5. Synchronous clock: ATE result for 540-cycle scan test of s298 benchmark circuit. Waveforms show 33 test cycles (cycles 13 through 46) of 500ns clock. Signals shown are scan-out, scan-in, scan enable, three primary outputs and clock. Green triangles under scan-out waveform are matching strobes.



Fig. 6. Asynchronous clock test: ATE result for 540-cycle scan test of s298 benchmark circuit. Waveforms shows 58 test cycles (cycles 13 through 71) taking the same time as taken by 46 cycles of synchronous clock test in Figure 5. Clock periods used were 200, 300, 410 and 500 ns as shown in Figure 4. Signals shown are scan-out, scan-in, scan enable, three primary outputs and clock. Green triangles under scan-out waveform are matching strobes.

period, the corresponding waveform is generated by the frame processor. For the synchronous clock test of Figures 5, which used a fixed clock period of 500ns for the entire test, the total time for 540 cycles was  $270\mu$ s.

Figure 6 shows the ATE waveforms using asynchronous clock with periods, 500, 410, 300 and 200 ns as selected for each cycle based on the corresponding activity it produces in CUT. The test clock period is determined from Figure 4. Thus, the peak activity in CUT is the same for both synchronous and asynchronous clock tests. Both Figures 5 and 6

show the waveforms for a time interval of  $16.5\mu$ s. Because asynchronous clock test runs at varying clock period more cycles are run in this time. Hence, in Figure 6 we observe 58 cycles (13 to 71) within the same timescale as 33 cycles (13 to 46) for the synchronous clock test. The total test time for 540 cycles is now  $157.7\mu$ s, which corresponds to a reduction of  $\approx 38\%$  over the synchronous clock test time.

This test time reduction is dependent on the relative clock schedules between synchronous and asynchronous clock tests and hence can be compared with 47.5% reduction reported for the 540 cycle test of s298 in Table I, even though a faster 205MHz (4.88ns period) clock was used there. There are two reasons for ATE time saving being lower. First, the granularity of clocks, i.e., four ATE clocks versus individual clock for each vector. Second. our selection of the four ATE frequencies was ad-hoc and we believe a better selection can improve the test time reduction.

## VIII. REDUCED VOLTAGE TEST

In previous sections, we showed that based on the amount of signal transitions occurring within CUT during each cycle the total test time can be minimized. However, the test time can be further reduced by finding an optimal test voltage at which the test can run fastest. Testing of digital circuits using very low voltage has been studied in the past and is known to be more effective in capturing faults [16] [11] [10]. In this section we examine the effect of reducing the power supply used for asynchronous clock test and demonstrate its feasibility using an example circuit. We begin with basic idea and its approach in synchronous clock as put forth in recent papers [31] [33] [34].

# A. Reducing Power Supply for Synchronous Clock Test

The test time is dependent on the following factors,

- 1) The critical path delay of the circuit  $T_{structure}$ ,
- 2) The maximum energy consumed by any cycle during the entire test  $E_{MAX(test)}$ , and
- 3) The maximum rated power of the CUT  $P_{MAX(rated)}$

Conventionally, testing is done at some nominal voltage based on the technology used. However reducing power supply has a quadratic effect on the energy consumed per cycle. Hence, for a given set of APTG patterns, by reducing the operating voltage for test the test clock frequency can be increased in such a way that the test time is reduced without exceeding the rated power supply for the device. But as we reduce the power supply, the critical path delay of the circuit increases. Hence, the lower bound for test clock frequency is limited by the structure constraint of the device. The optimum voltage at which the test runs fastest will be the point at which both power and structure constraints are satisfied. This scenario is graphically shown in Figure 7. As the voltage is reduced the structural delay increases and  $E_{MAX(test)}$  reduces. Since the maximum energy is reduced, test clock period can be decreased such that the power dissipation remains almost the same. This reduces the test time and the value for optimum voltage in a synchronous clock test will be the cross point of structure and power constrained curves. On an yield perspective, since the critical path delay is considered as a component to find the optimum voltage for test, a good circuit will not fail due to any increase in gate delay at the reduced voltage. The mathematical approach for this optimization is described in [34].



Fig. 7. Synchronous clock test time as a function of supply voltage showing the minimum test time voltage,  $V_{sunc}$ .

# B. Reducing Power Supply for Asynchronous Clock Test

The reduced voltage approach can be extended to further reduce the asynchronous clock test time. From equation (16), the period for each cycle is proportional to the voltage used for test. Hence the width of the power constrained period can be reduced to improve the test time by reducing the voltage for the test. However, in asynchronous clock tests, some cycles may have already been compressed to the minimum permitted by the structure constraint. As the supply voltage is reduced for test the critical path delay increases and hence more cycles become structure constrained. As the voltage is further reduced, the test starts to lose its asynchronous clock property and becomes fully synchronous. However, at some voltage before the test becomes synchronous most of the cycles will be structurally constrained except for very few cycles that are power constrained. The point at which the test still retains the asynchronous clock property will be the optimum test time for asynchronous clock test. Figure 8 illustrates this effect. Based on equation (19), the asynchronous clock test time is bounded by the slowest frequency of the synchronous clock test limited by the power constraint and the fastest frequency limited only by the structure constraint. Point A indicates the optimum synchronous clock test and point B indicates the optimum test time for asynchronous clock test.

Analysis of this method is performed on the s298 benchmark circuit. The circuit was synthesized in 180nm technology using Leonardo Spectrum [3] and the scan chains were inserted using Mentor Graphics DFT Advisor. We used deterministic vectors generated by Fastscan [2], which also include one path delay vector to trigger the critical path of the device. The asynchronous clock test was performed for decreasing power values and the corresponding test times were noted. Figure 9 shows the results plotted with test time on the y-axis and supply voltage on the x-axis. At each voltage we



Fig. 8. Asynchronous clock test time as a function of supply voltage showing the minimum test time voltage,  $V_{async}$ .

estimate the critical path delay using the alpha power law approximation [23], [24]:

$$T_{structure} = K \times \frac{V_{DD}}{(V_{DD} - V_{TH})^{\alpha}}$$
 (20)

Few assumptions were made when solving for the critical path delay,

- Critical path does not change as voltage is reduced; found valid for small voltage changes.
- 2) Threshold voltage does not vary.

The maximum rated power was found by simulating the circuit at nominal voltage for 100 random vectors in the functional mode. The resulting maximum power was noted to be  $1.23 \mathrm{mW}$ . The value for the velocity saturation index  $\alpha$  was found to be 2 by curve fitting the delay of a chain of inverters at different voltages with those obtained through simulation. Once the value of alpha is known, the value for K for s298 benchmark circuit can be found using the delay obtained through STA at nominal voltage. The value for K is found to be 1.78. We can now find the delay of the critical path at every voltage step.

The method to calculate the asynchronous clock period at each voltage was based on the explanation provided in Section V. At the optimum voltage of  $V_{async}=1.6 \rm V$ , the corresponding minimum asynchronous clock test time using this method is found to be  $TT_{asynch}=1.1 \mu \rm s$  which is a 58.17% reduction in test time compared to the test time  $(TT_{Nominal})$  of  $2.63 \mu \rm s$  using synchronous clock at the nominal voltage of  $1.8 \rm V$ , as shown in Figure 9.



Fig. 9. Minimum synchronous and asynchronous clock test times for s298 circuit after selecting suitable supply voltages.

#### CONCLUSION

Advance technologies for CMOS VLSI design for low power applications require power constrained tests that could result in longer test time and high testing costs. New methods are required to reduce test time while conforming to the allowable power. In this work we simulated the scan tests for ISCAS'89 benchmark circuits and obtained the maximum energy dissipated using synchronous clock period. Using the relation in equation (18), we generated clock with varying clock periods. This enabled us to raise the power per clock cycle to the peak power limit and in turn reduce the test time. Results have shown reductions up to 47% are attainable. Maximum reduction in test time is observed when the peak energy dissipated by the circuit is significantly greater than the average energy dissipated. We demonstrated the feasibility of the proposed method on an ATE.

We have shown that by reducing the power supply voltage at which a CUT is tested the test time can be reduced. Future investigations should involve the use of the proposed method for delay testing and consideration of the effects of leakage power that occur in advanced technologies. We believe that the methods presented here will remain beneficial. Future analysis would also include process variations when finding asynchronous clock periods for CUT. Such an analysis would be beneficial for the proposed method and we believe that it will not be a limitation in determining optimum clock periods. System on chip (SoC) testing could have severe test time and power problems when multiple cores are tested in parallel. There will be benefits if core tests are optimized by asynchronous clock periods. However, the distribution of clocks through the test access mechanism (TAM) of SoC and test program details have to be analyzed in the future.

*Acknowledgment:* This research is supported in part by the National Science Foundation Grant CCF-1116213.

#### REFERENCES

- [1] Nanosim User Guide. Synopsys, San Jose, CA, 2008.
- [2] ATPG and Failure Diagnosis Tools. Mentor Graphics Corp., Wilsonville, OR, 2009.
- [3] Leonardo Spectrum User Guide. Mentor Graphics Corp, Wilsonville, OR, 2011.
- [4] A. Agarwal, D. Blaauw, and V. Zolotov, "Statistical Timing Analysis for Intra-Die Process Variations With Spatial Correlations," in *Proc. International Conf. Computer Aided Design*, 2003, pp. 900–907.
- [5] V. D. Agrawal, "Pre-Computed Asynchronous Scan (Invited Talk)," in *Proc. 13th IEEE Latin American Test Workshop*, Quito, Ecuador, Apr. 2012.
- [6] V. D. Agrawal, "Reduced Voltage Test Can be Faster," in *Proc. International Test Conf.*, Nov. 2012. Elevator Talk.
- [7] Y. Bonhomme, T. Yoneda, H. Fujiwara, and P. Girard, "An Efficient Scan Tree Design for Test Time Reduction," in *Proc.* 9th IEEE European Test Symp., 2004, pp. 174–179.
- [8] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits. Springer, 2000.
- [9] M. Chalkia and Y. Tsiatouhas, "The Leafs Scan-Chain for Test Application Time and Scan Power Reduction," in *Proc. 19th IEEE International Conf. Electronics, Circuits and Systems*, 2012, pp. 749–752.
- [10] J. T. Y. Chang and E. J. McCluskey, "Detecting Delay Flaws by Very-Low-Voltage Testing," in *Proc. International Test Conf.*, Oct. 1996, pp. 367–376.
- [11] J. T. Y. Chang and E. J. McCluskey, "Quantitative Analysis of Very-Low-Voltage Testing," in *Proc. 14th IEEE VLSI Test Symp.*, 1996, pp. 332–337.
- [12] M. Chloupek, O. Novak, and J. Jenicek, "On Test Time Reduction Using Pattern Overlapping, Broadcasting and On-Chip Decompression," in *Proc. IEEE 15th International Symp. on Design and Diagnostics of Electronic Circuits Systems (DDECS)*, Apr. 2012, pp. 300–305.
- [13] R. M. Chou, K. K. Saluja, and V. D. Agrawal, "Scheduling Tests for VLSI Systems Under Power Constraints," *IEEE Trans. VLSI Systems*, vol. 5, no. 2, pp. 175–185, June 1997.
- [14] W. Daehn and J. Mucha, "Hardware Test Pattern Generation for Built-In Testing," in *Proc. International Test Conf.*, 1981, pp. 110–120.
- [15] C. Forzan and D. Pandini, "Why We Need Statistical Static Timing Analysis," in *Proc. 25th International Conf. Computer Design*, 2007, pp. 91–96.
- [16] H. Hao and E. J. McCluskey, "Very-Low-Voltage Testing for Weak CMOS Logic ICs," in *Proc. International Test Conf.*, Oct. 1993, pp. 275–284.
- [17] H. Hashempour, F. J. Meyer, and F. Lombardi, "Test Time Reduction in a Manufacturing Environment by Combining BIST and ATE," in *Proc. 17th IEEE International Symp. Defect and Fault Tolerance in VLSI Systems*, 2002, pp. 186–194.
- [18] W.-J. Lai, C.-P. Kung, and C.-S. Lin, "Test Time Reduction in Scan Designed Circuits," in *Proc. 4th European Conf. Design Automation*, Feb. 1993, pp. 489–493.

- [19] E. Larsson, Introduction to Advanced System-on-Chip Test Design and Optimization. Springer, 2005.
- [20] P. Mangilipally and V. P. Nelson, "Emulation of Slave Serial Mode to Configure the Xilinx Spartan 3 XC3S50 FPGA Using Advantest T2000 Tester," Technical report, Auburn University, 2011.
- [21] N. Nicolici and B. M. Al-Hashimi, Power Constrained Testing of VLSI Circuits. Springer, 2002.
- [22] S. Ravi, "Power-Aware Test: Challenges and Solutions," in *Proc. International Test Conf.*, Oct. 2007, pp. 1–10.
- [23] T. Sakurai, "Alpha Power-Law MOS Model," Solid-State Circuits Society Newsletter, vol. 9, Oct. 2004.
- [24] T. Sakurai and A. R. Newton, "Alpha Power-Law MOS Model," IEEE Journal of Solid State Circuits, vol. 25, pp. 584–593, Oct. 1990.
- [25] P. Shanmugasundaram and V. D. Agrawal, "Dynamic Scan Clock Control for Test Time Reduction Maintaining Peak Power Limit," in *Proc. 29th IEEE VLSI Test Symp.*, May 2011, pp. 248–253.
- [26] P. Shanmugasundaram and V. D. Agrawal, "Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock," in *Proc. 25th International Conf. VLSI Design*, Jan. 2012, pp. 448–453.
- [27] V. Sheshadri, V. D. Agrawal, and P. Agrawal, "Optimal Power-Constrained SoC Test Schedules With Customizable Clock Rates," in *Proc. IEEE International SOC Conf. (SOCC)*, Sept. 2012, pp. 271–276.
- [28] V. Sheshadri, V. D. Agrawal, and P. Agrawal, "Optimum Test Schedule for SoC with Specified Clock Frequencies and Supply Voltages," in *Proc. 26th International Conf. VLSI Design*, Jan. 2013, pp. 267–272.
- [29] V. Sheshadri, V. D. Agrawal, and P. Agrawal, "Power-Aware SoC Test Optimization Through Dynamic Voltage and Frequency Scaling," in *Proc. 21st IFIP/IEEE International Conf.* Very Large Scale Integration (VLSI-SoC), (Istanbul, Turkey), Oct. 2013.
- [30] P. Venkataramani and V. D. Agrawal, "Reducing ATE Time for Power Constrained Scan Test by Asynchronous Clocking," in Proc. International Test Conf., Nov. 2012. Poster P.13.
- [31] P. Venkataramani and V. D. Agrawal, "Reducing Test Time of Power Constrained Test by Optimal Selection of Supply Voltage," in *Proc. 26th International Conf. VLSI Design*, Jan. 2013, pp. 273–278.
- [32] P. Venkataramani and V. D. Agrawal, "Test-Time Reduction in ATE Using Asynchronous Clocking," in Proc. 6th IEEE International Workshop on Design for Manufacturability and Yield, June 2013. Poster.
- [33] P. Venkataramani, S. Sindia, and V. D. Agrawal, "A Test Time Theorem and Its Applications," in *Proc. 14th IEEE Latin-American Test Workshop*, Apr. 2013.
- [34] P. Venkataramani, S. Sindia, and V. D. Agrawal, "Finding Best Voltage and Frequency to Shorten Power-Constrained Test Time," in *Proc. 31st IEEE VLSI Test Symp.*, 2013, pp. 1–6.
- [35] B. Yang, A. Sanghani, S. Sarangi, and C. Liu, "A Clock-Gating Based Capture Power Droop Reduction Methodology for At-Speed Scan Testing," in *Proc. Design, Automation Test in Europe Conf. and Exhibition*, 2011, pp. 1–7.
- [36] C. Yao, K. K. Saluja, and P. Ramanathan, "Thermal-Aware Test Scheduling Using On-chip Temperature Sensors," in *Proc. 24th International Conf. VLSI Design*, 2011, pp. 376–381.