#### **RESEARCH**



# Verification and Validation with Prototype Chip Implemented with Layout Level Scan C-Elements

Hiroshi Iwata<sup>1</sup> · Kokoro Yamasaki<sup>1</sup> · Ken'ichi Yamaguchi<sup>1</sup>

Received: 12 January 2024 / Accepted: 10 July 2024 / Published online: 22 July 2024 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024

#### Abstract

Establishing a general and high-quality testing method for fabricated asynchronous circuits is crucial for the widespread adoption of asynchronous circuits. A full scan design for asynchronous circuits is imperative to address the major issue of manufacturing reliability. To establish a comprehensive testing workflow for asynchronous circuits, verification and validation are required for evaluating the full scan design must be conducted from gate level to chip level. Therefore, this paper proposes layout level circuits corresponding to transistor level scan elements capable of achieving a full scan design for general asynchronous circuits utilizing the Rohm  $0.18 \, [\mu m]$  process technology. Moreover, a prototype chip fabricated from the taped-out layout level circuits is utilized for verification and validation on both the layout and chip levels. As the verification and validation results at the layout level, the area and delay overhead against the original C-element and the scan C-elements were evaluated. Furthermore, the prototype real chip implementing the proposed scan C-elements was mounted onto a chip tester for dynamic verification by simulation, and the functional delay was measured by observing the signals with an oscilloscope. The usefulness of the proposed scan C-elements in the real chip has shown that it can be utilized as a library to realize a full scan design of asynchronous circuits.

Keywords Asynchronous circuit · Scan C-element · Layout level · Prototype chip · Verification and validation · Full scan design

#### 1 Introduction

Asynchronous circuits can solve several problems in designing synchronous circuits, such as the clock tree synthesis, clock skew, peak power consumption and so on. Therefore, VLSI design method using asynchronous circuits is attracting much attention [1, 2]. CAD tools for designing asynchronous circuit such as Balsa [3] and Petrify [4] have been developed for automatic design flows. The automatic design flow using CAD tools realize the design for practical

Responsible Editor: K. K. Saluja

☐ Hiroshi Iwata iwata@info.nara-k.ac.jp

Kokoro Yamasaki yamasaki@nara.kosen-ac.jp

Ken'ichi Yamaguchi yamaguti@info.nara-k.ac.jp

Department of Information Enginering, National Institute of Technology (KOSEN), Nara College, 22 Yata, Yamatokooriyama, Nara 6391080, Japan asynchronous circuits. However, manufacturing test of asynchronous circuits is more difficult than that of synchronous circuits. There are two main factors contributing to the difficulties of the manufacturing test. First, there is no global clock signal used in testing synchronous circuits and methods or mechanisms to control synchronisation during the manufacturing test is required since asynchronous circuits are synchronised by a handshake protocol using request and acknowledge signals. Another is that asynchronous circuits use a variety of sequential elements according to the application although synchronous circuits are designed with a single type of sequential element (D-FF).

Manufacturing test of VLSI (referred to as test) determines whether a chip is defective or not. "Test" is the most crucial step for improving the chip reliability since test is the final process of identifying defective chips before shipment. If there exists a fault in the circuit, an erroneous value might be propagated to primary outputs. The error at the primary outputs might cause failure with expected functions. The expected values at the primary output of the circuit can be calculated by simulating the output responses with applying test patterns to the circuit without any fault. To avoid



shipping the defective chips, it is needed to evaluate the test patterns used in the manufacturing test. Generally, fault coverage and fault efficiency are used as indexes to evaluate the quality of the test patterns. Fault coverage is defined as the ratio of faults detected by the test patterns to all faults in the circuit. Fault efficiency is defined as the ratio of faults detected by the test patterns to theoretical detectable faults in the circuit. For combinational circuits, Automatic Test Pattern Generation (ATPG) algorithm can generate test patterns achieving 100% fault efficiency. However, test generation for sequential circuits is impractical because of the enormous amount of computation time required with time expansion to account for the state of sequential elements.

The full scan design is the de facto standard design for testability method for synchronous circuits containing only a single type of sequential elements (D-FFs). The design permits that sequential circuits requiring time expansion can be treated as combinational circuits achieving automatic test pattern generation with a practical time. The modification method is to replace sequential elements in the circuit with corresponding scan elements, and connect each scan element in series as a single or multiple paths like shift registers. A scan path is defined as an ordered set of scan elements composing a shift register. As a result, the input and output of a scan elements can be regarded as a pseudo primary output and input of the combinational circuit, respectively, since an arbitrary value is able to be applied to the combinational circuit and and the output response from the combinational circuit is able to be observed. On the other hand, the lack of an established de facto standard method of design for testability for asynchronous circuits has impeded the popularization of asynchronous circuit design methods.

In the past, many asynchronous scan designs have been reported [5–13]. Hulgaard et al. [5] and Zeidler et al. [6] introduced a full scan design which a scan element used in Level Sensitive Scan Design (LSSD) of synchronous circuits is inserted so as to cut all feedback loops. However, inserting a large number of LSSD scan elements leads to impractical area and delay overhead. To reduce the number of inserted scan elements, partial scan design methods [7, 8] were proposed. As another approach to reduce the area overhead, Beest et al. [9, 10] proposed gate-level scan C-elements modified with cutting the internal loop of each C-element, and proposed a single latch (L1L2\*) full scan design method using the scan C-elements. However, it has been reported that these methods cannot achieve 100% fault efficiency for any given asynchronous circuit due to some specific undetctable fault [11]. Therefore, Iwata et al. [11] proposed "Bipartite full scan design" which guarantees the complete test (fault efficiency = 100%) for the combinational circuit and for the scan C-elements in the asynchronous circuits. Ishizaka et al. [12] implemented the scan C-element used in bipartite full scan design at the gate and transistor level in order to shorten a test sequence of scan paths, and Shintani et al. [13] reduced the number of transistors by reordering and sharing them. However, since none of the related studies have proposed a scan C-element in layout level, asynchronous circuit designers are required to implement these elements themselves if full scan design is applied to asynchronous circuits. Moreover, due to the lack of chip level verification and validation, each asynchronous circuit designer will need to measure speed specification with prototyped their chip, like functional delay, latency, throughput, and other required measurements.

In this research, we performed verification and validation of the scan C-element at both layout and chip levels. The component was designed and prototyped using the Rohm 0.18 [ $\mu$ m] process as one of the asynchronous circuit design methods. Our study has shown that this approach can enable technology mapping from the gate level to the layout level of the scan C-element and adoption of full scan design for asynchronous circuit desginers. In the experimental results for the prototype chip, the functional verification and various delays of the scan C-element were validated by applying test sequences to the real chip and observing the output response sequences.

Figure 1 shows the positioning of this paper in the design flow of asynchronous circuits and the scan C-element. Asynchronous circuits designed with RTL description or petri net are compiled to gate level circuit with logic synthesis tools (left flow in Fig. 1). The gate level asynchronous circuits including C-element is transformed to the full scan gate level circuit by the full scan design method [11]. As a result, C-elements in the asynchronous circuit are replaced with the corresponding scan C-elements. The full scan gate level circuit is transferred to a mask pattern and testable mass-produced chips with a technology mapping and the technology (physical) library of the scan C-element. However, any physical library of the scan C-element was not available provided from chip manufacturers. Therefore, we have developed a physical library of scan C-elements used with technology mapping (Fig. 1 right flow). The transistor level scan C-element proposed in [13] was used as schematic of the proposed layout level circuit. The layout level circuit of the proposed scan C-element was designed according to the design rules defined in Rohm 0.18 [ $\mu$ m] process model. A prototype chip for the scan C-elements was fabricated by submitting a taped-out mask pattern to the chip foundry. In addition, functional and delay validation of the prototype chip was conducted by applying test sequences and observing test response sequences as a unit test of the layout and chip level scan C-element. The reliability of the asynchronous circuit is guaranteed by the test flow, since the test flow has been prepared from upstream to downstream.

The remainder of this paper is organized as follows. Section 2 describes the functions and a transistor level implementation of the scan C-element. The proposed layout level



**Fig. 1** Research target in the design flow of asynchronous circuits and scan C-element



circuit and the prototype chip are evaluated through verification and validation by comparing the original C-element and the scan C-element in Sections 3 and 4, respectively. Section 5 concludes this paper.

# 2 Gate Level and Transistor Level Scan C-Element

This section describes the function and a transistor level implementation of the original C-element and the scan C-element.

### 2.1 Function of the Original C-Element

Figure 2 shows the logic symbol of the C-element. C-element is one of the most common sequential elements used in asynchronous sequential circuits. The C-elements are often used for waiting requests and handshake of registers. The state table of the C-element is shown in Table 1. The C-element resets the value if (a,b) is (0,0), sets the value if (a,b) is (1,1), and otherwise keeps the previous value.

Fig. 2 C-element symbol



#### 2.2 Function of the Scan C-Element

Figure 3 shows the gate level scan C-element [12] composed with two parts, the scan control logic and the C-element. The scan control logic (SCL) is a combinational circuit whose function is shown in Table 2. There are four types of operations. (1) Functional operation is used in the normal mode of the circuit. The inputs A and B from the input of combination circuit are captured into the C-element, (a,b)=(A,B). (2) Hold operation is used to hold the value captured in the C-element by applying (a,b)=(0,1) or (a,b)=(1,0). (3) Load operation is used to propagate a test pattern and a response/ error. By applying (a,b)=(SI,SI), the value of input SI can be captured in the C-element through the scan path. (4) Set operation is used to set the output of the C-element to 0 or 1 by applying (a,b)=(0,0) or (a,b)=(1,1) respectively. To avoid the race associated with signal switching during scan operation, there are four types of hold functions (SC=001, 010, 101, 110).

Table 1 Function table

| a | b | Z+ |
|---|---|----|
| 0 | 0 | 0  |
| 0 | 1 | Z  |
| 1 | 0 | Z  |
| 1 | 1 | 1  |





Fig. 3 Gate level structure of scan C-element [12]

#### 2.3 Transistor Level Circuit of Scan C-Element

Figure 4 shows the transistor level circuit of the scan control logic optimized for the area reduction proposed in [13]. The scan control logic is divided into two parts, "SC\_INV" and "CTRL". SC\_INV consists of three inverters for inversion of SC[2:0]. The CTRL consists of two functionally independent circuits, "output  $\overline{a}$  circuit" and "output  $\overline{b}$  circuit". These sub circuits are designed with CMOS logic and the scan control loige has 16 CMOS transistors.

On the other hand, various transistor level implementations have been proposed for the C-element [14–16]. This paper uses Martin's C-element [14] as the scan C-element, which has a simple and basic structure (Fig. 5, 4 NMOS and 4 PMOS transistors).

To implement a scan C-element using the scan control logic and the C-element, there are two structures (inverters placed before or after the C-element) as shown in Fig. 6. The scan C-element (a) is configured to apply  $\bar{a}$  and  $\bar{b}$  provided from the scan control logic to the C-element and negate Z at the output of C-element. The number of used

Table 2 Function table of scan control logic

| SC[2:0] | a  | b  | usege                 |
|---------|----|----|-----------------------|
| 000     | A  | В  | Functional operation  |
| 001     | 0  | 1  | Scan operation (hold) |
| 010     | 1  | 0  | Scan operation (hold) |
| 011     | SI | SI | Scan operation (load) |
| 100     | 0  | 0  | Scan operation (set)  |
| 101     | 0  | 1  | Scan operation (hold) |
| 110     | 1  | 0  | Scan operation (hold) |
| 111     | 1  | 1  | Scan operation (set)  |

NMOS and PMOS transistors is 32(SCL) + 8(C-element) + 2(inverter) = 42. The scan C-element (b) is configured to negate  $\bar{a}$  and  $\bar{b}$  provided from the scan control logic before applying them to the C-element. The number of used NMOS and PMOS transistors is 32(SCL) + 4(inverters) + 8(C-element) = 44.

Area and delay overhead are used to evaluate the scan C-element at the transistor level. Table 3 shows the evaluation result of the original C-element and scan C-elements. The area overhead measures the number of NMOS and PMOS transistors used for each circuit (#transistors), and the delay overhead measures the transition delay and rise and fall delays related to functional operation (from input a/b or A/B to output Z). Original C-element (original), scan C-element with an inverter placed after the C-element (Fig. 6(a), scan(after)), and scan C-element with two inverters placed before the C-element (Fig. 6(b), scan(before)) were designed at  $V_{DD} = 1.8 \text{ V}$ , typical condition of Rohm 0.18 [µm] process, and transient analysis was used to measure the delay overheads by using Synopsys HSPICE Version S-2021.09-1. To measure the rise delay of the functional output  $Z(t_r)$ , a rise transition was applied to both functional inputs a and b (A and B) and the time required for Z to go from  $V_{DD} \times 0.1 = 0.18 \text{ V}$  to  $V_{DD} \times 0.9 = 1.62 \,\mathrm{V}$ . To measure the rise transition delay from function input a(A) to function output Z (tda<sub>x</sub>), fix 1 at function input b(B) and measure the difference in time from when A exceeds  $V_{DD} \times 0.5 = 0.9 \text{ V}$  to when Z exceeds 0.9 V. The fall delay  $(t_f)$  was measured in the same manner of the rise delay  $(t_r)$ , the rise transition delay of b(B) (tdb<sub>r</sub>), and the fall transition delays of a(A) and b(B) $(tda_f, tdb_f)$  in the same manner of the rise transition delay of a(A)  $(tdr_a)$ .

Since  $t_r$  and  $t_f$  are dominated by the drive strength of the output stage, the scan C-element (b) "scan (after)" reflects the drive strength of the inverter at the output. On the other hand, both the original C-element and the scan C-element (a) "scan (before)" show that the loop section is the output and has less drive strength than "scan (after)". The delay overhead can be evaluated by the rise (fall) transition delay  $(tda_r, tda_f, tdb_r \text{ and } tdb_f)$ . However, "scan (after)" has the rise and fall transition delay reversed with the inverter. By comparing to the original scan C-element, the delay overhead can be determined as follows. The rise transition delay of "a" is 101 to 316 (+2.13 times), the rise transition delay of "b" is 113 to 308 (+1.73 times), the fall transition delay of "a" is 376 to 843 (+1.24 times), and the fall transition delay of "b" is 350 to 811 (+1.32 times). In the same manner, "scan (before)" has a delay overhead of +2.08 times, +1.80 times, +0.57 times, and +0.60 times, respectively.



**Fig. 4** Transistor level scan control logic [13]





Fig. 5 Transistor level C-element [14]

# 3 Proposed Layout Level Scan C-Element

This section describes the results of implementing the transistor level scan C-elements at the layout level using the Rohm 0.18 [ $\mu$ m] process rule with manual placement and routing. To ensure uniform fan-out capability, 5 [ $\mu$ m] and 2 [ $\mu$ m] was used as the size of channel width for PMOS and NMOS, respectively. To implement a layout level circuit, careful manual placement and routing is required to ensure that the electrical characteristics are functionally equivalent to those defined at the transistor level. Cadence Virtuoso (IC6.17) Schematic Editor XL, EDA L, and Layout Editor GXL were used for transistor level design, and netlist transform, and layout level design, respectively. Mentor Calibre v2021.2\_18.11 was used for verification of the designed layout level circuit (design rule check and layout versus schematic).



Fig. 6 Proposed structures of scan C-element





(a): An inverter is placed after the C-element (b): Inverters are placed before the C-element

#### 3.1 Layout Level C-Element

Figure 7 shows the proposed layout implementation of the original transistor level C-element (corresponding to Fig. 5). Since the fanout capacity of the "SET\_RESET" module in the C-element must be larger than that of the "LOOP" module, transistors of the "SET RESET" module were implemented with two in parallel. The floor area of "LOOP", "SET\_RESET", and total were  $4.04 \times 10.66 = 43.28 \, [\mu \text{m}^2], 5.28 \times 10.66 = 56.28 \, [\mu \text{m}^2],$ and  $10.62 \times 10.66 = 113.21 \, [\mu \text{m}^2]$ , respectively.

# 3.2 Layout Level Scan Control Logic

The layout of the scan control logic was implemented in two major modules, "SC\_INV" and "CTRL" shown in Fig. 8. "SC\_INV" is a simple module consisting with three NOT gates. The PMOS (5 [ $\mu$ m]) and NMOS (2 [ $\mu$ m]) used in the transistor level circuits were placed on the top and bottom, respectively. The placement order of the transistors was determined manually to decrease the required area by giving preference to transistors capable of sharing drain and source with a greedy algorithm. The height of the scan control logic was determined by routing complex wiring in the width occupied by the pmos transistors. The floor area of "SC\_INV", "CTRL", and total were  $8.54 \times 19.36 = 165.33 \, [\mu \text{m}^2], 9.36 \times 19.36 = 181.21 \, [\mu \text{m}^2],$ and  $27.73 \times 19.36 = 536.85 \, [\mu \text{m}^2]$ , respectively. As shown in Fig. 8, the CTRL wiring was very complex, with a height of 19.36 from VDD to GND, almost twice the height of the C-element (10.66). Therefore, there is a large area increase for a single inverter. Conversely, it is more straightforward to meet the design rules for wiring and mask layers.

Table 3 Transistor level evaluation of original and scan C-elements

| target       | original | scan (after) | scan (before) |
|--------------|----------|--------------|---------------|
| #transistors | 8        | 42           | 44            |
| $t_r[ps]$    | 72       | 42           | 96            |
| $t_f[ps]$    | 126      | 29           | 155           |
| $tda_r[ps]$  | 101      | 843          | 311           |
| $tdb_r[ps]$  | 113      | 811          | 316           |
| $tda_f[ps]$  | 376      | 316          | 589           |
| $tdb_f[ps]$  | 350      | 308          | 559           |

# 3.3 Layout Level Area Comparison

Table 4 shows the layout level comparison of the area of the three C-elements (original, scan C-element with an inverter inserted after C-element, and scan C-element with two inverters inserted before C-element). The area for the scan C-element with an inverter inserted after C-element "scan (after)" (Fig. 6(a)) was calculated by product of the height 19.36 and the width 33.64 including the C-element, scan control logic, an inverter, and wiring  $(33.64 \times 19.36 = 651.27 \ [\mu m^2])$ . The area for the scan C-element with two inverters inserted before C-element "scan (before)" (Fig. 6(b)) was also calculated by product of the height 19.36 and the width 45.04 including the C-element, two inverters, scan control logic, and wiring  $(49.03 \times 19.36 = 949.22 \ [\mu \text{m}^2]).$ 

# 3.4 Chip Layout for Scan C-Element

Figures 9 and 10 show the chip layout and its hierarchical design map. The chip layout contains the original C-element (C\_ELEM), the scan C-element with an inverter inserted after



Fig. 7 Physical layout of the original C-element





Fig. 8 Physical layout of the scan control logic

C-element (SCAN\_C\_AFTER), B-scan path (B\_SCAN\_PATH) and IO buffers. As an application of the proposed scan C-elements, the B-scan path (Fig. 11) proposed in [11] was constructed using four scan C-elements with two inverters inserted before C-element (SCAN\_C\_BEFORE). Dummy metal was also placed to satisfy the density rule defined in the design rule.

Design Rule Checks (DRC) and Layout Versus Schematic (LVS) were applied to the designed layout level circuits to ensure that there is no violation of design rules, and the functions of layout level circuit has the same functions of transistor level circuits (functionally equivalent). The proposed layout level circuit was converted to the GDSII format, and the tapedout mask pattern was submitted to the fabrication foundry to produce a prototype chip. 2.5 mm square packages with 160 pins are used for the fabrication.

Table 4 Layout level area comparison

|                        | original | scan (after) | scan (before) |
|------------------------|----------|--------------|---------------|
| Area[μm <sup>2</sup> ] | 113.21   | 651.27       | 949.22        |



Fig. 9 Physical layout at the chip level

# 4 Verification and Validation for a Prototype Chip

This section demonstrates the practicality of the proposed scan C-element by verification and validation for the fabricated prototype chip implemented with the proposed scan C-element. The usefulness of the proposed scan C-element in the real chip has shown that it can be used as a library to realize a full scan design of asynchronous circuits.

# 4.1 Experimental Environment

In the experiments, the prototype real chip implementing the proposed layout circuit was mounted onto an FPGA tester (Mitsubishi Electric Micro-Computer Application Software MU300-EM IV with MU300-ADP and DUTUNIV-QFP160)



Fig. 10 Hierarchical design map of the chip layout



Fig. 11 B-scan path constructed with four scan C-elements



for dynamic verification by simulation. The prototype chip was functionally verified by using a logic analyzer function of an oscilloscope (RIGOL MSO5152-E) with sixteen logic analyzer probes (PLA2216) to verify the output responses of the prototype chip to the test patterns given by the FPGA tester. The delay for the prototype chip was also measured by connecting two 1:10 350MHz passive probes (PVP2350) to the oscilloscope. The rated voltage of the prototype chip was 1.8 V, and the voltage was supplied to the IO buffer and CMOS transistors in the prototype chip through the I/O and core voltages in the FPGA tester. The following steps show that the verification and validation procedure of the prototype chips and its experimental environment.

- Step1. Mount a prototype chip onto the FPGA tester (MU300-EM IV) through a 160-pin socket (DUTU-NIV-QFP160) of the package specific DUT board (MU300-ADP).
- Step2. Connect probes to test points in the socket board and observe the input/output waveforms with the oscilloscope (MSO5152-E).
- Step3. A test sequence (Algorithm 1) is applied to the prototype chip from a prototype chip evaluation application (MU300-EVA IV).

#### Algorithm 1 Verification strategy of the scan C-element

- 1: SC[2:0]=000,  $AB=00\rightarrow01\rightarrow11\rightarrow01\rightarrow00$   $\triangleright$  test for input A
- 2: SC[2:0]=000,  $AB=00\rightarrow10\rightarrow11\rightarrow10\rightarrow00$   $\triangleright$  test for input B
- 3:  $SC[2:0]=100\rightarrow101\rightarrow001\rightarrow101\rightarrow111 > 0 \text{ hold}$ test with ab=01
- 4:  $SC[2:0]=111\rightarrow101\rightarrow001\rightarrow101\rightarrow100 \Rightarrow 1 \text{ hold}$ test with ab=01
- 5:  $SC[2:0]=100 \to 110 \to 010 \to 110 \to 111 \to 0 \text{ hold}$ test with ab=10
- 6:  $SC[2:0]=111\rightarrow110\rightarrow010\rightarrow110\rightarrow100 \triangleright 1 \text{ hold}$ test with ab=10
- 7: SC[2:0]=011,  $SI=0\rightarrow 1\rightarrow 0$   $\triangleright$  scan-in test

# 4.2 Unit Test for a Single Scan C-Element

To verify the function of the proposed layout level scan C-elements, unit test for each scan C-element was applied through the prototype chip. Algorithm 1 shows verification strategy for the functions of proposed scan C-element (shown in Table 2). The first line applies 000 to SC[2:0] (corresponds to "Functional operation" in Table 2) and verifies whether the original functions of the C-element (Table 1) which are 0 reset (ab=00), 0 hold, 1 hold, and 1 set (ab=11) are possible for ab=01. The second line also verifies the function of the C-element with respect to ab=10. The test sequences of lines 1 and 2 verified the SC[2:0]=000 operation of the scan C-element (all functions of the original C-element). Lines 3, 4, 5, and 6 verified the hold function by applying 01 or 10 to the input ab of the C-element. The third line applies 100 to SC[2:0] (corresponding to "Scan operation (set)" in Table 2), 00 is applied to the input ab of the C-element, and the internal state of the C-element is set to 0. Next, by applying SC[2:0]=101 (corresponding to "Scan operation (hold)" in Table 2), 01 is applied to the input ab of the C-element to hold the internal state. Furthermore, test verifies whether the internal state 0 can be held for all functions applying 01 to the input ab of the C-element (SC[2:0]=101 and 001), while keeping the Hamming distance at 1 to avoid a race. Finally, applying SC[2:0]=111 (corresponding to Scan Operation (set) in Table 2) verifies whether the internal state can be set to 1. Similarly, lines 4, 5, and 6 verify whether it is possible to hold 1 for ab=01, 0 for ab=10, and 1 for ab=10, respectively. The seventh line verifies that the test sequence  $00 \rightarrow 11 \rightarrow 00$  is applied to the input ab of the C-element and the same sequence of scan in (SI) can be observed at the output (Z, SO) by applying the test sequence  $0 \rightarrow 1 \rightarrow 0$  to SI and 010 to SC[2:0] (corresponds to "Scan operation (load)" in Table 2, i.e., used as scan-in function). Thus, all functions of the scan C-element in Table 2 can be verified with SC[2:0]=000 in line 1, 001 in lines 3 and 4, 010 in lines 5 and 6, 011 in line 7, 100 in lines 4 and 6, 101 in lines 3 and 4, 110 in lines 5 and 6, and 111 in lines 3 and 5.

Figures 12 and 13 show the waveforms drawn with the logic analyzer function of the oscilloscope during an experiment in which the test sequence implementing Algorithm 1



**Fig. 12** Waveform of unit test for scan C-element (after)



was applied to the scan C-element with an inverter inserted after C-element (SCAN\_C\_AFTER) and the scan C-element with two inverters inserted before C-element (SCAN\_C\_BEFORE), respectively. The experimental results verify that the proposed scan C-element functions in accordance with the specifications defined in Table 2.

#### 4.3 Unit Test for Scan Shift Function

In the experiments, scan shift function was verified with a B-scan path consisting of four scan C-elements. The B-scan path is connected in the order of scan-in, C1, C2, C3, C4 and scan-out. If a rising transition is applied to SI with all SCs

set to "Scan operation (load)", the outputs (Z1-Z4) of each scan C element (C1-C4) transition from 0 to 1. Figure 14 shows the wave forms during the scan shift of the B-scan path. "F", "H", and "L" in the SC waveforms (orange waveform: L1SC and purple waveform: L2SC) in the figure mean Functional, Hold, and Load operations, respectively. Z[0,2] and Z[1,3] are driven by L1 and L2, respectively. At t=0, the control signals L1SC and L2SC apply the functional input A[3:0] = B[3:0] = 0000 to all C-elements, resulting in the output of all C-elements being set to Z[3:0] = 0000. At t=1, both L1 and L2 hold the internal value (0) of C-elements with the hold operation. At t=2 (L1 load), the SI value 0 is scan-shifted to Z[0] (load is indicated with a transparent

Fig. 13 Waveform of unit test for scan C-element (before)





**Fig. 14** Scan shift waveform in B-scan path



red block), and Z[2] is loaded from the value of Z[1] (shifts of values are indicated with white arrows). Z[0] loads SI=1 at t=6, Z[1] loads Z[0] at t=8, Z[2] loads Z[1] at t=10, and Z[3] loads Z[2] at t=12. On the other hand, Z[0] loads SI=0 at t=10, Z[1] loads Z[0] at t=12, Z[2] loads Z[1] at t=14, and Z[3] loads Z[2] at t=16. The result shows that each scan C-element constructing the B-scan path is capable of iteratively loading and holding its internal values to achieve a scan shift. In other words, the test pattern applied from SI is shifted to C1(Z[0]), C2(Z[1]), C3(Z[2]), and C4(Z[3]), and the values captured by C2(Z[1]) and C3(Z[2]) are also observed from SO (Z[3]), as shown in Fig. 14. Since the proposed scan C-element was tested at the unit level

in Section 4.2, each scan C-element has been completely verified with Algorithm 1. Consequently, the scan shift operation was applied as an application of the integration test using four scan C-elements, and the operation was confirmed to meet to the specification.

# 4.4 Delay Overhead of the Original C-Element and the Scan C-Elements

To compare the functional delay overhead of the original C-element and the C-element with the prototype chip, a rising and falling transition were applied to the functional input a(A) with b(B) = 0 and 1, and the time until output Z



Fig. 15 Delay comparison with the original C-element and scan C-elements



rises and falls was measured. The rise or fall transition delay was measured from the time input a(A) exceeds 0.9 V to the time output Z exceeds 0.9 V or vice versa. Figure 15 shows the rising and falling transition waveforms of the original C-element and the scan C-elements measured with the oscilloscope. Since a 1:10 probe was used to measure high-speed waveforms, the actual measured value of 50 mV per line is  $50 \times 10 = 500 \,\mathrm{mV}$ . Waveform 1 (yellow line) and 2 (blue line) show "Input a(A)" and "Output Z", respectively. Since one row is 10 ns, the rise and fall transition time of the original C-element, the scan C-element with an inverter inserted after C-element, and the scan C-element with two inverters inserted before C-element were  $tda_r = 6.4 \text{ ns}$ ,  $tda_r = 5.8 \text{ ns}$ ,  $tda_r = 8.5 \text{ ns}, tda_f = 6.5 \text{ ns}, tda_f = 6.2 \text{ ns}, \text{ and } tda_f = 7.8 \text{ ns},$ respectively. Compared to the transition time at the transistor level (shown in Table 3), there is a difference of 7 to 63-fold due to the delays dominating measurements at the chip level, such as wiring, I/O pads, and external pins.

#### 5 Conclusion

It is crucial to establish a full scan design approach for asynchronous circuits because high-reliability asynchronous circuits necessitate an equivalent design for testability method to synchronous full scan design. This paper focused on the layout and chip level design as the final step of verification and validation to realize bipartite full scan testability [11] guaranteeing 100% fault efficiency for asynchronous circuits. Two patterns of scan C-elements were designed using a Rohm 0.18 [µm] process at the layout level and the delay overhead was evaluated quantitatively through transient analysis. A prototype chip was fabricated from the taped-out mask patterns, its operation was verified at the chip level, and delay overhead was also analyzed. The experimental results demonstrate that the proposed scan C-elements operates as specified in real chips to fill the final gap in the exceptionally reliable asynchronous circuit design methodology. In other words, since a full-custom asynchronous circuit design using the proposed scan C-element as a library had been realized, it is possible to realize practical asynchronous circuits with high reliability.

This paper evaluated the practicality of the transistor level C-element [12, 13] proposed by our research group from three new perspectives: transistor level delay overhead, layout level area, and real chip operation verification and validation. These previously unexamined aspects of the device's practicality are significant contributions to both industry and academia. They clarify the practicality of the device that was previously unclear. In [12] and [13], a transistor level scan C-element has been proposed, and in [12], a scan path construction method and test method have been also proposed. However, the evaluation has been limited to a comparison

of the number of transistors as an evaluation of area and the time required for the scan path test. Therefore, even if asynchronous circuit designers wished to conduct manufacturing tests using the scan C element proposed in [12] and [13], they would have been required to design a new layout level scan C-elements through their own efforts of research and development. Consequently, the results of this paper will enable asynchronous circuit designers to handle scan C-elements in the same way as synchronous circuit designers use scan FFs, as scan C-elements that have been tested on real chips and can be used safely can be handled as a library. This will overcome the most significant obstacle in asynchronous circuit design, the manufacturing test, and will represent a significant advancement in the proliferation of asynchronous circuits.

Future work includes the realization of a practical and testable asynchronous circuit using the proposed scan C-element libraries. Furthermore, as the paper's focus was on the design of scan C-elements that met the specified electrical characteristics (including functional equivalence, foundry layout rules, and crosstalk effects), there is potential for area reduction through the optimization of the placement and wiring. Other future works include the implementation of the proposed scan C-elements into standard cells and efficient scan design methods for sequential elements other than C-elements.

**Acknowledgements** This work was supported by JSPS KAKENHI Grant Number 21K11820. The VLSI chip in this study has been fabricated in the chip fabrication program of through the activities of VDEC, the University of Tokyo in collaboration with Rohm Corporation and Toppan Printing Corporation.

Data Availibility Data not available due to NDA restrictions.

#### **Declarations**

**Conflicts of Interest** The authors have no conflicts of interest to disclose.

# References

- Hua W, Lu Y-S, Pingali K, Manohar R (2020) Cyclone: A Static Timing and Power Engine for Asynchronous Circuits. In Proceedings of 26th IEEE International Symposium on Asynchronous Circuits and Systems, pp 11–19
- Ataei S, Manohar R (2020) Shared-Staticizer for Area-Efficient Asynchronous Circuits. In Proceedings of 26th IEEE International Symposium on Asynchronous Circuits and Systems, pp 94–101
- Sparso J, Furber S (2002) Principles asynchronous circuit design: A Systems Perspective. Kluwer Academic Publishers
- Cortadella J, Kishinevsky M, Kondratyev A, Lavagno L, Yakovlev A (1997) Petrify: A tool for manipulating concurrent specifications and synthesis of asynchronous controllers. IEICE Trans Inform Syst E80-D(3):315–325
- Hulgaard H, Burns SM, Borriello G (1995) Testing asynchronous circuits: A survey. Integr VLSI J 19(3):111–131



- Zeidler S, Krstic M (2015) A survey about testing asynchronous circuits. In Proceedings of European Conference on Circuit Theory and Design (ECCTD), pp 1–4
- Efthymiou A, Bainbridge J, Edwards D (2005) Test pattern generation and partial-scan methodology for an asynchronous SoC interconnect. IEEE Trans Very Large Scale Integr (VLSI) Syst 13(12):1384–1393
- Ohtake S, Saluja KK (2008) A systematic scan insertion technique for asynchronous on-chip interconnects. Digest of papers of Workshop on Low Power Design Impact on Test and Reliability, pp 12–13
- te Beest F, Peeters A, Van Berkel K, Kerkhoff H (2003) Synchronous Full-Scan for Asynchronous Handshake Circuits. J Electron Test 19(4):397–406
- te Beest F, Peeters A (2005) A multiplexer based test method for self-timed circuits. In Proceedings of the 11th IEEE International Symposium on Asynchronous Circuits and Systems, pp 166-175
- Iwata H, Ohtake S, Inoue M, Fujiwara H (2010) Bipartite Full Scan Design: A DFT Method for Asynchronous Circuits. In Proceedings of IEEE 19th Asian Test Symposium (ATS'10), pp 206–211
- Ishizaka M, Yamaguchi K, Iwata H (2019) Race-Free O(n) Scan Path Self-Test for Asynchronous Circuit. IEICE Trans J102-A(6):172–181. In Japanese
- Shintani Y, Yamaguchi K, Iwata H (2020) An Implementation of Functional Speed Ori-ented Transistor-Level Scan C-element. In Proceedings of Workshop on RTL and High Level Testing 2021(TS3-2):1–5
- Martin AJ (1989) Formal Program Transformations for VLSI Circuit Synthesis. In Formal development programs and proofs. Addison-Wesley Longman Publishing, pp 59–80
- 15. Sutherland IE (1989) Micropipelines. Commun ACM 32:720-738

 van Berkel K (1992) Beware the isochronic fork. Integr VLSI J 13(2):103–128

**Publisher's Note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

**Hiroshi lwata** received his B.E. degree in information engineering from National Institute of Technology (KOSEN), Nara College, Japan, in 2007 and the M.E. and Ph.D. degrees in information science from Nara Institute of Science and Technology, Nara, Japan, in 2008 and 2011, respectively. His research interests include VLSI CAD, IoT system, design for testability, and asynchronous circuit testing.

**Kokoro Yamasaki** received his B.E. degree in information engineering from National Institute of Technology (KOSEN), Nara College, Japan, in 2023. His research interests include VLSI testing, dependable system, and distributed systems.

**Ken'ichi Yamaguchi** received his B.E. degree in information engineering from National Institute of Technology (KOSEN), Nara College, Japan, in 1999 and the M.E. and Ph.D. degrees in information science from Nara Institute of Science and Technology, Nara, Japan, in 2001 and 2003, respectively. His research interests include VLSI CAD, design for testability, and high-level synthesis.

