# 國 立 中 正 大 學 資訊工程研究所碩士論文

適用寬頻操作之全數位延遲鎖相迴路

A Wide-Range All-Digital Delay-Locked Loop in 65nm CMOS technology

研究生: 張嘉麟 指導教授: 鍾菁哲 博士

中華民國 九十九 年 七 月

國立中正大學碩士班研究生

學位考試同意書

本人所指導 資訊工程學系

研究生 張嘉麟 所提之論文

適用寬頻操作之全數位延遲鎖相迴路(A Wide-Range All-Digital Delay-Locked Loop in 65nm CMOS technology)

同意其提付 碩 士學位論文考試

指導教授 静著打 簽章 PP年5月25日

國立中正大學碩士學位論文考試審定書

#### 資訊工程學系

研究生張嘉麟 所提之論文

適用寬頻操作之全數位延遲鎖相迴路(A Wide-Range All-Digital Delay-Locked Loop in 65nm CMOS technology) 經本委員會審查,符合碩士學位論文標準。

| 學位考試委員會<br>召集人     | 春順裕            | 簽章      |   |
|--------------------|----------------|---------|---|
| 委順裕                | <u> 種 書 哲 </u> |         |   |
| 指 導 教 授<br>中華民國 99 | 在書哲<br>年 7 月   | _簽章<br> | 日 |

### 博碩士論文電子檔案上網授權書

(提供授權人裝訂於紙本論文書名頁之次頁用) 本授權書所授權之論文為授權人在 國立中正大學 (工學院)資訊工程學系所九十 八學年度第二學期取得碩士學位之論文。

論文題目: A Wide-Range All-Digital Delay-Locked Loop in 65nm CMOS technology 指導教授: 鍾菁哲 博士

茲同意將授權人擁有著作權之上列論文全文(含摘要),非專屬、無償授權國家圖 書館及本人畢業學校圖書館,不限地域、時間與次數,以微縮、光碟或其他各種 數位化方式將上列論文重製,並得將數位化之上列論文及論文電子檔以上載網路 方式,提供讀者基於個人非營利性質之線上檢索、閱覽、下載或列印。

• 讀者基非營利性質之線上檢索、閱覽、下載或列印上列論文,應依著作權法相關規定辦理。

授權人:張嘉麟

中華民國19年1月27日

#### 適用寬頻操作之全數位延遲鎖相迴路

研究生: 張嘉麟

指導教授: 鍾菁哲博士

國立中正大學資訊工程研究所

#### 摘要

本論文提出一個全數位式延遲鎖定迴路設計。根據二進位搜尋演 算法,縮短鎖定所需要的時間。此外,所提出的漏電流延遲元件可以 容易的產生極大的訊號傳輸延遲時間,在極低頻操作時可以減少在循 環控制延遲元件裡的高速數位計數器之設計困難度。利用循環控制延 遲元件,重複使用延遲單元來增加工作頻率範圍而不是串接大量的延 遲單元。因此可以減少晶片面積。

一個 600MHz 到 1.2GHz 全數位延遲鎖定迴路實現在 UMC 65 奈米 互補式金氧半製程技術。其最大消耗的總功率為 2.6 毫瓦當操作在 1.2GHz 時。當工作頻率在 1.2GHz 時,所量測統計的方均根抖動量為 3.38ps,峰值抖動量為 39.29ps。

### A Wide-Range All-Digital Delay-Locked Loop in 65nm CMOS technology

Student: Chia-Lin Chang

Advisor: Dr. Ching-Che Chung

Department of Computer Science & Information Engineering National Chung Cheng University

### Abstract

A wide-range all-digital delay-locked loop is proposed in this thesis. Based on the binary search scheme, the locking time can be reduced. Besides, the proposed leakage delay unit (LDU) can easily generate a large delay to reduce the difficulties to build up the high-speed digital counter in the cycle-controlled delay unit (CCDU) for a very low frequency operation. By using the cycle-controlled delay unit (CCDU), it reuses the delay units to enlarge the operating frequency range rather than cascading a large amount of delay units. Thus, the chip area can be reduced, too.

A 600 kHz to 1.3 GHz all-digital delay-locked loop has been fabricated in UMC 65nm CMOS technology. The proposed DLL consumes a maximum power of 2.6 mW at 1.2 GHz. When the operating frequency is 1.2 GHz, the measured rms jitter and peak-to-peak jitter is 3.38 ps and 39.29 ps, respectively.

# Acknowledgements

I would like to express my deepest gratitude to my advisor Dr. Ching-Che Chung for his enthusiastic guidance and encouragement to overcome many difficulties throughout the research, and wholeheartedly give him and his family my best wishes faithfully.

I would like thank my S3Lab mates Chiun-Yao Ko and Cheng-Ruei Yang for their great help during my researches. I also like to thank all the S3Lab members in CCU for plenty of fruitful assistances in my graduated lives.

Finally, I give the greatest respect and love to my family and my girlfriend, Joyce Chang, and I want to express my appreciation for their support and understanding. I would like to dedicate this thesis to my family. I never let them down hope them happy and healthy forever.

### Contents

| Chapter 1 Introduction                              | 1  |
|-----------------------------------------------------|----|
| 1.1 Background                                      | 1  |
| 1.2 Motivation                                      | 3  |
| 1.3 Thesis Overview                                 | 5  |
| Chapter 2 An Overview of Delay-Locked Loop          | 6  |
| 2.1 Basic Concepts of Delay-Locked Loop             | 6  |
| 2.2 Classification of Delay-Locked Loop             | 7  |
| 2.3 Design of Analog Delay-Locked Loop              | 8  |
| 2.3.1 Phase Detector (PD)                           | 9  |
| 2.3.2 Charge Pump (CP) / Loop Filter (LF)           |    |
| 2.3.3 Voltage-Controlled Delay Line                 |    |
| 2.4 Design of Wide-Range Digital Delay-Locked Loop  | 14 |
| 2.4.1 Register-controlled DLL                       | 15 |
| 2.4.2 Counter-controlled DLL.                       | 16 |
| 2.4.3 SAR-controlled DLL.                           |    |
| 2.4.4 TDC-based DLL                                 | 19 |
| 2.5 Design of Mixed-Mode Delay-Locked Loop          |    |
| Chapter 3 Wide-Range All-Digital Delay-Locked Loop  |    |
| 3.1 Introduction                                    | 22 |
| 3.2 Design Trade-Off in Different DLL Architectures | 23 |
| 3.3 The proposed Wide-Range ADDLL Architecture      | 24 |

| 3.4 Digital Controlled Delay Line     | 25       |
|---------------------------------------|----------|
| 3.4.1 Leakage Delay Unit              | 27       |
| 3.4.2 Cycle-Controlled Delay Unit     | 31       |
| 3.4.3 Coarse Delay Unit               |          |
| 3.4.4 Fine Delay Unit                 |          |
| 3.5 Control Scheme                    | 38       |
| Chapter 4 Chip Implementation         | 40       |
| 4.1 Design Implementation             | 40       |
| 4.2 Design Parameters                 | 43       |
| 4.3 Simulation Results                | 45       |
| 4.4 Measurement Results               | 49       |
| Chapter 5 Conclusion and Future Works | 56<br>57 |
|                                       |          |

# **List of Figures**

| FIG. 2.1 THE GENERAL DLL ARCHITECTURE                                   | 6  |
|-------------------------------------------------------------------------|----|
| FIG. 2.2 BLOCK DIAGRAM OF ANALOG DLL.                                   | 8  |
| FIG. 2.3 THREE-STATE PHASE DETECTOR.                                    | 10 |
| FIG. 2.4 THE CHARACTERISTIC OF PHASE DETECTOR.                          | 10 |
| FIG. 2.5 A SIMPLIFIED DIAGRAM OF CHARGE PUMP AND LOOP FILTER            | 11 |
| FIG. 2.6 RC-TIME-CONSTANT CONTROLLED DELAY LINE.                        | 13 |
| FIG. 2.7 CURRENT-STARVED CONTROLLED DELAY LINE.                         | 13 |
| FIG. 2.8 THE ARCHITECTURE OF REGISTER-CONTROLLED DLL.                   | 15 |
| FIG. 2.9 THE ARCHITECTURE OF COUNTER-CONTROLLED DLL.                    | 17 |
| FIG. 2.10 SUCCESSIVE APPROXIMATION REGISTER (SAR) DLL.                  | 18 |
| FIG. 2.11 FLOWCHART OF 3-BIT BINARY SEARCH ALGORITHM.                   | 19 |
| FIG. 2.12 TDC-BASED DLL                                                 | 20 |
| FIG. 2.13 BLOCK DIAGRAM OF MIXED-MODE DLL                               | 21 |
| FIG. 3.1 THE BLOCK DIAGRAM OF THE PROPOSED DLL.                         | 24 |
| FIG. 3.2 THE PROPOSED LEAKAGE DELAY UNIT.                               | 27 |
| FIG. 3.3 SCHEMATIC OF THE PROPOSED LEAKAGE DELAY CELL.                  | 28 |
| FIG. 3.4 THE RELATED TIMING DIAGRAM OF THE PROPOSED LEAKAGE DELAY CELL. | 28 |
| FIG. 3.5 THE PROPOSED CYCLE-CONTROLLED DELAY UNIT (CCDU)                | 31 |
| FIG. 3.6 BLOCK DIAGRAM OF THE CYCLE-CONTROLLED DELAY CELL (CCDC)        | 33 |
| FIG. 3.7 SCHEMATIC OF THE CYCLE-CONTROLLED DELAY CELL                   | 33 |
| FIG. 3.8 TIMING DIAGRAM OF THE CCDC                                     | 34 |
| FIG. 3.9 TIMING DIAGRAM OF THE CCDU                                     | 35 |
| FIG. 3.10 SCHEMATIC OF THE COARSE DELAY UNIT (CDU)                      | 36 |

| FIG. 3.11 THE FINE DELAY UNIT (FDU).                               |    |
|--------------------------------------------------------------------|----|
| FIG. 3.12 THE BINARY SEARCH DLL CONTROLLER                         |    |
| FIG. 3.13 TIMING DIAGRAM OF THE PROPOSED DLL.                      | 39 |
| FIG. 4.1 CHIP FLOORPLAN AND I/O PLAN                               | 40 |
| FIG. 4.2 THE LAYOUT VIEW OF DLL.                                   | 42 |
| FIG. 4.3 THE DELAY LINE OF THE PROPOSED DLL.                       |    |
| FIG. 4.4 THE PROPOSED LEAKAGE DELAY CELL SIMULATION IN FAST MODE   | 45 |
| FIG. 4.5 THE PROPOSED LEAKAGE DELAY CELL SIMULATION IN SLOW MODE   | 46 |
| FIG. 4.6 SIMULATION OF CYCLE-CONTROLLED DELAY UNIT (CCDU).         | 47 |
| FIG. 4.7 THE OVERALL DLL SIMULATION.                               | 48 |
| FIG. 4.8 THE MICROPHOTOGRAPH OF THE DLL.                           | 49 |
| FIG. 4.9 DLL IS LOCKED AT 600 KHz                                  | 50 |
| FIG. 4.10 DLL IS LOCKED AT 1.2 GHZ                                 | 51 |
| FIG. 4.11 MEASURED LONG-TERM JITTER HISTOGRAM (AT 1.2 GHz)         | 52 |
| FIG. 4.12 MEASURED FIGURES-OF-MERIT FOR LONG-TERM JITTER HISTOGRAM | 53 |

# **List of Tables**

| TABLE 4.1 I/O PAD DESCRIPTION                     | .41  |
|---------------------------------------------------|------|
| TABLE 4.2 PERFORMANCE SUMMARY OF THE PROPOSED DLL | . 52 |
| TABLE 4.3 COMPARISONS OF RECENT WIDE-RANGE DLLS.  | . 54 |



# **Chapter 1**

### Introduction

#### 1.1 Background

In the current era of technological progress, with the fast evolution of the semiconductor manufacturing process entering the VLSI system. As the rapid grow up of high-speed circuits, the complexity of the chip and the clock frequency increases with time. Therefore, requirements for clock signal quality for the on-chip modules becoming more and more important now. How to eliminate the clock skew in high-speed, high-performance VLSI and System-On-Chip (SoC) becomes a very important issue.

Most of the sequential systems use the clock signal for entire system synchronization. However, many circuits need an external oscillator due to design consideration, process factor and other reasons such like chip co-works with a synchronized clock. There are many problems in signal transmission from off-chip clock source. The wire-mismatch causes the clock rising edge does not transit at the same time in all modules. The clock signal arrives to different components at different times is called clock skew. This can be caused by many reasons, the reasons listed here.

- PVT variations
- different interconnection wire length
- material imperfections
- differences input capacitance of the clock inputs of devices

With the complexity of VLSI system increases rapidly, the clock skew is also a phenomenon in a single chip. In order to distribute the clock for large clock loading nets [35], clock tree synthesis is needed. Thus how to minimize the clock skew among all modules is very important. Clock skew causes setup time and hold time violations, thus it may destroy the correctness of latched data.

Both Phase-Locked Loops (PLL's) [1]-[3] and Delay-Locked Loops (DLL's) [5]-[11] can be used to solve the clock skew problems in microprocessors and high-speed I/O interfaces. However, PLL accumulates phase error or clock jitter, makes jitter performance worse than DLL. If the frequency multiplication is not required, the DLLs are preferred for its unconditional stability, faster locking time and better jitter performance.

#### 1.2 Motivation

Delay-Locked Loops (DLLs) are widely used in high-speed microprocessors and memory interfaces to eliminate the clock skew. To meet the specifications in different applications, the DLLs are desired to achieve wide frequency range especially in low-power system-on-a-chip (SoC) with dynamic voltage and frequency scaling (DVFS) [4]. Traditionally, DLLs are often designed with the charge pump-based architecture [5],[13]-[15]. However, the charge pump-based DLLs suffer from serious leakage current [12] problem in 65nm CMOS process and the jitter performance becomes unacceptable. As a result, the low leakage CMOS process is often needed when implementing the charge pump-based DLLs in 65nm CMOS process. But if the low leakage CMOS process is used, the circuit performance will be degraded, too. Hence, the all-digital DLLs [6]-[11] which use robust digital control code to control the digital controlled delay line (DCDL) can avoid the leakage current problem and become more and more popular now.

The low supply voltage in 65nm CMOS process also makes it difficult to design a wide-range delay line. As a result, the analog DLL [13] uses a multi-band voltage controlled delay line (VCDL) to cover the wide-range operations. However, in this analog DLL, extra I/O pins are needed to specify the desired frequency band and the required ratio of charge-pump current. The two-stage delay line is proposed in mixed-mode DLL [15] to achieve the wide-range operation. The coarse-tuning delay stage which uses path selector with delay cells can provide a large delay for wide-range operations, and the high resolution delay line is achieved by adding the voltage-controlled delay line after the coarse-tuning delay stage. However, since many delay cells are used in the coarse-tuning delay line, the area and power consumption are very large.

The cycle-controlled delay line architecture is proposed in the digital DLL [11] to save the area cost when designing a wide-range DLL. The cycle-controlled delay line uses the ring oscillator architecture to generate a large delay. However, since the next stage coarse-tuning delay unit must have a delay controllable range larger than the delay step of previous cycle-controlled delay line unit, the ring oscillator in the cycle-controlled delay line unit, the ring oscillator in the cycle-controlled delay line counts the oscillator in the cycle-controlled delay line counts the oscillation times of oscillator's output. As a result, it is very difficult to design the high speed counter in the cycle-controlled delay line especially when the number of bits is increased in wide-range operation. Thus if ultra-wide operation range is required, it is difficult to use the cycle-controlled delay line architecture to provide the required delay in low-frequency operation.

In order to overcome these problems in 65nm CMOS process, a novel delay cell which used the transistor leakage current to generate an extreme large delay is presented. The proposed delay cell can reduce the bit number required for cycle-controlled delay line stage and thus makes it possible to build a DLL with ultra wide operation ranges from 600 kHz to 1.2 GHz with low power consumption and small chip area. As a result, the proposed DLL is very suitable for wide-range clock deskew applications in SoC era.

#### 1.3 Thesis Overview

This thesis is organized as follows. In chapter 2, we describe characteristic of the delay-locked loop and survey of recently wide-range DLLs and comparisons of these DLLs. In chapter 3, a wide-range all-digital DLL is proposed. The detail circuit implementation is also discussed in this chapter. Chapter 4 shows the experimental result and the performance comparisons. In chapter 5, we make a brief conclusion and discuss the future works about how to improve the performance of the Delay-Locked Loop.

# **Chapter 2**

### An Overview of Delay-Locked Loop

#### 2.1 Basic Concepts of Delay-Locked Loop



Fig. 2.1 The general DLL architecture.

The general DLL architecture is shown in Fig. 2.1. A DLL consists of a phase detector, a variable delay line, and a DLL controller to generate the analog or digital control signal for the delay line by PD's output. It selects an optimal delay ( $T_d$ ) to compensate the phase error between reference clock and remote clock. After DLL is locked, the remote clock is synchronized with reference clock. After that, the clock buffer propagation delay  $(T_c)$  can be ignored. The lock condition is the following equation:

$$K * T_{ref} = T_d + T_c \tag{2.1}$$

where *K* is an positive integer,  $T_{ref}$  represents the clock period of the reference clock.  $T_d$  and  $T_c$  denote the delay time of delay line and clock buffer respectively. When DLL is locked, there is no phase error between reference clock and remote clock.

### 2.2 Classification of Delay-Locked Loop

There are several types of DLL: Analog DLL, Digital DLL and Mixed-Mode DLL which have been proposed in many years. Analog DLLs [16]-[19] generally have better jitter and de-skew performances, but they are often process dependent, long locking time, sensitive to PVT variations and need a longer design time. Digital DLLs [6],[20]-[21] are robust with PVT variations and are fast locked, but the skew error is limited and poor jitter performance. Mixed-Mode DLLs [22]-[25] take the advantages of analog and digital DLLs, they perform a fast-locking operation and have a good jitter performance. However, it is hard to

integrate digital and analog blocks simultaneously. And its design complexity is also higher than digital DLL. There are several types of DLL architectures and each has its own advantages and disadvantages. And, we have to consider the trade-off to meet the different specifications. In the next section, we will introduce these three types of DLLs: analog DLLs, digital DLLs and mixed-mode DLLs.



2.3 Design of Analog Delay-Locked Loop

Fig. 2.2 Block diagram of analog DLL.

Fig. 2.2 illustrates the block diagram of analog DLL. It consists of a phase detector (PD), a charge pump (CP), a first order loop filter (LF) capacitor and a voltage-controlled delay line (VCDL). The reference clock signal is propagated through the voltage-controlled delay line which consists of cascaded variable delay stages. The output of the VCDL is

feedback to the phase detector, then the phase detector compares the phase between the reference clock and feedback clock. If the phase error is detected, the DLL will adjust the phase by changing the control voltage of delay cells. The charge pump integrates the phase detector output signal and the loop filter generates a stable control voltage ( $V_{ctrl}$ ) to adjust the delay of delay line.

#### 2.3.1 Phase Detector (PD)

The function of phase detector (PD) is to generate an output signal proportional to the phase difference between the reference clock and feedback clock. Fig. 2.3 shows the three-state phase detector circuit used in [26],[27] and Fig. 2.4 shows the waveforms in some conditions. When the reference clock leads feedback clock, the PD generates the UP pulse, while down signal is still at low. Oppositely, if reference clock lags feedback clock, the DOWN pulse is generated. The average value of up-down is an indication of phase difference between reference and feedback clock. The output of the PD can control the charge pump to generate a control voltage for the delay line.



Fig. 2.3 Three-state phase detector.



2.3.2 Charge Pump (CP) / Loop Filter (LF)

The simple model of charge pump and loop filter is shown in Fig. 2.5. It consists of two current sources and two switches. Charge pump is a circuit to convert the two digital output signals UP and DOWN from PD into charge flows. The quantity of charge flows is proportional to the phase error. When the UP signal is high, it turns on the upper switch and charges the output node  $V_{ctrl}$ . Oppositely, when DOWN signal is high, the DOWN signal turns on the lower switch and discharges the output node  $V_{ctrl}$ . Finally, if both UP and DOWN signals are low, no change of the output node status and the output node  $V_{ctrl}$  is in a high-impedance state.



Fig. 2.5 A simplified diagram of charge pump and loop filter.

To design loop filter we will consider the non-ideal effects, such as leakage current, the mismatch, charge sharing from current switches, the dead zone in the PD and so forth. The loop filter can be either passive or active. In general, a passive filter is simple to design and has better noise performance. The passive filter may be first-order, second-order, or other high order structure. High order filters take advantages of rejecting out-band noise. However, low order filters result in more stable operations. The choices between high order filters and low order filters depend on the applications.

#### 2.3.3 Voltage-Controlled Delay Line

Delay cells (elements) are widely used in digital system. They are essential parts in high speed VLSI application and clock phase modulation. There are many ways to implement the delay cell, such as RC delay, inverter delay chain, differential type delay chain and current starved delay cells. In this section, we will discuss two types of VCDL, RC-time-constant controlled delay line [28] and current-starved controlled delay line [15],[29].

The RC time constant controlled delay line is shown in Fig. 2.6. The delay line is composed of cascading even number of delay elements. The control voltage  $V_{ctrl}$  controls the charge current. The NMOS transistor Mn1 controls the capacitance "seem" by the driving gate. Large value of  $V_{ctrl}$  will lower the resistance of transistor Mn1, the effective capacitance increased and lead to produce a large delay.



Fig. 2.6 RC-time-constant controlled delay line.

Fig. 2.7 shows the current-starved controlled delay line. The control voltage  $V_{ctrl}$  applied to tune the resistance of pull-down transistor M2 and the pull-up transistor M1 through a current mirror. These variable resistance controls the current to charge or discharge the load capacitance. Large value of  $V_{ctrl}$  will apply large current to output node, the delay will be smaller.



Fig. 2.7 Current-Starved controlled delay line.

#### 2.4 Design of Wide-Range Digital Delay-Locked Loop

DLLs have been used in high-speed microprocessors and memory interfaces. The traditional analog DLL generally has good performance in jitter and skew, but needs a long locking time and large chip area. Moreover, the process-dependent characteristics make them hard to be ported to advanced CMOS technology. Conversely, the digital DLL can be easily ported to different processes and taking benefits from CMOS technologies, digital DLL has a lower supply voltage. The locking time is also smaller than analog DLLs. To facilitate the digital DLL for various clock generation circuits or phase alignment applications, the DLL are desired to achieve a wide operating frequency range for different specifications. We roughly divide digital DLLs into four categories according its register-controlled control scheme: [30]-[33], counter-controlled [34]-[36], successive approximation register-controlled [20] and time-to-digital convert (TDC)-based [6][15][37] DLLs. The following sections will describe them in detail.

#### 2.4.1 Register-controlled DLL

Fig. 2.8 shows the architecture of register-controlled DLL [30]-[33]. The n-bit shift registers are controlled by the output of phase detector. The phase detector detects the relationship between input clock and output clock, and generate Left/Right signal for shift register to control the delay time. When the output clock leads input clock, the PD sends "Left" signal to shift register and the high bits in the shift register will shift left to increase the delay time and compensate the phase error. Similarly, if PD sends "Right" signal, the high bits in shift register will shift right to decrease the delay time. The DLL continuously adjusts the shift register until the phase error is eliminated.



Fig. 2.8 The architecture of register-controlled DLL.

Although the control scheme is simple, but when the operation range is increased, we will need to add delay cells and bits of the shift register. Besides, the control mechanism is one by one, means that the more delay stages needs more shift registers to control the delay line. It increases the chip area and power consumption and the locking time. Thus, this architecture is not suitable for wide-range operation.

#### 2.4.2 Counter-controlled DLL

The block diagram of counter-controlled DLL [34]-[36] is shown in Fig. 2.9. The phase alignment scheme is similar to register-controlled DLL. The only difference is the structure of the delay cell, the rest of the circuit blocks are similar to register-controlled DLL. The counter-controlled DLL adopts binary-weighted delay line, so the delay cells of the delay line have carefully adjusted to maintain a fixed binary-weighted linearity. Meanwhile, the linearity of the delay line will be the key of the resolution of counter-controlled DLL.



Fig. 2.9 The architecture of counter-controlled DLL.

The n-bit up/down counter updates its value according to the output of phase detector. Phase detector detects lead or lag of the input phase and then outputs UP/DOWN signals to control the N-bit up/down counter. The counter produces binary-weighted control word and adjusts the delay time of delay line until the output clock synchronizes with input clock. The counter-controlled DLL replaces the register-controlled one to reduce the hardware of the controller. However, the counter-controlled still use the linear approach manner to search the desire code, thus the locking time is still as long as register-controlled DLL.

#### 2.4.3 SAR-controlled DLL

The locking time is an important design parameter for Digital DLL. Both register-controlled and counter-controlled DLLs use linear approach manner to search the optimal control code word, thus the locking time will be increased. For example, n-bits counter-based DLL takes 2<sup>n</sup> clock cycles to lock in worst case. In order to improve the locking time, the binary search algorithm is used to reduce the locking time. For a n-bit SAR-controlled [20] DLL shown in Fig. 2.10, it only needs n clock cycles to find the optimal delay of delay line and then the DLL locked.



Fig. 2.10 Successive approximation register (SAR) DLL.

Fig. 2.11 shows the 3-bit binary search algorithm. Assume "110" is the final control word, and the initial code word sets to "100". In this example, the output clock lags input clock in step 1 and step 2. And the

output clock leads input clock in step 3. So that, the control code set to correct code word "110" through binary search scheme.



Fig. 2.11 Flowchart of 3-bit binary search algorithm.

The successive approximation register DLL changes the control mechanism to shorten the locking time. It not only reduces the locking time, but also saves the area cost. Hence, it is suitable for wide-range application.

#### 2.4.4 TDC-based DLL

Fig 2.12 shows the TDC-based DLL. The time-to-digital converter (TDC) converts the period information of the input clock signal into digital code. The TDC-based DLL [6][15][37] divides the locking scheme into two stages, phase compensation and phase tracking. In phase

compensation stage, TDC is applied to measure the period of the input clock and outputs digital control word immediately. The output control code from TDC is sent to controller for roughly jumping to the desired code word. After phase compensation stage is finished, it turns into phase tracking stage. Phase tracking stage adopts linear search manner for fine-tuning the delay of delay line. Only few control bits needed to be determined in phase tracking stage.



Fig. 2.12 TDC-based DLL.

#### 2.5 Design of Mixed-Mode Delay-Locked Loop

The block diagram of mixed-mode DLL [22]-[25] is shown in Fig. 2.13. It combines analog DLL and digital DLL, which separates the

locking scheme into coarse-tuning stage and fine-tuning stage. Coarse-tuning stage is adjusted by digital DLL based on TDC circuit. After digital DLL is locked, the analog DLL will perform the fine-tuning based on charge pump and loop filter circuit. The combination of coarse-tuning stage's output and fine-tuning stage's output is the final output. Both fast locking and good deskew performance can be achieved by mixed-mode DLL. However, the portability of mixed-mode DLL is less than digital DLL and needs a long design time due to its higher complexity.



Fig. 2.13 Block diagram of Mixed-Mode DLL.

# **Chapter 3**

# Wide-Range All-Digital

## **Delay-Locked Loop**

#### 3.1 Introduction



There are many implementation methods of DLL discussed in chapter 2. Each type of DLLs has its advantages and disadvantages. Therefore, the selection of the DLL architecture is based on the design specifications. A wide-range all-digital DLL with fast locking time, small chip area, good jitter performance, and low power consumption facilitates to meet more different design specifications. This chapter will introduce a all-digital DLL which has wide-range operation, small chip area, low power consumption, fast locking time, and acceptable jitter performance.

#### 3.2 Design Trade-Off in Different DLL Architectures

Analog DLLs usually have good jitter and skew performance. However, these analog DLLs suffer from leakage current problem with the CMOS technology scaling down. The jitter performance owing to the leakage problem becomes unacceptable. Hence, all-digital DLLs use robust digital control code can avoid the leakage problem. In 65nm CMOS process, because of the low supply voltage makes it difficult to design a wide-range DLL. By cascading a large amount of delay cells, we may enlarge the propagation delay to achieve low frequency operation, but the area cost and power consumption will be increased, too. A cyclic control delay line [11] maintains the same timing resolution of conventional digital DLL, and extends the operating frequency range. However, the inner counter of cycle-controlled delay line must operate at a very high speed. If a wider operation frequency range is needed, the number of bits of the counter will be increased. It is difficult to design a high speed counter especially when the number of bits is increased. As a result, we proposed a delay unit which can generate a large delay time, to save the bits in the cycle-controlled delay line. The following section will discuss rhe proposed design in detail.

#### 3.3 The proposed Wide-Range ADDLL Architecture



Fig. 3.1 The block diagram of the proposed DLL.

The block diagram of the proposed DLL is shown in Fig. 3.1. The DLL consists of three parts namely: phase detector, DLL controller, and Digital Controlled Delay Line (DCDL). The digital controlled delay line is composed of four delay units: leakage delay unit (LDU), cycle-controlled delay unit (CCDU), coarse delay unit (CDU) and fine delay unit (FDU). The reference clock (Ref\_clk) is passed through the delay line and then outputted to the clock buffer. The clock buffer propagation delay indicates the delay of external clock tree. The output of the clock buffer, Out\_clk, is feedback to the DLL. The phase detector detects the phase relation between the reference clock and the output clock, and then it outputs up and down control signals to the DLL
controller. The DLL controller changes the control code of the DCDL according to the PD's output to eliminate the phase error between the reference clock and the output clock. And when the phase error between the reference clock and the output clock is eliminated, the DLL is locked. After DLL is locked, the Out\_clk will synchronize with the reference clock. In the proposed DLL architecture, the proposed leakage delay unit (LDU) is used to provide a large delay in the DCDL, and therefore the operating range of the DLL can be extended to a very low frequency. And in the conventional shift-register controlled DLL, the sequential search are often used to find the proper control code and resulting in a long lock-in time. The sequential search scheme is not suitable for wide-range DLL, and therefore the binary search scheme is used in the DLL controller to shorten the lock-in time of the DLL.

## 3.4 Digital Controlled Delay Line

There are four cascading delay stage in the digital controlled delay line.

- Leakage Delay Unit (LDU)
- Cycle-Controlled Delay Unit (CCDU)
- Coarse Delay Unit (CDU)
- Fine Delay Unit (FDU)

First of all, the leakage delay unit (LDU) can provide a large delay and makes the DLL possible to operate at very low operating frequency. The variable delay of cycle-controlled delay unit (CCDU) covers the one step delay of LDU. It reuses the delay units to extend the delay range rather than cascading a large amount of delay units. Coarse delay unit and fine delay unit have smaller delay for keeping finer resolution. These four delay units are directly connected in series. It is simple and straightforward.



## 3.4.1 Leakage Delay Unit



Fig. 3.2 The proposed Leakage Delay Unit.

Leakage delay unit is composed of one multiplexer and 32 cascading leakage delay cells as shown in Fig. 3.2. When DLL is in high speed operation, the input clock "Ref\_clk" is bypassd to the output. Each leakage delay cell has a controllable delay. When "FAST[n]" is low, it means the n-th leakage delay cell is in slow mode. The n-th leakage delay cell will generate a large propagation delay for low frequency operation. Oppositely, when "FAST[n]" is high, the n-th leakage delay cell is in fast mode. This n-th leakage delay cell will just work like a normal buffer. The highest operating frequency is limited by the intrinsic delay of the cascading leakage delay path, so there is a bypass path to shorten the signal delay.



Fig. 3.3 Schematic of the proposed leakage delay cell.





The proposed leakage delay cell is shown in Fig. 3.3. The proposed leakage delay cell uses the leakage current of the transistor in 65nm

CMOS technology to generate an extreme large delay. In the schematic view of Fig. 3.3, when "reset" is low and "FAST" is high, it means the delay cell operates in fast mode. The proposed delay cell will work just like two cascading inverter. In the fast mode, the delay of the leakage delay cell is very small. Otherwise, when "reset" is low and "FAST" is low too, the delay cell will operate in slow mode. As shown in Fig 3.3 in the part A of the proposed leakage delay cell, the delay from "IN" rise transition to "INV\_OUT" fall transition is still small. When "IN" has rise transition, the A part works like an inverter. But in the part B of the proposed leakage delay cell, the delay from "INV\_OUT" fall transition to "OUT" rise transition is very large. This is because both pull-up and pull-down circuit are switched off. The leakage current of the always-off PMOS transistor charges "Y" point. After a long time, the leakage current I charges this "Y" point to high.

Similarly, while "IN" has fall transition, both charge path and discharge path of "X" point are switched off in part A of the proposed leakage delay cell. The leakage current takes a very long time to charge "X" point to high. Next, the rise transition of "INV\_OUT" will quickly impact to the part B, the delay from "INV\_OUT" rise transition to "OUT" fall transition is small. As a result, the delay from "IN" to "OUT" of the proposed leakage delay cell is extreme large in slow mode.

The charging speed can be tuned by adjusting the width of the always-off PMOS transistor. The charging speed is also influenced by process, voltage, and temperature (PVT) variations. In the proposed leakage delay unit (LDU), the maximum delay is generated when FAST[31:0] is 32'h0 and the minimum delay is generated when FAST[31:0] is 32'hFFFF\_FFF.

The proposed leakage delay cell can generate two different propagation delay time. The delay step of LDU is covered by cycle-controlled delay unit (CCDU). The following section will describe CCDU in detail.



## 3.4.2 Cycle-Controlled Delay Unit

Fig. 3.5 shows the architecture of the proposed cycle-controlled delay unit (CCDU). It is composed of two cycle-controlled delay cells (CCDC), one edge combiner and one multiplexer. In order to generate an output with 50% duty cycle, a dual structure is introduced. Also, a pair of complementary inputs, LDL\_OUT+ and LDL\_OUT-, with a phase shift of 180 degrees is required. The edge combiner could be a SR flip-flop or a D-type Flip/Flop with asynchronous set and reset functions. When DLL is in high speed operation, the controller will sent zero counter value to this CCDU and the "count\_is\_0" will be pulled high. Consequently, the input signal "LDL\_OUT+" is bypassed to the output as "CCDL\_out".



Fig. 3.5 The proposed Cycle-controlled Delay Unit (CCDU).

Fig. 3.6 shows the block diagram of the cycle-controlled delay cell (CCDC) Fig. 3.7 shows the detail circuit of the CCDC and the timing diagram is illustrated in Fig. 3.8. LDL\_out, generated from the LDU, will

trigger the D-type Flip/Flop. The delay cell with an enable signal "trig" in ring oscillator is a fast resetting circuit. When "trig" signal is pull-down, it is quickly response to the delay cells and resetting the CCDC immediately. It is to prevent the additional trigger signal to the programmable counter and outputs the wrong initial counter value. Once the ring oscillator starts operating, "osc\_out" triggers n-bit programmable counter to count upward. When the output of the programmable counter is equal to that of the "count", generated from controller, the "match" will be pulled high. Pulse generator generates a pulse to the output, and it also resets the D-type Flip/Flop and programmable counter. As shown in Fig. 3.8, the period of the ring oscillator is Td, the delay time between "LDL\_out" and "pulse" can be digitally adjusted. It allows the input clock to circulate in the CCDC according to different input "count". The delay time between "LDL\_out" and "pulse" can be expressed as

$$t = c * Td$$
, for  $c = 0 \sim 2^n - 1$  (3.1)

where c is the times that CCDC is reused. The delay time can be extended by reusing the CCDC rather than cascading extra delay units.



Fig. 3.6 Block diagram of the cycle-controlled delay cell (CCDC).



Fig. 3.7 Schematic of the cycle-controlled delay cell.



Fig. 3.8 Timing diagram of the CCDC

Fig. 3.9 shows the timing diagram of the proposed cycle-controlled delay unit (CCDU). The input signals, "LDL\_out+" and "LDL\_out-", trigger the cycle-controlled delay units (CCDC) respectively. As the trigger signal comes, the inner counter will start counting upward until it counts up to the input count value from the DLL controller. While the inner counter matches the count value, the signal "S\_p" and the signal "R\_p" are generated by these two edge-triggered cycle-controlled delay cells. These two signals with a SR-latch can generate an output clock with 50% duty cycle as shown in Fig. 3.9. When DLL is in high frequency operation, the DLL controller will sent zero counter value to the CCDU, and then the signal "count\_is\_0" is pulled high, thus the input signal "LDL\_out" is bypassed to the output of the CCDU. In the proposed CCDU, the ring oscillator with the digital counter can generate a large delay for covering the one delay step of previous leakage delay unit in PVT variations. And this CCDU also keeps the resolution at the total

variable delay of the coarse delay unit (CDU) for precisely clock generation.



Fig. 3.9 Timing diagram of the CCDU.

For finer resolution of DLL, the coarse and fine tuning are adopted in the proposed DLL. Coarse tuning and fine tuning are implemented by coarse delay unit (CDU) and fine delay unit (FDU), respectively. The following sections will discuss these circuits in detail.

## 3.4.3 Coarse Delay Unit



Fig. 3.10 Schematic of the coarse delay unit (CDU).

Fig. 3.10 shows the circuit of the proposed coarse delay unit (CDU). It is composed of 31 delay cells, 31 AND logic gates and 32 multiplexers. The proposed circuit can generate 32 different delays, and the minimum delay is one multiplexer propagation delay. Assume the control code "Coarse" is 32'h0000\_0003 as shown in Fig. 3.10, the input signal propagate through two delay cells, two AND logic gates, and three multiplexers. Unused delay cells can be turned off for reducing power consumption. One delay step of the CDU is about 130ps at worst case by HSPICE simulation. The total variable delay time of this CDU is about 4.039ns in simulation. The total delay controllable range of the CDU should cover the delay step of the previous cycle-controlled delay unit in PVT variations.

#### 3.4.4 Fine Delay Unit

Fig. 3.11 shows the proposed fine delay unit (FDU). It is composed of N cascading buffers and (N-1) digital-controlled varactor (DCV) cells [3]. The DCV uses the gate capacitance difference of NAND gates under different digital control inputs to achieve different delay time. This DVC has several advantages such as good performance in terms of fine resolution, high portability, short design turn around cycle. For better resolution and linearity of the delay line, DCV cells are used in fine tuning stage. The total delay controllable range of the FDU should cover the delay step of the previous coarse delay unit in PVT variations.



Fig. 3.11 The fine delay unit (FDU).

## 3.5 Control Scheme



Fig. 3.12 The Binary search DLL controller.

Fig. 3.12 illustrates the Binary search DLL controller locking process. Each of the four delay units start from middle delay control code of tunable delay. If one stage can provide "n" different delays, the search step is "n/4" in the initial state. When the delay time is smaller than target delay, the controller adds current search step to control code, and increases the delay time of the delay unit. Oppositely, when output delay time is larger than the target, the controller subtracts the control code to

reduce the output delay time of delay unit. Whenever the PD's output changes from "up" to "down" or vice versa, the search step is divided by 2. After the search step reduces to 1, the control code is determined.



Fig. 3.13 Timing diagram of the proposed DLL.

Fig. 3.13 shows the timing diagram of proposed DLL. The proposed DLL takes one clock to initialize the internal circuit. After that, four stage binary search scheme is used in the DLL controller to find the proper control codes in each of the four delay units of the proposed delay line. The phase detector detects lead or lag between the reference clock (Ref\_clk) and output clock (Out\_clk), and then it outputs the up or down signals to the DLL controller to update the control code of the delay line. After the phase error between the reference clock and output clock is cancelled, the DLL is locked.

# **Chapter 4**

# **Chip Implementation**

## 4.1 Design Implementation



Fig. 4.1 Chip Floorplan and I/O plan.

| input     | bits | function                          |             |  |  |
|-----------|------|-----------------------------------|-------------|--|--|
| RESET     | 1    | reset chip                        |             |  |  |
| CLK       | 1    | input clock                       |             |  |  |
| DCO_EN    | 1    | enable internal dco               |             |  |  |
| DCO_CODE  | 3    | dco frequency select              |             |  |  |
| CODE_SEL  | 2    | select the control code to output |             |  |  |
|           |      | value                             | code output |  |  |
|           |      | 0                                 | LDU         |  |  |
|           |      | 1                                 | CCDU        |  |  |
|           |      | 2                                 | CDU         |  |  |
|           |      | 3                                 | FDU         |  |  |
| output    | bits | function                          |             |  |  |
| SYS_LOCK  | 1 /  | lock status                       |             |  |  |
| PD_LOCK   | 1    | PD's output signal                |             |  |  |
| CTRL_CODE | T    | digital control code              |             |  |  |
| OUT_REF   |      | output reference clock            |             |  |  |
| OUT_CLK   |      | output clock of DLL               |             |  |  |

Table 4.1 I/O PAD description



Fig. 4.1 shows the chip floorplan and I/O plan of the proposed DLL and the Pad description is shown in Table 1. The layout view of the proposed DLL is shown in Fig. 4.2. The area of the chip with I/O pads is  $670 \times 670 \ \mu\text{m}^2$ . The delay line contains the LDU, CCDU, CDU and FDU. A digital control oscillator (DCO) is also designed in the test chip to input the high frequency clock to the DLL.



Fig. 4.2 The layout view of DLL.

#### 4.2 Design Parameters



Fig. 4.3 The delay line of the proposed DLL.

Fig. 4.3 shows the delay line of the proposed DLL. Each delay unit should have a delay controllable range larger than a delay step of previous delay unit. Thus, we assume a delay step of CCDU is Td, which represents the inverse of the oscillation frequency of the oscillator in CCDU, the total delay controllable range of CDU and FDU should be larger than Td. If the total delay controllable range of CDU and FDU is increased, the delay step (Td) of CCDU can be also increased and thus the counter in the CCDU can be design more easily. However, if we increase the delay controllable range of CDU and FDU, the chip area and power consumption will also be increased, too. Thus, a small delay variable range of CDU and FDU is adopted for finer resolution and lower cost. Since the delay step (Td) of CCDU is determined by the next cascading delay unit, the parameter of CCDU is the bits of the programmable counter. For maximum delay tunable range, the bits of the programmable counter are desired as more as possible. The maximum bits of the programmable counter are limited by the delay step (TD) of CCDU, high speed counter is hard to be implemented with cell-based design flow. As a result, the number of the bits is determined by the critical path delay of the counter, while wider bits counter usually have longer data path and are more complicated. Fortunately, the proposed leakage delay unit (LDU) can generate a large signal delay for saving the bits of the counter in CCDU and makes it possible to achieve low frequency operation.

The proposed CCDU has 7-bit programmable counter inside, and one step of the CCDU, period of the oscillator, is about 3.41ns and the total variable delay time is 433.31ns. One step of leakage delay unit (LDU) is tuned to approach the total variable delay of next stage CCDU by adjusting the width of the always-off PMOS. One step delay time of LDU is about 428.5ns.

## 4.3 Simulation Results

Fig. 4.4 The proposed leakage delay cell simulation in fast mode. shows the simulation of the proposed leakage delay cell in fast mode. The simulation condition is in worst-case. The input period is 3800 ns. In fast mode, the propagation delay of the leakage delay cell is 162 ps. Fig. 4.5 shows the simulation of proposed leakage delay cell in slow mode, the propagation delay is about 428.5 ns.



Fig. 4.4 The proposed leakage delay cell simulation in fast mode.



Fig. 4.5 The proposed leakage delay cell simulation in slow mode.

Each leakage delay cell can provides a large delay in slow mode, and therefore the number of bits in the digital counter of the cycle-controlled delay unit can be reduced. As a result, the operation frequency of the DLL can be easily extended to a very low frequency.



Fig. 4.6 Simulation of cycle-controlled delay unit (CCDU).

Fig. 4.6 shows the simulation of cycle-controlled delay unit. The controller sends "reset", "count" and "count\_is\_0" signals to CCDU, the inner programmable counter, c\_count, counts upward until the value matches the "count". When "count" is 127, the propagation delay from "LDL\_out" to "CCDL\_out" is about 433.31 ns in worst condition. One step of the CCDU is 3.41 ns in the same simulation condition.



Fig. 4.7 The overall DLL simulation.

Fig. 4.7 shows the simulation of overall DLL. When "reset" is pull-down, the DLL starts phase tracking based on binary search scheme. The lock time of whole locking procedure is 57 clock cycles. After phase tracking complete, the DLL is locked.

## 4.4 Measurement Results



Fig. 4.8 The microphotograph of the DLL.

The proposed ultra wide range DLL is fabricated on a standard performance (SP) 65nm CMOS technology. Fig. 4.8 shows the microphotograph of the DLL, and its core area is 0.01 mm<sup>2</sup>.

Fig. 4.9 shows the measured output clocks when the proposed DLL is locked at 600 kHz. Besides the DLL circuit, the chip also contains a digital-controlled oscillator (DCO) to generate the on chip reference clock for the DLL. This DCO can generate 280MHz to 1.2GHz clock signal. The chip can choose the reference clock source, which from external reference clock or from internal DCO. Due to the speed limitation of the I/O pads, the output clock frequency must be lowered for testing. When internal DCO is chosen as the reference clock, the output of the DLL is divided by 4. In Fig. 4.10, signal at Channel 1 is reference clock divided by 4.



Fig. 4.9 DLL is locked at 600 kHz.



Fig. 4.10 DLL is locked at 1.2 GHz.

The operation frequency range of the proposed DLL ranges from 600 kHz to 1.2GHz with a 1.0V supply. The power consumption with 1.0V supply voltage is 2.6mW at 1.2 GHz, and is 0.366mW at 600 kHz. Each leakage delay cell can provides a large delay in slow mode, and therefore the number of bits in the digital counter of the cycle-controlled delay unit can be reduced. As a result, the operation frequency of the DLL can be easily extended to a very low frequency. Fig. 4.11 shows the measured jitter histogram of the output clock at 1.2 GHz. The experimental results show that the measured root-mean-square jitter and peak-to-peak jitter is 3.38 ps and 39.29 ps, respectively, at 1.2 GHz.



Fig. 4.11 Measured long-term jitter histogram (at 1.2 GHz).

| Table 4.2 Performance summ | ary of the proposed D | LL |
|----------------------------|-----------------------|----|
|----------------------------|-----------------------|----|

| Process                   | SP 65nm CMOS        |  |  |
|---------------------------|---------------------|--|--|
| Operation Frequency Range | 600 kHz ~ 1.2 GHz   |  |  |
| Supply voltage            | 1.0v                |  |  |
| Root mean square jitter   | 3.38 ps             |  |  |
| Peak to peak jitter       | 39.29 ps            |  |  |
| Chip core area            | 0.01mm <sup>2</sup> |  |  |
| <b>B</b> ower consumption | 2.6mW @ 1.2GHz      |  |  |
| Fower consumption         | 0.366mW @ 600kHz    |  |  |



Fig. 4.12 Measured figures-of-merit for long-term jitter histogram.

The figures-of-merit (FOMs) for long-term jitter at different input frequency is shown in Fig. 4.12.  $FOM_{p-p \text{ jitter}}$  is defined as

$$FOM_{p-p \text{ jitter}} = \frac{Measured p - p \text{ jitter}(ps)}{Input clock period(ps)}$$

where the period of input clock is represented in picosecond. And the  $FOM_{r.m.s \ jitter}$  is defined as

$$FOM_{r.m.s jitter} = \frac{Measured r.m.s jitter(ps)}{Input clock period(ps)}$$

| DLL                                  | [38]           | [18]          | [8]            | [9]          | [11]           | [14]         | [10]          | Proposed        |
|--------------------------------------|----------------|---------------|----------------|--------------|----------------|--------------|---------------|-----------------|
| Туре                                 | Digital        | Analog        | Digital        | Digital      | Digital        | Analog       | Digital       | Digital         |
| Process                              | 0.18µm         | 0.18µm        | 0.18µm         | 0.18µm       | 0.18µm         | 0.13µm       | 0.13µm        | 65nm            |
| Supply                               | 1.8V           | 1.8V          | 1.8V           | 1.8V         | 1.8V           | 1.2V         | 1.2V          | 1.0V            |
| Min.<br>Input<br>Period              | 1.20ns         | 0.5ns         | 0.667ns        | 0.714ns      | 1.428ns        | 0.2ns        | 1.0ns         | 0.833ns         |
| Max.<br>Input<br>Period              | 12.5ns         | 4.0ns         | 2.273ns        | 5.88ns       | 500.0ns        | 2.0ns        | 33.3ns        | 1666.67ns       |
| r.m.s<br>jitter                      | 1.73ps         | 2.81ps        | 0.936ps        | 2.03ps       | 2.0ps          | 1.06ps       | Х             | 3.38ps          |
| p-p jitter                           | 12ps           | 20.4ps        | 7.0ps          | 13.8ps       | 17.6ps         | 8.0ps        | 30ps          | 39.29ps         |
| Active<br>Area<br>(mm <sup>2</sup> ) | 0.19           | 0.046         | 0.053          | 0.236        | 0.88           | 0.107        | 0.02          | 0.01            |
| Power                                | 48mW<br>800MHz | 6.4mW<br>2GHz | 43mW<br>1.5GHz | 27mW<br>1GHz | 23mW<br>700MHz | 36mW<br>5GHz | 3.6mW<br>1GHz | 2.6mW<br>1.2GHz |
| FOM <sub>Range</sub>                 | 411.77         | 526.80        | 209.80         | 151.56       | 3922.64        | 46.73        | 4486.11       | 166583.7        |

Table 4.3 Comparisons of recent wide-range DLLs.

Table 4-3 shows the performance comparisons with recent wide-range DLLs. The figure-of-merit for adjustable range per unit area is defined as  $FOM_{Range} = \frac{Max. input period - Min. input period (ns)}{Normalized Active Area (mm<sup>2</sup>)}$ 

where the maximum input period and minimum input period are represented in nanosecond, and the normalized active area is represented in square millimeters. The proposed DLL achieves the largest FOM among the DLLs in Table 4-3. The proposed DLL with leakage delay cells can generate a large delay time for low frequency operation, and the leakage delay cells occupy a small area and improve the operation range. Therefore, it is suitable for wide-range frequency operation.



# **Chapter 5**

# **Conclusion and Future Works**

In this thesis, a novel leakage delay cell implemented with 65nm CMOS technology is presented. The proposed leakage delay cell with cycle-controlled delay unit can easily achieve ultra wide frequency range operation. The frequency operating range of the proposed DLL is from 600 kHz to 1.2 GHz. It also achieves smaller chip area and lower power dissipations than previous wide-range DLLs. As a result, it is very suitable for wide-range clock deskew applications in SoC era.

Nevertheless, the proposed wide-range DLL takes about 57 clock cycles to lock. For some applications, which demand a fast-locking time, the proposed DLL may be difficult to meet the requirement. To speed up the locking procedure is the first thing to do.

The proposed leakage delay cell suffers from PVT variations. The jitter performance in low frequency operation is needed to be improved. Thus how to solve these problems still needs to be investigated for further research.

# References

- [1] Floyd M. Gardner, "Charge-Pump Phase-Lock Loops," in *IEEE Transactions on Communications*, Vol. 28, no. 11, pp. 1849-1858, Nov. 1980.
- [2] Che-Fu Liang, Shin-Hua Chen, Shen-Iuan Liu, "A Digital Calibration Technique for Charge Pumps in Phase-Locked Systems," in *IEEE Journal of Solid-State Circuits*, Vol. 43, pp. 390-398, Feb. 2008.
- [3] Ashok Swaminathan, Kevin J. Wang, Ian Galton, "A Wide-Bandwidth 2.4GHz ISM Band Fractional-N PLL with Adaptive Phase Noise Cancellation," in *IEEE Journal of Solid-State Circuits*, Vol. 42, pp. 2639-2650, Dec. 2007.
- [4] Wonyoung Kim, Meeta S. Gupta, Gu-Yeon Wei, David Brooks, "System level analysis of fast, pre-core DVFS using on-chip switching regulators," in *Proceeding of IEEE International Symposium on High Performance Computer Architecture*, pp. 123-134, Feb. 2008.
- [5] Byung-Guk Kim, Lee-Sup Kim, "A 250-MHz-2-GHz wide-range delay-locked loop," in *IEEE Journal of Solid-State Circuits*, Vol. 40, pp. 1310 - 1321, Jun. 2005.
- [6] Ching-Che Chung, Chen-Yi Lee, "A new DLL-based approach for all-digital multiphase clock generation," in *IEEE Journal of Solid-State Circuits*, Vol. 39, pp. 469 475, Mar. 2004.
- [7] Duo Sheng, Ching-Che Chung, Chen-Yi Lee, "An ultra-low-power and portable digitally controlled oscillator for SoC applications," in *IEEE Transactions on Circuits and System II: Express Briefs*, Vol. 54, pp. 954 958, Nov. 2007.
- [8] Dongsuk Shin, Chulwoo Kim, Janghoon Song, Hyunsoo Chae, "A 7 ps Jitter 0.053 mm<sup>2</sup> Fast Lock All-Digital DLL With a Wide Range and High Resolution DCC," in *IEEE Journal of Solid-State Circuits*, Vol. 44, no. 9, pp. 2437-2451, Sep. 2009.

- [9] Dongsuk Shin, Won-Joo Yun, Hyun-Woo Lee, Young-Jung Choi, Suki Kim, Chulwoo Kim, "A 0.17–1.4GHz low-jitter all digital DLL with TDC-based DCC using pulse width detection scheme," in *Proceedings of European Solid-State Circuits Conference (ESSCRIC)*, pp. 82-85, Sep. 2008
- [10] Lei Wang, Leibo Liu, Hongyi Chen," An Implementation of Fast-Locking and Wide-Range 11-bit Reversible SAR DLL." In *IEEE Transactions on Circuits* and Systems II: Express Briefs, Vol. 57, no. 6, pp. 421-425, Jun 2010.
- [11] Hsiang-Hui Chang, Shen-Iuan Liu, "A wide-range and fast-locking all-digital cycle-controlled delay-locked loop," in *IEEE Journal of Solid-State Circuits*, vol. 40, pp. 661 - 670, Mar. 2005.
- [12] Wook Kim, Kyung Tae Do, Young Hwan Kim, "Statistical Leakage Estimation Based on Sequential Addition of Cell Leakage Currents," in *IEEE Transactions* on Very Large Scale Integration (VLSI) Systems, Vol. 18, pp. 602-615, Apr. 2010.
- [13] Kuo-Hsing Cheng, Chia-Wei Su, Meng-Jhe Wu, Yu-Ling Chang, "A wide-range DLL-based clock generator with phase error calibration," in *Proceeding of IEEE International Conference on Electronics, Circuits and Systems*, pp. 798 - 801, Aug. 2008.
- [14] Chi-Nan Chuang, Shen-Iuan Liu, "A 0.5–5-GHz Wide-Range Multiphase DLL With a Calibrated Charge Pump," in *IEEE Transactions on Circuits and System II: Express Briefs*, Vol. 54, pp. 939 - 943, Nov. 2007.
- [15] Kuo-Hsing Cheng, Yu-Lung Lo, "A fast-lock wide-range delay-locked loop using frequency-range selector for multiphase clock generator," in *IEEE Transactions Circuits System II: Express Briefs*, Vol. 54, pp. 561 - 565, Jul. 2007.
- [16] Yongsam Moon, Jongsang Choi, Kyeongho Lee, Deog-Kyoon Jeong, Min-Kyu Kim, "An all-analog multiphase delay-locked loop using a replica delay line for wide-range opearation and low-jitter performance," in *IEEE Journal of Solid-State Circuits*, Vol. 35, no. 3, pp. 377-384, Mar. 2000.

- [17] Abdulkerim L. Coban, Mustafa H. Koroglu, Kashif A Ahmed, "A 2.5-3.125-Gb/s quad transceiver with second-order analog DLL-based CDRs," in *IEEE Journal of Solid-State Circuits*, Vol. 40, no. 9, pp. 1940-1947, Sep. 2005.
- [18] Byung-Guk Kim, Lee-Sup Kim, "A 250-MHz-2-GHz wide-range delay-locked loop," in *IEEE Journal of Solid-State Circuits*, Vol. 40, no. 6, pp. 1310-1321, Jun 2005.
- [19] Seung-Jun Bae, Hyung-Joon Chi, Young-Soo Sohn, Hong-June Park, "A VCDL-based 60-760-MHz dual-loop DLL with infinite phase-shift capability and adaptive-bandwidth scheme," in *IEEE Journal of Solid-State Circuits*, Vol. 40, no. 5, pp. 1119-1129, May 2005.
- [20] Guang-Kaai Dehng, June-Ming Hsu, Ching-Yuan Yang, Shen-Iuan Liu, "Clock-deskew buffer using a SAR-controlled delay-locked loop," in *IEEE Journal of Solid-State Circuits*, Vol. 35, no. 8, pp. 1128-1136, Aug. 2000.
- [21] Rong-Jyi Yang, Shen-Juan Liu, "A 2.5 GHz All-Digital Delay-Locked Loop in 0.13 μm CMOS Technology," in *IEEE Journal of Solid-State Circuits*, Vol. 42, no. 11, pp. 2338-2347, Nov. 2007.
- [22] Feng Lin, Roman A. Royer, Brian Johnson, Brent Keeth, "A Wide-Range Mixed-Mode DLL for a Combination 512 Mb 2.0 Gb/s/pin GDDR3 and 2.5 Gb/s/pin GDDR4 SDRAM," in *IEEE Journal of Solid-State Circuits*, Vol. 43. no. 3, pp.631-641, Mar. 2008.
- [23] Youngkwon Jo, Yong Shim, Soohwan Kim, Suki Kim, Kwanjun Cho, "A mixed-structure delay locked-loop with wide range and fast locking," in *Proceeding of IEEE International Symposium on Circuits and Systems*, 2006, pp. 1937-1940
- [24] Jae Joon Kim, Sang-Bo Lee, Tae-Sung Jung, Chang-Hyun Kim, Soo-In Cho, Beomsup Kim, "A low-jitter mixed-mode DLL for high-speed DRAM applications," in *IEEE Journal of Solid-State Circuits*, Vol. 35, no. 10, pp. 1430-1436, Oct. 2000.

- [25] Eunseok Song, Seung-Wook Lee, Jeong-Woo Lee, Joonbae Park, Soo-Ik Chae, "A reset-free anti-harmonic delay-locked loop using a cycle period detector," in *IEEE Journal of Solid-State Circuits*, Vol.39, no. 11, pp. 2055-2061, Nov. 2004.
- [26] Mathieu Renaud, Yvon Savaria, "A CMOS three-state frequency detector complementary to an enhanced linear phase detector for PLL, DLL or high frequency clock skew measurement," in *Proceedings of the International Symposium on Circuits and Systems*, Vol. 3, pp. 148-151, May 2003.
- [27] F. Centurelli, S. Costi, M. Olivieri, S. Pennisi, A. Trifiletti, "Robust three-state PFD architecture with enhanced frequency acquisition capabilities," in *Proceedings of International Symposium on Circuits and Systems*, Vol. 4, pp. 812-815, May 2004.
- [28] Mark G. Johnson, Edwin L. Hudson, "A variable delay line PLL for CPU-coprocessor synchronization," in *IEEE Journal of Solid-State Circuits*, Vol. 23, no. 5, pp. 1218-1223, Nov. 1996
- [29] Federico Baronti, Diego Lunardini, Roberto Roncella, Roberto Saletti, "A self-calibrating delay-locked delay line with shunt-capacitor circuit scheme," in *IEEE Journal of Solid-State Circuits*, Vol. 39, pp. 384–387, Feb. 2004.
- [30] Young-Jin Jeon, Joong-Ho Lee, Hyun-Chul Lee, Kyo-Won Jin, Kyeong-Sik Min, Jin-Yong Chung, Hong-June Park, "A 66-333-MHz 12-mW register-controlled DLL with a single delay line and adaptive-duty-cycle clock dividers for production DDR SDRAMs," in *IEEE Journal of Solid-State Circuits*, Vol. 39, no. 11, pp. 2087-2092, Nov. 2004.
- [31] Yoshinori Okajima, Masao Taguchi, Miki Yanagawa, Koichi Nishimura, Osamu Hamada, "Digital Delay Locked Loop and Design Technique for High-Speed Synchronous Interface," in *IEICE Transactions on Electronics*, Vol. E79-C, no. 6, pp. 798-807. Jun. 1996.
- [32] Atsushi Hatakeyama, Hirohiko Mochizuki, Tadao Aikawa, Masato Takiia, Yuki Ishii, Hironobu Tsuboi, Shin-ya Fujioka, Shusaku Yamaguchi, Makoto Koga, Yuji Serizawa, Koichi Nishimura, Kuninori Kawabata, Yoshinori Okajima, Michiari Kawano, Hideyuki Kojima, Kazuhiro Mizutani, Toru Anezaki, Masatomo Hasegawa, Masao Taguchi, "A 256 Mb SDRAM using a register-controlled digital DLL," in *IEEE Journal of Solid-State Circuits*, Vol. 32, no. 11, pp. 1728-1734, Nov. 1997.
- [33] Feng Lin, Jason Miller, Aaron Schoenfeld, Manny Ma, R. Jacob Baker, "A register-controlled symmetrical DLL for double-data-rate DRAM," in *IEEE Journal of Solid-State Circuits*, Vol. 34, pp. 565-568, Apr. 1999.
- [34] Hiroki Sutoh, Kimihiro Yamakoshi, Masayuki Ino, "Circuit technique for skew-free clock distribution," in *Proceeding of IEEE Custom Integrated Circuits Conference*, 1995, pp. 163–166.
- [35] Hiroki Sutoh, Kimihiro Yamakoshi, "A clock distribution technique with an automatic skew compensation circuit," in *IEICE Transitions on Electronics*, Vol. E81-C, no. 2, pp. 277-283, Feb. 1998.
- [36] Chorng-Sii Hwang, Wang-Chih Chung, Chih-Yong Wang, Hen-Wai Tsao, Shen-Iuan Liu, "A 2V Clock Synchronizer using Digital Delay-Locked Loop," in *Proceeding of IEEE Asia Pacific Conference on ASICs*, pp. 91-94, Aug. 2000.
- [37] Ching-Che Chung, Chen-Yi Lee, "An all-digital phase-locked loop for high-speed clock generation," in *IEEE Journal of Solid-State Circuits*, Vol. 38, pp.347-351, Feb. 2003.
- [38] Hyunsoo Chae, Dongsuk Shin, Kisoo Kim, Kwan-Weon Kim, Young Jung Choi, Chulwoo Kim, "A wide-range all-digital multiphase DLL with supply noise tolerance," in *Proceeding of IEEE Asian Solid-State Circuits Conference*, pp. 421-424, Nov. 2008.
- [39] Ching-Che Chung, Chia-Lin Chang, "A wide-range all-digital delay-locked loop in 65nm CMOS technology," in *Proceeding of International Symposium on VLSI Design Automation and Test (VLSI-DAT)*, pp. 66-69, Apr. 2010.