# An All-Digital Phase-Locked Loop with High-Resolution for SoC Applications

Duo Sheng, Ching-Che Chung and Chen-Yi Lee Dept. of Electronics Engineering National Chiao Tung University Hsinchu, Taiwan, R.O.C hysteria@si2lab.org

## ABSTRACT

In this paper, we propose a very high-resolution all-digital phaselocked loop (ADPLL), which is designed with the cell library and described by Hardware Description Language (HDL). The proposed ADPLL uses a novel digitally controlled oscillator (DCO) to achieve 1.06ps resolution and the proposed DCO can extend the controllable range easily. The dead zone of the proposed phase/frequency detector (PFD) is 5ps. The proposed ADPLL can be easily ported to different process as a soft intellectual property (IP) block, making it very suitable for System-On-Chip (SoC) and system-level applications.

## 1. INTRODUCTION

Phase-locked loop (PLL) is a very important clocking IP for many digital systems such as digital communication and microprocessor. It can be used for frequency synthesis, clock de-skew and duty-cycle enhancement. Traditionally, the PLL is designed by analog approaches, however, in the SoC era and deep-sub micro technology, to integrate an analog block into a digital system needs to take more design efforts. Furthermore, as the technology changing, the analog blocks need to redesign. In contrast, all-digital phase-locked loop (ADPLL) uses the cell-based design approaches, so it can be easily integrated into the digital system. In addition, the ADPLL has the higher immunity for switching noise, and process, voltage and temperature (PVT) variation.

Many ADPLL's have been proposed to overcome the drawbacks of the analog PLL [1]-[5]. There are two key components in the ADPLL: digitally controlled oscillator (DCO) and phase/frequency detector (PFD). The most important design consideration of the ADPLL is how to design a high-resolution wide-range DCO and a small dead zone PFD. Since the major jitter source of the ADPLL comes from the DCO and PFD, the high-resolution DCO and the small dead zone PFD can reduce the jitter of ADPLL significantly. In [1], [2], the DCO consists of many binary-weighted width MOS to achieve high resolution. However, it needs full-custom design and it also takes a large area. In [3], [4], since the DCO has two stages: the coarse turning stage and the fine turning stage, it can achieve high resolution and wide frequency range. Some designs use tri-state buffers to form the DCO [3]-[5]. The tri-state buffers are the path selector, which can select the different delay path to obtain the different oscillating frequency. The proposed novel DCO can achieve 1.23ps resolution and the controllable range can be extended easily. Besides, the proposed PFD improves the dead zone of PFD [4] to 5ps. Thus the proposed ADPLL can achieve lowjitter operation.

This paper presents a new ADPLL solution for SoC applications. The proposed ADPLL can be implemented with cell library. Including the DCO and PFD, all designs of ADPLL can be described with HDL language. Because its portability, it can be used as a soft IP. It is very suitable for SoC design since both design time and complexity can be reduced. Since the proposed DCO can achieve both high resolution and wide frequency range, it can meet the demands of system-level integration.

Following is the organization of this paper: Section 2 describes the architecture and the lock-in algorithm of the proposed ADPLL. The proposed DCO and PFD circuits are described in Section 3. Section 4 shows the experimental results of the proposed ADPLL. Finally, the conclusions will be in Section 5.

## 2. ARCHITECTURE OVERVIEW

## 2.1 Architecture

Fig. 1 is the proposed ADPLL architecture. There are several functional blocks: the PFD, the ADPLL controller, two DCOs (tracking DCO and average DCO), and three frequency dividers (pre-divider, DCO divider and output divider). There are two DCOs in the proposed ADPLL, the tracking DCO is used for tracking the reference clock (Ref. CLK), and the average DCO is used to generate the output clock. By the average mechanism, the proposed ADPLL can generate the low-jitter output clock.



FIGURE 1. THE PROPOSED ADPLL ARCHITECTURE

Through the DCO divider, the signal DCO\_M is the output of tracking DCO divided by M. The Ref\_N comes from reference clock divided by N. The PFD will generate the signal "lead" or "lag" depended on the phase and frequency difference between the Ref\_N and DCO\_M. If DCO\_M leads Ref\_N, PFD generates a "lead" signal that will make the tracking DCO slow down. Conversely, when DCO\_M lags Ref\_N, PFD generates a "lag" signal to speed up the tracking DCO. When the ADPLL controller receives the "lead" or "lag" from the PFD, the ADPLL controller will change the DCO control code (DCO\_code [16:0]). And then DCO\_code [16:0] controls the tracking DCO to generate the output clock (DCO\_CLK). These blocks form a close-loop to achieve the "phase-locked" function.

#### 2.2 Lock-in Algorithm

The ADPLL has two operation modes: frequency acquisition mode and phase acquisition mode. Phase lock starts from the frequency acquisition mode. In the beginning, DCO oscillates at the middle operating range of the DCO, and the search step is one fourth of the DCO operating range. When the ADPLL controller receive the "lead" or "lag" signal from the PFD, the DCO control code will be decreased or increased respectively, and the frequency of DCO will be changed too. When the PFD output changes from "lead" to "lag" (or vice versa), the search direction will be changed and the search step will be reduced to one half of previous step. After the search step reduces to one, the frequency acquisition mode completes. Fig.2 shows the frequency acquisition mode operation of the ADPLL controller. We can see that the DCO control code will be converged to a certain range.

After the frequency acquisition mode completion, ADPLL enters the phase acquisition mode. The goal of this mode is to track phase of the reference clock. Fig.3 is the flow chart of phase acquisition operation. In the beginning of phase acquisition operation, the speed-up count (SPEEDUP\_COUNT) sets to zero. When the PFD output changes form "lead" to "lag" (or vice versa), that means the polarity changes, and then the search step will be reduced one half of the previous step. If the direction keeps the same way, the speedup count will add one. When the speed-up count equals to eight, the search step will be twice as the previous step. By increasing the search step, the phase tracking accelerates.



FIGURE 2. FREQUENCY ACQUISITION MODE OPERATION



FIGURE 3. FLOW CHART OF PHASE ACQUISITION MODE

#### 2.3 Average Mechanism

Due to the PFD's dead zone and the reference clock noise, the DCO control code has small variations in the phase acquisition mode. In order to reduce the jitter, the proposed ADPLL uses an average mechanism to diminish such kinds effects. In the beginning, the ADPLL controller detects the maximum and minimum of the DCO control code within m reference clock cycles, and then takes the average of these two values. The average value will be as the average DCO control code (avg\_code [16:0]) for the average DCO. Without the tracking noise, the ADPLL will generate a more stable and low-jitter output clock.

#### 3. KEY COMPONENTS DESIGN

We proposed a DCO circuit and a PFD circuit to achieve the highpermanence ADPLL. These essential components of the ADPLL are implemented with cell library. Without any passive components, it is easy to integrate into the ADPLL.

#### 3.1 Digitally Controlled Oscillator

The digitally controlled oscillator (DCO) is the heart of the ADPLL. Just like the VCO in the PLL, DCO provides the ADPLL output clock signal. The frequency of DCO output clock is controlled by the DCO control code. Fig.4 (a) shows the architecture of the proposed DCO. DCO is composed of three stages: coarse-tuning stage, 1<sup>st</sup> fine-tuning stage and 2<sup>nd</sup> fine-tuning stage. First, in the coarsetuning stage, there are 128 different paths and only one path is selected by 128-to-1 path selection MUX. The tri-state buffers are used to construct path selection MUX. In order to reduce the loading capacitance in the path selection MUX output, the path selection MUX is divided into two stages. In the first stage, there are sixteen delay path groups (G0 ~ G15), and each delay path group has eight different delay paths. Only one delay path in each delay group will be selected by the first stage selection signals (CON [0] ~ CON [7]). The second stage receives sixteen different delay paths from the first stage, and then it will select one of them by the second stage selection signals (CON2 [0] ~ CON2 [15]). From spice simulation, the resolution of the coarsetuning stage is the delay time of one coarse delay cell, and it is about 60.71ps. Because there are 128 different delay paths, the controllable range of DCO is about 7.771ns (60.71ps \* 128).

Second, in order to increase the frequency resolution of the DCO, the 1<sup>st</sup> fine-tuning stage is added into the DCO design. Fig.4 (b) shows the architecture of the proposed 1<sup>st</sup> fine-tuning stage. The 1<sup>st</sup> fine-tuning stage is composed of 32 shunted tri-state buffers and inverters. These tri-state buffers are controlled by the control signals (F1ON [0] ~ F1ON [31]). The frequency of the 1<sup>st</sup> fine-tuning stage output depends on the number of "turn on" tri-state buffers. As the number of "turn on" tri-state buffers. The controllable range of the 1<sup>st</sup> fine-tuning stage is 90.61ps and largest step is 17.74ps.

Finally, in order to further increase the DCO resolution, the  $2^{nd}$  fine-tuning stage is added after the 1<sup>st</sup> fine-tuning stage. Fig.4 (c) shows the circuit of the  $2^{nd}$  fine-tuning stage. The  $2^{nd}$  fine-tuning stage is composed of 32 three-input NOR gates to improve the resolution. The basic concept of the  $2^{nd}$  fine-tuning stage is to control the gate capacitance of NOR gate with input state [6], [7]. The control signals (F2ON [0] ~ F2ON [31]) are used to control the input state of NOR gate. As the gate capacitance of NOR gate changing, the delay of the

 $2^{nd}$  fine-tuning stage changes. From spice simulation, the controllable range of the  $2^{nd}$  fine-tuning stage is 34.06ps and step is 1.06ps. Because the rise/fall time unbalanced effect in the cell library, the duty cycle compensator is added after the  $2^{nd}$  fine-tuning stage. The duty cycle compensator is composed of the OR gates chain to generate the duty cycle balanced clock signal.

Note that the controllable range of each stage should cover the step of the pervious stage. As a result, the DCO does not have any unreachable zone. Table 1 shows the DCO simulation results in the typical case (TT,  $1.2V, 25^{\circ}C$ ), the best case (FF,  $1.32V, -40^{\circ}C$ ), and the worst case (SS,  $1.08V, 125^{\circ}C$ ) respectively. Since the controllable xrange of the coarse-tuning stage determines the controllable range of the DCO by changing the coarse delay cell. And the finest step of the  $2^{nd}$  fine-tuning stage determines the DCO can achieve wide controllable range and high resolution.

#### 3.2 Phase/Frequency Detector

The Phase/Frequency detector (PFD) detects the phase and frequency difference between Ref\_N and DCO\_M, and then sends "lead" or "lad" signals to the ADPLL controller. The schematic of the PFD is shown as Fig. 5. When DCO\_M leads Ref\_N, "lead" will generate a low pulse and "lag" remains high. Oppositely, When DCO\_M lags Ref\_N, "lag" will generate a low pulse and "lead" keep high. In order to minimize the dead zone of the PFD, the pulse amplifiers and the signal extenders are added into the PFD. The signal extenders are shown as the shadow blocks in the Fig. 5. The pulse amplifier circuit uses the chain of two-input AND gates to enlarge the pulse width applied to output registers. Since the signal extender and the pulse amplifier (Pulse Amp.) enlarge the phase difference between Ref\_N and DCO\_M, the following D-flip-flops can detect it. From the simulation results, the minimum detectable phase error of the PFD is 5ps. Fig. 6 is the simulation waveforms of the PFD by spice simulation.



Figure 4. (A) Architecture of The Proposed DCO. (b)  $1^{st}$  Fine-Tuning Stage (c)  $2^{nd}$  Fine-Tuning Stage

|                  | Best Case (ps) |       | Typical Case (ps) |       | Worst Case (ps) |        |
|------------------|----------------|-------|-------------------|-------|-----------------|--------|
|                  | Step           | Range | Step              | Range | Step            | Range  |
| Coarse-tuning    | 4272           |       | 60.71             | 7771  | 98.41           |        |
| 1st. Fine-tuning | 9.89           | 54.43 | 17.74             | 90.61 | 35.35           | 189.73 |
| 2nd. Fine-tuning | 0.67           | 21.51 | 1.06              | 34.06 | 1.85            | 59.2   |

TABLE 1. DCO SIMULATION RESULTS



FIGURE 5. PFD WITH SIGNAL EXTENDER AND DIGITAL AMPLIFIER



FIGURE 6. PFD SIMULATION WAVEFORMS

# 4. EXPERIMENTAL RESULTS

The proposed ADPLL is designed by cell-based design flow. We use Hardware Description Language (HDL) to describe the ADPLL controller and the frequency dividers, and then use a logic synthesizer to synthesize with 0.13 $\mu$ m 1P8M CMOS process cell library. After the gate-level simulation, the layout of ADPLL is generated by the auto placement and routing (APR) tool. In the placement and routing process, several steps should be noticed. First, after one DCO has been placed and routed, and the other DCO is duplicated by it. Second, during the whole design integration, the area and timing constrain should be given for the wire delay minimization. Finally, for operation stability, the power strip and ring should be added as many as possible. Fig.7 shows the layout of the ADPLL. The core size of the ADPLL is 500 $\mu$ m x 500 $\mu$ m, and the power consumption of the ADPLL is 4.22mW (@100MHz, 1.2V).

The post-layout simulation of the ADPLL is shown in Fig. 8. The frequency of the reference clock is 30MHz, and the division ratio 10, thus the frequency of the ADPLL output clock is 300MHz (=30MHz \* 10). When DCO\_M leads Ref\_N, "lead" will generate a high pulse and

"lag" keep high. Oppositely, When DCO\_M lags Ref\_N, "lag" will generate a high pulse and "lead" keep high. When the ADPLL controller receive the "lead" or "lag" signal from the PFD, the DCO control code will be decreased or increased respectively, and the frequency of DCO will be changed too. In the Fig. 8, we can see that either the tracking DCO control code or the average DCO control code will be converged to a stable value and complete the lock function.

# 5. CONCLUSIONS

In this paper, the high-resolution ADPLL is proposed. Because the novel DCO has three different turning stages, it can achieve wide operating range and 1.06ps resolution. By the signal extender and the pulse amplifier, the dead zone of the proposed PFD can be reduced to 5ps. The ADPLL is implemented in a  $0.13\mu$ m 1P8M CMOS process cell library, and since all designs of the proposed ADPLL are described with HDL language, it can port to different process easily and it can reduce design cycle time and complexity, thus it is very suitable for system-level integration and SoC applications.



FIGURE 7. LAYOUT OF THE PROPOSED ADPLL



FIGURE 8. POST-LAYOUT SIMULATION OF THE PROPOSED ADPLL

# ACKNOWLEDGEMENT

The authors would like to thank their members within the SI2 group of National Chiao Tung University for many fruitful discussions in design and implementation. The MPCA cell-library support from Faraday Technology Corporation is acknowledged as well.

#### REFERENCES

- J. Dunning, G. Garcia, J. Lundberg, and E. Nuckolls, "An alldigital phase-locked loop with 50-cycle lock time suitable for high-performance microprocessors," IEEE J. Solid-State Circuits, vol. 30, pp. 412–422, Apr. 1995.
- [2] J. -S Chiang and K. -Y Chen, "The design of an all-digital phaselocked loop with small DCO hardware and fast phase lock," IEEE Trans. Circuits Syst. II, vol. 46, pp. 945–950, Jul. 1999.
- [3] T.-Y. Hsu, C.-C. Wang, and C.-Y. Lee, "Design and analysis of a portable high-speed clock generator," IEEE Trans. Circuits Syst. II, vol. 48, pp. 367–375, Apr. 2001.
- [4] C.-C. Chung and C.-Y. Lee, "An all digital phase-locked loop for high-speed clock generation," IEEE J. Solid-State Circuits, vol. 38, no. 2, pp. 347–351, Feb. 2003.
- [5] T. Olsson and P. Nilsson, "A digitally controlled PLL for Soc Appications" IEEE J. Solid-State Circuits, vol. 39, no. 5, pp. 751– 760, May. 2004.
- [6] J. M. Rabaey, Digital Integrated Circuits—A Design Perspectives. NJ: Prentice-Hall, 1996.
- [7] P.-L. Chen, C.-C. Chung and C.-Y. Lee, "A novel digitallycontrolled varactor for portable delay cell design," IEICE Tran.Fundamentals, vol. E87-A, pp.3324-3326, Dec. 2004.