# 國 立 中 正 大 學 資訊工程學系研究所

### 碩士論文

應用於展頻序列通訊之全數位資料回復電路

An All-Digital Clock and Data Recovery Circuit for Spread Spectrum SerDes Applications

研究生:林揚迪

指導教授: 鍾菁哲 博士

中華民國 一零一 年 七 月

### 摘要

隨著半導體科技的不斷進步,積體電路的工作時脈以及各種通訊介面的傳輸速率也 越來越快。然而,越高的工作時脈與傳輸速率將使電磁干擾效應變得更加嚴重。電磁干 擾效應會使電子產品在高速運作時互相影響,導致電路工作行為的不正常現象。現今已 有很多種技術被提出來以解決電磁干擾的問題,各種技術中又以展頻時脈產生器最為熱 門。展頻時脈產生器是在一般的時脈產生器中加入調變,使其產生的時脈訊號頻率會在 一個範圍內上下擺動,達成展開頻率的效果。比起其他解決電磁干擾的技術,展頻時脈 產生器的成本較低。

此外,因為加入了展頻時脈,資料的傳輸速率就不再固定於某個速率上,而是在一個特定的頻率範圍。這會使得傳送端的資料被加入額外的時脈抖動影響,造成接收端的 位元誤碼率上升。因此通常展頻的範圍都不會定的太大,大約都在 5000ppm。然而,如 果展頻的範圍可以更大(大於 10%),將會有更好的電磁干擾效應衰減效果。所以,如果 接收端電路可以容忍更大的展頻範圍,這對整體電路的電磁干擾效應衰減是更有幫助 的。

本論文提出的全數位時脈與資料回復電路包含幾項特點以克服大範圍的展頻。內插 式的震盪器微調架構克服一般串接型震盪器的周期非單調遞增行為。適應性動態增益調 整以及基於時間對數位轉換器的快速相位增益補償大大地加強了一般時脈與資料回復 電路在大範圍展頻 (大於10%)下追蹤能力的不足。比起一般的時脈與資料回復電路,本 論文提出的電路還具有快速鎖定以及不需要參考時脈、多相位時脈產生器和超取樣式架 構等優點,這也使得晶片面積、功率消耗以及硬體設計複雜度上有著顯著的下降。

本論文之晶片是以 90 奈米製程的標準元件庫實現,具有很好的製程移轉能力。工作範圍為 76MHz 到 480MHz,晶片面積為 0.09mm<sup>2</sup>,功率消耗在 480MHz 且 10%的向下展頻下為 4.28mW。

i

### Abstract

With the great progress of the semiconductor technology, the operating frequency of ICs and the data rate of communication systems become faster and faster. However, the higher operating frequency and data rate will make more electromagnetic interference (EMI) effects. With EMI, the electronic devices will influence each other in high speed and work irregularly. There are many techniques proposed to solve the EMI problem. The spread spectrum clock generator (SSCG) is the hottest solution in all techniques. The SSCG is modulated with a spreading profile in traditional clock generators. The output frequency of the SSCG spreads out in a frequency range and achieves the spread spectrum. Comparing with other solutions for EMI, the SSCG has lower cost.

In addition, with SSCG, the data rate is no longer fixed but in a certain frequency range. The transmitter will produce additional jitter to the receiver, and the bit error rate of the receiver is increased accordingly. Hence, the spreading ratio of SSCG is always chosen in a small range, such as 5000ppm. However, if the transmitter can transmit data stream with a larger spreading ratio (>10%), there will be more EMI reduction. As a result, if the receiver can tolerance a larger spreading ratio, it is very useful for EMI reduction performance of the whole circuit.

In this thesis, we propose an all-digital clock and data recovery (CDR) with some features to overcome the large spreading ratio. The interpolator based fine tuning architecture of the digital controlled oscillator (DCO) overcomes the non-monotonic phenomenon of conventional cascaded DCO architecture. The adaptive control scheme and the time-to-digital converter (TDC) based fast phase compensation enhance the tracking ability of conventional CDR circuit with a large spreading ratio (>10%). In addition, the proposed ADCDR circuit performs fast lock-in and doesn't need an reference clock, or a multi-phase clock generator and oversampling architecture. The area, power consumption and design complexity can be greatly reduced.

The chip of this thesis is implemented in 90nm CMOS process with standard cells, and thus it has good portability over different processes. The core area is 0.09mm<sup>2</sup> and the power consumption is 4.28mW at 480MHz with 10% down spread.



### Contents

| Chapter 1 Introduction                                               | 1    |
|----------------------------------------------------------------------|------|
| 1.1 Electromagnetic Interference                                     | 1    |
| 1.2 Spread Spectrum Clocking                                         | 1    |
| 1.3 Thesis Organization                                              | 3    |
| Chapter 2 Survey of Conventional CDR circuits                        |      |
| 2.1 Basic concept of CDR circuit                                     | 4    |
| 2.1.1 Sample CDR Circuit                                             | 4    |
| 2.1.2 Phase Detector in CDR                                          | 6    |
| 2.2 Conventional CDR architecture                                    | 7    |
| 2.2.1 PLL-based CDR circuit                                          | 7    |
| 2.2.2 Phase Interpolator based CDR                                   | 9    |
| 2.2.3 Blind oversampling based CDR                                   | 10   |
| 2.3 Packet and NRZI                                                  | 11   |
| 2.4 Summary                                                          |      |
| Chapter 3 All-Digital CDR Circuit for Spread Spectrum SerDes         | •    |
| Applications                                                         |      |
| 3.1 The Proposed ADCDR Overview                                      | 13   |
| 3.2 Dual mode Phase and Frequency Detector                           | 15   |
| 3.3 Time-to-Digital Converter Embedded DCO                           | 17   |
| 3.3.1 Structure                                                      | 17   |
| 3.3.2 Fast Lock-in Procedure                                         |      |
| 3.3.3 Coarse Tuning Stage                                            |      |
| 3.3.4 Fine Tuning Stages                                             | 21   |
| 3.4 Adaptive Gain Control Scheme and TDC-based Fast Phase Compensati | on23 |
| 3.4.1 Adaptive Gain Control Scheme                                   |      |
| 3.4.2 TDC-based Fast Phase Compensation                              |      |
| Chapter 4 Experimental Results                                       |      |
| 4.1 Test Chip Implementation                                         |      |
| 4.2 Full Chip Simulation                                             |      |
| 4.3 Bit Error Rate measurement                                       |      |
| 4.3.1 Random Jitter Tolerance                                        |      |
| 4.3.2 Sinusoidal Jitter Tolerance                                    |      |
| 4.4 Chip Summary and Comparison Table                                |      |
| Chapter 5 Conclusion and Future Works                                |      |
| Reference 39                                                         |      |

### Figure

| Fig. 1.1 Spread spectrum clock generation                             | 2  |
|-----------------------------------------------------------------------|----|
| Fig. 1.2 Power spectrum density of the center-spread profile          | 2  |
| Fig. 2.1 Block diagram of a spread-spectrum SerDes                    | 4  |
| Fig. 2.2 Basic PLL type CDR circuit                                   | 5  |
| Fig. 2.3 Waveform of basic PLL type CDR circuit                       | 6  |
| Fig. 2.4 Architecture of the PLL-based CDR circuit                    | 7  |
| Fig. 2.5 Architecture of phase interpolator based CDR                 | 9  |
| Fig. 2.6 Architecture of blind oversampling based CDR                 | 10 |
| Fig. 2.7 Waveform of USB 2.0 packet [17]                              | 11 |
| Fig. 3.1 The block diagram of the proposed ADCDR circuit architecture | 13 |
| Fig. 3.2 Proposed dual mode PFD architecture [17]                     | 15 |
| Fig. 3.3 Waveform of proposed dual mode PFD [17]                      | 16 |
| Fig. 3.4 The TDC-embedded DCO architectre                             | 17 |
| Fig. 3.5 TDC measurement procedure                                    | 18 |
| Fig. 3.6 coarse tuning stage architecture of DCO                      | 20 |
| Fig. 3.7 fine tuning stage architecture of DCO                        | 21 |
| Fig. 3.8 fine-tune code cross over 3 coarse stage                     | 21 |
| Fig. 3.9 Phase error accumulation in the CID region                   | 23 |
| Fig. 3.10 Polarity changes in the CID region                          | 24 |
| Fig. 3.11 The proposed AGCID and TDC compensation flow                | 25 |
| Fig. 3.12 The proposed TDC circuit for the PFD                        | 26 |
| Fig. 4.1 Test chip floor planning and I/O planning                    | 27 |
| Fig. 4.2 Layout of the test chip                                      | 29 |
| Fig. 4.3 System simulation of proposed ADCDR circuit in 480MHz        | 30 |
| Fig. 4.4 Random jitter BER performance                                | 32 |
| Fig. 4.5 Sinusoidal jitter tolerance performance                      |    |

### Table

| Table 2.1 CDR architecture comparison | 12 |
|---------------------------------------|----|
| Table 4.1 I/O PADs description        |    |
| Table 4.2 Block module name           |    |
| Table 4.3 Chip summary                |    |
| Table 4.4 Comparison table            |    |
|                                       |    |



### Chapter 1 Introduction

### **1.1 Electromagnetic Interference**

Nowadays, the operating speed of electronic devices become faster and faster. It will result in more electromagnetic interference (EMI) effect. With EMI, the electronic devices will influence each other and make themselves work irregularly. As a result, EMI had become an important issue for circuit designers.

There are many techniques proposed to reduce EML such as shielding or low-voltage differential clocking. Shielding techniqe can effectively suppress EMI of the system. However, the shielding technique increases not only the area but also the cost of design. Similarly, the low-voltage differential clocking technique has a complex routing problem.

### **1.2 Spread Spectrum Clocking**

In recent years, in many high-speed serial link applications, such as USB 3.0, SATA 3.0, PCI-E 3.0, and DisplayPort, the spread spectrum clock generator (SSCG) is adopted in the transmitter part to effectively reduce the EMI with low hardware cost.

The spreading ratio in a SSCG determines the amount of the EMI reduction, and it also influences the jitter performance of the output clock. In addition, the transmitter with a SSCG produces additional jitter to the receiver, and thus the bit error rate (BER) of the receiver is increased accordingly. Fig. 1.1 shows the center-spread spread spectrum clock generation with a triangular modulation profile. The spreading ratio is  $\alpha$ , the baseline frequency is  $F_{center}$ , and the modulation frequency is  $f_m$  in Fig. 1.



The average frequency (baseline frequency) in the center-spread modulation should be equal to the non-spread clock frequency, and the spreading ratio determines the maximum and minimum output frequencies of the SSCG. For example, if  $F_{center}$  is 160MHz and the spreading ratio ( $\alpha$ ) is 10%, the output clock frequency ranges from 152MHz to 168MHz.

Fig. 1.2 is an example of the EMI reduction between a spread spectrum clock (SSC) with the center-spread modulation and a non spread clock. As we can see, the SSCG can effectively reduce the EMI.



Fig. 1.2 Power spectrum density of the center-spread profile

In SATA specifications [1], the spreading ratio is 5000ppm (0.5%) with a 30~33 kHz modulation frequency. In most of standards, the modulation frequency is always chosen to above audio band to prevent FM receivers from receiving this signal and often ranges in 30~33 kHz. This spreading ratio and modulation frequency can make enough EMI reduction and the receiver can still keep a good bit error rate (BER) performance.

However, if the transmitter can transmit data stream with a larger spreading ratio (>10%), there will be more EMI reduction, as discussed in [2]-[4]. Nevertheless, the frequency deviation produced by the SSCG will be a design challenge for the CDR circuit design.

### **1.3 Thesis Organization**

In this thesis, we discuss a referenceless all-digital clock and data recovery (ADCDR) circuit with an adaptive gain control scheme and the time-to digital converter (TDC)-based fast phase compensation for spread spectrum clocking applications.

In chapter 2, we discuss basic concept of CDR circuit and survey of conventional CDR circuits. In chapter 3, the proposed all-digital clock and data recovery circuit for spread spectrum SerDes applications is presented. In chapter 4, we show experimental results, including chip test plan, full chip simulation, chip summary and the comparison table. Finally, we make a conclusion and point out some future works in chapter 5.

### Chapter 2

### **Survey of Conventional CDR circuits**

### 2.1 Basic concept of CDR circuit

### 2.1.1 Sample CDR Circuit



Fig. 2.1 Block diagram of a spread-spectrum SerDes

Clock and data recovery circuit plays an important role in high speed serial data transmission applications. Fig. 2.1 shows a possible transceiver block diagram with spread

spectrum clocking. The transmitter (TX) is composed of scrambler, encoder, serializer, spread spectrum clock generator (SSCG) and TX driver. The receiver (RX) is composed of clock and data recovery (CDR) circuit, deserializer, decoder and de-scrambler.

In Fig. 2.1, the transmitter uses the scrambler to scramble the input data. The scrambler can prevent the parallel data consist of regular patterns that will result in severe EMI. After the parallel data is scrambled, the transmitter sends the encoded and serialized data pattern to the receiver. The CDR circuit has to extract the original clock from the incoming data pattern and retimes the incoming data to perform the clock and data recovery. Then, the receiver de-serialized and de-scrambled the recovered data and completed the data transmission.



In conventional CDR circuits, PLL-based architecture is usually applied to build a CDR circuit. Fig. 2.2 shows a basic example of PLL-based CDR circuit. The PLL loop is consisted of a phase and frequency detector (PFD), a loop filter (LF), and a voltage controlled oscillator (VCO) to synchronize the VCO clock (recovered clock) with the input data rate. After CDR circuit is locked, the decision circuit (DFF) is triggered by VCO clock to retime the input data pattern as the recovered data. Fig. 2.3 shows the waveform of PLL-based CDR circuit. After the CDR is locked, the rising edge of the recovered clock has to align with input data transition edge to complete the clock recovery. Finally, we can use the falling edge of the recovered clock to retime the input data pattern and recovery data.



Fig. 2.3 Waveform of basic PLL type CDR circuit

### 2.1.2 Phase Detector in CDR

The phase detector (PD) in the CDR circuit must provide two essential functions: data transition detection and phase difference detection [5]. Several phase detector circuits for random data stream tracking have been published. There are two well-known PD for random data stream detection, the linear type is Hogge PD [7], and the bang-bang type is Alexander PD [8]. Other PD works such as the reduce-rate PD [9] and 3X oversampling PD [10] are proposed in prior CDR circuits.

### 2.2 Conventional CDR architecture

### 2.2.1 PLL-based CDR circuit



Fig. 2.4 Architecture of the PLL-based CDR circuit

Fig. 2.4 shows a example of PLL-based dual loop CDR circuit. This architecture is composed of a multi-phase PLL loop and a data recovery and phase tracking CDR loop. It is widely used in conventional CDR circuits [11][12][32]. The PLL uses the external reference clock to generate the high speed multi-phase clocks. The CDR loop uses the multi-phase clocks to trigger the oversampling PD and controls the low gain charge-pump to perform the clock and data recovery.

In this architecture, the VCO clock rate could be half or quarter of the data rate to save the power consumption of the VCO. However, charge-pump based PLL faces several challenges in advanced CMOS process (90nm<sup>1</sup>). The main problem is the charge-pump architecture. It uses the capacitor to store the controlled voltage of the VCO. In advanced CMOS process, the leakage current of the transistor is very serious. Therefore, the leakage current of controlled voltage could generate the ripples in control voltage and results in the additional jitter in output frequency. In addition, in CMOS 90nm process, the operating voltage is down to 1.0V. Thus it requires to trade off the operating frequency and VCO gain.

In this dual loop architecture, we need to use a narrow bandwidth loop filter to prevent false lock. Therefore, the tuning rage of VCO should be decreased to prevent false lock. The tracking speed of the PLL is decreased due to the narrow bandwidth filter and results in decreasing the input data jitter tolerance. Finally, a narrow bandwidth loop filter usually needs a large on-chip capacitor and thus the chip area and the power consumption will be increased.

The other problem of the dual-loop architecture is that the lock-in time of the charge-pump based PLL is too long. In USB 2.0 high speed mode, the dual-loop architecture cannot complete the frequency and phase acquisition within the synchronization pattern. In addition, it needs an external reference clock for the PLL to generate multi-phase clocks. The reference clock often inputs by an external crystal oscillator, and thus the cost and power consumption is increased accordingly. As a result, the referenceless CDR architecture [13]-[18] is more attractive in today's system-on-chip (SoC) era.



### **2.2.2 Phase Interpolator based CDR**

Fig. 2.5 Architecture of phase interpolator based CDR

Fig. 2.5 shows the phase interpolator (PI) based CDR architecture. The PI based CDR is widely used in SATA applications [19]-[21]. This architecture is composed of a frequency tracking loop and a phase tracking loop. In frequency tracking loop, it uses the reference clock (F(ref)) to generate the high speed multi-phase clocks that referenced by phase tracking loop. The phase interpolator is driven by the loop filter, and it adjusts the phase by interpolation. The PI based architecture is suitable for multichannel transmission because each channel can share the input clocks.

However, this architecture still needs an external reference clock and needs to design a multi-phase oscillator. With external reference clock, the cost, area, and the power consumption is increased. In addition, multi-phase oscillator needs a special routing technique to ensure the consistence of each phase.

### 2.2.3 Blind oversampling based CDR



Fig. 2.6 Architecture of blind oversampling based CDR

Fig. 2.6 shows the blind oversampling circuit [22][23]. The PLL loop uses a reference clock to generate high speed multi-phase clocks. The multi-phase clocks trigger the blind oversampling samplers (DFFs) to sample the input data. The data which sampled by multi-phase samplers will be sent to the decision circuit (majority-voting circuit) to decide the right value of the data. The phase count (M) of multi-phase clocks represents the oversampling rate. It usually has a 2X, 3X or 4X higher clock rate than data rate.

The blind oversampling architecture has no feedback phase tracking and has fast acquisition ability. However, it still needs an external reference, a multi-phase oscillator, and oversampling architecture increases the design complexity.

### 2.3 Packet and NRZI

In communication systems, the packet will be stuffed with some bits to avoid the packet without data transition for a long time. In conventional CDR circuit, there is no tracking action in the consecutive identical digits (CIDs) region. If the input data has no transition for a long time, the CDR circuit would loss the frequency lock and increases the bit error rate (BER). We will discuss more details of CIDs problem in section 3.4.

In USB 2.0, the packet will be stuffed in any six consecutive "1" before the packet is transmitted. Then, each bit is encoded by a non return to zero inverted (NRZI) encoder. In NRZI encoding, the signal has a transition if the bit transmitted is "0". On the other hand, the signal has no transition if the bit transmitted is "1". In 8b/10b which used in SATA applications encoding, the maximum run length of consecutive "0" or "1" is five.



Fig. 2.7 Waveform of USB 2.0 packet [17]

The encoded NRZI waveform of the USB 2.0 packet is shown in Fig. 2.7. The first 32 "0" bits are the synchronization bit. After NRZI encoding, the 32 bit "0" is encoded into 31 continuous data transition as the SYNC pattern. The CDR circuit should finish the frequency and phase acquisition within 31 cycles.

In our pattern generator, the maximum length of CID is 6, and is the same as USB 2.0 specification. The SYNC pattern length is 32/70 cycles.

### 2.4 Summary

| CDR<br>Architecture | Advantages                     | Disadvantages                   |  |
|---------------------|--------------------------------|---------------------------------|--|
|                     |                                | Long Lock-in Time               |  |
| PLL                 | Input Frequency Tracking       | Large Loop Filter Area(analog)  |  |
|                     | Input Jitter Rejection         | External Reference Clock needed |  |
|                     |                                | Multiphase Clocks needed        |  |
| Phase               | Multichannel Shane Imput Cleak | External Reference Clock needed |  |
| Interpolator        | Multichannel Share Input Clock | Multiphase Clocks needed        |  |
|                     |                                | External Reference Clock needed |  |
| Oversampling        | Fast Lock-In                   | Multiphase Clocks needed        |  |
|                     | Fast Acquisition               | Large FIFO Size                 |  |
|                     |                                | Hardware Complexity             |  |

Table 2.1 CDR architecture comparison

Table 2.1 summarized prior CDR architectures. They all have some disadvantages such as charge-pump leakage, large loop filter area, long lock-in time, requirement for an external reference clock, and multi-phase clock generator, and high hardware design complexity.

In this thesis, we propose an all-digital clock and data recovery (ADCDR) circuit with fast lock-in time, referencelss architecture, no multi-phase clock generator nor oversampling architecture are required. Besides, we propose the adaptive gain control scheme and the time-to-digital (TDC)-based fast phase compensation. The proposed adaptive gain control scheme adjusts the phase tracking gain by counting the consecutive identical digits (CIDs). In addition, the proposed ADCDR can compensate for a large phase error by the proposed TDC. As a result, the frequency variations during data transmission can be easily tracked and compensated for even with a large spreading ratio in the transmitter.

### **Chapter 3**

### All-Digital CDR Circuit for Spread Spectrum SerDes Applications

### **3.1 The Proposed ADCDR Overview**



Fig. 3.1 The block diagram of the proposed ADCDR circuit architecture

In conventional SSCG, the spreading ratio is chosen smaller than 5000ppm and the modulation frequency is chosen as 30~33 KHz. However, if the transmitter can transmit data stream with a larger spreading ratio (>10%), there will be more EMI reduction, as discussed in section 1.2. In the proposed ADCDR system, we adopt a SSCG with 10% spreading ratio and 30~33 KHz modulation frequency. The input data of ADCDR circuit is modulated by the SSCG.

Fig. 3.1 shows the block diagram of the proposed ADCDR circuit architecture. The proposed ADCDR circuit is composed of a dual mode phase and frequency detector (PFD)

[17], a delay line based time-to-digital converter (TDC) circuit for quantizing the phase error, a TDC-embedded monotonic low-power digitally controlled oscillator (DCO) [17][24] that can perform fast lock-in, a digital loop filter (DLF) [25], an ADCDR controller and a state machine to perform the fast lock-in procedure.

The input data is delayed and exclusive-OR with the original input data to generate the data transition signal (Data\_T). We use Data\_T signal to trigger the state machine and perform the fast lock-in procedure.

The whole lock-in procedure has two steps: the TDC-embedded DCO locking step and the modified binary search algorithm [26]. In TDC-embedded DCO lock-in step, the time duration between the first two data transition (Data\_T) will be measured by the TDC-embedded DCO and quantized in terms of coarse tuning delay time. Then, DCO generates the coarse\_tdc\_code to the ADCDR controller and encodes as the initial dco\_code. After TDC lock-in step, the frequency error between input data rate and dco\_clk could be very small (in 1~2 coarse tuning stage delay).

After TDC-embedded DCO lock-in step, the ADCDR controller begins to apply modified binary search algorithm to track the remaining phase and frequency error within the synchronization pattern. Finally, frequency and phase will be locked. The state machine sets the PFD mode into phase tracking mode by setting Track\_Mode signal to "0". In this mode, the dual PFD works as a PD and is suitable for random data detection. When the dco\_clk is faster than Data\_T, the PFD generates a down pulse to the controller to slow down the DCO. Otherwise, when the dco\_clk is slower than the Data\_T, the PFD generates a up pulse to the controller to increase the DCO.

In the phase tracking mode, the adaptive gain control with CID (AGCID) and the TDC-based fast phase compensation is applied to maintain phase tracking. The AGCID

scheme can automatically adjust the phase track gain. The TDC-based fast phase compensation will compensate for a large phase error and maintain the frequency and phase stability.

### **3.2 Dual mode Phase and Frequency Detector**



Fig. 3.2 shows the proposed dual mode phase and frequency detector (PFD). It supports two modes: PFD mode within the SYNC pattern and PD mode for random data detection.

When the Track\_Mode is switched to "1", the proposed dual mode PFD is turned into PFD mode. In PFD mode, the PFD works like a conventional sample based bang-bang PFD to detect the frequency and phase error within the SYNC pattern. If the dco\_clk is faster/slower than the Data\_T, the PFD will generate a down/up pulse to the controller to adjust DCO speed. The timing diagram of proposed dual mode PFD is shown in Fig. 3.3.

When the Track\_Mode is switched to "0", the proposed PFD is turned into PD mode for tracking random data pattern.



Fig. 3.3 Waveform of proposed dual mode PFD [17]

From the Fig. 3.3, in the PD mode, the QD signal is reset by negative edge of the DCO\_Clk. It is represented that there is no data transition within the half period of DCO\_Clk. It means DCO\_Clk is within the CID region. The irregular frequency error signal, OUTD will be mask by Mask signal. The PD discards the comparison of this time because it is not really correct and keeps detecting the next phase error of random data stream

In PFD mode, if the DCO\_Clk is within the CID region, the frequency error will be detected by PFD. Thus, the proposed dual mode PFD can solve the problem of the conventional sample based bang-bang PFD in the CID region.

# 3.3 Time-to-Digital Converter Embedded DCO

#### 3.3.1 Structure



Fig. 3.4 shows the TDC-embedded DCO architecture. As shown in Fig. 3.4, the DCO is composed of 64 coarse tuning stages and 32 fine tuning stages and the TDC [17] part for fast lock-in. A coarse tuning stage is composed of three logic gates including two AND gates and one NAND gates. The fine tuning circuit is composed of two parallel connected tri-state buffer arrays operating as an interpolator [24]. The interpolator circuit can keep the monotonic response between two coarse tuning stages switching. The TDC part is composed of 64 D-Flip/Flops (DFFs) on every output node of two AND gates. The Encoder encodes the binary 11 bits binary dco\_code[10:0] into coarse and fine tuning control thermal meter code. The dco\_code[10:5] encodes into coarse[62:0] and the dco\_code[4:0] encodes into fine[30:0]. The Decoder decodes the 64-bits thermal meter code generated by the TDC into the 6-bit binary coarse\_dco\_code to the ADCDR controller dco\_code initialization.

### 3.3.2 Fast Lock-in Procedure



(a) Input data clock period is smaller than half of all coarse stages delays



(b) Input data clock period is larger than half of all coarse stages delays, but is smaller



(c) Input data clock period is larger than all coarse stages delays, but is smaller than one

half of all coarse stages delays



In Fig. 3.4, when the TDC\_Lock is set to "0", the phase difference between the first two positive edges of data transition pulse (Data\_T) represents the data rate. The second positive edge of Data\_T will trigger the DFFs deployed on the DCO delay line to quantize the input data rate information in terms of the coarse tuning delay and complete the fast frequency acquisition.

Fig. 3.5 shows the TDC measurement procedure. In Fig. 3.5 (a), the input data clock period is smaller than the half of all coarse delays. The "1" should propagating in the upper AND gates chain. The Decoder must find the rightmost "1", and decodes into corresponding coarse tuning control code. In Fig. 3.5 (a), the coarse\_tdc\_code should be decoded as 30/2=15, represents the initial coarse code of dco code should be set as 15.

In Fig. 3.5 (b), the input data clock period is larger than the half of all coarse delays but is smaller than the all coarse stages delays. The "0" should be in the lower AND gates chain. The Decoder must find the leftmost "0", and decodes into corresponding coarse tuning control code. In Fig. 3.5 (b), the coarse\_tdc\_code should be decoded as 62/2=31, represents the initial coarse code of dco\_code should be set as 31.

In Fig. 3.5 (c), the input data clock period is larger than the all coarse delays but smaller than one half of all coarse stages delays. In this case, the cyclic counter is adopt to count the negative edge of CLK\_OUT. If the cyclic\_count is "1", represents that the input data rate is larger than all coarse stages delays. The "0" should be in the upper AND gates chain. The Decoder must find the rightmost "0", and decodes into corresponding coarse tuning control code. In Fig. 3.5 (c), the coarse\_tdc\_code should be decoded as 64/2=32, represents the initial coarse code of dco\_code should be set as 32.

### 3.3.3 Coarse Tuning Stage



Fig. 3.6 coarse tuning stage architecture of DCO

Fig. 3.6 shows the coarse stage architecture of the DCO [27]. The NAND chain based architecture has smaller coarse tuning resolution than the conventional MUX type DCO [24]. The one coarse tuning stage delay of the conventional MUX type DCO is a NAND gate delay + a MUX delay. In the proposed coarse tuning architecture, the one coarse tuning stage delay is two NAND/AND gates. As we discuss in 3.3.2, we use the coarse tuning stage as a TDC to quantize the input data clock period. The resolution of a coarse tuning stage should be kept as small as possible. Thus, we can achieve the fast frequency acquisition in a short time.

As shown in Fig. 3.6, if the dco\_code is 96, it should be encoded into the coarse control code as 96/32=3 (coarse[0]~coarse[2] = 1, coarse[3]~coarse[62] = 0) where one coarse tuning delay is equal to 32 fine tuning control code. The delay path shows as arrow path in Fig. 3.6

### 3.3.4 Fine Tuning Stages



Fig. 3.7 fine tuning stage architecture of DCO

Fig. 3.7 shows the fine tuning stage architecture of DCO [24]. The fine tuning stage is composed of two parallel connected tri-state buffer arrays operating as an interpolator. In conventional cascaded DCO architecture, non-monotonic phenomenon is a widely research problem [24][28][29] and it's not suitable for spread spectrum applications. The interpolator based fine tuning architecture is a directly approach to solve non-monotonic problem. However, the interpolator circuit may consume more power than cascading DCO architecture.



Fig. 3.8 fine-tune code cross over 3 coarse stage

Fig. 3.8 shows the simulation result of the monotonic DCO response during coarse tuning control code switching. A coarse stage resolution is about 248ps and the average fine tune resolution 7.75ps implemented in 90nm CMOS process.

When we turn on more tri-state buffers in the left hand side array, the output clock (CLK\_OUT) is more closed to CA\_OUT. On the other hand, the output clock (CLK\_OUT) is more closed to CB\_OUT when we turn on more tri-state buffer in the right hand side.



## **3.4 Adaptive Gain Control Scheme and TDC-based Fast Phase Compensation**

In conventional bang-bang CDR architecture, the phase tracking ability is very poor, and it is due to only the bang-bang phase information is used to control the DCO. However, there is no data transition in the CID region, and the frequency error is accumulated in this region, and it may cause a large phase error in the end of the CID region. Thus, the proposed AGCID scheme and the TDC-based fast phase compensation approach can enhance the phase and frequency tracking ability. Hence, the frequency variations in the receiving data can be easily tracked and compensated for even with a large spreading ratio in the transmitter.



Fig. 3.9 Phase error accumulation in the CID region

In the CID region, the PD is not operating and the CDR controller doesn't perform phase tracking. The frequency error between Input\_Data and DCO\_CLK will accumulate the phase error and result in a large phase drift. Fig. 3.9 shows an example of phase error accumulation in CID region. The initial phase error is  $\triangle$ P1 and the frequency error between Data\_T and

DCO\_CLK causes a phase drift X at every cycle. After 3 cycles in CID region, the phase error will increase to  $\triangle P1+3X$ .

We propose a novel tracking scheme called an adaptive gain control scheme (AGCID) to solve this problem. After frequency and phase acquisition is completed, the CID length (cycle\_count) is counted, as shown in Fig. 3.10. The cycle\_count is cleared whenever the data transition occurs. If there is no phase polarity change in the end of the CID region, it means that there are consecutive up or down pulses in the CID region. Thus, the proposed AGCID scheme will add or subtract the DCO control code by the CID length (cycle\_count). As a result, the accumulated phase error is quickly compensated in the end of each CID region. Oppositely, if there are phase polarity changes in the end of the CID region, it means the phase polarity is changed during the CID region. However, we cannot identify where the polarity change occurred in the CID region, so we assume the phase polarity is changed in the middle of the CID region. As a result, the proposed AGCID scheme will restore the baseline DCO code (avg\_dco\_code) calculated by the DLF, and then adds or subtracts the DCO control code by the CID length (cycle\_count) divided by 2.



Fig. 3.10 Polarity changes in the CID region

#### **3.4.2 TDC-based Fast Phase Compensation**



Fig. 3.11 The proposed AGCID and TDC compensation flow

Fig. 3.11 shows the flow chart of the proposed AGCID scheme and the TDC-based fast phase compensation. If the ADCDR controller detects consecutive up or down pulses, and the accumulated phase error is smaller than 1/3 clock period. Then, the DCO control code will be added or subtracted by the CID length (cycle\_count) according to the PFD's output. In this case, if the accumulated phase error is too large (> 1/3 clock period), the phase error is compensated for by a larger phase tracking gain (tdc\_code) to quickly reduce the phase error. Oppositely, if the ADCDR controller detects phase polarity is changed in the end of the CID region, the baseline DCO control code (avg\_dco\_code) is restored to the DCO control code (dco\_code) with an offset (cycle\_count/2). When the phase polarity is changed during the CID region, the accumulated phase error will be a small value, and thus the TDC code (tdc code) is not used in this situation.



**Digital Pulse Amplifier** 

Fig. 3.12 The proposed TDC circuit for the PFD

Fig. 3.12 shows the proposed TDC circuit for the PFD. A digital pulse amplifier is applied in the TDC to extend the input signal's pulse width. Thus, the signal (Pulse\_amp) can be used to reset the TDC delay line after each TDC operation. The PFD outputs a low pulse to the TDC, and the input signal passes through the TDC delay line which is composed of the TDC delay units (TDUs). The output of each TDU is sampled at the positive edge of the input signal. Thus, the input pulse width (i.e. phase error) can be quantized by the delay time of the TDU. The TDU is composed of two AND gates to keep the mapping gain of TDU and DCO coarse tuning stage consistent. The proposed interpolation type fine-tuning can provide a fixed ratio between the resolution of DCO ( $\triangle$ DCO) and TDC delay unit. Therefore, the resolution of the DCO coarse-tuning stage is equal to one TDC delay unit. Therefore, the resolution of the DCO coarse-tuning stage becomes 2\* $\triangle$ TDC. Subsequently, the resolution of fine-tuning stage, which means the DCO resolution ( $\triangle$ DCO), is equal to (1/32) \* (2 \*  $\triangle$ TDC). The resolution of TDU is equal to 16\* $\triangle$ DCO. In Fig. 3.11, when the phase error is larger than 1/3 clock period, the dco\_code = dco\_code ± tdc\_code/16. It means we compensate 1/16 of the TDC measured phase error.

### **Chapter 4**

### **Experimental Results**

### 4.1 Test Chip Implementation



Fig. 4.1 Test chip floor planning and I/O planning

Fig. 4.1 shows the proposed ADCDR circuit's floor planning and I/O planning. There are 18 I/O PADs and 14 power PADs. The detail I/O description is shown in Table 4.1. The test chip is composed of the proposed ADCDR circuit and a spread spectrum clock generator (SSCG) to trigger the random pattern generator. The SSCG circuit has the range of 76MHz to 480MHz for high speed and low speed measurement. The SSCG is controlled by off-chip binary code (SSC\_HIGH\_CODE[3:0]) to adjust the SSCG output frequency coarsely. The SPREAD\_COUNT is controlled to adjust the fine code of DCO in SSCG to generate a down spread triangular modulation. The pattern generator (Pattern\_Gen) generates the 32/70 SYNC pattern and adopts the feedback shift register (LFSR) for random bits generation.

| Output        | Bit | Function                                                        |  |  |  |
|---------------|-----|-----------------------------------------------------------------|--|--|--|
| RECOVERY_CLK  | 1   | CDR circuit recovery clock                                      |  |  |  |
| TARGET_DATA   | 1   | CDR circuit input data                                          |  |  |  |
| LOCK          | 1   | CDR circuit lock in                                             |  |  |  |
| SSC_CLK_OUT   | 1   | SSCG clock                                                      |  |  |  |
| Input         | Bit | Function                                                        |  |  |  |
| RESET         | 1   | System reset                                                    |  |  |  |
| SSC_RESET     | 1   | SSCG reset                                                      |  |  |  |
| SSC_High_CODE | 4   | SSCG baseline code for down spread                              |  |  |  |
| SPREAD_COUNT  | 2   | Down spread counter                                             |  |  |  |
| SSC_ON        | 1   | SSCG enable                                                     |  |  |  |
| SSC_REF_CLK   | 1   | SSCG reference clock                                            |  |  |  |
| TEST_MODE     | 1   | 0 Random Mode                                                   |  |  |  |
|               |     | 1 Worst case mode                                               |  |  |  |
|               |     | 0 SYNC pattern = 32                                             |  |  |  |
| SYNC_MODE     | 1   | 1 SYNC pattern = 70                                             |  |  |  |
| OUTPUT DIV    | 1   | RECOVERY_CLK/SSC_CLK_OUT divide by 8 for the high               |  |  |  |
|               | 1   | speed mode measurement                                          |  |  |  |
|               | 1   | Gated TARGET_DATA/RECOVERY_CLK for the                          |  |  |  |
| OUTPUT_TYPE   |     | SSCG clock measurement                                          |  |  |  |
|               |     | 1 Gated SSC_CLK_OUT for the recovery clock and data measurement |  |  |  |

Table 4.1 I/O PADs description



Fig. 4.2 Layout of the test chip

|  | Table 4.2 | Block module | name |
|--|-----------|--------------|------|
|--|-----------|--------------|------|

| Block Number | Module Name                            |
|--------------|----------------------------------------|
| (1)          | Spread Spectrum Clock Generator (SSCG) |
| (2)          | TDC_embedded DCO                       |
| (3)          | Dual mode Phase Detector (PFD)         |
| (4)          | Pattern Generator                      |
| (5)          | Time to digital converter (TDC)        |
| (6)          | Digital loop filter (DLF)              |
| (7)          | ADCDR controller                       |
| (8)          | State machine                          |

Fig. 4.2 and Table 4.2 show the layout of test chip. The test chip is implemented in a TSMC 90nm CMOS process with standard cells and a 1.0V power supply. The chip area is  $864*864 \ \mu\text{m}^2$  and the active area is  $300*300 \ \mu\text{m}^2$ . The test chip is composed of the proposed ADCDR block and the testing block. The proposed ADCDR block contains the TDC-embedded DCO, Dual mode PFD, TDC, Controller, DLF and state machine. The testing block contains the SSCG and the pattern generator.



### 4.2 Full Chip Simulation

Fig. 4.3 System simulation of proposed ADCDR circuit in 480MHz

Fig. 4.3 shows the post layout simulation of the proposed ADCDR circuit at 480MHz. A pattern generator is included in the test chip. It can perform the 10% down-spread modulation with a 30 kHz modulation frequency.

In the first phase, the TDC-embedded DCO produces the initial dco\_code value of quantized input data clock period to the controller and performs fast lock-in.

In the second phase, the ADCDR controller performs modified binary search algorithm [26] to achieve frequency and phase acquisition in a short time. In Fig. 4.3, the PolarityCode is the previous recorded DCO control code at phase polarity change. When the phase polarity is changed again, the DCO control code is updated as the average value of dco\_code and PolarityCode. As compared to the binary search algorithm with a digital loop filter [25], the modified binary search algorithm [26] can quickly find the DCO control code close to the target DCO control code in a short time. Then, the ADCDR controller enters the third phase.

The input data stream has a 10% down-spread modulation at 480MHz. In the third phase, the proposed adaptive gain control scheme and the TDC-based fast phase compensation approach are applied to track the frequency variations due to the down-spread modulation during data transmission. The Large\_PhaseError signal indicates that the phase error is too large (>1/3 clock period) in these cycles, and the phase error is compensated for by a larger phase tracking gain (tdc\_code) to quickly reduce the phase error.



### 4.3 Bit Error Rate measurement

### **4.3.1 Random Jitter Tolerance**





Fig. 4.4 Random jitter BER performance

Fig. 4.4 shows the BER performance of the proposed ADCDR. When the data is transmitted without the down-spread spread spectrum modulation, the proposed AGCID scheme and the TDC-based fast phase compensation approach can improve the jitter tolerance of the CDR circuit. In Fig. 4.4(b), when the data is transmitted with the 10% down-spread modulation and the SSCG in the transmitter outputs frequency ranges from 432MHz to 480MHz. The proposed AGCID scheme and the TDC-based fast phase compensation approach can enhance the phase and frequency tracking ability. Therefore, the random jitter tolerance is improved to 130ps and the BER is still less than 10<sup>-12</sup> as compared to without the ADCID and the TDC. As a result, the frequency variations during data transmission can be tracked and compensated for even with a 10% spreading ratio in the transmitter.



### 4.3.2 Sinusoidal Jitter Tolerance



Fig. 4.5 Sinusoidal jitter tolerance performance

In the jitter tolerance testing of the receiver, the input data rate is modulated with a sinusoidal jitter. The sinusoidal has three important parameters,  $UI_{jitter}$  (peak-peak),  $\triangle f$  and f j. The  $UI_{jitter}$  (peak-peak) is the total jitter accumulation in one sinusoidal modulation period. The  $\triangle f$  is the maximum frequency variation in one sinusoidal modulation period. The f is the maximum frequency of the sinusoidal jitter. The relationship between these three parameters is in Eq. 4.1

$$UI_{jitter} (peak - peak) = \frac{\Delta f}{f_{j}*\pi}$$
(4.1)

Fig. 4.5 shows the sinusoidal jitter tolerance of the proposed ADCDR circuit at 300MHz. With the proposed AGCID scheme and the TDC-based fast phase compensation, the corner frequency is at 9MHz with 0.25UI jitter tolerance. Opposite, without the propose AGCID scheme and the TDC-based fast phase compensation, the corner frequency is 6MHz with 0.2UI. In addition, with the previous AGCID scheme [17], the corner frequency is 9MHz with 0.15UI.

### 4.4 Chip Summary and Comparison Table

| Process                | 90nm CMOS                   |  |
|------------------------|-----------------------------|--|
| Operating Range        | $76 MHz \sim 480 MHz$       |  |
| Supply Voltage         | 1.0V                        |  |
| Core Area              | 0.09mm <sup>2</sup>         |  |
| Dowor Congumption      | 4.28 mW (480MHz)            |  |
| Power Consumption      | 1.03 mW (76MHz)             |  |
| Input Jitter Tolerance | 130ps with BER $< 10^{-12}$ |  |
| SSC Tracking Range     | Down spread 10%             |  |
| Lock-in Time           | < 35cycles                  |  |
| Reference Clock        | NO                          |  |

Table 4.3 Chip summary

The chip summary is shown in Table 4.3. The chip is implemented in TSMC 90nm CMOS process with standard cells and 1.0V supply. The core area is 0.09mm<sup>2</sup>. The frequency range of the proposed ADCDR ranges from 76MHz to 480MHz. The power consumption is 4.28 mW at 480MHz with the down-spread 10% modulation. The lock-in time of proposed ADCDR is less than 35 cycles in 480MHz. The input random jitter tolerance is 130ps with down spread 10%

The performance of proposed ADCDR and the comparison with related works are shown in Table 4.4. The test chip is composed of an ADCDR circuit, a SSCG and a pattern generator. The proposed ADCDR circuit has smaller chip area, power consumption, shorter lock-in time, better jitter tolerance and no external reference needed.

Table 4.4 Comparison table

|           | [14]                | [12]               | [22]                 | [32]                 | [16]                | [17]                  |                       |
|-----------|---------------------|--------------------|----------------------|----------------------|---------------------|-----------------------|-----------------------|
|           | JSSC'               | JSSC'              | TCAS-II'             | JSSC'                | APCCAS'             | VLSI-DAT              | Proposed              |
|           | 06                  | 08                 | 08                   | 11                   | 08                  | '11                   |                       |
| Process   | 0.18-µm             | 0.18-µm            | 0.18-µm              | 0.13 <b>-</b> µm     | 0.18-µm             | 65nm                  | 90nm                  |
| Data Pata | 155Mb/s~            | 200Mb/s~           | 480MUz               | 1Gb/s~               | 1.25Ch/a            | 480MUz                | $76 MHz \sim$         |
| Data Kate | 3Gb/s               | 4Gb/s              | 400MI12              | 4Gb/s                | 1.2500/\$           | 4801VITIZ             | 480MHz                |
|           | 4X                  | 8X                 | Blind                | 4X                   | 4X                  |                       |                       |
| CDR Type  | Multi-phas          | Multi-phas         | Over-samp            | Multi-phas           | Multi-phas          | Full Rate             | Full Rate             |
|           | e                   | e                  | ling                 | e                    | e                   |                       |                       |
| Area      | 0.88mm <sup>2</sup> | 0.8mm <sup>2</sup> | 0.185mm <sup>2</sup> | 0.074mm <sup>2</sup> | 0.63mm <sup>2</sup> | 0.0255mm <sup>2</sup> | 0.09mm <sup>2</sup> * |
| Supply    | 1.8V                | 1.4V               | 1.8V                 | 1.2V                 | 1.8V                | 1.0V                  | 1.0V                  |
|           |                     |                    |                      |                      |                     |                       | 4.28mW*               |
| D         | 95 mW               | 14mW               | 8.2mW                | 11.4mW               | 80mW                | 1.73mW                | (480MHz)              |
| Power     | (3Gb/s)             | (2Gb/s)            | (480MHz)             | (3Gb/s)              | (1.25Gb/s)          | (480MHz)              | 1.03mW*               |
|           |                     |                    |                      |                      |                     |                       | (76MHz)               |
| Lock-in   | 50µs                |                    | Zero lock            |                      | 800ns               | 40ns                  | 67ns                  |
| Time      | (3Gb/s)             | N/A                | time                 | N/A                  | (1.25Gb/s)          | (500Mb/s)             | (480MHz)              |
| Reference | No                  | Yes                | Yes                  | Yes                  | No                  | No                    | No                    |
|           |                     |                    | 26                   |                      |                     |                       | Yes                   |
| SSC       | N.                  | Yes                |                      |                      | N.                  | N.                    | 10%                   |
| Tracking  | NO                  | 2500ppm            | NO                   | NO                   | NO                  | NO                    | down                  |
|           |                     |                    | -                    | De                   |                     |                       | spread                |
| Jitter    | 0.5111              |                    |                      | 0.22111              |                     | 0.15111               | 0.25111               |
| Tolerance | 0.301               |                    |                      | 0.2301               |                     |                       | 0.2501                |
| and       | (u)/MHZ             | N/A                | N/A                  | (w/SMHZ)             | N/A                 |                       |                       |
| Corner    | ( <i>w</i> 2.488GD  |                    |                      |                      |                     |                       |                       |
| Frequency | /s data rate        |                    |                      | data rate            |                     | data rate             | data rate             |

\*including SSCG and pattern generator

### Chapter 5 Conclusion and Future Works

In this thesis, the referenceless all-digital clock and data recovery circuit for spread spectrum SerDes application is proposed.

With the AGCID and TDC-based fast phase compensation, the random jitter tolerance can be increased to 130ps with 432MHz to 480MHz spread spectrum clock. The sinusoidal jitter tolerance can be increase from 0.15UI to 0.25UI with corner frequency at 9MHz.

The proposed ADCDR circuit doesn't need the reference clock, a multi-phase clock generator nor the oversampling architecture. The area, power consumption and design complexity can be greatly reduced.

With the proposed TDC-embedded DCO, the lock-in time can be within in 35 cycles at 480MHz data rate. In addition, the proposed interpolator-based fine tuning architecture can easy solve the DCO non-monotonic response problem and it is very suitable in spread spectrum applications.

The test chip is implemented in TSMC 90nm process with standard cells, and thus it has good portability over different processes. The core area is 0.09mm<sup>2</sup> and the power consumption is 4.28mW at 480MHz with 10% down spread.

In recent years, the body channel communication (BCC) [30][31] is very attractive in ubiquitous healthcare systems and multimedia systems. It adopts the human body as the transmission medium to achieve high speed communication with low power consumption. However, the human body as the channel can generate additional jitter for the receivers. The CDR circuit must enhance the tracking ability to overcome the noise interference from the human body. Thus our proposed ADCDR circuit with strong tracking ability is very suitable in the body channel communication systems.



### Reference

- [1] Serial ATA Working Group, SATA-IO Revision 3.1 Specification, July, 2011.
- [2] Hsiang-Hui Chang, I-Hui Hua, and Shen-Iuan Liu, "A spread spectrum clock generator with triangular modulation," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 4, pp. 673-676, Apr. 2003.
- [3] Davide De Caro, et al., "A 1.27 GHz, all-digital spread spectrum clock generator/synthesizer in 65nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 5, pp. 1048-1060, May 2010.
- [4] Duo Sheng, Ching-Che Chung, and Chen-Yi Lee, "A low power and portable spread spectrum clock generator for SoC applications," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 19, no. 6, pp. 1113-1117, Jun. 2011.
- [5] Behzad Razavi, "Challenges in the Design of High-Speed Clock and Data Recovery Circuits," *IEEE Communication Magazine*, vol. 40, pp. 94-101, Aug. 2002.
- [6] Ming-ta Hsieh, Sobelman, G., "Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery," *IEEE Circuits and Systems Magazine*, vol. 8, pp. 45-57, 2008.
- [7] Hogge and C.R., Jr., "A Self Correcting Clock Recovery Circuit," *IEEE Transaction on Electron Devices*, Vol. 32, pp. 2704-2706, Dec. 1985.
- [8] J. D. H. Alexander, "Clock recovery from random binary signals," *IEEE Electronics Letters*, vol. 11, pp. 541-542, Oct. 1975.
- [9] Shao-Hung Lin, Chang-Lin Hsieh and Shen-Iuan Liu, "A Half-Rate Bang-Bang Phase/Frequency Detector for Continuous-Rate CDR Circuits," in Proceeding of IEEE Conference on Electron Devices and Solid-State Circuits, 2007, pp. 353–356.
- [10] Yoshio Miki, Member, IEEE, Tatsuya Saito, Hiroki Yamashita, Fumio Yuki, Takashige Baba, Akio Koyama, and Masahito Sonehara, "A 50-mW/ch 2.5-Gb/s/ch Data Recovery Circuit for the SFI-5 Interface With Digital Eye-Tracking," *IEEE Journal of Solid-State Circuits*, vol. 39, pp. 613-621, Apr. 2004.
- [11] Jinghua Li, et al., "A full on-chip CMOS clock-and-data recovery IC for OC-192 applications," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 5, pp. 1213-1222, Jun. 2008.
- [12] Pavan Kumar Hanumolu, Gu-Yeon Wei and Un-Ku Moon, "A wide-tracking range clock and data recovery circuit," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 2, pp. 425-439, Feb. 2008.
- [13] Michael H. Perrott, et al., "A 2.5-Gb/s multi-rate 0.25µm CMOS clock and data recovery circuit utilizing a hybrid analog/digital loop filter and all-digital referenceless frequency acquisition," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 12, pp. 2930-2944, Dec. 2006.

- [14] Rong-Jyi Yang, et al., "A 155.52 Mbps 3.125 Gbps continuous-rate clock and data recovery circuit," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 6, pp. 1380-1390, Jun. 2006.
- [15]Rong-Jyi Yang, Kuan-Hua Chao and Shen-Iuan Liu, "A 200-Mbps ~ 2-Gbps continuous-rate clock-and data-recovery circuit," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 53, no. 4, pp.842-847, Apr. 2006.
- [16] Chi-Shuang Oulee and Rong-Jyi Yang, "A 1.25Gbps all-digital clock and data recovery circuit with binary frequency acquisition," in Proceedings of IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Nov. 2008, pp. 680-683.
- [17] Ching-Che Chung and Wei-Cheng Dai, "A referenceless all-digital fast frequency acquisition full-rate CDR circuit for USB 2.0 in 65nm CMOS technology," in Proceedings of International Symposium on VLSI Design, Automation, and Test (VLSI-DAT), Apr. 2011, pp. 217-220.
- [18] Chang-Lin Hsieh, and Shen-Iuan Liu, "A 1~16Gb/s wide-range clock/data recovery circuit with bidirectional frequency detector," *IEEE Trans. Circuits and Systems-II: Express Briefs*, vol. 58, pp. 487-491, Aug. 2011.
- [19] R. Kreienkamp, U. Langmann, C. Zimmermann, T. Aoyama, and H. Siedhoff, "A 10-Gb/s CMOS Clock and Data Recovery Circuit with an Analog Phase Interpolator," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 3, pp. 736–743, Mar. 2005.
- [20] M.Y. He and J. Poulton, "A CMOS Mixed-Signal Clock and Data Recovery Circuit for OIF CEI-6G 1 Backplane Transceiver," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 3, pp. 597–606, Mar. 2006.
- [21] M. Hsieh and G.E. Sobelman, "Clock and Data Recovery with Adaptive Loop Gain for Spread Spectrum SerDes Applications," in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), May 2005, pp. 4883–4886.
- [22] Sang-Hune Park; Kwang-Hee Choi; Jung-Bum Shin; Jae-Yoon Sim; Hong-June Park, "A Single-Data-Bit Blind Oversampling Data-Recovery Circuit With an Add-Drop FIFO for USB2.0 High-Speed Interface," *IEEE Transaction on Circuits and System II: Express Briefs*, vol. 55 No. 2, pp. 156-160, Feb. 2008.
- [23] S.I. Ahmed and T.A. Kwasniewski, "Overview Of Oversampling Clock and Data Recovery Circuits" in Proceedings of Canadian Conference on Electrical and Computer Engineering, May 1–4, 2005, pp. 1876–1881.
- [24] Duo Sheng and Jhih-Ci Lan, "A monotonic and low-power digitally controlled oscillator with portability for SoC applications," *in Proceedings of IEEE International Midwest Symposium on Circuits and Systems (MWSCAS)*, Aug. 2011.
- [25] Ching-Che Chung and Chiun-Yao Ko, "A fast phase tracking ADPLL for video pixel clock generation in 65nm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 10, pp. 2300-2311, Oct. 2011.
- [26] Hsuan-Jung Hsu and Shi-Yu Huang, "A low-jitter ADPLL via a suppressive digital filter

and an interpolation-based locking scheme," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 19, no. 1, pp. 165-170, Jan. 2011.

- [27] Rong-Jyi Yang and Shen-Iuan Liu, "A 40~550MHz harmonic-free all-digital delay-locked loop using a variable SAR algorithm," *IEEE Journal of Solid-State Circuits*, vol. 42, pp. 361-373, Feb. 2007.
- [28] Ching-Che Chung, Chiun-Yao Ko, and Sung-En Shen, "A built-in self calibration circuit for monotonic digitally controlled oscillator design in 65nm CMOS technology," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 58, no. 3, pp. 149-153, Mar. 2011.
- [29] P.-Y. Chao, C.-W. Tzeng, S.-C. Fang, C.-C. Weng, and S.-Y. Huang, "Low-Jitter Code-Jumping for All-Digital PLL to Support Almost Continuous Frequency Tracking," *in Proceedings of International Symposium on VLSI Design, Automation and Test* (VLSI-DAT), April 2011.
- [30] Joonsung Bae; Kiseok Song; Hyungwoo Lee; Hyunwoo Cho; Hoi-Jun Yoo, "A 0.24-nJ/b Wireless Body-Area-Network Transceiver With Scalable Double-FSK Modulation," *IEEE Journal of Solid-State Circuits*, vol., 47, pp. 310-322, Jan. 2012.
- [31] Joonsung Bae, Hyunwoo Cho, Kiseok Song, Hyungwoo Lee, Hoi-Jun Yoo, "The Signal Transmission Mechanism on the Surface of Human Body for Body Channel Communication," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, pp. 582-593, Mar. 2012.
- [32]H. Song, D. S. Kim, D. H. Oh, S. Kim, and D. K. Jeong, "A 1.0.4.0-Gb/s all-digital CDR with 1.0-ps period resolution DCO and adaptive proportional gain control," *IEEE Journal* of Solid-State Circuits, vol. 46, no. 2, pp. 424.434, Feb. 2011.