## 國立中正大學

資訊工程研究所碩士論文

可應用於USB2.0之無參考時脈全數位快 速鎖定之連續速率資料與時脈回復電路

An Referenceless All-Digital Fast Frequency Acquisition Full-Rate Continuous Rate CDR Circuit for USB 2.0 in 65nm Technology

> 研究生: 戴偉丞 指導教授: 鍾菁哲 博士

中華民國 一百 年 七 月

國立中正大學碩士班研究生

學位考試同意書

本人所指導 資訊工程學系

研究生 戴偉丞 所提之論文

可應用於USB2.0之無參考時脈全數位快速鎖定資料與時脈回復電路 An Referenceless All-Digital Fast Frequency Acquisition Full-Rate CDR Circuit for USB 2.0 in 65nm Technology

同意其提付 碩 士學位論文考試



國立中正大學碩士學位論文考試審定書

#### 資訊工程學系

#### 研究生戴偉丞 所提之論文

<u>可應用於USB2.0之無參考時脈全數位快速鎖定資料與時</u> 脈回復電路 An Referenceless All-Digital Fast Frequency Acquisition Full-Rate CDR Circuit for USB 2.0 in 65nm Technology 經本委員會審查,符合碩士學位論文標準。

學位考試委員會 音 召 集 X 員 委 指 導 教 授 簽章 中華民國 月 100 12年 日

#### 博碩士論文授權書

#### (本聯請裝訂於論文紙本書名頁前空白處,供學校圖書館做爲授權管理用)

ID:099CCU00392118

本授權書所授權之論文爲授權人在國立中正大學(學院)資訊工程研究所系所\_\_\_\_\_組99學 年度第 暑期 學期取得 碩 士學位之論文。

論文題目: 可應用於USB2.0之無參考時脈全數位快速鎖定之連續速率資料與時脈回復電路

#### 指導教授: 鍾菁哲, Chung, Chingche

茲同意將授權人擁有著作權之上列論文全文(含摘要),提供讀者基於個人非營利性質之線上檢 索、閱覽、下載或列印,此項授權係非專屬、無償授權國家圖書館及本人畢業學校之圖書館, 不限地域、時間與次數,以微縮、光碟或數位化方式將上列論文進行重製,並同意公開傳輸數 位檔案。

紙本論文:茲同意將授權人擁有著作權之上列論文全文(含摘要),提供讀者基於個人非營利性 質之閱覽或列印,此項授權系非專屬、無償授權國立中正大學圖書館做為編目上架及公開陳列 閱覽使用。

□ 校內外立即開放

□ 校內立即開放,校外於 年 月 日後開放

☑ 校內於 2016 年 12 月 31 日;校外於 2016 年 12 月 31 日後開放

□ 其他

授權人:戴偉丞

载温水\_ 親筆簽名或蓋章: 民國/00年8月23日

## 可應用於USB2.0之無參考時脈全數位快 速鎖定之連續速率資料與時脈回復電路

學生: 戴偉丞 指導教授: 鍾菁哲 博士

國立中正大學資訊工程學系研究所

#### 摘要

有鑑於傳統 USB2.0 傳輸架構之中擁有需要外接石英震盪器來當作參考時脈 以及依靠以多相位產生器來完成資料回復等技術會造成額外面積成本以及功率 消耗的問題;本論文提出一個不需要外部參考時脈之資料與時脈回復電路,此電 路用於 USB2.0 通訊系統之接收端,針對傳送端傳送過來的 USB2.0 封包內之同步 訊號區段來完成接收端與傳送端時脈之頻率追蹤及相位鎖定,使得接收端可正確 的將傳送端傳送過來的資料重現於接收端之暫存器內;但由於 USB2.0 之同步訊 號為一個有限的長度(高速模式下為 32 位元,全速及低速模式中為7 位元),為 了滿足此快速鎖定的需求,本論文提出了內嵌時間至數位轉換器之振盪器使得本 電路在頻率追蹤上可在一個同步訊號的時間即可快速的鎖定至接近於資料傳送 率的頻率區段,使之加快追蹤過程讓整體鎖定時間可於同步訊號的區段之內完 成。本論文也提出一個雙模式相位與頻率偵測器,可於封包中的同步訊號區段使 用頻率偵測模式來達成頻率追蹤,於隨機資料的區段中可切換至相位偵測模式, 來達成並維護接收端振盪器與輸入資料訊號間相位的對齊。

USB2.0 之中分別提供三種速度模式,分別為高速模式(480MHz)、全速模式 (12MHz) 以及低速模式(1.5MHz);在傳統 USB2.0 傳輸架構之中,每一種速度模 式下都需要一個與之對應的時脈與資料回復電路造成成本的提升;本論文提出的 時脈與資料回復電路能達到以一套硬體完成寬頻的工作範圍來涵蓋 USB2.0 中所 有速度模式以減少硬體成本。為了支援低頻的模式本論文提出了一個低頻時脈合 成器減少傳統上使用延遲單元的作法,減少面積並達成寬頻連續速率時脈與資料 回復電路來支援於 USB 2.0 下所有速度模式。

本論文提出之全數位電路使用 65 奈米製程的標準元件庫實現,工作範圍為 700kHz 到 500MHz 可用於連續速率之時脈及資料回復,晶片面積為 150µm<sup>2</sup>,其最 高功率消耗為運作於 500MHz 之消耗功率為 2.63mW。



## An Referenceless All-Digital Fast Frequency Acquisition Full-Rate Continuous Rate CDR Circuit for USB2.0 in 65nm Technology

Student: Wei-Cheng Dai Advisor: Dr. Ching-Che Chung Department of Computer Science and Information Engineering, National Chung Cheng University

#### Abstract

The conventional USB2.0 transceiver usually needs the external reference clock and multi-phase scheme for over-sampling architecture and causes the additional cost. To solve this problem, an all-digital fast-frequency acquisition full-rate continuous rate clock and data recovery (CDR) circuit for USB 2.0 applications without a reference clock is presented in this thesis. Lock-in time is constrained by the short USB 2.0 packet synchronization pattern's specifications (32bit for high speed mode and 7bit for full/slow speed mode). Therefore we propose a wide range time-to-digital converter (TDC) embedded digital control oscillator (DCO) to achieve fast frequency acquisition so that the lock-in time can be reduced to satisfy the length of synchronization pattern. A dual mode phase and frequency detector (PFD) is proposed to perform two operating modes in the frequency tracking in the sync pattern region and the phase aligning in the random data pattern region.

In order to support all USB 2.0 speed modes: high speed mode (480MHz), full speed mode (12MHz) and slow speed mode (1.5MHz), the low frequency clock synthesizer is proposed in this thesis. It can achieve the wide range continuous rate

clock and data recovery with reduced area cost and cover all the operation modes of USB 2.0.

The proposed ADCDR is implemented on standard performance 65nm CMOS process. The operating range of proposed ADCDR circuit is from 700kHz to 500MHz which covers all speed modes of USB 2.0. The chip size is  $644\mu m^2$  and the core size is  $150\mu m^2$ . The power consumption is 2.63 mW at 500MHz.



Acknowledgements

I would like to express the deepest gratitude to my advisor Dr. Ching-Che Chung, who always gives me the right direction and enthusiastic guidance, and helps me to complete my research.

I would like to thank to my partners of Silicon Sensor and System (S3) Lab of National Chung Cheng University, too. Without their assistance, I could not overcome many difficulties over two years.

Finally, I would like to thank my family, who always concern for me. Their support is always encouraged me to keep positive in my research.

## Content

| Chapter 1 Introduction                                                   | 11    |
|--------------------------------------------------------------------------|-------|
| 1.1 Conventional CDR circuit survey                                      | 11    |
| 1.1.1 Basic concept of CDR circuit                                       | 11    |
| 1.1.1.1 Sample CDR Circuit                                               | 11    |
| 1.1.1.2 Phase Detector in CDR circuit                                    | 12    |
| 1.1.2 Multiphase Scheme for CDR circuit                                  | 13    |
| 1.1.2.1 Reduced-rate oversampling CDR circuit                            | 14    |
| 1.1.2.2 Blind oversampling type CDR circuit                              | 15    |
| 1.1.3 Modern implementation of CDR circuit                               | 16    |
| 1.1.3.1 Single rate CDR circuit                                          | 16    |
| 1.1.3.2 Multi-rate CDR circuit                                           | 16    |
| 1.1.3.3 Continuous rate CDR circuit                                      | 17    |
| 1.1.4 Referenceless issue of CDR circuit                                 | 18    |
| 1.2 Introduction to USB 2.0.                                             | 19    |
| 1.2.1 USB 2.0 overview                                                   | 19    |
| 1.2.2 USB 2.0 protocol layer                                             | 22    |
| 1.3 Design challenges of CDR circuit in USB2.0                           | 25    |
| 1.4 Summary                                                              | 27    |
| Chapter 2 Referenceless CDR Circuit for USB2.0 High Speed Mode           | 28    |
| 2.1 System Architecture Overview                                         | 28    |
| 2.2 Dual Mode Phase and Frequency Detector                               | 31    |
| 2.3 Time-to-Digital Converter Embedded DCO                               | 33    |
| 2.3.1 Structure                                                          | 33    |
| 2.3.2 MUX Type DCO structure                                             | 34    |
| 2.3.3 DCV fine-tune delay line                                           | 35    |
| 2.3.4 Time-to-Digital Converter                                          | 36    |
| 2.4 Adaptive Gain in Consecutive Identical Digit                         | 38    |
| Chapter 3 Wide Range Improvement of Supporting USB2.0 Full/Low Speed Mod | de.40 |
| 3.1 System Architecture with Wide Operating Range Improvement            | 40    |
| 3.2 Wide Range Cyclic TDC-embedded DCO                                   | 42    |
| 3.3 Low Frequency Clock Synthesizer                                      | 46    |
| 3.3.1 Cyclic Delay of low frequency clock synthesization                 | 46    |
| 3.3.2 Fast Reset Scheme on DCO                                           | 49    |
| Chapter 4 Experimental Results                                           | 51    |
| 4.1 Test Chip Implementation                                             | 51    |

| 4.2 Test Chip Measurement Result                                |    |
|-----------------------------------------------------------------|----|
| 4.3 Final Chip Implementation                                   |    |
| 4.4 Full Chip Overall Simulation                                | 61 |
| 4.5 Bit Error Rate measurement in RTL Behavior Model Simulation | 63 |
| 4.6 Chip Summary and Comparison Table                           | 65 |
| Chapter 5 Conclusion and Future Works                           | 68 |
| Reference                                                       | 70 |



## **Figure list**

| Fig 1.1 The role of CDR circuit in a communication system         | 11 |
|-------------------------------------------------------------------|----|
| Fig 1.2 Basic example of PLL type CDR circuit                     | 11 |
| Fig 1.3 Waveform of basic PLL type CDR circuit                    |    |
| Fig 1.4 Architecture of reduced rate CDR circuit                  | 14 |
| Fig 1.5 Architecture of blind oversampling circuit                | 15 |
| Fig 1.6 Example of continuous rate CDR circui                     | 17 |
| Fig 1.7 USB 2.0 interconnection between host and device           | 19 |
| Fig 1.8 System diagram of UTMI                                    | 20 |
| Fig 1.9 Receiver sensitivity requirement of USB 2.0 specification | 21 |
| Fig 1.10 Packet format of USB 2.0                                 |    |
| Fig1.11 Example of USB2.0 transaction                             | 24 |
| Fig 1.12 Waveform of USB 2.0 packet                               |    |
| Fig 1.13 Tracking time of charge pump VCO loop                    |    |
| Fig 2.1 The block diagram of proposed ADCDR Circuit architecture  |    |
| Fig 2.2 The waveform of data transition extraction                |    |
| Fig 2.3 Proposed dual mode PFD architecture                       |    |
| Fig 2.4 Waveform of the proposed dual mode PFD                    |    |
| Fig 2.5 The TDC-embedded DCO architecture                         |    |
| Fig 2.6 DCO part of proposed TDC-embedded DCO                     |    |
| Fig 2.7 Delay path selection of proposed MUX Type DCO             |    |
| Fig 2.8 The fine-tune delay line of proposed MUX Type DCO         |    |
| Fig 2.9 The TDC part of proposed TDC-embedded DCO                 |    |
| Fig 2.10 Operation of TDC-embedded DCO                            |    |
| Fig 2.11 Phase error accumulation                                 |    |
| Fig 2.12 Increased gain in the 3 consecutive identical digit      |    |
| Fig 3.1 System architecture of wide operation range improvement   | 40 |
| Fig 3.2 Proposed wide range cyclic TDC-embedded DCO architecture  | 42 |
| Fig 3.3 Operation waveform of proposed cyclic TDC-embedded DCO    | 44 |
| Fig 3.4 Another case of TDC measurement                           | 44 |
| Fig 3.5 DCO delay line simulation with PVT variations             | 45 |
| Fig 3.6 The proposed low frequency clock synthesizer architecture | 46 |
| Fig 3.7 Waveform of low frequency clock synthesize                | 47 |
| Fig 3.8 Long reset time for original DCO architecture             | 49 |
| Fig 3.9 Fast Reset DCO architecture                               | 49 |
| Fig 3.10 Operation of fast reset scheme                           |    |
| Fig 4.1 Test chip floorplaning and I/O planning                   | 51 |

| Fig 4.2 microphotograph of the test chip of proposed ADCDR circuit   | 53 |
|----------------------------------------------------------------------|----|
| Fig 4.3 The jitter histogram of proposed ADCDR circuit               | 54 |
| Fig 4.4 The jitter histogram of proposed ADCDR circuit               | 57 |
| Fig 4.5 Waveform of recovery clock and data pattern                  | 58 |
| Fig 4.6 second version chip floorplaning and I/O planning            | 59 |
| Fig 4.7 System simulation of proposed ADCDR circuit in 480MHz        | 61 |
| Fig 4.8 System simulation of proposed ADCDR circuit in 100MHz        | 62 |
| Fig 4.9 System simulation of proposed ADCDR circuit in 12MHz         | 62 |
| Fig 4.10 Input jitter tolerance of BER performance with AGCID scheme | 64 |
| Fig 4.11 Layout of 2 <sup>nd</sup> version of proposed ADCDR circuit | 65 |



## Table list

| Table 1.1 PID Types                                          | 23 |
|--------------------------------------------------------------|----|
| Table 3.1 Coarse/Fine stage range and step in PVT variations | 45 |
| Table 4.1 I/O PADs discription                               | 52 |
| Table 4.2 I/O PADs description of 2nd chip                   | 60 |
| Table 4.3 Comparison table                                   | 67 |



## **Chapter 1 Introduction**

## 1.1 Conventional CDR circuit survey

#### 1.1.1 Basic concept of CDR circuit

#### 1.1.1.1 Sample CDR Circuit



Fig 1.1 The role of CDR circuit in a communication system

Clock and data recovery circuit [8][9] is widely used in high speed serial data link communication system. In Fig. 1.1, the transceiver transmits the encoded serial data to the receiver. The CDR circuit in the receiver recovers the synchronization clock of the incoming serial data and then uses the recovered clock to retime the receive data pattern to as the recovered data.



Fig 1.2 Basic example of PLL type CDR circuit

Conventionally, PLL architecture is usually used to construct the CDR circuit. Fig. 1.2 shows the basic architecture of PLL type CDR circuit. The PLL loop is proposed of a phase detector (PD), a loop filter (LP), and the voltage controlled oscillator (VCO) to synchronize the VCO clock rate with the input data rate. The VCO clock triggers the D Flip/Flop (DFF) to sample the NRZ data and the input data is retimed as the recovered data. Fig. 1.3 shows the concept of the PLL type CDR circuit. The recovered clock's rising edge should be aligned with the input data transition and then we can sample the input data with its falling edge.



Fig 1.3 Waveform of basic PLL type CDR circuit

#### 1.1.1.2 Phase Detector in CDR circuit

In order to make the CDR circuit is able to track the random data pattern, several phase detectors (PDs) for random data tracking are published. The linear type Hogge PD [30] and the bang bang type Alexander PD [31] is well-known. Other works like the PD for 3X over-sampling Eye-Tracking [7] or the half rate PD [18] for reduced-rate CDR circuit are proposed to achieve the novel CDR circuit.

#### 1.1.2 Multiphase Scheme for CDR circuit

The traditional PLL type CDR circuit suffers the speed limitation of high speed data rate. In order to track the full rate data, the VCO clock frequency should be operated at a high data rate. Thus the power consumption and design complexity of VCO are increased. In addition, the traditional PLL type CDR circuit has the disadvantage of requiring a 50 % duty-cycle VCO output clock. If the VCO output clock is not the 50% duty cycle the performance of the CDR circuit will be degraded.

In order to overcome these problems and achieve the better performance of clock and data recovery, the multiphase over-sampling scheme is published [19][10]. There are two popular architectures with the multiphase scheme. One is the reduced rate oversampling CDR circuit which has the advantage of the reduced rate of VCO clock to save the power and achieve the high data rate much easier. The other is blind oversampling architecture which has the advantage of all digital nature and the fast lock-in time due to no phase alignment is required.

#### 1.1.2.1 Reduced-rate oversampling CDR circuit



Fig 1.4 Architecture of reduced rate CDR circuit

Fig. 1.4 shows the example of the reduced rate over-sampling CDR circuit. There is a multi-phase VCO oscillates the reduced clock rate (usually half rate or quarter rate) by locking with the divided VCO clock and the data rate in the PLL Loop and generates the multi-phase clocks (Ckout[n:0]). The multi-phase clock is used to trigger the Multi-phase PD to generate the up/down pulse to charge/discharge the charge pump to adjust the multi-phase VCO output clock frequency and then output recovered data and recovered clock.

The advantage of this architecture is that the VCO clock is reduced to half or quarter of data rate to save the power consumption from the VCO. However, it suffers the problem of design complexity from multi-phase clock generation and over-sampling circuit.

#### 1.1.2.2 Blind oversampling type CDR circuit



Fig 1.5 Architecture of blind oversampling circuit

Fig. 1.5 shows the example of the blind over-sampling circuit. The multi-phase clock generator's central clock is synchronized with the reference clock which relative to the input data. The multi-phase clocks trigger the blind over-sampling sampler to sample the input data (NRZ\_Data). The multi-phase clock sample rate usually is 2X, 3X or 4X higher than data rate. The data sampled by the DFFs in the sampler will be sent to the decision circuit (Majority-voting Circuit) to pick up which data should be the recovered data.

The blind over-sampling architecture has the advantage of the all-digital nature and the short lock-in time, but it suffers the extra power consumption of multi-phase clock generator and usually needs the off-chip reference clock to generate the multi-phase clocks which the central frequency is close to the data rate.

#### 1.1.3 Modern implementation of CDR circuit

#### 1.1.3.1 Single rate CDR circuit

There are three modern implementation of CDR circuit. They are single rate, multi-rate and continuous rate CDR circuit. The single rate CDR circuit is designed for the single data rate. It is used for the specific application which operation speed is specified [2][6][7]. Because of the narrow operation range, the cost of the circuit and design complexity can be reduced. Besides, because of the specified operation speed, the external reference clock is usually used for frequency acquisition [1][3].

#### 1.1.3.2 Multi-rate CDR circuit

The multi-rate CDR circuit has multiple operation speed. It's target for the application with multiple operation modes. The implementation of multi-rate CDR circuit by using multiple reference clocks with automatic bit rate selection is published [12]. However, the cost of multiple reference clock is large, so the architecture of a single external clock and multiple reference generation scheme is proposed in [13]. The other work which do not need a reference clock is presented [14].

#### 1.1.3.3 Continuous rate CDR circuit

The nature of a continuous rate CDR circuit has a wide operation range. The input data rate can be any speed within the operation range [15][16][17][27].



Fig 1.6 Example of continuous rate CDR circui

Fig. 1.6 shows the example of the continuous rate CDR circuit. The continuous rate CDR circuit usually has the frequency band selector to select which the VCO frequency band is close to the input data rate to speed up the frequency acquisition. The FD is used for frequency tracking. The FD compares the VCO clock and input data rate then adjusts the high gain charge pump to adjust the VCO clock frequency coarsely and achieves the frequency acquisition. The PD is used to compare the VCO phase is leading or lagging to the data transition and adjust the lower gain charge pump to fine tune the VCO and maintain the phase alignment.

#### **1.1.4 Referenceless issue of CDR circuit**

An external off-chip reference clock is often needed in a PLL. Therefore, the PLL based CDR circuit [29] is often need the external reference clock. However, the off-chip reference clock increases the deign complexity of the system integration. Besides, the external reference clock has the cost of area occupation and power consumption. To overcome the problem of an external reference clock, the referenceless architecture is more attractive in today's system-on-a-chip era.

However, the referenceless design is usually suffered from the narrow operation range [21] and long lock-in time [22]. In this thesis, the proposed ADCDR can solve these problems.



### 1.2 Introduction to USB 2.0

#### 1.2.1 USB 2.0 overview

Universal serial bus (USB) is the industrial standard for the communication between computer and electronic device. The USB 2.0 specification [5] is released in April 2000, and it defines the data transfer rate of "Low Speed" (1.5MHz), "Full Speed" (12MHz) and "High Speed" (480MHz).

As shown in Fig 1.7, the USB system specifications define the communication between USB host and USB device (function). There is only one USB host at any USB system. The USB host is used to control the whole communication flow. USB device could have hubs which provide additional attachment point to the USB or the functions which provide capabilities to the system such as a joystick or speakers.





**Device Circuit Board** 

Fig 1.7 USB 2.0 interconnection between host and device

The data is transmitted on the USB cable by the differential signals (Data+ and Data-.) There are difference eye patterns (TP1 to TP4) defined by the USB 2.0 specification. The receiver sensitivity is defined as TP4 which is shown in Fig 1.9.



Fig 1.8 System diagram of UTMI

Fig. 1.8 shows the block diagram of the USB 2.0 Transceiver Macrocell Interface (UTMI) [4][23]. UTMI is a transceiver interface of the USB system shown in Fig. 1.7. The Data+ and Data- are the differential signals on the serial bus. The analog front end (Analog Front End) is used to exchange the differential signals with digital signals.

In the transmitter path, the host parallel data (Parallel TX Data) is serialized by the shift register and after bit stuffing and NRZ-I encoding, then is transmitted to the bus. In the receiver path, there are two CDR circuits in the UTMI, HS CDR is for the high speed mode (480MHz) and FS CDR is for the full speed mode (12MHz). Both CDR circuits are locking with the multi-phase reference clock and recover the data. After the NRZ-I decoding and bit unstuffing, the recovered data is sent to the shift register and outputted as the parallel data (Parallel RX Data).



Fig 1.9 Receiver sensitivity requirement of USB 2.0 specification

Fig. 1.9 shows the eye pattern of receiver sensitivity in USB 2.0 high speed mode. The receiver sensitivity of input data should be 70% unit internal ( 312 ps peak to peak input data jitter) and recommend the BER of recovered data can be less than  $10^{-12}$ .



#### 1.2.2 USB 2.0 protocol layer

In USB 2.0 protocol layer, there are tree types packets: token packet, data packet and handshake packet. Each transaction must be contained with these three types packet. The format of these packets is shown in Fig. 1.10

PID ADDR ENDP CRC5 SYNC 8bit 7bit 4bit 5bit Data Packet CRC16 PID Data SYNC 8bit 0-8192 bit 16bit

**Token Packet** 



Each packet has the SYNC field, the length of the SYNC field is 32 bit for high speed mode and 7 bit for full/slow speed mode. The PID filed is used to define each packet meaning. Table 1.1 shows PID types and detail functions. The Data filed is the transmit data. The ADDR and ENDP field is used to address the device number. The CRC field is for error detection within the transaction. The example of the USB 2.0 transaction is shown in Fig. 1.11.

PID<3:0>\* **PID** Type **PID Name** Description OUT 0001B host-to-function transaction. 1001B function-to-host transaction. IN Token Start-of-Frame marker and frame number. SOF 0101B host-to-function transaction for SETUP to a SETUP 1101B control pipe. DATA 0 0011B Data packet PID even. DATA 1 1011B Data packet PID odd. Data DATA2 0111B Data packet PID high-speed, high bandwidth isochronous transaction in a microframe. Data packet PID high-speed for split and high bandwidth **J111B MDATA** Á isochronous transactions. ACK 0010B Receiver accepts error-free data packet. 1010B Receiving device cannot accept data or transmitting NAK device cannot send data. Handshake STALL 1110**B** Endpoint is halted or a control pipe request is not supported. NYET 0110B No response yet from receiver. 1100B PRE Host-issued preamble. Enables downstream bus traffic to low-speed devices. ERR 1100B Split Transaction Error Handshake (reuses PRE value). Special SPLIT 1000B High-speed Split Transaction Token. PING 0100B High-speed flow control probe for a bulk/control Endpoint.

Table 1.1 PID Types

Reserved PID.

0000B

Reserved



Fig1.11 Example of USB2.0 transaction

The Fig. 1.11 shows the sample example of USB 2.0 transaction. The host sends a token packet IN to request the data. The function receives the token packet and the function is now ready to do the data transmission. Then, the function starts to transmit the data packet. When the data packet is transmitted without error, the host will send a handshake packet ACK to the function and whole transmission is complete successfully.

# 1.3 Design challenges of CDR circuit in USB2.0

As the former description, before the packet transmit to the bus, the packet will be bit stuffed in any six consecutive "1" (Consecutive Identical Digit) to avoid a long period without data transition, then each bit is encoded with non-return to zero, inverted (NRZI) encoder. The NRZI signal has a transition if the bit being transmitted is "0", and does not have a transition if the bit being transmitted is "1". Finally, the packet is transmitted on the bus.



Fig 1.12 Waveform of USB 2.0 packet

Fig. 1.12 shows the encoded NRZI waveform of USB 2.0 packet, the first 32 bit is the SYNC field and after NRZI encoding, there are 31 continuous data transition in the SYNC pattern. That means the frequency and phase acquisition should be completed within the 31 cycles.

However, the conventional analog approaches have difficulty to satisfy this fast lock-in requirement. Because the charge pump type VCO loop suffers the problem of long unilateral frequency acquisition time. The Fig. 1.13 shows the locking procedure of conventional analog approach frequency tracking. In the beginning, the VCO is discharged to the minimum frequency. In the coarse-tuning phase, the VCO would be charged with high gain charge pump to speed up the frequency acquisition. After frequency acquisition is completed. In the fine tuning phase the VCO is charged with low gain charge pump for the finer tracking, and then it achieves the phase acquisition. The lock-in time from the minimum frequency to the target frequency usually takes several micro seconds.



Fig 1.13 Tracking time of charge pump VCO loop

## 1.4 Summary

Due to the additional cost with an external reference clock and the multi-phase clock generation of over-sampling architecture, in this thesis, we design an ADCDR circuit without an external reference clock and over-sampling scheme. Thus the off-chip crystal and the multi-phase generator shown in Fig. 1.8 can be eliminated to save the power consumption and the area occupation.

In order to support three speed modes of USB 2.0 specification, we design an ADCDR circuit to the wide range continuous rate type CDR circuit. Therefore, instead of using multiple CDR circuits to support the multi speed modes of USB 2.0, our ADCDR circuit can support the multi speed modes with reduced hardware costs.

To satisfy the fast lock-in requirements with the short SYNC pattern of USB 2.0, the fast lock-in feature in the proposed ADCDR circuit is necessary.

Vere

## Chapter 2 Referenceless CDR Circuit for USB2.0 High Speed Mode

## 2.1 System Architecture Overview



Fig 2.1 The block diagram of proposed ADCDR Circuit architecture

Fig. 2.1 shows the block diagram of proposed ADCDR Circuit architecture. The proposed ADCDR Circuit is composed of a dual mode phase frequency detector (PFD), a TDC-embedded digitally controlled oscillator (DCO), a phase drift detector, a CDR Controller (Controller), and a lock-in procedure control state machine (State Machine).

The data transition (Data\_T) is extracted from the delayed input data and exclusive-OR with the original input data. Every data transition of input data would generate a small pulse shown as Fig. 2.2.



Fig 2.2 The waveform of data transition extraction

The lock-in procedure control state machine triggered by the Data\_T is used to control the whole lock-in procedure.

The whole lock-in procedure has tree phase: TDC-locking phase, coarse-tuning phase and fine-tuning phase. In the TDC-locking phase, the period of the first two data transition (Data\_T) will be measured by TDC-embedded DCO and then it generates the TDC\_Code. The TDC\_Code is encoded as the Initial DCO\_Code in the Controller. After this phase, the DCO output frequency (DCO\_Clk) would be very close to the input data rate.

After TDC-locking phase the state machine would turn off the TDC function to save power by setting the TDC\_lock signal to "1". Then it enters the coarse-tuning phase within the SYNC patterns. In coarse-tuning phase, state machine sets the operating mode of dual mode PFD by setting signal Track\_Mode to "1". In this mode, the dual mode PFD works as the common PFD. In this mode, the PFD compares the frequency between DCO output clock (DCO\_Clk) and data transition signal (Data\_T) which is the input data rate. When DCO output clock (DCO\_CLK) is faster than input data rate, the UP signal is generated. Then the controller reduces the DCO\_Code to slow down the DCO output clock. Otherwise, if DCO output clock is slower than input data rate, the DN signal is generated. Then the controller would increases the

DCO\_Code to speed up the DCO output clock. In coarse-tuning phase the DCO output clock is adjusted with the coarse-tuning cells to speed up the lock-in time.

After two phase polarity changes of UP and DN signals in the coarse-tuning phase, the state machine switches the CDR circuit to the fine-tuning phase. In this phase, the dual mode PFD is switched to the PD mode by setting the Track\_Mode signal to "0". The dual mode PFD can generate the correct UP/DN signals to tracking the random data. In addition, the DCO output clock would be adjusted with fine-tuning cells to achieve finer tuning. So that the CDR circuit can keep the phase alignment between the DCO output clock and the data transition until the end of packet.



## 2.2 Dual Mode Phase and Frequency Detector



Fig 2.3 Proposed dual mode PFD architecture

Fig. 2.3 shows the proposed dual mode PFD architecture. It supports two tracking mode to achieve the frequency and phase acquisition with the USB 2.0 packet format.

When the Track\_Mode is setting to "1", the proposed dual mode PFD is in the PFD mode. In this mode it works just like other common conventional Bang-Bang PFDs to achieve frequency tracking within the SYNC pattern. If frequency of the DCO output clock (DCO\_Clk) is faster/slower than the input data rate, then the proposed dual mode PFD outputs the UP/DN signal to the CDR controller to adjust the frequency of the DCO output clock. The waveform of PFD mode is shown in the left side of Fig 2.4

When the Track\_Mode is setting to "0", the proposed dual mode PFD is switched to the PD mode. In this mode, it can be used to achieve the phase acquisition with the random data pattern of the USB2.0 packet.



Fig 2.4 Waveform of the proposed dual mode PFD

In the right side of Fig. 2.4, QD reset by the negative edge of DCO\_Clk, that means if there is no data transition arrives within the half period of the DCO\_Clk, the comparison of this time is discarded. Furthermore, because of the reset of QD, there is an irregular down pulse of OUTD is generated, and thus if constructs the irregular UP/DN signals. The mask signal is created by the NAND operation between QU and QD. So that the irregular down pulse of OUTD can be masked by the AND operation, and PFD can remain output the correct UP/DN signal.
## 2.3 Time-to-Digital Converter Embedded DCO

#### 2.3.1 Structure



Fig 2.5 The TDC-embedded DCO architecture

As shown in Fig. 2.5 is shown, the TDC-embedded DCO is constructed by DCO part and TDC part. The DCO part is a ring type DCO constructed by a coarse-tuning delay line and the fine-tuning delay line. The coarse-tuning delay line composed with the chain of 2-to1 multiplexers (MUXs). The fine-tuning delay line which is composed with the digital-controlled varactors (DCV) [11][20][24][26] to achieve the finer resolution. The TDC part is constructed by deploying the D-Flip/Flops (DFFs) on every node between any two MUXs. The Decoder decodes the 128-bit thermal meter code to the 7-bit binary TDC\_code and outputs to the CDR controller. The Encoder encodes the input 7-bit code from the CDR controller's 7-bit control code to the 128 bit DCO code (Sel[127:0]) to adjust the DCO output clock (DCO\_Clk).

#### 2.3.2 MUX Type DCO structure



Fig 2.6 DCO part of proposed TDC-embedded DCO

The gray region in Fig. 2.6 is the DCO part of the proposed TDC-embedded DCO. In the proposed ADCDR circuit, there are 127 cascaded MUXs as the coarse-tuning delay line. The Coarse\_code[6:0] is the digital dco code output from CDR controller. After encoding, Coarse\_code[6:0] is encoded to Sel[127:0] to select the delay path to control the DCO output clock (DCO\_Clk). As shown in Fig. 2.7, after encoding the Coarse\_code "125", the Sel[125] is setting to "0", the others are setting to "1", then the delay path is passing through 125 MUXs. The resolution of the coarse-tuning delay line is about 34ps.



Fig 2.7 Delay path selection of proposed MUX Type DCO

### 2.3.3 DCV fine-tune delay line



Fig 2.8 The fine-tune delay line of proposed MUX Type DCO

Fig. 2.8 shows the architecture of fine-tuning delay line of the proposed MUX Type DCO. To achieve a finer resolution, the digital-controlled varactors (DCVs) [11] are used. In proposed ADCDR circuit, the thermal meter code Fine[6:0] controls the nodal capacitance between each two buffers to adjust the delay timing on the delay line. This architecture can achieve a high resolution to about 8ps, and total delay range is about 56ps that can cover the one step of coarse-tuning delay line.



#### 2.3.4 Time-to-Digital Converter



Fig 2.9 The TDC part of proposed TDC-embedded DCO

The gray region of Fig. 2.9 is the TDC part of proposed TDC-embedded DCO. When TDC\_Lock is setting to "0" the small pulse (Data\_T) which generated from data transition would trigger the DFFs deployed on the DCO delay line to quantize the data rate of input data to a digital code to achieve the fast frequency acquisition.



Fig 2.10 Operation of TDC-embedded DCO

Fig. 2.10 shows the operation of proposed TDC-embedded DCO. When the CDR is reset, DCO\_rst is "0". In addition, in the beginning, the coarse-tuning code is set to the maximum value to select the longest delay path of the delay line in the beginning. Then, when the first positive edge of the small pulse of the data transition

arrives, the DCO\_rst is setting to "1" by the state machine. Then the "0" is beginning pass through the delay line. When the second positive edge of the small pulse of the data transition arrives, the DFFs are triggered to sample the value on the delay line. The number of "0" which the DFFs sampled is the number of the MUXs that the "0" pass through during the period of one bit time. After encoding the TDC code, the value would be the initial value of DCO\_code so that the initial DCO clock frequency could be very close to the input data rate to speed up the frequency acquisition within the short SYNC pattern in USB 2.0. Finally, when the TDC measurement is finish, the TDC lock is setting to "1" to turn off the TDC to save power.



## 2.4 Adaptive Gain in Consecutive Identical Digit

In the region of Consecutive identical digit (CID) the PD is not operating. Thus, there is no frequency adjustment in this region. The frequency error between DCO output clock (DCO\_Clk) and input data rate (Data\_T) will accumulate phase error and causing the phase drift and the data recovery error. The Fig. 2.11 shows the example. If the initial frequency is  $\Delta$ P1 and the period difference between data transition (Data\_T) and DCO output clock (DCO\_Clk) is X. In the right side of Fig. 2.11, we turn off the frequency tracking. The phase error is accumulated with period difference X in each cycle. After 3 cycles, the phase error is increase to  $\Delta$ P1 + 3X. In 3 consecutive identical digits region, the quantity of phase error is the same as turn off the frequency tracking for 3 cycles shown as the left side of Fig. 2.11



Fig 2.11 Phase error accumulation

To solve the problem of data recovery error from phase error accumulation, we proposed a tracking scheme of adaptive gain in consecutive identical digit (AGCID) in the CDR controller. We count the number of negative edge between any data transition. The value of the counter will be used to increase the gain of the DCO clock frequency adjustment. As shown in Fig. 2.12, in the first of comparison between Data\_T and DCO\_Clk, Data\_T is leading DCO\_Clk and DCO\_code is reduced with one step. Then, after the 3 consecutive identical digits, the Data\_T is still leading DCO\_Clk and DCO\_code is reduced with a increased gain of 3\*step.



## Chapter 3 Wide Range Improvement of Supporting USB2.0 Full/Low Speed Mode

## **3.1 System Architecture with Wide Operating Range Improvement**



Fig 3.1 System architecture of wide operation range improvement

Fig. 3.1 shows the system architecture of the second version of CDR circuit proposed in this thesis. The CDR circuit of previous version only supports the single operating rate of USB 2.0 at high speed mode. In the second version, it has the improvement of wide operating range from 700kHz to 500MHz that can cover all speed mode (High Speed, Full Speed and Low Speed mode) of the USB 2.0 specification.

There are two added components in 2<sup>nd</sup> version of proposed ADCDR circuit, the low frequency clock synthesizer (LFCS) to achieve the low area cost and wide operation range, and the TDC Counter combined with the TDC-embedded DCO to achieve the wide range TDC measurement.



# 3.2 Wide Range Cyclic TDC-embedded DCO



Fig 3.2 Proposed wide range cyclic TDC-embedded DCO architecture

The proposed wide range cyclic TDC-embedded DCO architecture is almost the same with the former version except the number of MUXs and DFFs is reduced to 63 to save the chip area occupation, and there is an additional counter (TDC Counter) connected on the DCO, and it is used to achieve the wide range TDC measurement.

The difficulty of wide range TDC implementation is the number of DFFs is not enough to quantize the long pulse width. To achieve the wide range TDC, the cyclic concept is combined into the original TDC-embedded DCO. The TDC measurement is completed within the first two data transition (Data\_T). When the first positive edge comes, the "0" beginning pass through the delay. If the input data rate is slower than the minimum DCO frequency, then the "0" will pass through the entire delay line and DCO begin oscillating. The TDC Counter deployed on the DCO is trigger by the negative edge. Then, when the second positive edge arrives, DFFs samples the remained digit on the delay line which can not reach to trigger the TDC Counter. Then TDC measurement is completed. The TDC measurement result is separated to two parts, Cyclic\_Count which output from TDC Counter and TDC\_code which encoded from DFFs' value (code[63:0]).

Assume that "*K*" is the value of Cyclic\_Count, "*M*" is the value of the TDC code, *Tf* is the delay of fine-tuning delay line (Fine Tune Delay Line), and *Tmux* is a delay time of unit MUX. The input data rate can be quantized to two cases:

If LSB of code[63:0] is "0"

$$DataRate = (K-1)*(128Tmux + 2Tf) + (64Tmux + Tf) + M*Tmux$$
(3.1)

If LSB of code[63:0] is "1"

$$DataRate = (K-1)*(128Tmux + 2Tf) + 2*(64Tmux + Tf) + M*Tmux (3.2)$$

Fig. 3.3 shows the example of one of the case of TDC measurement. In this case the LSB of code is "0" and K is 10 and M is 39, so the proposed TDC quantize the period of two data transition to the delay of

$$9*(128+2Tf)+2*(64Tmux+Tf)+39Tmux.$$

Fig. 3.4 shows the other case of TDC measurement. In this case the LSB of code is "1" and the K is still 10 and M is 52. That the proposed TDC quantize the period of two data transition to the delay of

$$9*(128+2Tf)+64Tmux+Tf+52Tmux.$$



Fig 3.3 Operation waveform of proposed cyclic TDC-embedded DCO





Fig 3.5 DCO delay line simulation with PVT variations

| PVT<br>corner | Slov   | v Case  | Typical | Case    | Fast Case |         |  |  |
|---------------|--------|---------|---------|---------|-----------|---------|--|--|
|               | Step   | Range   | Step    | Range   | Step      | Range   |  |  |
| Coarse        | 68 ps  | 4854 ps | 52 ps   | 3665 ps | 57 ps     | 2850 ps |  |  |
| Fine          |        |         |         |         |           |         |  |  |
| tune stage    | 7.6 ps | 124 ps  | 6.8 ps  | 103 ps  | 5.9 ps    | 89 ps   |  |  |

Table 3.1 Coarse/Fine stage range and step in PVT variations

Fig. 3.5 shows the DCO delay line of the Cyclic TDC-embedded DCO's control code versus period simulation with PVT variations, and the range and step of each stage is shown in Table 3.1.

## **3.3 Low Frequency Clock Synthesizer**

#### **3.3.1 Cyclic Delay of low frequency clock**

#### synthesization



Fig 3.6 The proposed low frequency clock synthesizer architecture

Fig. 3.6 shows the proposed low frequency clock synthesizer (LFCS). There is a Cyclic Delay Counter triggered by DCO\_Clk as the cyclic delay cell and output the Cyclic\_delay signal. The Cycilic\_delay passes through the Coarse Tuning Delay Line and Fine Tuning Delay Line and generates the output clock (OUT\_Clk). Therefore, by combining the delay of the cyclic delay and the coarse and fine delay line's delay, the low frequency clock can be synthesized. The Path Selector can choose which path should be select to be a recovery clock (Recovery\_Clk). Because if the input data rate which is faster than the minimum frequency of the DCO\_Clk, that the LFCS is not necessary. Then the DCO\_Clk is output as the recovery clock directly.

The proposed LFCS can use the TDC measurement result to set the initialization output clock (Recovery\_Clk) which is very close to the target data rate and can achieve the fast frequency acquisition. As the former description, the input data rate is quantized as the (3.1) and (3.2). So the recovery clock period should be:

Recovered 
$$Clock = (K-1)*(64Tmux + Tf) + (32Tmux + Tf/2) + M*Tmux/2$$
 (3.3)

or

Recovered Clock = (K-1)\*(64Tmux+Tf)+2\*(32Tmux+Tf/2)+M\*Tmux/2 (3.4)

Fig. 3.7 shows the waveform of low frequency clock synthesization. The Cyclic Delay Counter generates the transition of Cyclic\_delay for X delay time. The DCO\_rst which is constructed by the operation XNOR between Cyclic\_delay, and the low pulse of the DCO\_rst can reset and restart the DCO within several periods. Assume Y is the low pulse of DCO\_rst, that we can adjust the Coarse\_Tuning\_Code and Fine\_Tuning\_Code to adjust its width and adjust the OUT\_Clk period. Then the period of OUT Clk can be synthesized to X+Y.



Fig 3.7 Waveform of low frequency clock synthesize

Assume that the Cyclic TDC-embedded DCO output value Cyclic\_count is K and TDC\_code is M, *Tmux* is the unit delay of MUX, and *Tf* is the delay of fine-tuning delay line. In the operation of low frequency clock synthesis, the period of DCO\_Clk is setting to half of the maximum period. Therefore, when the Cyclic Delay Counter is counting to K, a transition of Cyclic\_delay signal is generated, then the delay time of X can be:

$$X = (k-1)*(64Tmux + Tf) + 32Tmux + Tf/2$$
(3.5)

Then the value of the M/2 or M/2+32 is used to be the Coarse\_Tuning\_Code, and the delay time of Y can be:

$$Y = M * Tmux / 2 + Tf$$
 (3.6)  
or  

$$Y = 32Tmux + M * Tmux / 2 + Tf$$
 (3.7)  
The period of OUT\_Clk (X+Y) can be:  
If LSB of code[63:0] is "0":  

$$X + Y = (K - 1) * (64Tmux + Tf) + 32Tmux + M * Tmux / 2 + 1.5 * Tf$$
 (3.8)

If LSB of code[63:0] is "1":

$$X + Y = (K - 1) * (64Tmux + Tf) + 64Tmux + M * Tmux / 2 + 1.5 * Tf$$
(3.9)

Finally, the OUT\_Clk initial output clock frequency can be close to the input data rate to achieve the fast frequency acquisition. In addition, there is a little mismatch between (3.8) and (3.3). The mismatch also appears between (3.9) and (3.4). However, the mismatch is quite small enough that can be eliminated with the flowing frequency tracking of the lock-in procedure easily.

#### 3.3.2 Fast Reset Scheme on DCO



Fig 3.8 Long reset time for original DCO architecture

Because the proposed LFCS has the requirement of short reset time for the DCO. So the DCO with fast reset scheme is proposed, and it will not to worsen the resolution of delay line. This architecture is combined into the Cyclic TDC-embedded DCO.

The conventional work like is shown in Fig. 3.8 shows. When the DCO\_rst is setting to "0", the reset time that value "1" reach DCO\_Clk is the half of the period relative with current control code. It spend a lot of time to wait the value "1" pass through the delay line in the conventional architecture.



Fig 3.9 Fast Reset DCO architecture

Therefore, in this paper we proposed a fast reset scheme that is not need waiting for the reset signal pass through the delay line. Fig. 3.9 shows the proposed fast reset scheme implemented on the DCO part of TDC-embedded DCO.



Fig 3.10 Operation of fast reset scheme

Fig. 3.10 shows the operation of the fast reset scheme. When the reset signal (DCO\_rst) is setting to "0", then the output node of the NAND gate is setting to "1". In addition, the MUXs are enforced to select the path which the value "1" can pass through to their output. So that the fast reset scheme can reset the DCO entire delay line very quickly. Because of the fast reset scheme, the proposed LFCS can be achieved.



## **Chapter 4** Experimental Results

## 4.1 Test Chip Implementation



Fig 4.1 Test chip floorplaning and I/O planning

Fig. 4.1 shows the proposed ADCDR circuit floorplaning and I/O planning, There are 14 I/O PADs and 18 power PADs. The detail I/O description is shown in Table 4.1. The test chip contains the first version of the proposed ADCDR circuit and there is a on-chip clock generator (Clock\_Generator) to generate the 480MHz clock to trigger the pattern generator (Pattern\_Generator) and output the USB 2.0 format data pattern to the proposed ADCDR circuit. The clock generator which implemented with the DCO has the rage of 300MHz to 500MHz to cover the USB2.0 high speed mode operation frequency (480MHz), and the DCO is controlled by the off-chip input (CLK\_GEN\_CODE[3:0]) that can adjust the clock generator output clock coarsely. The pattern generator (Pattern\_Generator) generates the 32 bit SYNC pattern and data pattern with stuff bit in every 6 consecutive identical digits data by using the linear feedback shift register (LFSR) with 2<sup>7</sup>-1 pseudo random binary sequence (PRBS).

| Output        | Bits  | Function                                                                                           |                                         |  |  |
|---------------|-------|----------------------------------------------------------------------------------------------------|-----------------------------------------|--|--|
| CLK_GEN_OUT   | 1 bit | Clock of USB 2.0 data pattern generator                                                            |                                         |  |  |
| RECOVERY_CLK  | 1 bit | CDR Circuit recovery clock                                                                         |                                         |  |  |
| STATE         | 2 bit | State of Circuit Condition                                                                         |                                         |  |  |
| TARGET_DATA   | 1 bit | CDR Circuit's input data generated by USB 2.0 data-pattern generator                               |                                         |  |  |
| Input         | Bits  | Function                                                                                           |                                         |  |  |
| CLK_GEN_CODE  | 4 bit | Adjust the on chip clock generator to 480MHz<br>clock provide to USB 2.0 data pattern<br>generator |                                         |  |  |
| LFSR_MODE     | W.    | Value                                                                                              | Mode                                    |  |  |
|               | 1 bit | 0                                                                                                  | Normal Case                             |  |  |
|               |       | 1                                                                                                  | Worst Case                              |  |  |
| TEST MODE     | 1 1.4 | 0                                                                                                  | Output the CLK_GEN_OUT and RECOVERY_CLK |  |  |
| TEST_MODE     | 1 dit | 1                                                                                                  | Output the RECOVERY_CLK an TARGET_DATA  |  |  |
| DIV           | 11 .  | 0                                                                                                  | Output normal RECOVERY_CLK              |  |  |
| DIV           | 101t  | 1 Output RECOVERY_CLK/8                                                                            |                                         |  |  |
| RESET         | 1 bit | System reset                                                                                       |                                         |  |  |
| CLK_GEN_RESET | 1 bit | On chip clock generator reset                                                                      |                                         |  |  |

| Table 4.1 I/O PADs | discription |
|--------------------|-------------|
|--------------------|-------------|



Fig 4.2 microphotograph of the test chip of proposed ADCDR circuit

Fig. 4.2 shows the microphotograph of the test chip. The test chip is fabricated on the standard performance 65nm CMOS process. The chip size is  $644\mu m^2$  and the core size is  $150\mu m^2$ . Whole chip is composed of the testing part and proposed ADCDR part. The testing part contains the Pattern Generator and the Clock Generator. The ADCDR part contains TDC-embedded DCO, Dual Mode PFD, Controller, State Machine and Loop Filter [28].

#### File Control Setup Measure Analyze Utilities Help 12:56 AM 20.0 GSa/s 1.00 kpts 100 mV/ 🐣 20 4 Jitter Analysis ĮĮ ſſ <u>[</u>]1 ſl 1, <u>↑</u>\_\_\_\_\_ Η 100 ps/ 🛛 🔌 👎 15.1910 ns 1 900 mV -40 > More (1of 2) Measurements Markers Histogram Color Grade Scales am Courtarade States Mean 15.19130879 n Std Dev 15.1302 ps μ±1σ 73.9% μ±2σ 94.6% μ±3σ 99.8% 15.19282 ns 15.19282 ns 101.82 ps 15.13646 ns 15.23827 ns Hits 14.83 khits <mark>?</mark> Peak 896 hits edian Mode Delete All Y Scale 224 hits/ Y Offset 0 hits p-p Min Max

## **4.2 Test Chip Measurement Result**

(a) Jitter histogram of the recovered clock in 526MHz and output is divided by 8



(b) Jitter histogram of the on-chip clock generator clock at 526MHz and output is

#### divided by 8

#### Fig 4.3 The jitter histogram of proposed ADCDR circuit

Fig. 4.3 (a) show the jitter measurement result of recovered clock of proposed ADCDR circuit recovered clock at 526MHz. Because of the speed limitation of I/O PADs, so the recovered clock is divided by 8 then outputs. The rms jitter is 15.13ps, and the peak to peak jitter is 101.82ps. Fig. 4.3 (b) is the jitter measurement result of on-chip clock generator, and it's also divided by 8. The rms jitter is 7.47ps and peak to peak jitter is 63.64ps. The proposed ADCDR circuit can tolerance for input jitters and still operates correctly.

Because of the divider would influence the jitter performance, so the jitter performance of recovered clock and on-chip clock generator in 325MHz which is without the clock divider is tested, and the result is shown in Fig 4.4. Fig 4.4 (a) shows the rms jitter of proposed ADCDR circuit recovered clock is 12.41ps and peak to peak jitter is 90.91ps. Fig 4.4 (b) shows the on-chip clock generator output clock rms jitter is 7.57ps and peak to peak jitter is 50.91ps.

The divider influence is shown in Fig. 4.4 (c) and (d). The case of jitter performance of the 325MHz ADCDR circuit recovered clock and on-chip clock generator output clock divided by 8 is tested. Fig. 4.4 (c) shows the recovered clock with rms jitter is increased to 25.25ps and peak to peak jitter is increased to 167.27ps. Fig. 4.4 (d) shows the on-chip clock generator output clock, the rms jitter is increased to 10.6ps and peak to peak jitter is increased to 80ps.

As the measurement result is shown, the operation range of proposed ADCDR circuit is 325MHz to 526MHz, and its cover the operating frequency of the USB2.0 high speed mode at 480MHz.



(a) Jitter histogram of the recovered clock in 325MHz without the output divider



(b) Jitter histogram of the on-chip clock generator output clock 325MHz without the

output divider



(c) Jitter histogram of recovered clock in 325MHz and output is divided by 8



(d) Jitter histogram of on-chip clock generator output clock 325MHz and output is

divided by 8





Fig 4.5 Waveform of recovery clock and data pattern

Fig. 4.5 shows the waveform of recovery clock and data pattern in 325MHz

The 325MHz recovery clock is directly output from I/O PADs directly. In the proposed ADCDR, the negative edge of recovery clock is used to trigger the DFF to retime the input data as the recovery data. In Fig. 4.5, the arrow is point out the sample node of input data that the negative edge of recovery clock sampled.

## 4.3 Final Chip Implementation



Fig 4.6 second version chip floorplaning and I/O planning

Fig. 4.6 shows the floorplaning and I/O planning of final chip of 2<sup>nd</sup> proposed ADCDR circuit. There is an additional programmable divider to supply the low frequency clock to achieve the wide range testing. The DCO will supply the 1GHz clock and divided by the programmable divider to generate the 500MHz, 100MHz, 100MHz, 10MHz, 1MHz clocks. The programmable divider supply the mode for divide by 2, 10, 100 or 1000 controlled by the off-chip input DIV\_CODE[1:0]. The other detail I/O PADs descriptions are shown in Table 4.2.

| Output                                    | Bit   | Function                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |  |  |  |
|-------------------------------------------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| CLK_GEN_OUT                               | 1 bit | Clock of USB 2.0 data pattern generator                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |  |  |  |
| RECOVERY_CLK 1                            |       | CDR Circuit recovery clock                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |  |  |  |
| LOCK                                      | 1 bit | System lock in                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |  |  |  |
| TARGET_DATA                               | 1 bit | CDR Circuit's input data generated by USB 2.0 data pattern generator                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |  |  |  |
| Input                                     | Bit   | Function                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |  |  |  |
| DIV_CODE                                  | 2 bit | Adjust the on chip clock generator frequency for 500MHz to 1MHz by adjusting the divider counter.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |  |  |  |
| CLK_GEN_CODE                              | 4bit  | Control code of on-chip clock generator                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |  |  |  |
| TEST_MODE                                 | 1 bit | Value     Mode       0     Random Mode       1     Worst Case/USB Test Mode (SPEED_MODE       = 1)     Image: Speed of the second secon |  |  |  |
| SYNC_MODE                                 | 1 bit | 0SYNC Pattern = 71SYNC Pattern = 32                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |  |  |  |
| OUTPUT_TYPE                               | 1 bit | 0     Gated TARGET_DATA       1     Gated CLK_GEN_OUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |  |  |  |
| OUT_DIV                                   | 1 bit | Turn on the divider for RECOVERED_CLK/8 and CLK_GEN_OUT/8 for high speed mode measurement                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |  |  |  |
| RESET                                     | 1 bit | System reset                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |  |  |  |
| CLK_GEN_RESET   1 bit   On chip clock gen |       | On chip clock generator reset                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |  |  |  |

### Table 4.2 I/O PADs description of 2nd chip

## **4.4 Full Chip Overall Simulation**



Fig 4.7 System simulation of proposed ADCDR circuit in 480MHz

Fig. 4.7 shows the post layout simulation waveform of the proposed ADCDR circuit in operation speed of USB2.0 high speed mode. Because of the data rate (480MHz) is faster than the minimum frequency of Cyclic TDC-embedded DCO, so the Cyclic\_Count is 0 and the LFCS is not operating (so the synthesizer\_code[10:0] is 11'b0) and the TDC-embedded DCO clock is output as the recovery clock directly. The TDC\_Value is 56 and after encoding and sending to the ADCDR controller as the dco\_code initial value 281. The UP/DN pulse is sending to the ADCDR controller to adjust the dco\_code and adjust the TDC-embedded DCO clock frequency for frequency tracking in the 32 continuous data transitions (data\_t) of the USB2.0 SYNC pattern. Finally, the frequency and phase acquisition is completed and lock-in within 40ns.



Fig 4.8 System simulation of proposed ADCDR circuit in 100MHz

Fig. 4.8 shows the post layout simulation waveform of proposed ADCDR circuit in operation speed of 100MHz. After the TDC measurement the Cyclic\_count is 2, greater than 0. That means the LFCS must be used to synthesize the recovery clock. Then, the dco\_code is reduced to the half of the maximum and the synthesizer\_code is used to control the LFCS to adjust the recovery clock. Finally whole system is lock-in within 310ns.



Fig 4.9 System simulation of proposed ADCDR circuit in 12MHz

Fig. 4.9 shows the post layout simulation waveform of proposed ADCDR circuit in operation speed of USB2.0 full speed mode. The Cyclic\_count is 15 measured by TDC and the lock-in time is 2.4µs.

# 4.5 Bit Error Rate measurement in RTL Behavior Model Simulation



(b) BER performance of AGCID in 12MHZ



(c) BER performance of AGCID in 1MHZ

Fig 4.10 Input jitter tolerance of BER performance with AGCID scheme

As shown in Fig. 4.10 (a), when the proposed ADCDR circuit operates in the USB2.0 high speed mode, and the AGCID scheme is turned off, BER (Bit Error Rate) is less than  $10^{-12}$  until the input data jitter is larger than 100ps. When the input data jitter is larger than 100ps, BER performance is beginning get worse. However, when the proposed ADCDR turns on the AGCID scheme, the jitter tolerance is improved, when the input jitter is 150ps, and the BER is still less than  $10^{-12}$ . The Fig. 4.10 (b) shows the proposed ADCDR circuit operates in the operation speed of USB2.0 full speed mode of 12MHz. The input jitter tolerance of BER performance is improved from input jitter 8ns to input jitter 9ns by turning on the AGCID. The Fig. 4.10 (c) shows the proposed ADCDR circuit at 1MHz achieves the input jitter 11ns and the BER is less than  $10^{-12}$ . The AGCID scheme turns on and the input jitter is improved to 12ns and the BER is still less than  $10^{-12}$ .

4.6 Chip Summary and Comparison Table



Fig 4.11 Layout of 2<sup>nd</sup> version of proposed ADCDR circuit

Fig. 4.11 shows the layout of the  $2^{nd}$  version of proposed ADCDR circuit. The chip is implemented on standard performance 65nm CMOS process. The chip size is  $644\mu m^2$  and the core size is  $150\mu m^2$ . Except the proposed ADCDR, there are additional testing circuits in this circuit. The pattern generator (Pattern Generator) occupies  $1479\mu m^2$  and the clock generator (Clock Generator) occupies  $230\mu m^2$ , respectively. In addition, the power consumption of the testing circuit by the pattern generator is 0.213mW in 500MHz, and the clock generator is 0.155mW in 500MHz.

The operation range of proposed ADCDR is from 700kHz to 500MHz, that can cover all speed modes of USB 2.0 specification.

The lock-in time of proposed CDR is less than 32 cycles in 480MHz that means the lock-in procedure can be completed within the SYNC pattern of USB 2.0 packet in high speed mode. In addition, the resolution of the wide range cyclic TDC is small enough to the USB 2.0 full speed mode (12MHz) and slow speed mode (1.5MHz) that the initial phase error between data transition and recovered clock is less than 90°, thus the proposed ADCDR circuit can recover data correctly within the short SYNC pattern in full speed mode and slow speed mode of USB 2.0.

The input jitter tolerance of proposed ADCDR is already achieved the pk-pk 150ps (86% UI), in USB 2.0 high speed mode, 9ns (80% UI) in USB 2.0 full speed mode and 12ns (96% UI) in USB 2.0 slow speed mode. The component inside the transceiver analog front shown in Fig. 1.8 such like equalizer can be used with the proposed ADCDR and achieve the receiver sensitivity requirement of input data eye pattern of 70% UI which is shown in Fig. 1.7.

The performance of proposed ADCDR and comparison with related works of clock and data recovery circuits are shown in Table 4.3.

|                    | [16]<br>ASSCC'07                           | [17]<br>TCAS-1'06                          | [26]<br>ISSC'06                           | [15]<br>TCAS-II'09                      | [2]<br>TCAS-II'08      | [25]<br>Version 1                          | Version 2.                                                                          |
|--------------------|--------------------------------------------|--------------------------------------------|-------------------------------------------|-----------------------------------------|------------------------|--------------------------------------------|-------------------------------------------------------------------------------------|
| Process            | 0.13-µm                                    | 0.35-μm                                    | 0.18-μm                                   | 0.25-μm                                 | 0.18-μm                | 65nm                                       | 65nm                                                                                |
| Data Rate          | 180Mb/s~<br>3.2Gb/s                        | 200Mb/s~<br>2Gb/s                          | 155Mb/s~<br>3Gb/s                         | 140Mb/s~<br>1.8Gbs                      | 480MHz                 | 480MHz                                     | 700Kb/s~<br>500Mb/s                                                                 |
| Operation<br>Rate  | Continuous<br>Rate                         | Continuous<br>Rate                         | Continuous<br>Rate                        | Continuous<br>Rate                      | Single<br>Rate         | Single<br>Rate                             | Continuous<br>Rate                                                                  |
| CDR<br>Type        | 4X<br>Multi-phase                          | 4X<br>Multi-phase                          | 4X<br>Multi-phase                         | 14X<br>Multi-phase                      | Blind<br>Over-sampling | Full<br>Rate                               | Full<br>Rate                                                                        |
| Area               | 0.55mm <sup>2</sup>                        | 0.4mm <sup>2</sup>                         | 0.88mm <sup>2</sup>                       | 2mm <sup>2</sup>                        | 0.185mm <sup>2</sup>   | $0.0255 \text{mm}^2$                       | 0.0255mm <sup>2</sup>                                                               |
| Supply             | 1.2V                                       | 3.3V                                       | 1.8V                                      | 2.5V                                    | 1.8V                   | 1.0V                                       | 1.0V                                                                                |
|                    | 140 mW                                     |                                            | ZR                                        |                                         |                        |                                            | 2.63 mW<br>(500MHz)                                                                 |
| Power              | (3.2Gb/s)<br>75 mW                         | 170 mW<br>(2Gb/s)                          | 95 mW<br>(3Gb/s)                          | 342.5 mW<br>(Receiver)                  | 8.2 mW<br>(480MHz)     | 1.73 mW<br>(480MHz)                        | 1.73 mW<br>(150MHz)                                                                 |
|                    | (180Mb/s)                                  | W.                                         | đ                                         |                                         |                        |                                            | 0.54 mW<br>(1MHz)                                                                   |
| Jitter rms         | 14 ps<br>(3.2Gb/s)<br>138 ps<br>(180MHz)   | 5.86 ps<br>(2Gb/s)<br>18.8 ps<br>(200Mb/s) | 64 ps<br>(3Gb/s)<br>80 ps<br>(155Mb/s)    | 14,96 ps<br>(1.8Gb/s)                   | N/A                    | 15 ps<br>(526MHz)<br>12.41 ps<br>(325MHz)  | N/A                                                                                 |
| Jitter<br>pk-pk    | 116 ps<br>(3.2Gb/s)<br>700 ps<br>(180Mb/s) | 41.8 ps<br>(2Gb/s)<br>120 ps<br>(200Mb/s)  | 48.9 ps<br>(3Gb/s)<br>467 ps<br>(155Mb/s) | N/A                                     | N/A                    | 101 ps<br>(526MHz)<br>90.91 ps<br>(325MHz) | N/A                                                                                 |
| Lock-in<br>Time    | 12 μs<br>(3.2Gb/s)<br>9.5 μs<br>(200Mb/s)  | 30 μs<br>(1Gb/s)                           | 50 μs<br>(3Gb/s)                          | 7.5μs<br>(1.8Gb/s)<br>8 μs<br>(140Mb/s) | Zero<br>lock time      | 40 ns<br>(500Mb/s)                         | 40 ns<br>(500Mb/s)<br>310 ns<br>(100Mb/s)<br>2.4 μs<br>(12Mb/s)<br>25 μs<br>(1Mb/s) |
| Reference<br>Clock | No                                         | No                                         | No                                        | No                                      | Yes                    | No                                         | No                                                                                  |

## Chapter 5 Conclusion and Future Works

In this thesis, the referenceless continuous rate ADCDR with the short lock-in time and wide operation range is proposed

The Cyclic TDC-embedded DCO is used to measure the input data rate that the initial recovered clock frequency can be close to the target data rate to achieve the fast frequency acquisition within the SYNC pattern of USB 2.0 packet.

The proposed cyclic concept of the TDC makes the DFFs on the delay line can be reused and achieve the wide range TDC measurement. Besides, the cyclic delay concept of the low frequency clock synthesis makes the wide operation can be achieved and not increasing the area occupation.

The proposed ADCDR doesn't need the multi-phase over-sampling scheme. The design complexity and the cost are both reduced.

The dual mode phase detector achieve both of the SYNC pattern phase and frequency tracking and random data pattern phase tracking. The scheme of adaptive gain in consecutive identical digit improves the input jitter tolerance and the bit error rate performance.

For this ADCDR, the test chip of version 1 is taped out to verify the proposed methods. This version is designed only for the operation speed of USB2.0 High Speed Mode (480MHz). The test chip is fabricated in UMC 65nm standard performance CMOS process and the core area is  $150\mu$ m× $150\mu$ m.
CDR circuit is widely used in the high speed serial data link. However, there is a problem of Electromagnetic Interference (EMI) exist. The EMI would cause the disturbance of the transmission data. The conventional solution is the metal shielding, but this solution is requires a high cost. Therefore the technique of spread spectrum is proposed [32] for the EMI reduction.

Therefore, if the transmitter uses the triangle modulation to generate the spread spectrum clock, the output data rate would be continuous changed with a regularly pattern. The tracking ability of CDR circuit must be enhanced to tracking the changing data rate and keep the robustness and the bit error rate performance. At the present stage the proposed AGCID scheme can be used to enhance the CDR circuit tracking ability, but it's still encounter the limitation on the high modulation frequency. We hope to find the novel method to increase the CDR circuit performance on the spread spectrum issue in the future.

## Reference

- Sang-Hyun Lee, Moon-Sang Hwang, Youngdon Choi, "A 5-Gb/s 0.25-um CMOS Jitter-Tolerant Variable-Interval Oversampling Clock/Data Recovery Circuit," in *IEEE Journal of Solid-State Circuits*, Vol. 37, pp. 1822-1830, Dec. 2002.
- [2] Sang-Hune Park; Kwang-Hee Choi; Jung-Bum Shin; Jae-Yoon Sim; Hong-June Park, "A Single-Data-Bit Blind Oversampling Data-Recovery Circuit With an Add-Drop FIFO for USB2.0 High-Speed Interface," in *IEEE Transaction on Circuits and System II: Express Briefs*, Vol. 55 No. 2, pp. 156-160, Feb. 2008.
- [3] Banu, M.; Dunlop, A, "A 660MWs CMOS Clock Recovery Circuit with Instantaneous Locking for NRZ Data and Burst-Mode Transmission," in Proceeding of IEEE Solid-State Circuits Conference, pp. 102-103, 1993.
- [4] "USB 2.0 Transceiver Macrocell Interface(UTMI) Specification," Intel Corporation, Mar. 2001
- [5] USB Implementers Forum, "USB 2.0 specification", Apr. 2000
- [6] Jos'e Sarmento, John T. Stonick, "A Minimal-Gate-Count Fully Digital Frequency-Tracking Oversampling CDR Circuit," *in Proceeding of IEEE International Symposium Circuit and Systems*, pp, 2099-2102, 2010.
- [7] Yoshio Miki, Member, IEEE, Tatsuya Saito, Hiroki Yamashita, Fumio Yuki, Takashige Baba, Akio Koyama, and Masahito Sonehara, "A 50-mW/ch 2.5-Gb/s/ch Data Recovery Circuit for the SFI-5 Interface With Digital Eye-Tracking," in *IEEE Journal of Solid-State Circuits*, Vol. 39, pp. 613-621, Apr. 2004.
- [8] Ming-ta Hsieh, Sobelman, G., "Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery," *IEEE Circuits and Systems Magazine*, Vol. 8, pp. 45-57, 2008
- [9] Behzad Razavi, "Challenges in the Design of High-Speed Clock and Data Recovery Circuits," *IEEE Communication Magazine*, Vol. 40, pp. 94-101, Aug. 2002
- [10] Jaeha Kim and Deog-Kyoon Jeong, "Multi-Gigabit-Rate Clock and Data Recovery Based on Blind Oversampling," *IEEE Communication Magazine*, Vol. 41 pp. 68-74, Dec. 2003.
- [11] Pao-Lung Chen, Ching-Che Chung, and Chen-Yi Lee, "A Clock Generation with Cascaded Dynamic Frequency Counting Loops for Wide Multiplication Range Applications," in *IEEE Journal of Solid-State Circuits*, Vol. 41, pp. 1275-1285, Jun. 2006.

- [12] Scheytt, J.C., Hanke, G., Langmann, U., "A 0.155-, 0.622-, and 2.488-Gb/s Automatic Bit-Rate Selecting Clock and Data Recovery IC for Bit-Rate Transparent SDH Systems," in *IEEE Journal of Solid-State Circuits*, Vol. 4, pp. 1935-1943, Dec. 1999.
- [13] Che-Fu Liang and Shen-luan Liu, "A 20/10/5/2.5Gb/s Power-scaling Burst-Mode CDR Circuit Using GVCO/Div2/DFF Tri-mode Cells," in Proceeding of IEEE International Symposium Circuit and Systems, pp. 224-608, Feb. 2008.
- [14] Pyung-Su Han and Woo-Young Choi, "1.25/2.5-Gb/s Burst-Mode Clock Recovery Circuit with a Novel Dual Bit-Rate Structure in 0.8n-1m CMOS," in Proceeding of IEEE International Symposium Circuit and Systems, pp. 3069-3072, 2006.
- [15] Inhwa Jung, Daejung Shin, Taejin Kim and Chulwoo Kim, "A 140-Mb/s to 1.82-Gb/s Continuous-Rate Embedded Clock Receiver for Flat-Panel Displays," in *IEEE Transaction on Circuits and System II: Express Briefs*, Vol. 56, pp. 773-777, Oct. 2009.
- [16] Moon-Sang Hwang, Sang-Yoon Lee, Jeong-Kyoum Kim, Suhwan Kim; Deog-Kyoon Jeong, "A 180-Mb/s to 3.2-Gb/s, Continuous-Rate, Fast-Locking CDR without Using External Reference Clock" in Proceeding of IEEE Asia Solid State Circuits Conference, pp. 144-147, Nov. 2007.
- [17] Rong-Jyi Yang, Kuan-Hua Chao and Shen-Iuan Liu, "A 200-Mbps 2-Gbps Continuous-Rate Clock-and-Data-Recovery Circuit," in *IEEE Transaction on Circuits and System I*, Vol. 53, pp. 842-847, Apr. 2006.
- [18] Shao-Hung Lin, Chang-Lin Hsieh and Shen-Iuan Liu, "A Half-Rate Bang-Bang Phase/Frequency Detector for Continuous-Rate CDR Circuits," *in Proceeding* of IEEE Conference on Electron Devices and Solid-State Circuits, pp. 353–356, 2007.
- [19] Ahmed, S.I., Kwasniewski and T.A., "OVERVIEW OF OVERSAMPLING CLOCK AND DATA RECOVERY CIRCUITS," in Proceeding of IEEE Canadian Conference on Electrical and Computer Engineering, pp. 1876-1881, May. 2005.
- [20] Robert Bogdan Staszewski, Chih-Ming Hung, Dirk Leipold, and Poras T. Balsara, "A first multigigahertz digitally controlled oscillator for wireless applications," in *IEEE Transactions on Microwave Theory and Techniques*, Vol. 51, pp. 2154-2164, Nov. 2003.
- [21] Tontisirin, S. and Tielert, R., "A Gb/s one-forth-rate CMOS CDR Circuit without External Reference Clock," in Proceeding of IEEE International Symposium Circuit and Systems, pp. 3265-3268, 2006.

- [22] Dalton, D., Chai, K., Evans, E., Ferriss, M., Hitchcox, D., Murray, P., Selvanayagam, S., Shepherd, P. and DeVito, L., "A 12.5-Mb/s to 2.7-Gb/s Continuous-Rate CDR With Automatic Frequency Acquisition and Data-Rate Readback," in *IEEE Journal of Solid-State Circuits*, Vol. 40, pp. 2713-2725, Dec. 2005.
- [23] Babulu. K. and Rajan. K.S., "FPGA IMPLEMENTATION OF USB TRANSCEIVER MACROCELL INTERFACE WITH USB2.0 SPECIFICATIONS," in Proceeding of IEEE International Conference on Emerging Trends in Engineering and Technology, pp. 966-970, 2008.
- [24] Hsuan-Jung Hsu, Chun-Chieh Tu, and Shi-Yu Huang, "A high-resolution all-digital phase-locked loop with its application to built-in speed grading for memory," *in Proceeding of IEEE Symposium on VLSI Design Automation and Test (VLSI-DAT)*, pp. 267-270, Apr. 2008.
- [25] Ching-Che Chung and Wei-Cheng Dai; "An Referenceless All-Digital Fast Frequency Acquisition Full-Rate CDR Circuit for USB 2.0 in 65nm CMOS Technology," i *in Proceeding of IEEE International Symposium VLSI Design, Automation and Test (VLSI-DAT)*, pp. 1-4, 2011.
- [26] Rong-Jyi Yang, Kuan-Hua Chao, Sy-Chyuan Hwu, Chuan-Kang Liang and Shen-Iuan Liu, "A 155.52 Mbps–3.125 Gbps Continuous-Rate Clock and Data Recovery Circuit," in *IEEE Journal of Solid-State Circuits*, Vol. 41, pp. 1380-11390, Jun. 2006.
- [27] Seon-Kyoo Lee, Young-Sang Kim, Hyunsoo Ha, Younghun Seo, Hong-June Park and Jae-Yoon Sim, "A 650Mb/s-to-8Gb/s Referenceless CDR Circuit with Automatic Acquisition of Data Rate," in Proceeding of IEEE Solid-State Circuits Conference, pp. 184-185,185a, Feb. 2009.
- [28] Chen-Yi Lee, and Ching-Che Chung, "Digital Loop Filter for All-Digital Phase-Locked Loop Design," <u>US patent 7,696,832 B1</u>, Apr.13, 2010.
- [29] Terng-Yin Hsu, Bai-Jue Shieh and Chen-Yi Lee, "An All-Digital Phase-Locked Loop (ADPLL)-Based Clock Recovery Circuit," in *IEEE Journal of Solid-State Circuits*, Vol. 34, pp. 1063-1073, Aug. 1999.
- [30] Hogge and C.R., Jr., "A Self Correcting Clock Recovery Circuit," in *IEEE Transaction on Electron Devices*, Vol. 32, pp. 2704-2706, Dec. 1985.
- [31] J. D. H. Alexander, "Clock recovery from random binary signals," in *IEEE Electronics Letters*, Vol. 11, pp. 541-542, Oct. 1975.
- [32] Kuo-Hsing Cheng, Cheng-Liang Hung and Chih-Hsien Chang, "A 0.77 ps RMS Jitter 6-GHz Spread-Spectrum Clock Generator Using a Compensated Phase-Rotating Technique," in *IEEE Journal of Solid-State Circuits*, Vol. 46, pp. 1198-1213, May. 20011.