CSCNPhaser: Inference of Chromosome-Specific Copy Number Using Population Haplotypes

 

site was last updated 02/10/11

A large number of copy number variations (CNVs) have been identified in humans. In practice, because our human genome is a diploid, most sequencing platforms are limited to (or more accurate) for detecting total copy numbers rather than chromosome-specific copy numbers at each of the two homologous chromosomes. Given a set of total copy numbers and their background haplotypes. the CSCNPhaser aims to infer chromosome-specific copy number using the linkage disequilibrium between CNVs and SNPs.

CSCNPhaser:

項目符號

CSCNPhaser is implemented in C.

項目符號

Download the program and example input/output files compressed using WinZip
Windows Platform
Unzip the files using WinZip or other software
Usage: double click the CSCNphaser.exe or run the program on DOS prompt

Unix Platform (you may need to chmod the program)
1. unzip CSCNphaser_Unix.zip
2. chmod 755 CSCNphaser
3. ./CSCNphaser

項目符號

Input file

A) CNV.txt

ex)
CNV_id sample1 sample2 sample3 sample4 sample5 sample6
1 2 2 2 2 3 3
2 2 2 0 3 3 2

*columns are separated by TAB.
col1 : CNV-id
col2, col3, ... : copy number for each sample

B) CNV_info.txt

ex)
CNV_id Chromosome Start_position End_position
1 chr1 10 20
2 chr5 5 25

*columns are separated by TAB.
col1 : CNV-id
col2 : chromosome
col3 : start position of the CNV
col4 : end position of the CNV

C) SNP_chrN.txt

chrN : all chromosome observed in file "CNV_info.txt"

ex SNP_chr1.txt)
0011101111
0011101111
0011101111
0011101110
0011101110
0011101111
0011101111
0011101111
0011101111
1100010001
1100010001
0011101111

ex SNP_chr5.txt)
111010001100111
111010001100111
111010001100111
111010001100111
101101001011100
101101001011100
111010001100111
010010110011011
111010001100111
010010110011011
101101001011100
010010110011011

row1 : sample1 haplotype 1
row2 : sample1 haplotype 2
row3 : sample2 haplotype 1
row4 : sample2 haplotype 2
row5 : sample3 haplotype 1
row6 : sample3 haplotype 2
.
.

D) SNP_chrN_info.txt

chrN : all chromosome observed in file "CNV_info.txt"

ex SNP_chr1_info.txt)
SNP_id position allele_0 allele_1
1 2 G A
2 3 A G
3 6 C T
4 11 C G
5 12 C G
6 15 A G
7 16 A G
8 18 C T
9 21 C T
10 25 G A

ex SNP_chr5_info.txt)
SNP_id position allele_0 allele_1
11 1 C T
12 3 A C
13 6 A C
14 8 C A
15 9 A G
16 13 C G
17 16 C G
18 17 C G
19 19 A C
20 20 A C
21 23 C T
22 24 C T
23 28 G A
24 29 A C
25 31 A G

*columns are separated by TAB.
col1 : SNP-id
col2 : position of the SNP
col3 : the nucleotide of 0 in file "SNP_chrN.txt"
col4 : the nucleotide of 1 in file "SNP_chrN.txt"
 

項目符號 Output file: result.txt

ex)
CNV_id sample1 sample2 sample3 sample4 sample5 sample6
1 1/1 1/1 1/1 1/1 1/2 2/1
2 1/1 1/1 0/0 1/2 1/2 0/2

*columns are separated by TAB.
col1 : CNV-id
col2 : the inferred chromosome-specific copy number of sample1
col3 : the inferred chromosome-specific copy number of sample2
col4 : the inferred chromosome-specific copy number of sample3

If you encounter any problem, please contact Yao-Ting Huang (ythuang at cs.ccu.edu.tw)