| The Chromosome 7 Annotation Project |
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Twenty regions on human chromosome 7 and the syntenic intervals in mouse devoid of genes (gene deserts)(Tables S7a, S7b). For this study a gene desert was defined as a region with no known, novel, or partial genes in a 500 kb region. The control regions examined are shown (Tables S7c, S7d, S7e). In addition to the randomly selected genomic regions (described in the published text), as an additional control, we examined the genomic intervals encompassed by large genes (>500 kb) on chromosome 7 (Table S7f). In all cases the only correlation was with low CpG density. The orthologous genes in mouse and syntenic anchor points were used to identify the equivalent regions in the murine genome. Table S7b shows the orthologous murine gene that flanks the region. If an orthologous gene was not yet defined, the nearest known gene flanking the region was selected (the closest EST or syntenic anchor marker to the boundary is also listed in brackets). When syntenic anchors are used, the corresponding region in the public mouse assembly (UCSC) can be identified by retrieving the syntenic anchor sequence from the chromosome 7 Genome Browser (http://www.chr7.org), and searching the mouse assembly. The best sequence alignment (using BLAT) will represent the border of the mouse region equivalent to the human gene desert. For our analysis of mouse, we examined both the UCSC (results shown in Table S7b) and the Celera assemblies separately. In 19 of 20 cases the results were equivalent. In one instance (human desert #7), a break in synteny occurred and UCSC placed both mouse segments on chromosome 5 (Table S7b), while Celera positioned them on chromosomes 12 and 5. Notwithstanding, using our criteria all 20 regions would still be characterized as deserts in mouse. The location of the human gene deserts along chromosome 7 can also be observed in the 'Structural Feature' track in the Genome Browser at http://www.chr7.org. |
|
(Table S7a) Putative Gene Deserts on Chromosome 7 |
|
|
||||||||
|
Gene Desert |
Size (kb) |
Location |
Flanking Reference Genes |
Genes and Models in Region |
CpG Island |
CpG Island/Mb |
% Syntenic (>75%) |
Repetitive Content |
||
|
LINEs |
SINEs |
|||||||||
|
1 |
1850 |
7q11.22-q11.23 |
FLJ13195 |
AUTS2 |
2 predicted |
1 |
0.5 |
4.20% |
10.80% |
25.80% |
|
2 |
1740 |
7q31.1 |
THC1201470 |
IMMP2L |
3 predicted; 1 pseudogene |
0 |
0 |
4.60% |
32.90% |
5.20% |
|
3 |
1700 |
7p12.2-p12.1 |
KIAA0633 |
FLJ40449 |
2 predicted; 1 putative; 1 pseudogene |
4 |
2.4 |
2.90% |
26.00% |
8.10% |
|
4 |
1690 |
7p22.1-p21.3 |
NXPH1 |
IMAGE:3605453 |
2 predicted; 1 putative; 1 pseudogene |
3 |
1.8 |
4.70% |
31.20% |
5.30% |
|
5 |
1640 |
7q31.31-q31.32 |
ANKRD7 |
hCT1816883 |
1 pseudogene |
2 |
1.2 |
2.10% |
32.20% |
4.20% |
|
6 |
1190 |
7p21.3 |
ARL4 |
ETV1 |
1 pseudogene |
2 |
1.7 |
5.60% |
24.00% |
8.70% |
|
7 |
1040 |
7q21.11-q21.13 |
IMAGE:5272175 |
GRM3 |
1 predicted |
1 |
1 |
2.10% |
33.80% |
4.90% |
|
8 |
990 |
7p12.3-p12.2 |
MGC26484 |
ZPBP |
none |
1 |
1 |
2.60% |
31.00% |
6.00% |
|
9 |
950 |
7q31.33-q31.2 |
THC1079110 |
GRM8 |
1 predicted |
0 |
0 |
3.10% |
28.70% |
5.10% |
|
10 |
900 |
7q36.1-q36.2 |
ARP3BETA |
DPP6 |
2 predicted |
2 |
2.2 |
1.90% |
24.30% |
8.40% |
|
11 |
850 |
7p13-p12.3 |
IGFBP3 |
PRO1866 |
2 predicted; 1 pseudogene |
1 |
1.2 |
1.50% |
31.40% |
5.80% |
|
12 |
810 |
7q31.2-q31.31 |
IMAGE:4276820 |
TFEC |
1 putative |
0 |
0 |
11.80% |
19.90% |
6.10% |
|
13 |
770 |
7q21.2 |
FLJ32110 |
IMAGE:5295327 |
1 predicted |
1 |
1.3 |
4.80% |
33.60% |
5.20% |
|
14 |
730 |
7q31.2 |
GPR85 |
PPP1R3 |
none |
1 |
1.4 |
4.10% |
31.40% |
5.10% |
|
15 |
720 |
7q35 |
TPK1 |
THC1203597 |
3 pseudogenes |
1 |
1.4 |
4.10% |
29.70% |
6.30% |
|
16 |
660 |
7p14.1-p13 |
GLI3 |
MGC2821 |
1 predicted |
0 |
0 |
4.10% |
31.10% |
6.50% |
|
17 |
610 |
7q21.11 |
AIP1 |
GNAI1 |
1 predicted; 1 pseudogene |
0 |
0 |
2.40% |
30.00% |
6.30% |
|
18 |
540 |
7p21.2-p21.1 |
FERD3L |
LOC221830 |
1 predicted |
0 |
0 |
6.50% |
35.40% |
5.20% |
|
19 |
540 |
7p14.1 |
BC033981 |
INHBA |
none |
0 |
0 |
6.60% |
16.60% |
9.70% |
|
20 |
540 |
7q21.3-q22.1 |
DC11 |
TAC1 |
2 predicted |
0 |
0 |
3.30% |
28.70% |
12.20% |
|
|
Total (kb) |
|
|
|
|
Average |
0.8 |
4.20% |
28.10% |
7.50% |
|
|
20460 |
|
|
|
|
Standard Deviation |
0.8 |
2.30% |
6.30% |
4.70% |
|
(Table S7b) Mouse Regions Syntenic to Gene Deserts on Chromosome 7 (Data shown is from analysis on UCSC mouse sequence) |
||||||||||
|
Gene Desert |
Mouse Size (kb) |
Mouse Location |
Flanking Reference Genes |
Genes and Models in Region |
CpG Island |
CpG Island/Mb |
% Syntenic (>75%) |
Repetitive Content |
||
|
LINEs |
SINEs |
|||||||||
|
1 |
1488 |
chr5 |
BC021509 |
Gats (hmSA93056) |
none |
1 |
0.7 |
6.40% |
7.80% |
16.20% |
|
2 |
1951 |
chr12 |
Immp2l-pending |
Dnajb9 (hmSA72264) |
none |
1 |
0.5 |
4.20% |
31.50% |
2.90% |
|
3 |
2128 |
chr11 |
U26967 |
Sec61g (hmSA220916) |
none |
3 |
1.4 |
2.30% |
33.80% |
2.60% |
|
4 |
2216 |
chr6 |
Ica1 (BB641832) |
BC011114 (hmSA167586) |
2 spliced EST clusters |
4 |
1.8 |
3.80% |
39.80% |
2.30% |
|
5 |
2019 |
chr6 |
AW214405 (hmSA16662) |
Kcnd2 (hmSA15769) |
none |
1 |
0.5 |
1.80% |
39.30% |
2.20% |
|
6 |
1138 |
chr12 |
Etv1 |
Arl4 |
none |
0 |
0 |
5.80% |
27.50% |
4.00% |
|
7 |
172 |
chr5 |
Cdk6 (hmSA322558) |
Sema3a (BB451280) |
none |
0 |
0 |
4.80% |
33.80% |
3.40% |
|
1117 |
chr5 |
Telomere (BB871298) |
Png (hmSA322547) |
none |
3 |
2.7 |
1.90% |
34.60% |
2.60% |
|
|
8 |
1239< | |||||||||