![]() These incompatibilities occur when gene divergence affects loci encoding interacting products such as receptors and their ligands. Reproductive isolation drives the formation of new species, and many genes contribute to this through Dobzhansky-Muller incompatibilities (DMIs). Most pseudogenes had more than one of the. A sequence was classed as a V pseudogene if it contained stop codons, frameshifts, lacked an ATG, or had a significantly altered RSS (spacer longer or shorter than 23 Ϯ 1 bp any alterations in the first three nucleotides of the heptamer (CAC) Ͼ 2 mismatches elsewhere in the heptamer or Ͼ 1 G residue in the nonamer). However, this applied to only one gene ( J558.13.103 ), which had a GT-to-GC splice site change, which allows splicing, but at lower efficiency. Where a sequence had all the features of a normal coding gene, except for a noncanonical splice junction, we classed it as a coding gene, even though it may be nonfunctional. ![]() A sequence was classed as a V gene if it had an intact translation initiation codon (ATG), splice junctions, RSS, had no in-frame stop codons or frameshifts, and had a minimum length of 291 bp. A sequence was assigned to a V family if it had at least 80% identity at the nucleotide level over the entire coding exon sequence (excluding the leader). V genes and pseudogenes were assigned to families using the nomenclature originally suggested by Brodeur and Riblet (32). Further, it aided identification of germline genes, because many genes deposited with databases are recombined, mature, somatically hypermutated, and, hence, nongermline V genes. Analysis of our assembled sequence using NIX software ( ͗ ͘ ) enabled us to identify sequences with homology to known V gene fragments, including diverged pseudogenes and gene remnants. This may be because this locus contains a large number of very similar V genes and a high density of LINE1 and other repeats (31), both of which make automated assembly less accurate. For example, contig AC074328 overlaps both AC073939 and AC087166 at the 5 Ј end of the Igh locus in our assembly but is placed further 3 Ј in Ensembl. We noticed a number of mistakes in the ordering of contigs in the Ensembl assembly. A list of other overlapping BACs used is available on request. ![]() For simplicity, only the most complete BACs are shown. We used Sequencher software, which enabled us to incorporate and align all BACs that we found by BLAST analysis (Fig. ![]() We have assembled 2.75 Mb of the Igh locus, which contains the complete mouse Igh V region (2.5 Mb) and upstream flanking sequence (109.11–111.85 Mb). The Igh locus spans 3.3 Mb on mouse chromosome 12, from the 3 Ј enhancer at position 108.5 Mb to the first non-V gene at 111.85 Mb (Ensembl). Thus, we chose to assemble the locus manually, which has enabled us to include many BAC sequences not available on Ensembl, order the contigs correctly, provide at least two-fold coverage over large parts, and close all of the gaps. However, at the date of manuscript submission, the Ensembl assembly is incomplete, with single contig coverage over large parts of NCBI Build m34 (freeze Septem- ber 2005), many contigs in draft form, and over 40 large and small gaps encompassing ϳ 450 kb of sequence. The Igh locus has been partially assembled by the Ensembl project (24) ( ͗ ͘ ). Because the C57BL/6 genome sequence is be- ing assembled in Ensembl, and this is likely to be the major mouse strain for future study, it will be vital to investigate the recombination patterns of V genes in this strain with the benefit of complete locus knowledge. Further, studies of V(D)J recombination in the C57BL/6 mouse strain have been limited. This assembled and fully annotated sequence is the first report that places V genes relative to flanking regions, pseudogenes, repeats, and nonrepetitive intergenic sequences, enabling study of their role in V(D)J recombination. In this study, we set out to assemble and annotate the primary sequence data of the mouse Igh V region from publicly available sources, including the mouse genome sequencing project (Ensembl) (24), to provide a detailed picture of the locus, including exact numbers and positions of genes within each family and their correct genomic context. ![]() Thus, to gain a complete understanding of how the Ig repertoire is established, it will be necessary to investigate the large noncoding regions between genes. These are not confined to genes, but rather affect large chromatin domains. Further, it recently has become increasingly clear that large-scale chromatin remodelling events, such as nuclear relocalization (21), antisense intergenic transcription (22), and locus contraction (23), precede V(D)J recombination. or at most a small amount of flanking sequence, and have not been able to assess the role of chromatin context, provided by the relatively large tracts of intervening sequence (2–50 kb) between the relatively small (500 bp) V genes. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |