Genome Browser

Supports

Status

The version 1.0 is the latest assembly, and is mirrored by Ensembl and UCSC genome browser. The other version are provided mainly for archival purposes, and should not be used unless you have special reasons (e.g. you already have your own annotations for an older assembly).

The Medaka Genome Sequencing Project

As one of the important targets of the group grant project Genome Science (Grant-in-Aid for Scientific Research on Priority Areas supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan), we started the sequencing of the medaka genome at the Academia Sequencing Center of the National Institute of Genetics (NIG) in mid 2002. The strain we chose was a southern inbred strain, Hd-rR, and sequencing was conducted by the whole-genome shotgun strategy.

The genome of Hd-rR, an inbred medaka strain, was assembled from 13.8 million reads that were obtained from the whole genome shotgun plasmid, fosmid, and bacterial artificial chromosome (BAC) libraries. The total size of the assembled contigs was 700.4 megabases (Mb). Of the 700.4 Mb in the sequenced genome, 50% of nucleotides are covered in scaffolds (or contigs) of length 1.41Mb (9.8 kilobases) that are called N50 values. This contiguity is sufficient to characterize the genomic structures of genes.

The medaka genome sequence data have been released to the public four times to meet urgent requests from the medaka research community. Four versions named 200406, 200506, version 0.9, and version 1.0 have been created to provide users with timely information. The former two versions had shorter scaffolds that were not anchored on the medaka chromosomes because they were built in 2004 and 2005, before genetics markers were available. Versions 0.9 and 1.0 were created in 2006, when comprehensive genetic markers were available, so that about 90% of their scaffolds and ultracontigs were located on the twenty-four medaka chromosomes. Versions 0.9 and 1.0 were built from the identical contigs and scaffolds, but the assembly of version 1.0 is longer than that of version 0.9 because more genetic markers could be used to generate version 1.0. Version 0.9 is left open to the public because most of the data analysis in the medaka genome paper was based on version 0.9. In these two versions, two scaffolds linked by a single BAC are connected into one ultracontig if it is consistent with genetic markers. The N50 value of ultracontigs in version 1.0 amounted to 5.1 Mb, excluding gaps, and therefore, the great continuity of ultracontigs promises to accelerate the task of positional cloning with an ample number of confirmed genetic markers in our database.

In addition to the genome of the Hd-rR medaka inbred strain, the genome of another inbred strain HNI was also sequenced to produce the draft 648-Mb HNI genome. Inbred strains Hd-rR and HNI originated in the southern and northern Japanese populations, respectively. They can mate and produce healthy offspring, although they are estimated to have diverged about 4 million years ago, and their genome sequences have diverged by approximately 3.42%. The alignment of the two medaka genomes identified about 16.4 million single nucleotide polymorphisms (SNPs), from which 2,401 SNPs were selected and mapped genetically onto medaka chromosomes using a backcross panel between these two inbred strains. These genetic SNP markers, together with 140 single sequence length polymorphism (SSLP) and restriction fragment length polymorphism (RFLP) markers, were used to anchor scaffolds on chromosomes, to construct the medaka chromosome map. These confirmed markers should be useful in isolating responsible genes of interest by positional cloning. To enable users to use the markers, our database contains PCR primers for 2,473 markers, with the genetic distances between the markers, and their locations on the chromosomes.

Project Core Members

  • Yuji Kohara (Shotgun sequencing)
    • Center for Genetic Resource Information, National Institute of Genetics
  • Shinichi Morishita (Bioinformatics Analysis / Genome assembler)
    • Graduate School of Frontier Sciences, University of Tokyo
  • Hiroyuki Takeda (Materials, SNP and EST mapping, and medaka biology)
    • Graduate School of Science, University of Tokyo

Contributions

  • Yuji Kohara, Shinichi Morishita and Hiroyuki Takeda designed research.

Sequencing (Center for Genetic Resource Information, National Institute of Genetics)

  • Kazuko Ohishi, Shinobu Haga, Fumiko Ohta, Hisayo Nomoto, Keiko Nogata, Tomomi Morishita, Tomoko Endo, Tadasu Shin-I and Yuji Kohara sequenced medaka shotgun libraries.

Bioinformatics Analysis (Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo)

  • Masahiro Kasahara and Shin Sasaki assembled WGS fragments using their in-house WGS assembler Ramen, and designed highly specific primers for SNP markers.
  • Shin Sasaki analyzed polymorphisms and genomic alternations.
  • Yoichiro Nakatani estimated the teleost genome evolution.
  • Wei Qu analyzed homologues, orthologues, paralogues, and local gene duplications.
  • Budrul Ahsan made gene predictions using 5'SAGE TSS and identified non-coding genes including novel miRNA candidates.
  • Tomoyuki Yamada elucidated novel repetitive elements using his software.
  • Masahiro Kasahara, Wei Qu, Budrul Ahsan and Tomoyuki Yamada validated specificity and efficiency of primers.
  • Budrul Ahsan, Daisuke Kobayashi, Tomoyuki Yamada, Shin Sasaki, Taro L. Saito, Yukinobu Nagayasu, Yasuhiko Kasai, and Koichiro Doi developed the medaka genome browser.
  • Shinichi Morishita supervised bioinformatics analyses.

Materials, SNP and EST mapping, and medaka biology (Department of Biological Sciences, Graduate School of Science, The University of Tokyo)

  • Kiyoshi Naruse, Takanori Narita and Hiroyuki Takeda constructed the fosmid and 7.5 Kb plasmid libraries.
  • Kiyoshi Naruse, Tomoko Jindo and Hiroyuki Takeda constructed a high density SNP map.
  • Daisuke Kobayashi and Hiroyuki Takeda analyzed the expression of medaka novel genes and miRNA.

BAC libraries (RIKEN Genomic Sciences Center / National Institute of Informatics)

  • Atsushi Toyoda, Yoko Kuroki and Asao Fujiyama constructed BAC libraries and sequenced BAC ends.
  • BAC libraries (Department of Molecular Biology, Keio University School of Medicine)
  • Takashi Sasaki, Atsushi Shimizu, Shuichi Asakawa, and Nobuyoshi Shimizu constructed another class of BAC libraries and sequenced BAC ends.

Hd-rR and HNI strains (Department of Environmental Science, Faculty of Science, Niigata University)

  • Atsuko Shimada and Mitsuru Sakaizumi constantly provided the Hd-rR and HNI strains.
  • Mitsuru Sakaizumi constructed a typing panel with Hd-rR and Kunming strains.

Acknowledgements

This work has been supported by Grant-in-Aid for Scientific Research on Priority Areas (Grant#12209003) to Shinichi Morishita.

Ramen Assembler Development Team members are indebted to Yuji Kohara and Tadasu Shin-i for their technical discussions on the whole genome shotgun assembly.

Members in the UT Genome Browser Development Team are grateful to Kiyoshi Naruse, Daisuke Kobayashi, and Takanori Narita for their valuable input to improve the functions of the browser in a variety of ways.