Data construction

Illumina short reads of 176 rice re-sequencing data (1) was obtained from NCBI SRA. Low quality bases and adopter sequences in each reads were removed using Trimmomatic (v0.36) (2). Reads were mapped to Nipponbare reference genome (IRGSP-1.0)(3) by BWA (v0.7.15)(4) and Local Realignment was conducted using GATK (v3.7)(5). After removing PCR duplicated with Picard (v2.9.0)(http://picard.sourceforge.net), variants were called by GATK HaplotypeCaller. The effect of each variant site was annotated by using snpEff (6). Rice genome annotation information from RAP-DB (7) and MSU release7 (3) were used for snpEff analysis and annotation tracks. Genome Wide Association Study was conducted based on SNV and phenotypic data for seven traits (1) using a linear mixed model (LMM) implemented in the Fast-LMM program(8).

Citation

1. Yano K, Yamamoto E, Aya K, et al (2016) Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat Genet 48:927-34.
2. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120.
3. Kawahara Y, de la Bastide M, Hamilton JP, et al (2013) Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y) 6:4.
4. Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754-1760.
5. Van der Auwera GA, Carneiro MO, Hartl C, et al (2013) From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinforma 43:11.10.1-33.
6. Cingolani P, Platts A, Wang LL, et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6:80-92.
7. Sakai H, Lee SS, Tanaka T, et al (2013) Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol 54:e6.
8. Lippert C, Listgarten J, Liu Y, et al (2011) FaST linear mixed models for genome-wide association studies. Nat Methods 8:833-835.