Bpop: an efficient program for estimating base population allele frequencies in single and multiple group structured populations
Base population allele frequencies (AF) should be used in genomic evaluations. A program named Bpop was implemented to estimate base population AF using a generalized least squares (GLS) method when the base population individuals can be assigned to groups. The required dense matrix products involving (A22 )-1v were implemented efficiently using sparse submatrices of A-1, where A and A22 are pedigree relationship matrices for all and genotyped animals, respectively. Three approaches were implemented: iteration on pedigree (IOP), iteration in memory (IM), and direct inversion by sparsity preserving Cholesky decomposition (CHM). The test data had 1.5 million animals genotyped using 50240 markers. Total computing time (the product (A22)-11) was 53 min (1.2 min) by IOP, 51 min (0.3 min) by IM, and 56 min (4.6 min) by CHM. Peak computer core memory use was 0.67 GB by IOP, 0.80 GB by IM, and 7.53 GB by CHM. Thus, the IOP and IM approaches can be recommended for large data sets because of their low memory use and computing time.