Abstract
Transfer of hepatitis C virus (HCV) infection from a donor to a new recipient is associated with a bottleneck of genetic diversity in the transmitted viral variants. Existing data suggests that one, or very few, variants emerge from this bottleneck to establish the infection (transmitted founder [T/F] variants). In HCV, very few T/F variants have been characterized due to the challenges of obtaining early infection samples and of high throughput viral genome sequencing. This study used a large, acute HCV, deep-sequenced dataset from first viremia samples collected in nine prospective cohorts across four countries, to estimate the prevalence of single T/F viruses, and to identify host and virus-related factors associated with infections initiated by a single T/F variant. The short reads generated by Illumina sequencing were used to reconstruct viral haplotypes with two haplotype reconstruction algorithms. The haplotypes were examined for random mutations (Poisson distribution) and a star-like phylogeny to identify T/F viruses. The findings were cross-validated by haplotype reconstructions across three regions of the genome (Core-E2, NS3, NS5A) to minimize the possibility of spurious overestimation of single T/F variants. Of 190 acute infection samples examined, 54 were very early acute infections (HCV antibody negative, RNA positive), and single transmitted founders were identified in 14 (26%, 95% CI: 16-39%) after cross validation across multiple regions of the genome with two haplotype reconstruction algorithms. The presence of a single T/F virus was not associated with any host or virus-related factors, notably viral genotype or spontaneous clearance. In conclusion, approximately one in four new HCV infections originates from a single T/F virus. Resolution of genomic sequences of single T/F variants is the first step in exploring unique properties of these variants in the infection of host hepatocytes.