Analysis of Compositional Bias in a Commercial Phage Display Peptide Library by Next-Generation Sequencing

Research output: Contribution to journalJournal articleResearchpeer-review


  • Fulltext

    Final published version, 2.11 MB, PDF document

The principal presumption of phage display biopanning is that the naïve library contains an unbiased repertoire of peptides, and thus, the enriched variants derive from the affinity selection of an entirely random peptide pool. In the current study, we utilized deep sequencing to characterize the widely used Ph.DTM-12 phage display peptide library (New England Biolabs). The next-generation sequencing (NGS) data indicated the presence of stop codons and a high abundance of wild-type clones in the naïve library, which collectively result in a reduced effective size of the library. The analysis of the DNA sequence logo and global and position-specific frequency of amino acids demonstrated significant bias in the nucleotide and amino acid composition of the library inserts. Principal component analysis (PCA) uncovered the existence of four distinct clusters in the naïve library and the investigation of peptide frequency distribution revealed a broad range of unequal abundances for peptides. Taken together, our data provide strong evidence for the notion that the naïve library represents substantial departures from randomness at the nucleotide, amino acid, and peptide levels, though not undergoing any selective pressure for target binding. This non-uniform sequence representation arises from both the M13 phage biology and technical errors of the library construction. Our findings highlight the paramount importance of the qualitative assessment of the naïve phage display libraries prior to biopanning.

Original languageEnglish
Article number2402
Issue number11
Publication statusPublished - 2022

Bibliographical note

Publisher Copyright:
© 2022 by the authors.

    Research areas

  • biopanning, compositional bias, deep sequencing, departure from randomness, M13 phage, next-generation sequencing, Ph.D.-12 peptide library, phage display, principal component analysis

ID: 326740841