U.S. Department of Energy

Pacific Northwest National Laboratory

Comprehensive, Quantitative, Intact Proteoform Measurements of Patient-Derived Breast Tumor Xenografts Using an Improved Top-Down Proteomics Pipeline

The widely used bottom-up proteomics approaches are inherently limited by the “peptide-to-protein” inference problem; moreover, they are incapable of identifying or distinguishing many different processed and/or modified versions of the gene products whose coverage is important for understanding the development and treatment of cancers.  To address this need, we have recently developed a significantly improved top-down proteomics pipeline that allows for comprehensive quantitative intact proteoform measurements. Application of this pipeline to the comparative analysis of patient-derived xenograft (PDX) models of basal and luminal B human breast cancer provided much improved performance in the ability to obtain both comprehensive proteome coverage and precise and robust intact proteoform quantification, and resulting in novel biological insights otherwise blinded to bottom-up analysis approaches.

Intact proteoforms were extracted from the breast cancer PDX samples using mechanical homogenization in 6 M Guanidine, followed by enrichment of lower mass proteome and desalting in centrifugal concentration devices with molecular weight cutoff of 100 kDa and 10 kDa, respectively. The resulting protein samples were separated on a 75 m80 cm C2 column using a 3-h gradient and analyzed using a ThermoScientific Orbitrap Velos Elite mass spectrometer (240K MS; Top-3 CID MS/MS at 60K). Data analysis used an entirely new top-down informatics pipeline MSPathFinder including ProMex for feature finding, sequence graph for proteoform exploration, and LcMsSpectator for results visualizing and editing. Proteoform quantification used the area under the curve of LC-MS features.

Preliminary data 
Application of the improved top-down proteomics pipeline for characterization of PDX breast tumor samples provided an average of 1077 proteoforms from 405 unique proteins identified from each of the 5 replicates of the two different PDX samples; a total of 3123 proteoforms across 867 proteins were identified from all 10 replicates. Moreover, the new pipeline provided precise, robust quantification of the proteoforms. Pearson correlation of peak area of each proteoform for the replicates within each subtype is 0.70~0.80; >90% of the detected LC-MS features have CV of <30%, demonstrating excellent reducibility in the quantitative top-down measurements. Overall our top-down proteomics pipeline provided significant improvement in proteome coverage, precision of the quantification, and the ability to identify statistically significant changes on proteoform abundances, in comparison to another recent report that analyzed proteoforms in the same PDX samples. Although still relatively limited in coverage compared to the conventional bottom-up approach, top-down analysis provides a complementary view of the proteome (e.g., truncated and/or post-translationally modified proteoforms, variants, additional protein identifications).

The proteoform quantitation results showed striking differences between the basal and luminal breast cancer subtypes. Both unsupervised hierarchical clustering and principle component analysis readily separated the replicates by the subtypes they represent. Statistical analysis showed that a total of 1780 out of 3123 proteoforms were differentially expressed between the two subtypes (adjusted p value <0.01 and fold change >2). Interestingly, top-down, bottom-up and peptidomics analysis of the same PDX samples showed the same degree of changes in protein/peptide abundances between the two subtypes. The top-down measurements provided unique information on up-regulation of many proteoforms of histones and 60S ribosomal proteins as well as key pathway level changes (e.g., glycolysis) in the luminal subtype, holding great promise in providing new insights into cancer biology.

Novel aspect 
This improved top-down proteomics pipeline provides superior proteome coverage and precise, robust quantification of the proteoforms of the PDX samples.

| Pacific Northwest National Laboratory