U.S. Department of Energy

Pacific Northwest National Laboratory


All of our open source software is cross-posted at our group's GitHub Repository.

Software Category:

Software Category: Featured Tools

  • Used to de-isotope mass spectra and to detect features from mass spectrometry data using observed isotopic signatures.

  • Reduces mass measurement errors for parent ions of tandem MS/MS data by modeling systematic errors based on putative peptide identifications. This information is used to subtract out errors from parent ion protonated masses.

  • GlyQ-IQ is software that performs a targeted, chromatographic centric search of mass spectral data for glycans.

  • InfernoRDN can perform various downstream data analysis, data reduction, and data comparison tasks including normalization, hypothesis testing, clustering, and heatmap generation.

  • Aligns multiple LC-MS datasets to one another after which LC-MS features can be matched to a database of peptides (typically an AMT tag database)

  • SIPPER can be used to automatically detect and quantify partially labeled C13 peptides.

  • VIPER (Visual Inspection of Peak/Elution Relationships) can be used to visualize and characterize the features detected during LC-MS analyses.

Software Category: Data Analysis and Data Presentation Tools

  • Active Data Canvas is a web-based visual analytic tool to visualize data matrix (expression matrix) and for users to interactively identify the structured domain knowledges (e.g., pathways and other genesets) linked to a cluster.

  • The Residue Frequency Summarizer is a VB.NET command-line utility that reads in a text file or fasta file containing peptide or protein sequences and prepares statistics on the occurrence of each amino acid residue throughout the file. Statistics include the number of sequences containing each amino acid and the occurrence percentage across all residues for each amino acid.

  • APE (Automated Processing Engine) provides a flexible platform for automation, standardization, documentation, QA/QC, and optimization of high-throughput data processing steps.

  • Used to perform various downstream data analysis, data reduction, and data comparison steps including normalization, hypothesis testing and clustering.  Note that InfernoRDN is the suggested replacement for DanteR (due to installation issues with DanteR).

  • Windows graphical user interface tool for viewing LC-MS data and identifications.

  • The Protein Coverage Summarizer can be used to determine the percent of the residues in each protein sequence that have been identified.

  • A high-performance multiprocessor implementation of the NCBI BLAST library.

  • Draws correctly proportioned and positioned two and three circle Venn diagrams (aka Euler diagrams) whose colors can be customized and the diagrams copied to the clipboard or saved to disk.

  • The Visual Integration for Bayesian Evaluation (VIBE) software is a visualization tool that allows the user to observe classification accuracies at the class level and evaluate classification accuracies on any subset of available data types based on the posterior probability models defined for the individual and integrated data.

Software Category: Fasta File, Protein Sequence, or Protein Database Related tools

  • The Fasta File Comparer program will compare portions of the protein sequences in the source Fasta file with those in the comparison Fasta file and write the matching proteins and sequences to an output text file.

  • Console application that reads a protein FASTA file and splits it apart into a number of sections. Although the splitting is random, each section will have a nearly identical number of residues.

  • Evaluates data quality metrics to fit logistic regression classification models to quality control (QC) LC-MS datasets to predict whether a dataset is in or out of control.

  • MSPolygraph is an open source hybrid database and spectral library MS/MS search engine that runs in either serial or parallel modes.

  • The NET Prediction Utility can be used to compute predicted normalized elution time (NET) values for a list of peptide sequences.

  • The PepAligner program can read a file containing peptides and align them to a file of protein sequences (.Fasta or delimited text) using Smith-Waterman alignment.

  • The Peptide Fragmentation Modeller is a command-line utility reads that reads in a text file of peptide sequences and generates the theoretical fragmentation pattern for each, outputting the results in a single concatenated DTA file, or in separate .Dta files.

  • A Population Variation plug-in for the Skyline software program that can assist researchers in determining whether their target peptides have known mutations in the general human population.

  • The Protein Digestion Simulator can be used to read a text file containing protein or peptide sequences (FASTA format or delimited text) then output the data to a tab-delimited file.

  • The Protein Sequence Motif Extractor reads a fasta file or tab delimited file containing protein sequences, then looks for the specified motif in each protein sequence.

  • The SCX Prediction Utility can be used to compute the predicted strong cation exchange (SCX) fraction (on a 0 to 1 scale) in which a given peptide will appear.

  • The Uniprot DAT File Parser can read a Uniprot .Dat file and parse out the information for each entry, creating a series of tab delimited text files or creating a FASTA file.

  • The Validate Fasta File utility is a Windows command-line application that will parse a Fasta file to check the validity of the file, looking for common, known problems.

Software Category: MS Analysis Tools

  • Finds peaks in raw mass spectra. Capable of full waveform generation, automated mass spectra interpretation and database searching integration of FASTA or GenBank files.

  • This software can be used to generate an Accurate Mass and Time tag database (Microsoft Access format) from local MS/MS search engine results from either SEQUEST or X!Tandem.

Software Category: MS/MS Analysis Tools

  • DeconMSn creates spectrum files for tandem mass spectrometry data.

  • LC-IMS-MS Feature Finder is a command line software application that searches for molecular ion signatures in multidimensional liquid chromatography-ion mobility spectrometry-mass spectrometry (LC-IMS-MS) data by clustering deisotoped peaks with similar monoisotopic mass, charge state, LC elution time, and ion mobility drift time values.

  • This program is a command-line utility that reads in a Mascot Generic Format (MGF) file and creates the equivalent _Dta.txt or .Dta files.  For more information, see Matrix Science's Data File Format page

  • MASIC (MS/MS Automated Selected Ion Chromatogram generator) Generates selected ion chromatograms (SICs) for all of the parent ions chosen for fragmentation in an LC-MS/MS analysis.

  • Reads the contents of a tab-delimited peptide hit results file (e.g. from Sequest, XTandem, Inspect, or MSGF+) and merges that information with the corresponding MASIC results files, appending the relevant MASIC stats for each peptide hit result.

  • MetISIS is a software application that takes user observed MS/MS spectra and generates ranked lists of candidate identifications from a database of metabolites. MetISIS uses a database of molecular structures for which we have generated in silico spectra (ISIS). While MetISIS is still very limited in its capability, we share it in its current form in hopes it will be found useful. We are continuing to update the capability of MetISIS and will release new vesions on this website periodically.

  • MS-GF+ (aka MSGF+ or MSGFPlus) performs peptide identification by scoring MS/MS spectra against peptides derived from a protein sequence database. It supports the HUPO PSI standard input file (mzML) and saves results in the mzIdentML format, though they can easily be transformed to TSV. ProteomeXchange supports Complete data submissions using MS-GF+ search results.

  • DataProcessing toolbox for running MSGF+ and MASIC, then merging the results. Uses Windows batch files to automate the process for a folder of Thermo .Raw files

  • MSPathFinder is a database search engine for top-down proteomics. 

  • PE-MMR (SA) can be used to create a MGF file with refined parent ion masses and charges, which can lead to more accurate search results from MS/MS spectra.

  • Converts a MSGF+ TSV file, X!Tandem results file (XML format), or a SEQUEST Synopsis/First Hits file to a series of tab-delimited text files summarizing the results.

  • SpectrumLook can be used to inspect the fragmentation (MS/MS) spectra in an LC-MS/MS analysis.

Software Category: MS Data File Utilities

  • Command-line utility that reads in a _Dta.txt file and creates the equivalent Mascot Generic Format (MGF) file. _Dta.txt files are large text files that contain numerous .Dta files, all concatenated together.

  • The Concatenated Text File Splitter can be used to split apart the concatenated file to re-create the individual text files (creating one file per spectrum). This is necessary if you wish to re-search the data with SEQUEST (which reads individual .Dta files).

  • The Flexible File Sort Utility is a command line application that sorts a text file alphabetically (forward or reverse).
    It supports both in-memory sorts for smaller files and use of temporary swap files for large files.
    It can alternatively sort on a column in a tab-delimited or comma-separated file.
    The column sort mode also supports numeric sorting.

  • The MsDataFileReader DLL is a VB.NET DLL that can be used to read mass spectrum data.

  • The MS File Info Scanner can be used to scan a series of MS data files (or data folders) and extract the acquisition start and end times, number of spectra, and the total size of the data.

  • Utility for converting ontology OBO files to a tab-delimited text file

  • The Thermo Raw File Reader is a .NET DLL that demonstrates how to read Thermo-Finnigan .Raw files using Thermo's MS File Reader.

Software Category: Mass Spectrometry Auxiliary Tools

Software Category: Tutorials

  • This topic provides a basic introduction to using software tools that do not have a graphical user interface (GUI) and instead can only be used at the Windows Command Prompt.


| Pacific Northwest National Laboratory