MS-GF+

MS-GF+ Documentation home

MS-GF+

(How to migrate from MS-GFDB to MS-GF)

ChangeLog

Usage: java -Xmx3500M -jar MSGFPlus.jar

-s SpectrumFile (*.mzML, *.mzXML, *.mgf, *.ms2, *.pkl or *_dta.txt)
   Spectra should be centroided (see below for MSConvert example). Profile spectra will be ignored.

-d DatabaseFile (*.fasta or *.fa or *.faa)

[-conf ConfigurationFile] (Configuration file path; options specified at the command line will override settings in the config file)
   Example parameter file is at https://github.com/MSGFPlus/msgfplus/blob/master/docs/examples/MSGFPlus_Params.txt

[-decoy DecoyPrefix] (Prefix for decoy protein names; Default: XXX)

[-o OutputFile (*.mzid)] (Default: [SpectrumFileName].mzid)

[-t PrecursorMassTolerance] (e.g. 2.5Da, 20ppm or 0.5Da,2.5Da; Default: 20ppm)
   Use a comma to define asymmetric values. 
   E.g. "-t 0.5Da,2.5Da" will set 0.5Da to the left (ObservedPepMass < TheoreticalPepMass) 
                              and 2.5Da to the right (ObservedPepMass > TheoreticalPepMass)

[-ti IsotopeErrorRange] (Range of allowed isotope peak errors; Default: 0,1)
   Takes into account the error introduced by choosing a non-monoisotopic peak for fragmentation.
   The combination of -t and -ti determines the precursor mass tolerance.
   E.g. "-t 20ppm -ti -1,2" tests abs(ObservedPepMass - TheoreticalPepMass - n * 1.00335Da) < 20ppm for n = -1, 0, 1, 2.

[-thread NumThreads] (Number of concurrent threads to be executed; Default: Number of available cores)

[-tasks NumTasks] (Override the number of tasks to use on the threads; Default: (internally calculated based on inputs))
   More tasks than threads will reduce the memory requirements of the search, but will be slower (how much depends on the inputs).
   1 <= tasks <= numThreads: will create one task per thread, which is the original behavior.
   tasks = 0: use default calculation - minimum of: (threads*3) and (numSpectra/250).
   tasks < 0: multiply number of threads by abs(tasks) to determine number of tasks (i.e., -2 means "2 * numThreads" tasks).
   One task per thread will use the most memory, but will usually finish the fastest.
   2-3 tasks per thread will use comparably less memory, but may cause the search to take 1.5 to 2 times as long.

[-verbose 0/1] (0: Report total progress only (Default), 1: Report total and per-thread progress/status)

[-tda 0/1] (0: Don't search decoy database (Default), 1: Search decoy database)

[-m FragmentMethodID] (0: As written in the spectrum or CID if no info (Default), 1: CID, 2: ETD, 3: HCD, 4: UVPD)

[-inst InstrumentID] (0: Low-res LCQ/LTQ (Default), 1: Orbitrap/FTICR/Lumos, 2: TOF, 3: Q-Exactive)

[-e EnzymeID] (0: Unspecific cleavage, 1: Trypsin (Default), 2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: glutamyl endopeptidase, 6: Arg-C, 7: Asp-N, 8: alphaLP, 9: no cleavage)

[-protocol ProtocolID] (0: Automatic (Default), 1: Phosphorylation, 2: iTRAQ, 3: iTRAQPhospho, 4: TMT, 5: Standard)

[-ntt 0/1/2] (Number of Tolerable Termini; Default: 2)
   E.g. For trypsin, 0: non-tryptic, 1: semi-tryptic, 2: fully-tryptic peptides only.

[-mod ModificationFileName] (Modification file; Default: standard amino acids with fixed C+57; only if -mod is not specified)

[-minLength MinPepLength] (Minimum peptide length to consider; Default: 6)

[-maxLength MaxPepLength] (Maximum peptide length to consider; Default: 40)

[-minCharge MinCharge] (Minimum precursor charge to consider if charges are not specified in the spectrum file; Default: 2)

[-maxCharge MaxCharge] (Maximum precursor charge to consider if charges are not specified in the spectrum file; Default: 3)

[-n NumMatchesPerSpec] (Number of matches per spectrum to be reported; Default: 1)

[-addFeatures 0/1] (0: Output basic scores only (Default), 1: Output additional features)

[-ccm ChargeCarrierMass] (Mass of charge carrier; Default: mass of proton (1.00727649))

[-maxMissedCleavages Count] (Exclude peptides with more than this number of missed cleavages from the search; Default: -1 (no limit))

[-numMods Count] (Maximum number of dynamic (variable) modifications per peptide; Default: 3)
      

Example command (high-precision spectra):

java -Xmx3500M -jar MSGFPlus.jar -s Dataset.mzML -d IPI_human_3.79.fasta -inst 1 -t 20ppm -ti -1,2 -ntt 2 -tda 1 -o PSMs.mzid

Example command (low-precision spectra):

java -Xmx3500M -jar MSGFPlus.jar -s Dataset.mzML -d IPI_human_3.79.fasta -inst 0 -t 0.5Da,2.5Da -ntt 2 -tda 1 -o PSMs.mzid

Parameters:

MS-GF+ output

MS-GF+ outputs results as an mzIdentML (version 1.1) file. See http://www.psidev.info/mzidentml/ for details on the mzIdentML format. For every PSM, MS-GF+ reports the following scores:

MS-GF+ output example

Shown below is a sample of the MS-GF+ output in table form, as extracted from a simple MzIdentML file: test.mzid

There are two options for converting an MS-GF+ output file (.mzid) into a tab-separated file (.tsv).

  1. The MzIDToTsv utility built into MSGFPlus.jar (see the MzIDToTsv page)
  2. The Mzid-To-Tsv-Converter standalone application, available on GitHub
#SpecFile SpecID ScanNum FragMethod Precursor IsotopeError PrecursorError(ppm) Charge Peptide Protein DeNovoScore MSGFScore SpecEValue EValue QValue PepQValue
test.mgf index=0 26559 CID 1285.3457 1 -5.049801 3 K.IGAYLFVDMAHVAGLIAAGVYPNPVPHAHVVTSTTHK.T test 299 244 1.4807088E-31 3.2871733E-29 0.0 0.0
test.mgf index=0 26559 CID 1285.3457 1 -5.049801 3 K.IGAYLFVDMAHVAGLIAAGVYPNPVPHAHVVTSTTHK.T test_isoform 299 244 1.4807088E-31 3.2871733E-29 0.0 0.0
test.mgf index=1 -1 CID 870.11743 0 0.14029178 3 K.NLANPTSVILASIQM+15.995LEYLGMADK.A test2 156 136 2.2559852E-22 4.4217308E-20 0.0 0.0
(Text file of this table: test_Unrolled.tsv)