Package 'flowTraceR' reference manual

Title:	Tracing Information Flow for Inter-Software Comparisons in Mass Spectrometry-Based Bottom-Up Proteomics
Description:	Useful functions to standardize software outputs from ProteomeDiscoverer, Spectronaut, DIA-NN and MaxQuant on precursor, modified peptide and proteingroup level and to trace software differences for identifications such as varying proteingroup denotations for common precursor.
Authors:	Oliver Kardell [aut, cre]
Maintainer:	Oliver Kardell <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2025-03-09 03:26:08 UTC
Source:	https://github.com/okdll/flowtracer

Analysis of connected levels

Description

Analysis of the traceR_connected_pg_prec or traceR_connected_mod.pep_prec column

Usage

analyze_connected_levels(
  input_df,
  connected_levels = c("proteinGroup_precursor", "mod.peptides_precursor"),
  count_level = c("upper", "lower"),
  plot = TRUE,
  plot_characteristic = c("absolute", "relative")
)
analyze_connected_levels(
  input_df,
  connected_levels = c("proteinGroup_precursor", "mod.peptides_precursor"),
  count_level = c("upper", "lower"),
  plot = TRUE,
  plot_characteristic = c("absolute", "relative")
)

Arguments

`input_df`	A tibble with flowTraceR´s connected level information e.g. traceR_connected_pg_prec.
`connected_levels`	Choose either `proteinGroup_precursor` or `mod.peptides_precursor` for the corresponding traceR connection. Default is proteinGroup_precursor.
`count_level`	Counts appearances per possible connections. Choose either `upper` or `lower` - lower is always precursor level; upper is either proteingroup or mod.peptide level depending on chosen `connected_levels`. Default is upper. Duplicate entries are removed.
`plot`	Logical value, default is TRUE. If `TRUE` barplot is generated, if `FALSE` report as output.
`plot_characteristic`	if `absolute` the absolute count is displayed in barplot, if `relative` the relative count is displayed in barplot. Default is absolute. `plot_characteristic` has no influence on report.

Details

Shows the absolute and relative counts of possible connections - unique_unique/unique_common/common_unique/common_common of the respective column - as report or plot.

Value

This function returns a plot - absolute/relative counts - or a data frame.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(ggplot2)
library(tibble)

# DIA-NN example data
data <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique", "unique_common"),
  "traceR_traced_proteinGroups" = c("common", "common", "unique"),
  "traceR_traced_mod.peptides" = c("common", "unique", "common"),
  "traceR_traced_precursor" = c("common", "unique", "common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2", "EGIVEYPR2")
)

# Upper level - proteingroup level - how many proteingroups have a specific categorization
# Plot
analyze_connected_levels(input_df = data,
 connected_levels = "proteinGroup_precursor",
 count_level = "upper",
 plot = TRUE,
 plot_characteristic = "relative")

#Report
analyze_connected_levels(input_df = data,
 connected_levels = "proteinGroup_precursor",
 count_level = "upper",
 plot = FALSE)
# Load libraries
library(dplyr)
library(stringr)
library(ggplot2)
library(tibble)

# DIA-NN example data
data <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique", "unique_common"),
  "traceR_traced_proteinGroups" = c("common", "common", "unique"),
  "traceR_traced_mod.peptides" = c("common", "unique", "common"),
  "traceR_traced_precursor" = c("common", "unique", "common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2", "EGIVEYPR2")
)

# Upper level - proteingroup level - how many proteingroups have a specific categorization
# Plot
analyze_connected_levels(input_df = data,
 connected_levels = "proteinGroup_precursor",
 count_level = "upper",
 plot = TRUE,
 plot_characteristic = "relative")

#Report
analyze_connected_levels(input_df = data,
 connected_levels = "proteinGroup_precursor",
 count_level = "upper",
 plot = FALSE)

Analysis of unknown modifications

Description

Analysis of the traceR_precursor_unknownMods or traceR_mod.peptides_unknownMods column

Usage

analyze_unknown_mods(
  input_df,
  level = c("precursor, modified_peptides"),
  plot = TRUE,
  plot_characteristic = c("absolute", "relative")
)
analyze_unknown_mods(
  input_df,
  level = c("precursor, modified_peptides"),
  plot = TRUE,
  plot_characteristic = c("absolute", "relative")
)

Arguments

`input_df`	A tibble with the traceR_precursor_unknownMods or traceR_mod.peptides_unknownMods column.
`level`	Choose either `precursor` for traceR_precursor_unknownMods or `modified_peptides` for traceR_mod.peptides_unknownMods. Default is precursor.
`plot`	Logical value, default is TRUE. If `TRUE` barplot is generated, if `FALSE` report as output.
`plot_characteristic`	If `absolute` the absolute count is displayed in barplot, if `relative` the relative count is displayed in barplot. Default is absolute. `plot_characteristic` has no influence on report.

Details

Shows the absolute and relative counts of TRUE/FALSE of the traceR_precursor_unknownMods or traceR_mod.peptides_unknownMods column - as data frame or plot. Duplicate traceR_mod.peptides entries or traceR_precursor entries are removed, respectively.

Value

This function returns a plot - absolute/relative counts - or a data frame.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(ggplot2)
library(tibble)

# Generate data
data <- tibble::tibble(
 "traceR_mod.peptides" = c("AACLLPK",
  "ALTDM(UniMod:35)PQM(UniMod:35)R",
  "ALTDM(DummyModification)PQMK",
  "ALTDM(UniMod:35)PQM(UniMod:35)R",
  "ALTDM(DummyModification)PQMK"),
 "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, TRUE, FALSE, TRUE),
 "traceR_precursor" = c("AACLLPK2",
  "ALTDM(UniMod:35)PQM(UniMod:35)R2",
  "ALTDM(DummyModification)PQMK3",
  "ALTDM(UniMod:35)PQM(UniMod:35)R2",
  "ALTDM(DummyModification)PQMK3"),
 "traceR_precursor_unknownMods" = c(FALSE, FALSE, TRUE, FALSE, TRUE)
)

# Generate Report - precursor level
analyze_unknown_mods(
 input_df = data,
 level = "precursor",
 plot = FALSE
)

# Generate relative Plot - peptide level
analyze_unknown_mods(
 input_df = data,
 level = "modified_peptides",
 plot = TRUE,
 plot_characteristic = "relative"
)
# Load libraries
library(dplyr)
library(stringr)
library(ggplot2)
library(tibble)

# Generate data
data <- tibble::tibble(
 "traceR_mod.peptides" = c("AACLLPK",
  "ALTDM(UniMod:35)PQM(UniMod:35)R",
  "ALTDM(DummyModification)PQMK",
  "ALTDM(UniMod:35)PQM(UniMod:35)R",
  "ALTDM(DummyModification)PQMK"),
 "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, TRUE, FALSE, TRUE),
 "traceR_precursor" = c("AACLLPK2",
  "ALTDM(UniMod:35)PQM(UniMod:35)R2",
  "ALTDM(DummyModification)PQMK3",
  "ALTDM(UniMod:35)PQM(UniMod:35)R2",
  "ALTDM(DummyModification)PQMK3"),
 "traceR_precursor_unknownMods" = c(FALSE, FALSE, TRUE, FALSE, TRUE)
)

# Generate Report - precursor level
analyze_unknown_mods(
 input_df = data,
 level = "precursor",
 plot = FALSE
)

# Generate relative Plot - peptide level
analyze_unknown_mods(
 input_df = data,
 level = "modified_peptides",
 plot = TRUE,
 plot_characteristic = "relative"
)

Connects traced levels

Description

Connects two levels after categorization in unique and common entries.

Usage

connect_traceR_levels(
  input_df,
  level = c("proteinGroups", "modified_peptides")
)
connect_traceR_levels(
  input_df,
  level = c("proteinGroups", "modified_peptides")
)

Arguments

`input_df`	A tibble with flowTraceR´s traced level information e.g. traceR_traced_proteinGroups.
`level`	Choose between `proteinGroups` or `modified_peptides`. Connection between proteinGroups/modified_peptides and precursor categorization. Default is proteinGroups.

Details

Based on flowTraceR´s categorization in unique and common identifications two levels are connected. Possible connections are proteinGroup or modified peptide with precursor categorization.

Value

This function returns a tibble with one of the following columns depending on chosen level:

traceR_connected_pg_prec - connection between proteinGroup categorization and precursor categorization.
traceR_connected_mod.pep_prec - connection between modified peptide categorization and precursor categorization.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(tidyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
   "traceR_traced_proteinGroups" = c("common", "common", "unique"),
   "traceR_traced_mod.peptides" = c("common", "unique", "common"),
   "traceR_traced_precursor" = c("common", "unique", "common"),
   "traceR_proteinGroups" = c("P02768", "P02671", "Q92496"),
   "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK", "EGIVEYPR"),
   "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
   "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2", "EGIVEYPR2"),
   "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

spectronaut <- tibble::tibble(
   "traceR_traced_proteinGroups" = c("common", "common", "unique"),
   "traceR_traced_mod.peptides" = c("common", "unique", "common"),
   "traceR_traced_precursor" = c("common", "unique", "common"),
   "traceR_proteinGroups" = c("P02768", "P02671", "Q02985"),
   "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "M(UniMod:35)KPVPDLVPGNFK", "EGIVEYPR"),
   "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
   "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "M(UniMod:35)KPVPDLVPGNFK2", "EGIVEYPR2"),
   "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# Connect Precursor and ProteinGroup level
diann_connected <- connect_traceR_levels(input_df = diann, level = "proteinGroups")

spectronaut_connected <- connect_traceR_levels(input_df = spectronaut, level = "proteinGroups")
# Load libraries
library(tidyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
   "traceR_traced_proteinGroups" = c("common", "common", "unique"),
   "traceR_traced_mod.peptides" = c("common", "unique", "common"),
   "traceR_traced_precursor" = c("common", "unique", "common"),
   "traceR_proteinGroups" = c("P02768", "P02671", "Q92496"),
   "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK", "EGIVEYPR"),
   "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
   "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2", "EGIVEYPR2"),
   "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

spectronaut <- tibble::tibble(
   "traceR_traced_proteinGroups" = c("common", "common", "unique"),
   "traceR_traced_mod.peptides" = c("common", "unique", "common"),
   "traceR_traced_precursor" = c("common", "unique", "common"),
   "traceR_proteinGroups" = c("P02768", "P02671", "Q02985"),
   "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "M(UniMod:35)KPVPDLVPGNFK", "EGIVEYPR"),
   "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
   "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "M(UniMod:35)KPVPDLVPGNFK2", "EGIVEYPR2"),
   "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# Connect Precursor and ProteinGroup level
diann_connected <- connect_traceR_levels(input_df = diann, level = "proteinGroups")

spectronaut_connected <- connect_traceR_levels(input_df = spectronaut, level = "proteinGroups")

Conversion of software specific levels

Description

Conversion of precursor, modified peptide and proteinGroup entries to standardized format.

Usage

convert_all_levels(
  input_df,
  input_MQ_pg,
  software = c("MaxQuant", "DIA-NN", "Spectronaut", "PD")
)
convert_all_levels(
  input_df,
  input_MQ_pg,
  software = c("MaxQuant", "DIA-NN", "Spectronaut", "PD")
)

Arguments

`input_df`	A tibble with precursor, modified peptide and proteinGroup level information. For MaxQuant: evidence.txt and proteinGroups.txt, for PD: PSMs.txt with R-friendly headers enabled, for DIA-NN and Spectronaut default output reports.
`input_MQ_pg`	For MaxQuant: A tibble with proteinGroup level information - proteinGroups.txt.
`software`	The used analysis software - MaxQuant, PD, DIA-NN or Spectronaut. Default is MaxQuant.

Details

The input entries are converted to a software independent format. The generated entries are appended to the submitted dataframe.

Value

This function returns the original submitted tibble - input_df - including the following new columns:

traceR_precursor - software-independent standardized text for precursor entries.
traceR_precursor_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized format.
traceR_mod.peptides - software-independent standardized text for modified peptide entries.
traceR_mod.peptides_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized format.
traceR_proteinGroups - software-independent standardized text for proteinGroups.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(comprehenr)
library(tibble)

# MaxQuant example data
evidence <- tibble::tibble(
  "Modified sequence" = c("_AACLLPK_",
   "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
   "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3),
  "Protein group IDs" = c("26", "86;17", "86;17")
)

proteingroups <- tibble::tibble(
"Protein IDs" = c("A0A075B6P5;P01615;A0A087WW87;P01614;A0A075B6S6", "P02671", "P02672"),
id = c(26, 86, 17)
)

# Conversion
convert_all_levels(
 input_df = evidence,
 input_MQ_pg = proteingroups,
 software = "MaxQuant"
)
# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(comprehenr)
library(tibble)

# MaxQuant example data
evidence <- tibble::tibble(
  "Modified sequence" = c("_AACLLPK_",
   "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
   "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3),
  "Protein group IDs" = c("26", "86;17", "86;17")
)

proteingroups <- tibble::tibble(
"Protein IDs" = c("A0A075B6P5;P01615;A0A087WW87;P01614;A0A075B6S6", "P02671", "P02672"),
id = c(26, 86, 17)
)

# Conversion
convert_all_levels(
 input_df = evidence,
 input_MQ_pg = proteingroups,
 software = "MaxQuant"
)

Conversion of software specific modified peptide entries

Description

Modified peptide entries are converted to a common text representation

Usage

convert_modified_peptides(
  input_df,
  software = c("MaxQuant", "PD", "DIA-NN", "Spectronaut")
)
convert_modified_peptides(
  input_df,
  software = c("MaxQuant", "PD", "DIA-NN", "Spectronaut")
)

Arguments

`input_df`	A tibble with modified peptide level information. For MaxQuant: evidence.txt, for PD: PSMs.txt with R-friendly headers enabled, for DIA-NN and Spectronaut default output reports.
`software`	The used analysis software for the input_df - MaxQuant, PD, DIA-NN or Spectronaut. Default is MaxQuant.

Details

The input entries are converted to a software independent format. The generated entries are appended to the submitted dataframe. Conversion of modifications is currently only available for UniMod:35 and UniMod:4. Other modifications will not be converted to standardized format.

Value

This function returns the original submitted tibble - input_df - including two new columns:

traceR_mod.peptides - software-independent standardized text for modified peptide entries.
traceR_mod.peptides_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized text.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
 "Modified sequence" = c("_AACLLPK_",
  "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
  "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3)
)

# Conversion
convert_modified_peptides(
 input_df = data,
 software = "MaxQuant"
)
# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
 "Modified sequence" = c("_AACLLPK_",
  "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
  "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3)
)

# Conversion
convert_modified_peptides(
 input_df = data,
 software = "MaxQuant"
)

Conversion of software specific precursor entries

Description

Precursor entries are converted to a common text representation

Usage

convert_precursor(
  input_df,
  software = c("MaxQuant", "PD", "DIA-NN", "Spectronaut")
)
convert_precursor(
  input_df,
  software = c("MaxQuant", "PD", "DIA-NN", "Spectronaut")
)

Arguments

`input_df`	A tibble with precursor level information. For MaxQuant: evidence.txt, for PD: PSMs.txt with R-friendly headers enabled, for DIA-NN and Spectronaut default output reports.
`software`	The used analysis software for the input_df - MaxQuant, PD, DIA-NN or Spectronaut. Default is MaxQuant.

Details

Value

This function returns the original submitted tibble - input_df - including two new columns:

traceR_precursor - software-independent standardized text for precursor entries.
traceR_precursor_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized text.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
 "Modified sequence" = c("_AACLLPK_",
 "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
 "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3)
)

# Conversion
convert_precursor(
 input_df = data,
 software = "MaxQuant"
)
# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
 "Modified sequence" = c("_AACLLPK_",
 "_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
 "ALTDM(Dummy_Modification)PQMK"),
  Charge = c(2,2,3)
)

# Conversion
convert_precursor(
 input_df = data,
 software = "MaxQuant"
)

Conversion of software specific proteinGroups

Description

ProteinGroups are converted to a common text representation

Usage

convert_proteingroups(
  input_df,
  software = c("MaxQuant", "DIA-NN", "Spectronaut", "PD")
)
convert_proteingroups(
  input_df,
  software = c("MaxQuant", "DIA-NN", "Spectronaut", "PD")
)

Arguments

`input_df`	A tibble with proteinGroup level information. For MaxQuant: proteinGroups.txt, for PD: PSMs.txt with R-friendly headers enabled, for DIA-NN and Spectronaut default output reports.
`software`	The used analysis software for the input_df - MaxQuant, PD, DIA-NN or Spectronaut. Default is MaxQuant.

Details

The input entries are converted to a software independent format. The generated entries are appended to the submitted dataframe.

Value

This function returns the original submitted tibble - input_df - including one new column:

traceR_proteinGroups - software-independent standardized text for proteinGroups.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(comprehenr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
"Protein IDs" = c("A0A075B6P5;P01615;A0A087WW87;P01614;A0A075B6S6", "P02671", "P02672"),
id = c(26, 86, 17)
)

# Conversion
convert_proteingroups(
 input_df = data,
 software = "MaxQuant"
)
# Load libraries
library(dplyr)
library(stringr)
library(comprehenr)
library(tibble)

# MaxQuant example data
data <- tibble::tibble(
"Protein IDs" = c("A0A075B6P5;P01615;A0A087WW87;P01614;A0A075B6S6", "P02671", "P02672"),
id = c(26, 86, 17)
)

# Conversion
convert_proteingroups(
 input_df = data,
 software = "MaxQuant"
)

flowTraceR: a package for standardization of level information and tracking inter-software differences in bottom-up label-free proteomics

Description

Useful functions to standardize software outputs from ProteomeDiscoverer, Spectronaut, DIA-NN and MaxQuant on precursor, modified peptide and proteingroup level and to trace software differences for identifications such as varying proteingroup denotations for common precursor.

Author(s)

Maintainer: Oliver Kardell [email protected]

Create example data

Description

Example data for ProteomeDiscoverer, Spectronaut, DIA-NN and MaxQuant.

Usage

get_example(
  example = c("MaxQuant", "DIA-NN", "Spectronaut", "PD", "RetentionTime")
)
get_example(
  example = c("MaxQuant", "DIA-NN", "Spectronaut", "PD", "RetentionTime")
)

Arguments

example

Choose between "ProteomeDiscoverer", "Spectronaut", "DIA-NN" and "MaxQuant" or for an example for downstream analysis "RetentionTime". Default is MaxQuant.

Details

Data for each software for testing functions of flowTraceR. Additional example data for Spectronaut and DIA-NN for analyzing retention time distribution on precursor level.

Value

This function returns example data as dataframe for the respective chosen example. For "MaxQuant" a list with evidence/proteingroup dataframe. For "RetentionTime" a list with Spectronaut/DIA-NN data including retention time information.

Author(s)

Oliver Kardell

Examples


# Spectronaut example data
Spectronaut_data <- get_example(example = "Spectronaut")
# Spectronaut example data
Spectronaut_data <- get_example(example = "Spectronaut")

Check of converted modifications

Description

Check if conversion to UniMod-format of identified modifications is successful.

Usage

get_unknown_mods(input_string, pattern_start, pattern_end)
get_unknown_mods(input_string, pattern_start, pattern_end)

Arguments

`input_string`	character column traceR_precursor as string.
`pattern_start`	character of software-dependent beginning of representation of modifications.
`pattern_end`	character of software-dependent end of representation of modifications.

Details

After conversion to standardized format by convert_precursor or convert_modified_peptides, entries with modifications are checked for a successful conversion. Conversion of modifications is currently only available for UniMod:35 and UniMod:4. Other modifications will not be converted to standardized format.

Value

This function returns vector with logical values. This function is incorporated in the functions convert_precursor and convert_modified_peptides; used to generate the unknownMods column : if TRUE: a modification is detected, which is not converted to a standardized text.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# Generate data
data <- tibble::tibble(
  "traceR_precursor" = c("AACLLPK",
   "ALTDM(UniMod:35)PQM(UniMod:35)R2",
   "ALTDM(DummyModification)PQMK3")
)

# Unknown modifications present?
get_unknown_mods(input_string = data$traceR_precursor, pattern_start= "(", pattern_end = ")")
# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# Generate data
data <- tibble::tibble(
  "traceR_precursor" = c("AACLLPK",
   "ALTDM(UniMod:35)PQM(UniMod:35)R2",
   "ALTDM(DummyModification)PQMK3")
)

# Unknown modifications present?
get_unknown_mods(input_string = data$traceR_precursor, pattern_start= "(", pattern_end = ")")

Trace common and unique identifications between different software outputs for all levels

Description

Identifications of two input data frames are compared and categorized in unique and common entries for each level.

Usage

trace_all_levels(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  filter_unknown_mods = TRUE
)
trace_all_levels(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  filter_unknown_mods = TRUE
)

Arguments

`input_df1`	A tibble with flowTraceR's standardized precursor, modified peptide and proteinGroup level information.
`input_df2`	A tibble with flowTraceR's standardized precursor, modified peptide and proteinGroup level information.
`analysis_name1`	output tibble name for input_df1 - default is `"input_df1"`.
`analysis_name2`	output tibble name for input_df2 - default is `"input_df2"`.
`filter_unknown_mods`	Logical value, default is TRUE. If TRUE, unknown modifications are filtered out - requires "traceR_precursor_unknownMods" or "traceR_mod.peptides_unknownMods" column.

Details

Based on flowTraceR's standardized output format two software outputs can be compared and categorized into common and unique identifications - for precursor, modified peptide and proteinGroup level.

Value

This function returns a list with both original submitted tibbles - input_df1 and input_df2 - with the following new columns:

traceR_traced_precursor - categorization on precursor level in common and unique entries.
traceR_traced_mod.peptides - categorization on modified peptide level in common and unique entries.
traceR_traced_proteinGroups - categorization on proteinGroups level in common and unique entries.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "DummyProt"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK",
   "EGIVEYPR", "ALTDM(DummyModification)PQMK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE, TRUE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
   "EGIVEYPR2", "ALTDM(DummyModification)PQMK3" ),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE, TRUE)
)
# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "Q02985", "P02671"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "EGIVEYPR", "M(UniMod:35)KPVPDLVPGNFK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "EGIVEYPR2", "M(UniMod:35)KPVPDLVPGNFK2"),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# trace all levels in one step
traced_all <- trace_all_levels(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  filter_unknown_mods = TRUE
)
# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "DummyProt"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK",
   "EGIVEYPR", "ALTDM(DummyModification)PQMK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE, TRUE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
   "EGIVEYPR2", "ALTDM(DummyModification)PQMK3" ),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE, TRUE)
)
# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "Q02985", "P02671"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "EGIVEYPR", "M(UniMod:35)KPVPDLVPGNFK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "EGIVEYPR2", "M(UniMod:35)KPVPDLVPGNFK2"),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# trace all levels in one step
traced_all <- trace_all_levels(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  filter_unknown_mods = TRUE
)

Trace common and unique identifications between different software outputs

Description

Identifications of two input data frames are compared and categorized in unique and common entries.

Usage

trace_level(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  level = c("precursor", "modified_peptides", "proteinGroups"),
  filter_unknown_mods = TRUE
)
trace_level(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  level = c("precursor", "modified_peptides", "proteinGroups"),
  filter_unknown_mods = TRUE
)

Arguments

`input_df1`	A tibble with flowTraceR's standardized precursor, modified peptide, or proteinGroup level information - required column depends on chosen `level`.
`input_df2`	A tibble with flowTraceR's standardized precursor, modified peptide, or proteinGroup level information - required column depends on chosen `level`.
`analysis_name1`	output tibble name for input_df1 - default is `"input_df1"`.
`analysis_name2`	output tibble name for input_df2 - default is `"input_df2"`.
`level`	`"precursor", "modified_peptides", "proteinGroups"` - respective level for tracing common vs. unique entries. Default is precursor.
`filter_unknown_mods`	Logical value, default is TRUE. If TRUE, unknown modifications are filtered out - requires "traceR_precursor_unknownMods" or "traceR_mod.peptides_unknownMods" column; depends on chosen `level`.

Details

Based on flowTraceR's standardized output format two software outputs can be compared and categorized into common and unique identifications for a chosen level: precursor, modified peptide or proteinGroup level.

Value

This function returns a list with both original submitted tibbles - input_df1 and input_df2 - including one of the following new columns depending on chosen level :

traceR_traced_precursor - categorization on precursor level in common and unique entries.
traceR_traced_mod.peptides - categorization on modified peptide level in common and unique entries.
traceR_traced_proteinGroups - categorization on proteinGroups level in common and unique entries.

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "DummyProt"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK",
   "EGIVEYPR", "ALTDM(DummyModification)PQMK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE, TRUE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
   "EGIVEYPR2", "ALTDM(DummyModification)PQMK3" ),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE, TRUE)
)

# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "Q02985", "P02671"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "EGIVEYPR", "M(UniMod:35)KPVPDLVPGNFK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "EGIVEYPR2", "M(UniMod:35)KPVPDLVPGNFK2"),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# trace proteinGroup level
traced_proteinGroups <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "proteinGroups",
  filter_unknown_mods = TRUE
)

# trace precursor level
traced_pecursor <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "precursor",
  filter_unknown_mods = TRUE
)
# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "DummyProt"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK",
   "EGIVEYPR", "ALTDM(DummyModification)PQMK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE, TRUE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
   "EGIVEYPR2", "ALTDM(DummyModification)PQMK3" ),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE, TRUE)
)

# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "Q02985", "P02671"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "EGIVEYPR", "M(UniMod:35)KPVPDLVPGNFK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "EGIVEYPR2", "M(UniMod:35)KPVPDLVPGNFK2"),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# trace proteinGroup level
traced_proteinGroups <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "proteinGroups",
  filter_unknown_mods = TRUE
)

# trace precursor level
traced_pecursor <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "precursor",
  filter_unknown_mods = TRUE
)

Trace unique_common categorization for proteinGroup level

Description

Unique_common categorizations are analyzed on proteinGroup level

Usage

trace_unique_common_pg(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  string_analysis = FALSE
)
trace_unique_common_pg(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  string_analysis = FALSE
)

Arguments

`input_df1`	A tibble with flowTraceR's unique_common categorization for the proteinGroup_precursor connection.
`input_df2`	A tibble which is the counter part for input_df1 - which was used to generate the unique_common categorization for the proteinGroup_precursor connection.
`analysis_name1`	String. Appended to input_df1's traceR_proteinGroups column - default is `"input_df1"`.
`analysis_name2`	String. Appended to input_df1's traceR_proteinGroups column - default is `"input_df2"`.
`string_analysis`	Logical value, default is `FALSE`. If TRUE, only keeps proteinGroup identifications of input_df1 in which protein denotations are not present in the counterpart - the proteinGroups of input_df2 - and vice versa.

Details

For each submitted dataframe the unique_common proteinGroup_precursor connection is analyzed to highlight potential differences in proteinGroup denotations for common precursors.

Value

This function returns a tibble with the following columns :

traceR_proteinGroups_input_df1 - proteinGroup denotations of input_df1 for common precursor between input_df1 and input_df2
traceR_precursor - common precursor between input_df1 and input_df2
traceR_proteinGroups_input_df2 - proteinGroup denotations of input_df2 for common precursor between input_df1 and input_df2

Author(s)

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique",
  "unique_common", "unique_common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "P04433"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
  "EGIVEYPR2", "ASQSVSSYLAWYQQK2"),
)

# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique",
  "unique_common", "unique_common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q02985", "A0A0A0MRZ8;P04433"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "M(UniMod:35)KPVPDLVPGNFK2",
  "EGIVEYPR2", "ASQSVSSYLAWYQQK2"),
)

# Find difference in pg denotation
# string_analysis = TRUE
resultA <- trace_unique_common_pg(input_df1 = diann,
 input_df2 = spectronaut,
 analysis_name1 = "DIA-NN",
 analysis_name2 = "Spectronaut",
 string_analysis = TRUE)

# Find difference in pg denotation
# string_analysis = FALSE
# compare with resultA
resultB <- trace_unique_common_pg(input_df1 = diann,
 input_df2 = spectronaut,
 analysis_name1 = "DIA-NN",
 analysis_name2 = "Spectronaut",
 string_analysis = FALSE)
# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique",
  "unique_common", "unique_common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "P04433"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
  "EGIVEYPR2", "ASQSVSSYLAWYQQK2"),
)

# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_connected_pg_prec" = c("common_common", "common_unique",
  "unique_common", "unique_common"),
  "traceR_proteinGroups" = c("P02768", "P02671", "Q02985", "A0A0A0MRZ8;P04433"),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "M(UniMod:35)KPVPDLVPGNFK2",
  "EGIVEYPR2", "ASQSVSSYLAWYQQK2"),
)

# Find difference in pg denotation
# string_analysis = TRUE
resultA <- trace_unique_common_pg(input_df1 = diann,
 input_df2 = spectronaut,
 analysis_name1 = "DIA-NN",
 analysis_name2 = "Spectronaut",
 string_analysis = TRUE)

# Find difference in pg denotation
# string_analysis = FALSE
# compare with resultA
resultB <- trace_unique_common_pg(input_df1 = diann,
 input_df2 = spectronaut,
 analysis_name1 = "DIA-NN",
 analysis_name2 = "Spectronaut",
 string_analysis = FALSE)

Package 'flowTraceR'

Help Index

Analysis of connected levels

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Analysis of unknown modifications

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Connects traced levels

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Conversion of software specific levels

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Conversion of software specific modified peptide entries

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Conversion of software specific precursor entries

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Conversion of software specific proteinGroups

Description

Usage

Arguments

Details

Value

Author(s)

Examples

flowTraceR: a package for standardization of level information and tracking inter-software differences in bottom-up label-free proteomics

Description

Author(s)

See Also

Create example data

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Check of converted modifications

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Trace common and unique identifications between different software outputs for all levels

Description