Global compRehensive Atlas of Peptide and Protein Abundance

Study Code / Acronym

GRAPPA

Award Number

BB/T019670/1

Programme

Research Grant

Status / Stage

Active

Dates

1 March 2021 -
29 February 2024

Duration (calculated)

02 years 11 months

Funder(s)

BBSRC (UKRI)

Funding Amount

£671,803.00

Funder/Grant study page

BBSRC UKRI

Contracted Centre

EMBL - European Bioinformatics Institute

Principal Investigator

Dr Juan Antonio Vizcaino

PI Contact

juan@ebi.ac.uk

PI ORCID

0000-0002-3905-4335

WHO Catergories

Understanding Underlying Disease

Disease Type

Dementia (Unspecified)

CPEC Review Info

Reference ID	687
Researcher	Reside Team
Published	07/07/2023

Study Code / Acronym	GRAPPA
Award Number	BB/T019670/1
Status / Stage	Active
Start Date	20210301
End Date	20240229
Duration (calculated)	02 years 11 months
Funder/Grant study page	BBSRC UKRI
Contracted Centre	EMBL - European Bioinformatics Institute
Funding Amount	£671,803.00

Abstract

The world-leading PRIDE database now contains >14,000 proteomics datasets, all of which contain raw mass spectrometry (MS) data, some contain standardised lists of protein identifications but currently none contain quantitative data expressed in a standard format. As such, there is vast untapped potential for quantitative data re-use, for the majority of research groups who do not have the capability to re-process data sets themselves. In this project, we will develop robust open cloud-based data analysis pipelines that will be used to process 100s of publicly available datasets, using standardised data processing and normalisation protocols. All datasets will be made available within a new portal, PRIDE Quant to support computational users, and will be passed to the Expression Atlas database to provide a biologist-friendly view of the data. Data processing will largely focus on human samples for which the highest data volumes exist, including both “baseline” datasets e.g. to provide cell line or tissue/organ-level estimates of protein abundance, and “differential” expression datasets for various diseases including cancer, dementia, diabetes and major infectious diseases. We will develop several exemplar applications of the data, including displays showing correlations between gene and protein expression for matched samples, generation of co-expression networks from proteomics data, and generating vast maps of peptide-level abundance to support new research in proteome bioinformatics.

Research In Dementia Mapping

RESIDE