Global compRehensive Atlas of Peptide and Protein Abundance
Study Code / Acronym
GRAPPAAward Number
BB/T019557/1Programme
Research GrantStatus / Stage
ActiveDates
1 August 2020 -31 July 2023
Duration (calculated)
02 years 11 monthsFunder(s)
BBSRC (UKRI)Funding Amount
£328,372.00Funder/Grant study page
BBSRC UKRIContracted Centre
University of LiverpoolPrincipal Investigator
Professor Andrew JonesPI Contact
andrew.jones.3@city.ac.ukPI ORCID
0000-0001-6118-9327WHO Catergories
Understanding Underlying DiseaseDisease Type
Dementia (Unspecified)CPEC Review Info
Reference ID | 685 |
---|---|
Researcher | Reside Team |
Published | 07/07/2023 |
Data
Study Code / Acronym | GRAPPA |
---|---|
Award Number | BB/T019557/1 |
Status / Stage | Active |
Start Date | 20200801 |
End Date | 20230731 |
Duration (calculated) | 02 years 11 months |
Funder/Grant study page | BBSRC UKRI |
Contracted Centre | University of Liverpool |
Funding Amount | £328,372.00 |
Abstract
The world-leading PRIDE database now contains >14,000 proteomics datasets, all of which contain raw mass spectrometry (MS) data, some contain standardised lists of protein identifications but currently none contain quantitative data expressed in a standard format. As such, there is vast untapped potential for quantitative data re-use, for the majority of research groups who do not have the capability to re-process data sets themselves. In this project, we will develop robust open cloud-based data analysis pipelines that will be used to process 100s of publicly available datasets, using standardised data processing and normalisation protocols. All datasets will be made available within a new portal, PRIDE Quant to support computational users, and will be passed to the Expression Atlas database to provide a biologist-friendly view of the data. Data processing will largely focus on human samples for which the highest data volumes exist, including both “baseline” datasets e.g. to provide cell line or tissue/organ-level estimates of protein abundance, and “differential” expression datasets for various diseases including cancer, dementia, diabetes and major infectious diseases. We will develop several exemplar applications of the data, including displays showing correlations between gene and protein expression for matched samples, generation of co-expression networks from proteomics data, and generating vast maps of peptide-level abundance to support new research in proteome bioinformatics.