Welcome to the TAP project

Click here to download the published dataset & raw datasets

Click here to search the published dataset

Click here for the GenePro - Cytoscape Web-Start

Click here to search the raw MS data*

The Yeast TAP Project is aimed at elucidating the entire network of protein-protein interactions in a model eukaryotic organism, namely the yeast Saccharomyces cerevisiae. Our principle approach is based on the use of the highly effective tandem affinity purification (TAP) method developed by Seraphin and colleagues (Rigaut et al. Nature Biotech. 1999) to isolate native protein complexes to virtual homogeneity. In order to more completely map the entire interactome, a genome-wide set of strains bearing in-frame insertions was individually introduced by homologous recombination at the 3' end of each predicted ORF as an entry point. Each insertion encoded a TAP tag and a selectable marker, resulting in endogenous expression of the full-length protein fused at its C-terminus with a calmodulin binding peptide (CBP), a tobacco etch virus (TEV) protease cleavage site, and two IgG-binding domains of Staphylococcus aureus protein A. The tagged baits, and their stably interacting partners, were purified from 4-litre yeast cultures under conditions that retained native PPI, and the identities of the co-purifying proteins (preys) determined in two complementary ways: First, a portion of the isolated complexes was electrophoresed on an SDS polyacrylamide gel, stained with silver, and visible bands removed and identified by trypsin digestion and peptide mass fingerprinting using MALDI-TOF MS.

In parallel, another aliquot of each purified protein preparation was digested in solution and the peptides separated and sequenced by data-dependent liquid chromatography-mass spectrometry (LC-MS/MS). Since the samples are sufficiently pure, this approach generally identified all the constituent interacting polypeptides, including low abundance trace components.

Among the purifications of 4562 different proteins that were attempted, including all predicted non-membrane proteins, 2357 purifications were successful; at least one protein was identified (in 1613 cases by MALDI-TOF MS and in 2001 cases by LC-MS/MS) that was not present in parallel purifications from another tagged strain or an untagged control strain. Our dataset comprises an extended network of several thousand protein-protein interactions.

Both the quality of the MS spectra used for protein identification and the approximate stoichiometry of the interacting protein partners can be evaluated by accessing this comprehensive, publicly accessible database, which reports the supporting experimental evidence for putative protein interactions. Provided in a series of linked pages are the corresponding primary mass spectral data, suitably marked-up along with the database search algorithm scores, descriptive protein information, and where available, annotated gel images.

The data presented in this database corresponds to the publication by Krogan et al in the upcoming issue of Nature. To view/download the supplimentary sections of that article, or to manually search through the raw spectrometry data, please follow the links below.

*Note: the following link provides access to the raw mass spec database search results and gel images, which represents data prior to any machine learning/PPI confidence score assignment and clustering/complex prediction. Much of the information within this site may be cryptic to those unfamiliar with mass spectrometry-based search results and proteomic data interpretation.