Here, we summarize the benchmark results in the paper PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding. We maintain a leaderboard for each of the 14 considered protein understanding tasks. All benchmark results can be reproduced in the PEER benchmark codebase. We also maintain an integrated leaderboard among different methods by taking the mean reciprocal rank (MRR) as the metric. In the future, we will open the entrance to receive new benchmark results of new methods from the community.

Note that, all benchmark results reported here are averaged over three runs with seeds 0, 1 and 2, and the standard deviation of three runs is also reported.

Integrated Leaderboard
Protein Function Prediction
Protein Localization Prediction
- Leaderboard for Subcellular Localization Prediction
- Leaderboard for Binary Localization Prediction
Protein Structure Prediction
Protein-Protein Interaction (PPI) Prediction
Protein-Ligand Interaction (PLI) Prediction
- Leaderboard for PLI Affinity Prediction on PDBbind
- Leaderboard for PLI Affinity Prediction on BindingDB

Integrated Leaderboard

Evaluation metric - Mean Reciprocal Rank (MRR) on all applicable benchmark tasks

Rank	Method	MRR	Ranks: Fluorescence → BindingDB	Reference	External data
1	[MTL] ESM-1b + Contact	0.517	[4, 4, 1, 2, 2, 1, /, 1, 1, 5, 4, 2, 13, 5]	paper	UniRef50 for pre-train; Contact for MTL
2	ESM-1b (fix)	0.401	[17, 3, 12, 14, 1, 5, 2, 2, 2, 1, 1, 19, 4, 15]	paper	UniRef50 for pre-train
3	[MTL] CNN + Contact	0.277	[6, 11, 5, 1, 9, 9, /, 7, 8, 9, 12, 1, 3, 8]	paper	Contact for MTL
4	[MTL] CNN + SSP	0.272	[1, 7, 6, 8, 13, 10, 13, 6, /, 11, 11, 6, 1, 3]	paper	SSP for MTL
5	ESM-1b	0.270	[9, 8, 4, 3, 4, 2, 1, 4, 3, 6, 6, 7, 15, 12]	paper	UniRef50 for pre-train
6	[MTL] ESM-1b + SSP	0.269	[5, 2, 3, 6, 5, 3, 5, 3, /, 4, 3, 4, 7, 4]	paper	UniRef50 for pre-train; SSP for MTL
7	[MTL] ESM-1b + Fold	0.250	[8, 5, 2, 15, 3, 4, 4, /, 4, 2, 5, 3, 8, 9]	paper	UniRef50 for pre-train; Fold for MTL
8	ProtBert	0.231	[7, 1, 9, 12, 6, 6, 3, 5, 5, 3, 7, 5, 16, 11]	paper	BFD for pre-train
9	[MTL] CNN + Fold	0.226	[2, 17, 8, 10, 14, 12, 12, /, 10, 16, 8, 8, 2, 1]	paper	Fold for MTL
10	CNN	0.127	[3, 14, 7, 16, 10, 8, 11, 8, 9, 8, 15, 13, 5, 7]	paper	/
11	ProtBert (fix)	0.121	[19, 6, 11, 18, 8, 11, 7, 9, 12, 14, 2, 17, 11, 17]	paper	BFD for pre-train
12	[MTL] Transformer + Fold	0.116	[11, 9, 14, 11, 11, 15, 14, /, 14, 13, 10, 10, 14, 2]	paper	Fold for MTL
13	LSTM	0.104	[16, 16, 19, 4, 7, 7, 6, 14, 7, 15, 13, 14, 12, 16]	paper	/
14	[MTL] Transformer + SSP	0.091	[10, 10, 16, 9, 12, 17, 10, 15, /, 12, 18, 11, 6, 10]	paper	SSP for MTL
15	Transformer	0.090	[12, 13, 15, 5, 15, 16, 9, 13, 13, 10, 17, 9, 10, 14]	paper	/
16	ResNet	0.084	[15, 19, 17, 13, 17, 13, 8, 12, 6, 19, 9, 18, 9, 13]	paper	/
17	[MTL] Transformer + Contact	0.082	[13, 15, 18, 7, 16, 18, /, 11, 11, 18, 16, 12, 17, 6]	paper	Contact for MTL
18	DDE	0.082	[14, 12, 10, 17, 18, 14, /, 10, /, 7, 14, 15, /, /]	paper	/
19	Moran	0.058	[18, 18, 13, 19, 19, 19, /, 16, /, 17, 19, 16, /, /]	paper	/

Protein Function Prediction

Leaderboard for Fluorescence Prediction

Task type - Protein-wise Regression
Dataset statistics - #Train: 21,446 #Validation: 5,362 #Test: 27,217
Evaluation metric - Spearman’s Rho on the test set (the higher, the better)
Dataset splitting scheme - Train & Validation: mutants with three or less mutations; Test: mutants with four or more mutations.
Description - Models are asked to predict the fitness of green fluorescent protein mutants. The prediction target is a real number indicating the logarithm of fluorescence intensity.

Rank	Method	Test Spearman’s Rho	Reference	External data	#Params	Hardware
1	[MTL] CNN + SSP	0.683 ± 0.001	paper	SSP for MTL	7,455,748	4 × Tesla V100 (32GB)
2	[MTL] CNN + Fold	0.682 ± 0.001	paper	Fold for MTL	8,677,548	4 × Tesla V100 (32GB)
3	CNN	0.682 ± 0.002	paper	/	6,403,073	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Contact	0.681 ± 0.001	paper	UniRef50 for pre-train; Contact for MTL	657,279,416	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + SSP	0.681 ± 0.002	paper	UniRef50 for pre-train; SSP for MTL	655,643,578	4 × Tesla V100 (32GB)
6	[MTL] CNN + Contact	0.680 ± 0.001	paper	Contact for MTL	8,502,274	4 × Tesla V100 (32GB)
7	ProtBert	0.679 ± 0.001	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)
8	[MTL] ESM-1b + Fold	0.679 ± 0.001	paper	UniRef50 for pre-train; Fold for MTL	657,170,530	4 × Tesla V100 (32GB)
9	ESM-1b	0.679 ± 0.002	paper	UniRef50 for pre-train	654,000,055	4 × Tesla V100 (32GB)
10	[MTL] Transformer + SSP	0.656 ± 0.002	paper	SSP for MTL	21,810,180	4 × Tesla V100 (32GB)
11	[MTL] Transformer + Fold	0.648 ± 0.004	paper	Fold for MTL	22,421,676	4 × Tesla V100 (32GB)
12	Transformer	0.643 ± 0.005	paper	/	21,545,985	4 × Tesla V100 (32GB)
13	[MTL] Transformer + Contact	0.642 ± 0.017	paper	Contact for MTL	22,071,298	4 × Tesla V100 (32GB)
14	DDE	0.638 ± 0.003	paper	/	468,481	4 × Tesla V100 (32GB)
15	ResNet	0.636 ± 0.021	paper	/	11,300,354	4 × Tesla V100 (32GB)
16	LSTM	0.494 ± 0.071	paper	/	27,080,328	4 × Tesla V100 (32GB)
17	ESM-1b (fix)	0.430 ± 0.002	paper	UniRef50 for pre-train	654,000,055	4 × Tesla V100 (32GB)
18	Moran	0.400 ± 0.001	paper	/	386,561	4 × Tesla V100 (32GB)
19	ProtBert (fix)	0.339 ± 0.003	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)

Leaderboard for Stability Prediction

Task type - Protein-wise Regression
Dataset statistics - #Train: 53,571 #Validation: 2,512 #Test: 12,851
Evaluation metric - Spearman’s Rho on the test set (the higher, the better)
Dataset splitting scheme - Train & Validation: proteins from four rounds of experimental design; Test: top candidates with single mutations.
Description - Models are asked to predict the stability of proteins under natural environment. The prediction target is a real number indicating the experimental measurement of stability.

Rank	Method	Test Spearman’s Rho	Reference	External data	#Params	Hardware
1	ProtBert	0.771 ± 0.020	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + SSP	0.759 ± 0.002	paper	UniRef50 for pre-train; SSP for MTL	655,643,578	4 × Tesla V100 (32GB)
3	ESM-1b (fix)	0.750 ± 0.010	paper	UniRef50 for pre-train	654,000,055	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Contact	0.733 ± 0.007	paper	UniRef50 for pre-train; Contact for MTL	657,279,416	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + Fold	0.728 ± 0.002	paper	UniRef50 for pre-train; Fold for MTL	657,170,530	4 × Tesla V100 (32GB)
6	ProtBert (fix)	0.697 ± 0.013	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)
7	[MTL] CNN + SSP	0.695 ± 0.016	paper	SSP for MTL	7,455,748	4 × Tesla V100 (32GB)
8	ESM-1b	0.694 ± 0.073	paper	UniRef50 for pre-train	654,000,055	4 × Tesla V100 (32GB)
9	[MTL] Transformer + Fold	0.672 ± 0.010	paper	Fold for MTL	22,421,676	4 × Tesla V100 (32GB)
10	[MTL] Transformer + SSP	0.667 ± 0.063	paper	SSP for MTL	21,810,180	4 × Tesla V100 (32GB)
11	[MTL] CNN + Contact	0.661 ± 0.006	paper	Contact for MTL	8,502,274	4 × Tesla V100 (32GB)
12	DDE	0.652 ± 0.033	paper	/	468,481	4 × Tesla V100 (32GB)
13	Transformer	0.649 ± 0.056	paper	/	21,545,985	4 × Tesla V100 (32GB)
14	CNN	0.637 ± 0.010	paper	/	6,403,073	4 × Tesla V100 (32GB)
15	[MTL] Transformer + Contact	0.620 ± 0.004	paper	Contact for MTL	22,071,298	4 × Tesla V100 (32GB)
16	LSTM	0.533 ± 0.101	paper	/	27,080,328	4 × Tesla V100 (32GB)
17	[MTL] CNN + Fold	0.472 ± 0.170	paper	Fold for MTL	8,677,548	4 × Tesla V100 (32GB)
18	Moran	0.322 ± 0.011	paper	/	386,561	4 × Tesla V100 (32GB)
19	ResNet	0.126 ± 0.094	paper	/	11,300,354	4 × Tesla V100 (32GB)

Leaderboard for Beta-lactamase Activity Prediction

Task type - Protein-wise Regression
Dataset statistics - #Train: 4,158 #Validation: 520 #Test: 520
Evaluation metric - Spearman’s Rho on the test set (the higher, the better)
Dataset splitting scheme - Random split.
Description - Models are asked to predict the activity among first-order mutants of the TEM-1 beta-lactamase protein. The prediction target is the experimentally tested fitness score (a real number) which records the scaled mutation effect for each mutant.

Rank	Method	Test Spearman’s Rho	Reference	External data	#Params	Hardware
1	[MTL] ESM-1b + Contact	0.899 ± 0.001	paper	UniRef50 for pre-train; Contact for MTL	657,279,416	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + Fold	0.882 ± 0.007	paper	UniRef50 for pre-train; Fold for MTL	657,170,530	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + SSP	0.881 ± 0.001	paper	UniRef50 for pre-train; SSP for MTL	655,643,578	4 × Tesla V100 (32GB)
4	ESM-1b	0.839 ± 0.053	paper	UniRef50 for pre-train	654,000,055	1 × Tesla V100 (32GB)
5	[MTL] CNN + Contact	0.835 ± 0.009	paper	Contact for MTL	8,502,274	4 × Tesla V100 (32GB)
6	[MTL] CNN + SSP	0.811 ± 0.014	paper	SSP for MTL	7,455,748	4 × Tesla V100 (32GB)
7	CNN	0.781 ± 0.011	paper	/	6,403,073	1 × Tesla V100 (32GB)
8	[MTL] CNN + Fold	0.736 ± 0.012	paper	Fold for MTL	8,677,548	4 × Tesla V100 (32GB)
9	ProtBert	0.731 ± 0.226	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)
10	DDE	0.623 ± 0.019	paper	/	468,481	4 × Tesla V100 (32GB)
11	ProtBert (fix)	0.616 ± 0.002	paper	BFD for pre-train	420,981,761	4 × Tesla V100 (32GB)
12	ESM-1b (fix)	0.528 ± 0.009	paper	UniRef50 for pre-train	654,000,055	4 × Tesla V100 (32GB)
13	Moran	0.375 ± 0.008	paper	/	386,561	4 × Tesla V100 (32GB)
14	[MTL] Transformer + Fold	0.276 ± 0.029	paper	Fold for MTL	22,421,676	4 × Tesla V100 (32GB)
15	Transformer	0.261 ± 0.015	paper	/	21,545,985	4 × Tesla V100 (32GB)
16	[MTL] Transformer + SSP	0.197 ± 0.017	paper	SSP for MTL	21,810,180	4 × Tesla V100 (32GB)
17	ResNet	0.152 ± 0.029	paper	/	11,300,354	4 × Tesla V100 (32GB)
18	[MTL] Transformer + Contact	0.142 ± 0.063	paper	Contact for MTL	22,071,298	4 × Tesla V100 (32GB)
19	LSTM	0.139 ± 0.051	paper	/	27,080,328	4 × Tesla V100 (32GB)

Leaderboard for Solubility Prediction

Task type - Protein-wise Classification
Dataset statistics - #Train: 62,478 #Validation: 6,942 #Test: 1,999
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 30% sequence identity cutoff against the test set.
Description - Models are required to predict whether a protein is soluble or not (binary classification).

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	[MTL] CNN + Contact	70.63 ± 0.34	paper	Contact for MTL	8,503,299	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + Contact	70.46 ± 0.16	paper	UniRef50 for pre-train; Contact for MTL	657,280,697	4 × Tesla V100 (32GB)
3	ESM-1b	70.23 ± 0.75	paper	UniRef50 for pre-train	654,001,336	4 × Tesla V100 (32GB)
4	LSTM	70.18 ± 0.63	paper	/	27,080,969	4 × Tesla V100 (32GB)
5	Transformer	70.12 ± 0.31	paper	/	21,546,498	4 × Tesla V100 (32GB)
6	[MTL] ESM-1b + SSP	70.03 ± 0.15	paper	UniRef50 for pre-train; SSP for MTL	655,644,859	4 × Tesla V100 (32GB)
7	[MTL] Transformer + Contact	70.03 ± 0.42	paper	Contact for MTL	22,071,811	4 × Tesla V100 (32GB)
8	[MTL] CNN + SSP	69.85 ± 0.62	paper	SSP for MTL	7,456,773	4 × Tesla V100 (32GB)
9	[MTL] Transformer + SSP	69.81 ± 0.46	paper	SSP for MTL	21,810,693	4 × Tesla V100 (32GB)
10	[MTL] CNN + Fold	69.23 ± 0.10	paper	Fold for MTL	8,678,573	4 × Tesla V100 (32GB)
11	[MTL] Transformer + Fold	68.85 ± 0.43	paper	Fold for MTL	22,422,189	4 × Tesla V100 (32GB)
12	ProtBert	68.15 ± 0.92	paper	BFD for pre-train	420,982,786	4 × Tesla V100 (32GB)
13	ResNet	67.33 ± 1.46	paper	/	11,300,867	4 × Tesla V100 (32GB)
14	ESM-1b (fix)	67.02 ± 0.40	paper	UniRef50 for pre-train	654,001,336	4 × Tesla V100 (32GB)
15	[MTL] ESM-1b + Fold	64.80 ± 0.49	paper	UniRef50 for pre-train; Fold for MTL	657,171,811	4 × Tesla V100 (32GB)
16	CNN	64.43 ± 0.25	paper	/	6,404,098	4 × Tesla V100 (32GB)
17	DDE	59.77 ± 1.21	paper	/	468,994	4 × Tesla V100 (32GB)
18	ProtBert (fix)	59.17 ± 0.21	paper	BFD for pre-train	420,982,786	4 × Tesla V100 (32GB)
19	Moran	57.73 ± 1.33	paper	/	387,074	4 × Tesla V100 (32GB)

Protein Localization Prediction

Leaderboard for Subcellular Localization Prediction

Task type - Protein-wise Classification
Dataset statistics - #Train: 8,945 #Validation: 2,248 #Test: 2,768
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 30% sequence identity cutoff against the test set.
Description - Models are required to predict where a natural protein locates in the cell. The label denotes 10 possible locations.

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	ESM-1b (fix)	79.82 ± 0.18	paper	UniRef50 for pre-train	654,011,584	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + Contact	78.86 ± 0.75	paper	UniRef50 for pre-train; Contact for MTL	657,290,945	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + Fold	78.43 ± 0.28	paper	UniRef50 for pre-train; Fold for MTL	657,182,059	4 × Tesla V100 (32GB)
4	ESM-1b	78.13 ± 0.49	paper	UniRef50 for pre-train	654,011,584	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + SSP	78.00 ± 0.34	paper	UniRef50 for pre-train; SSP for MTL	655,655,107	4 × Tesla V100 (32GB)
6	ProtBert	76.53 ± 0.93	paper	BFD for pre-train	420,990,986	4 × Tesla V100 (32GB)
7	LSTM	62.98 ± 0.37	paper	/	27,086,097	4 × Tesla V100 (32GB)
8	ProtBert (fix)	59.44 ± 0.16	paper	BFD for pre-train	420,990,986	4 × Tesla V100 (32GB)
9	[MTL] CNN + Contact	59.07 ± 0.45	paper	Contact for MTL	8,511,499	4 × Tesla V100 (32GB)
10	CNN	58.73 ± 1.05	paper	/	6,412,298	4 × Tesla V100 (32GB)
11	[MTL] Transformer + Fold	56.74 ± 0.29	paper	Fold for MTL	22,426,293	4 × Tesla V100 (32GB)
12	[MTL] Transformer + SSP	56.70 ± 0.16	paper	SSP for MTL	21,814,797	4 × Tesla V100 (32GB)
13	[MTL] CNN + SSP	56.64 ± 0.33	paper	SSP for MTL	7,464,973	4 × Tesla V100 (32GB)
14	[MTL] CNN + Fold	56.54 ± 0.65	paper	Fold for MTL	8,686,773	4 × Tesla V100 (32GB)
15	Transformer	56.02 ± 0.82	paper	/	21,550,602	4 × Tesla V100 (32GB)
16	[MTL] Transformer + Contact	52.92 ± 0.64	paper	Contact for MTL	22,075,915	4 × Tesla V100 (32GB)
17	ResNet	52.30 ± 3.51	paper	/	11,304,971	4 × Tesla V100 (32GB)
18	DDE	49.17 ± 0.40	paper	/	473,098	4 × Tesla V100 (32GB)
19	Moran	31.13 ± 0.47	paper	/	391,178	4 × Tesla V100 (32GB)

Leaderboard for Binary Localization Prediction

Task type - Protein-wise Classification
Dataset statistics - #Train: 5,161 #Validation: 1,727 #Test: 1,746
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 30% sequence identity cutoff against the test set.
Description - Models are asked to predict whether a protein is “membrane-bound” or “soluble” (binary classification).

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	[MTL] ESM-1b + Contact	92.50 ± 0.26	paper	UniRef50 for pre-train; Contact for MTL	657,280,697	4 × Tesla V100 (32GB)
2	ESM-1b	92.40 ± 0.35	paper	UniRef50 for pre-train	654,001,336	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + SSP	92.26 ± 0.20	paper	UniRef50 for pre-train; SSP for MTL	655,644,859	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Fold	91.83 ± 0.20	paper	UniRef50 for pre-train; Fold for MTL	657,171,811	4 × Tesla V100 (32GB)
5	ESM-1b (fix)	91.61 ± 0.10	paper	UniRef50 for pre-train	654,001,336	4 × Tesla V100 (32GB)
6	ProtBert	91.32 ± 0.89	paper	BFD for pre-train	420,982,786	4 × Tesla V100 (32GB)
7	LSTM	88.11 ± 0.14	paper	/	27,080,969	4 × Tesla V100 (32GB)
8	CNN	82.67 ± 0.32	paper	/	6,404,098	4 × Tesla V100 (32GB)
9	[MTL] CNN + Contact	82.67 ± 0.72	paper	Contact for MTL	8,503,299	4 × Tesla V100 (32GB)
10	[MTL] CNN + SSP	81.83 ± 0.86	paper	SSP for MTL	7,456,773	4 × Tesla V100 (32GB)
11	ProtBert (fix)	81.54 ± 0.09	paper	BFD for pre-train	420,982,786	4 × Tesla V100 (32GB)
12	[MTL] CNN + Fold	81.14 ± 0.40	paper	Fold for MTL	8,678,573	4 × Tesla V100 (32GB)
13	ResNet	78.99 ± 4.41	paper	/	11,300,867	4 × Tesla V100 (32GB)
14	DDE	77.43 ± 0.42	paper	/	468,994	4 × Tesla V100 (32GB)
15	[MTL] Transformer + Fold	76.27 ± 0.57	paper	Fold for MTL	22,422,189	4 × Tesla V100 (32GB)
16	Transformer	75.74 ± 0.74	paper	/	21,546,498	4 × Tesla V100 (32GB)
17	[MTL] Transformer + SSP	75.20 ± 1.23	paper	SSP for MTL	21,810,693	4 × Tesla V100 (32GB)
18	[MTL] Transformer + Contact	74.98 ± 0.77	paper	Contact for MTL	22,071,811	4 × Tesla V100 (32GB)
19	Moran	55.63 ± 0.85	paper	/	387,074	4 × Tesla V100 (32GB)

Protein Structure Prediction

Leaderboard for Contact Prediction

Task type - Residue-pair Classification
Dataset statistics - #Train: 25,299 #Validation: 224 #Test: 40
Evaluation metric - L/5 Precision (L: protein sequence length) on the test set (the higher, the better)
Dataset splitting scheme - Adopt the splits of ProteinNet; use the data of CASP12 for test.
Description - Models are asked to estimate whether each pair of residues contact or not (binary classification).

Rank	Method	Test L/5 Precision	Reference	External data	#Params	Hardware
1	ESM-1b	45.78 ± 2.73	paper	UniRef50 for pre-train	655,638,455	4 × Tesla V100 (32GB)
2	ESM-1b (fix)	40.37 ± 0.22	paper	UniRef50 for pre-train	655,638,455	4 × Tesla V100 (32GB)
3	ProtBert	39.66 ± 1.21	paper	BFD for pre-train	422,030,337	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Fold	35.86 ± 1.27	paper	UniRef50 for pre-train; Fold for MTL	658,808,930	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + SSP	32.03 ± 12.25	paper	UniRef50 for pre-train; SSP for MTL	657,281,978	4 × Tesla V100 (32GB)
6	LSTM	26.34 ± 0.65	paper	/	29,948,808	4 × Tesla V100 (32GB)
7	ProtBert (fix)	24.35 ± 0.44	paper	BFD for pre-train	422,030,337	4 × Tesla V100 (32GB)
8	ResNet	20.43 ± 0.74	paper	/	11,562,498	4 × Tesla V100 (32GB)
9	Transformer	17.50 ± 0.77	paper	/	21,808,129	4 × Tesla V100 (32GB)
10	[MTL] Transformer + SSP	12.76 ± 1.62	paper	SSP for MTL	22,072,324	4 × Tesla V100 (32GB)
11	CNN	10.00 ± 0.20	paper	/	7,451,649	4 × Tesla V100 (32GB)
12	[MTL] CNN + Fold	5.87 ± 0.21	paper	Fold for MTL	9,726,124	4 × Tesla V100 (32GB)
13	[MTL] CNN + SSP	5.73 ± 0.66	paper	SSP for MTL	8,504,324	4 × Tesla V100 (32GB)
14	[MTL] Transformer + Fold	2.04 ± 0.31	paper	Fold for MTL	22,683,820	4 × Tesla V100 (32GB)

Leaderboard for Fold Classification

Task type - Protein-wise Classification
Dataset statistics - #Train: 12,312 #Validation: 736 #Test: 718
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Adopt data from SCOP 1.75 database; entire superfamilies are held out from training to compose the test set.
Description - Models are required to classify the global structural topology of a protein on the fold level. The label indicates 1195 different folding topologies. Models are expected to detect the proteins with similar structures but dissimilar sequences, i.e., performing remote homology detection.

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	[MTL] ESM-1b + Contact	32.10 ± 0.72	paper	UniRef50 for pre-train; Contact for MTL	658,808,930	4 × Tesla V100 (32GB)
2	ESM-1b (fix)	29.95 ± 0.21	paper	UniRef50 for pre-train	655,529,569	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + SSP	28.63 ± 1.55	paper	UniRef50 for pre-train; SSP for MTL	657,173,092	4 × Tesla V100 (32GB)
4	ESM-1b	28.17 ± 2.05	paper	UniRef50 for pre-train	655,529,569	4 × Tesla V100 (32GB)
5	ProtBert	16.94 ± 0.42	paper	BFD for pre-train	422,205,611	4 × Tesla V100 (32GB)
6	[MTL] CNN + SSP	11.67 ± 0.56	paper	SSP for MTL	8,679,598	4 × Tesla V100 (32GB)
7	[MTL] CNN + Contact	11.07 ± 0.38	paper	Contact for MTL	9,726,124	4 × Tesla V100 (32GB)
8	CNN	10.93 ± 0.35	paper	/	7,626,923	1 × Tesla V100 (32GB)
9	ProtBert (fix)	10.74 ± 0.93	paper	BFD for pre-train	422,205,611	4 × Tesla V100 (32GB)
10	DDE	9.57 ± 0.46	paper	/	1,081,003	4 × Tesla V100 (32GB)
11	[MTL] Transformer + Contact	9.16 ± 0.91	paper	Contact for MTL	22,683,820	4 × Tesla V100 (32GB)
12	ResNet	8.89 ± 1.45	paper	/	11,912,876	4 × Tesla V100 (32GB)
13	Transformer	8.52 ± 0.63	paper	/	22,158,507	4 × Tesla V100 (32GB)
14	LSTM	8.24 ± 1.61	paper	/	27,845,682	4 × Tesla V100 (32GB)
15	[MTL] Transformer + SSP	8.14 ± 0.76	paper	SSP for MTL	22,422,702	4 × Tesla V100 (32GB)
16	Moran	7.10 ± 0.56	paper	/	999,083	4 × Tesla V100 (32GB)

Leaderboard for Secondary Structure Prediction

Task type - Residue-wise Classification
Dataset statistics - #Train: 8,678 #Validation: 2,170 #Test: 513
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Training & validation: from NetSurfP; Test: CB513 dataset.
Description - Models are asked to predict the secondary structure (i.e., coil, strand or helix) of each residue.

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	[MTL] ESM-1b + Contact	83.21 ± 0.32	paper	UniRef50 for pre-train; Contact for MTL	657,281,978	4 × Tesla V100 (32GB)
2	ESM-1b (fix)	83.14 ± 0.10	paper	UniRef50 for pre-train	654,002,617	4 × Tesla V100 (32GB)
3	ESM-1b	82.73 ± 0.21	paper	UniRef50 for pre-train	654,002,617	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Fold	82.27 ± 0.23	paper	UniRef50 for pre-train; Fold for MTL	657,173,092	4 × Tesla V100 (32GB)
5	ProtBert	82.18 ± 0.05	paper	BFD for pre-train	420,983,811	4 × Tesla V100 (32GB)
6	ResNet	69.56 ± 0.20	paper	/	11,301,380	4 × Tesla V100 (32GB)
7	LSTM	68.99 ± 0.76	paper	/	28,312,970	4 × Tesla V100 (32GB)
8	[MTL] CNN + Contact	66.13 ± 0.06	paper	Contact for MTL	8,504,324	4 × Tesla V100 (32GB)
9	CNN	66.07 ± 0.06	paper	/	6,405,123	1 × Tesla V100 (32GB)
10	[MTL] CNN + Fold	65.93 ± 0.04	paper	Fold for MTL	8,679,598	4 × Tesla V100 (32GB)
11	[MTL] Transformer + Contact	63.10 ± 0.43	paper	Contact for MTL	22,072,324	4 × Tesla V100 (32GB)
12	ProtBert (fix)	62.51 ± 0.06	paper	BFD for pre-train	420,983,811	4 × Tesla V100 (32GB)
13	Transformer	59.62 ± 0.94	paper	/	21,547,011	4 × Tesla V100 (32GB)
14	[MTL] Transformer + Fold	50.93 ± 0.20	paper	Fold for MTL	22,422,702	4 × Tesla V100 (32GB)

Protein-Protein Interaction (PPI) Prediction

Leaderboard for Yeast PPI Prediction

Task type - Protein-pair Classification
Dataset statistics - #Train: 1,668 #Validation: 131 #Test: 373
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 40% sequence identity cutoff against the test set.
Description - Models are asked to predict whether two yeast proteins interact or not (binary classification).

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	ESM-1b (fix)	66.07 ± 0.58	paper	UniRef50 for pre-train	655,639,736	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + Fold	64.76 ± 1.42	paper	UniRef50 for pre-train; Fold for MTL	658,810,211	4 × Tesla V100 (32GB)
3	ProtBert	63.72 ± 2.80	paper	BFD for pre-train	422,031,362	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + SSP	62.06 ± 5.98	paper	UniRef50 for pre-train; SSP for MTL	657,283,259	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + Contact	58.50 ± 2.15	paper	UniRef50 for pre-train; Contact for MTL	658,919,097	4 × Tesla V100 (32GB)
6	ESM-1b	57.00 ± 6.38	paper	UniRef50 for pre-train	655,639,736	4 × Tesla V100 (32GB)
7	DDE	55.83 ± 3.13	paper	/	731,138	4 × Tesla V100 (32GB)
8	CNN	55.07 ± 0.02	paper	/	7,452,674	1 × Tesla V100 (32GB)
9	[MTL] CNN + Contact	54.50 ± 1.61	paper	Contact for MTL	9,551,875	4 × Tesla V100 (32GB)
10	Transformer	54.12 ± 1.27	paper	/	21,808,642	4 × Tesla V100 (32GB)
11	[MTL] CNN + SSP	54.12 ± 2.87	paper	SSP for MTL	8,505,349	4 × Tesla V100 (32GB)
12	[MTL] Transformer + SSP	54.00 ± 1.17	paper	SSP for MTL	22,072,837	4 × Tesla V100 (32GB)
13	[MTL] Transformer + Fold	54.00 ± 2.58	paper	Fold for MTL	22,684,333	4 × Tesla V100 (32GB)
14	ProtBert (fix)	53.87 ± 0.38	paper	BFD for pre-train	422,031,362	4 × Tesla V100 (32GB)
15	LSTM	53.62 ± 2.72	paper	/	27,490,569	4 × Tesla V100 (32GB)
16	[MTL] CNN + Fold	53.28 ± 1.91	paper	Fold for MTL	9,727,149	4 × Tesla V100 (32GB)
17	Moran	53.00 ± 0.50	paper	/	649,218	4 × Tesla V100 (32GB)
18	[MTL] Transformer + Contact	52.86 ± 1.15	paper	Contact for MTL	22,333,955	4 × Tesla V100 (32GB)
19	ResNet	48.91 ± 1.78	paper	/	11,563,011	4 × Tesla V100 (32GB)

Leaderboard for Human PPI Prediction

Task type - Protein-pair Classification
Dataset statistics - #Train: 6,844 #Validation: 277 #Test: 227
Evaluation metric - Accuracy on the test set (the higher, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 40% sequence identity cutoff against the test set.
Description - Models are asked to predict whether two human proteins interact or not (binary classification).

Rank	Method	Test Acc	Reference	External data	#Params	Hardware
1	ESM-1b (fix)	88.06 ± 0.24	paper	UniRef50 for pre-train	655,639,736	4 × Tesla V100 (32GB)
2	ProtBert (fix)	83.61 ± 1.34	paper	BFD for pre-train	422,031,362	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + SSP	83.00 ± 0.88	paper	UniRef50 for pre-train; SSP for MTL	657,283,259	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + Contact	81.66 ± 2.88	paper	UniRef50 for pre-train; Contact for MTL	658,919,097	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + Fold	80.28 ± 1.27	paper	UniRef50 for pre-train; Fold for MTL	658,810,211	4 × Tesla V100 (32GB)
6	ESM-1b	78.17 ± 2.91	paper	UniRef50 for pre-train	655,639,736	4 × Tesla V100 (32GB)
7	ProtBert	77.32 ± 1.10	paper	BFD for pre-train	422,031,362	4 × Tesla V100 (32GB)
8	[MTL] CNN + Fold	69.03 ± 2.68	paper	Fold for MTL	9,727,149	4 × Tesla V100 (32GB)
9	ResNet	68.61 ± 3.78	paper	/	11,563,011	4 × Tesla V100 (32GB)
10	[MTL] Transformer + Fold	67.33 ± 2.68	paper	Fold for MTL	22,684,333	4 × Tesla V100 (32GB)
11	[MTL] CNN + SSP	66.39 ± 0.86	paper	SSP for MTL	8,505,349	4 × Tesla V100 (32GB)
12	[MTL] CNN + Contact	65.10 ± 2.26	paper	Contact for MTL	9,551,875	4 × Tesla V100 (32GB)
13	LSTM	63.75 ± 5.12	paper	/	27,490,569	4 × Tesla V100 (32GB)
14	DDE	62.77 ± 2.30	paper	/	731,138	4 × Tesla V100 (32GB)
15	CNN	62.60 ± 1.67	paper	/	7,452,674	1 × Tesla V100 (32GB)
16	[MTL] Transformer + Contact	60.76 ± 6.87	paper	Contact for MTL	22,333,955	4 × Tesla V100 (32GB)
17	Transformer	59.58 ± 2.09	paper	/	21,808,642	4 × Tesla V100 (32GB)
18	[MTL] Transformer + SSP	54.80 ± 2.06	paper	SSP for MTL	22,072,837	4 × Tesla V100 (32GB)
19	Moran	54.67 ± 4.43	paper	/	649,218	4 × Tesla V100 (32GB)

Leaderboard for PPI Affinity Prediction

Task type - Protein-pair Regression
Dataset statistics - #Train: 2,127 #Validation: 212 #Test: 343
Evaluation metric - RMSE on the test set (the lower, the better)
Dataset splitting scheme - Train: wild-type complexes as well as mutants with at most 2 mutations; Validation: mutants with 3 or 4 mutations; Test: mutants with more than 4 mutations.
Description - Models are required to predict the binding affinity between two proteins, measured by pKd (a real number). This task performs evaluation under a multi-round protein binder design scenario.

Rank	Method	Test RMSE	Reference	External data	#Params	Hardware
1	[MTL] CNN + Contact	1.732 ± 0.044	paper	Contact for MTL	9,550,850	4 × Tesla V100 (32GB)
2	[MTL] ESM-1b + Contact	1.893 ± 0.064	paper	UniRef50 for pre-train; Contact for MTL	658,917,816	4 × Tesla V100 (32GB)
3	[MTL] ESM-1b + Fold	2.002 ± 0.065	paper	UniRef50 for pre-train; Fold for MTL	658,808,930	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + SSP	2.031 ± 0.031	paper	UniRef50 for pre-train; SSP for MTL	657,281,978	4 × Tesla V100 (32GB)
5	ProtBert	2.195 ± 0.073	paper	BFD for pre-train	422,030,337	4 × Tesla V100 (32GB)
6	[MTL] CNN + SSP	2.270 ± 0.041	paper	SSP for MTL	8,504,324	4 × Tesla V100 (32GB)
7	ESM-1b	2.281 ± 0.250	paper	UniRef50 for pre-train	655,638,455	4 × Tesla V100 (32GB)
8	[MTL] CNN + Fold	2.392 ± 0.041	paper	Fold for MTL	9,726,124	4 × Tesla V100 (32GB)
9	Transformer	2.499 ± 0.156	paper	/	21,808,129	4 × Tesla V100 (32GB)
10	[MTL] Transformer + Fold	2.524 ± 0.146	paper	Fold for MTL	22,683,820	4 × Tesla V100 (32GB)
11	[MTL] Transformer + SSP	2.651 ± 0.034	paper	SSP for MTL	22,072,324	4 × Tesla V100 (32GB)
12	[MTL] Transformer + Contact	2.733 ± 0.126	paper	Contact for MTL	22,333,442	4 × Tesla V100 (32GB)
13	CNN	2.796 ± 0.071	paper	/	7,451,649	1 × Tesla V100 (32GB)
14	LSTM	2.853 ± 0.124	paper	/	27,489,928	4 × Tesla V100 (32GB)
15	DDE	2.908 ± 0.043	paper	/	730,625	4 × Tesla V100 (32GB)
16	Moran	2.984 ± 0.026	paper	/	648,705	4 × Tesla V100 (32GB)
17	ProtBert (fix)	2.996 ± 0.462	paper	BFD for pre-train	422,030,337	4 × Tesla V100 (32GB)
18	ResNet	3.005 ± 0.244	paper	/	11,562,498	4 × Tesla V100 (32GB)
19	ESM-1b (fix)	3.031 ± 0.014	paper	UniRef50 for pre-train	655,638,455	4 × Tesla V100 (32GB)

Protein-Ligand Interaction (PLI) Prediction

Leaderboard for PLI Affinity Prediction on PDBbind

Task type - Protein-ligand Regression
Dataset statistics - #Train: 16,436 #Validation: 937 #Test: 285
Evaluation metric - RMSE on the test set (the lower, the better)
Dataset splitting scheme - Random split; remove redundancy in training and validation sets with 90% sequence identity cutoff against the test set.
Description - Models are asked to predict the binding affinity between a protein and a ligand, measured by pKd (a real number).

Rank	Method	Test RMSE	Reference	External data	#Params	Hardware
1	[MTL] CNN + SSP	1.295 ± 0.030	paper	SSP for MTL	8,984,068	4 × Tesla V100 (32GB)
2	[MTL] CNN + Fold	1.316 ± 0.064	paper	Fold for MTL	10,205,868	4 × Tesla V100 (32GB)
3	[MTL] CNN + Contact	1.328 ± 0.033	paper	Contact for MTL	10,030,594	4 × Tesla V100 (32GB)
4	ESM-1b (fix)	1.368 ± 0.076	paper	UniRef50 for pre-train	655,790,519	4 × Tesla V100 (32GB)
5	CNN	1.376 ± 0.008	paper	/	7,931,393	1 × Tesla V100 (32GB)
6	[MTL] Transformer + SSP	1.387 ± 0.019	paper	SSP for MTL	22,814,212	4 × Tesla V100 (32GB)
7	[MTL] ESM-1b + SSP	1.419 ± 0.026	paper	UniRef50 for pre-train; SSP for MTL	657,434,042	4 × Tesla V100 (32GB)
8	[MTL] ESM-1b + Fold	1.435 ± 0.015	paper	UniRef50 for pre-train; Fold for MTL	658,960,994	4 × Tesla V100 (32GB)
9	ResNet	1.441 ± 0.064	paper	/	12,304,386	4 × Tesla V100 (32GB)
10	Transformer	1.455 ± 0.070	paper	/	22,550,017	4 × Tesla V100 (32GB)
11	ProtBert (fix)	1.457 ± 0.024	paper	BFD for pre-train	422,510,081	4 × Tesla V100 (32GB)
12	LSTM	1.457 ± 0.131	paper	/	28,215,432	4 × Tesla V100 (32GB)
13	[MTL] ESM-1b + Contact	1.458 ± 0.003	paper	UniRef50 for pre-train; Contact for MTL	659,069,880	4 × Tesla V100 (32GB)
14	[MTL] Transformer + Fold	1.531 ± 0.181	paper	Fold for MTL	23,425,708	4 × Tesla V100 (32GB)
15	ESM-1b	1.559 ± 0.164	paper	UniRef50 for pre-train	655,790,519	4 × Tesla V100 (32GB)
16	ProtBert	1.562 ± 0.072	paper	BFD for pre-train	422,510,081	4 × Tesla V100 (32GB)
17	[MTL] Transformer + Contact	1.574 ± 0.215	paper	Contact for MTL	23,075,330	4 × Tesla V100 (32GB)

Leaderboard for PLI Affinity Prediction on BindingDB

Task type - Protein-ligand Regression
Dataset statistics - #Train: 7,900 #Validation: 878 #Test: 5,230
Evaluation metric - RMSE on the test set (the lower, the better)
Dataset splitting scheme - Four protein classes (ER, GPCR, ion channels and receptor tyrosine kinases) are held out from training and validation for generalization test.
Description - Models are asked to predict the binding affinity between a protein and a ligand, measured by pKd (a real number).

Rank	Method	Test RMSE	Reference	External data	#Params	Hardware
1	[MTL] CNN + Fold	1.462 ± 0.044	paper	Fold for MTL	10,205,868	4 × Tesla V100 (32GB)
2	[MTL] Transformer + Fold	1.464 ± 0.007	paper	Fold for MTL	23,425,708	4 × Tesla V100 (32GB)
3	[MTL] CNN + SSP	1.481 ± 0.036	paper	SSP for MTL	8,984,068	4 × Tesla V100 (32GB)
4	[MTL] ESM-1b + SSP	1.482 ± 0.014	paper	UniRef50 for pre-train; SSP for MTL	657,434,042	4 × Tesla V100 (32GB)
5	[MTL] ESM-1b + Contact	1.490 ± 0.033	paper	UniRef50 for pre-train; Contact for MTL	659,069,880	4 × Tesla V100 (32GB)
6	[MTL] Transformer + Contact	1.490 ± 0.058	paper	Contact for MTL	23,075,330	4 × Tesla V100 (32GB)
7	CNN	1.497 ± 0.022	paper	/	7,931,393	4 × Tesla V100 (32GB)
8	[MTL] CNN + Contact	1.501 ± 0.035	paper	Contact for MTL	10,030,594	4 × Tesla V100 (32GB)
9	[MTL] ESM-1b + Fold	1.511 ± 0.017	paper	UniRef50 for pre-train; Fold for MTL	658,960,994	4 × Tesla V100 (32GB)
10	[MTL] Transformer + SSP	1.519 ± 0.050	paper	SSP for MTL	22,814,212	4 × Tesla V100 (32GB)
11	ProtBert	1.549 ± 0.019	paper	BFD for pre-train	422,510,081	4 × Tesla V100 (32GB)
12	ESM-1b	1.556 ± 0.047	paper	UniRef50 for pre-train	655,790,519	4 × Tesla V100 (32GB)
13	ResNet	1.565 ± 0.033	paper	/	12,304,386	4 × Tesla V100 (32GB)
14	Transformer	1.566 ± 0.052	paper	/	22,550,017	4 × Tesla V100 (32GB)
15	ESM-1b (fix)	1.571 ± 0.032	paper	UniRef50 for pre-train	655,790,519	4 × Tesla V100 (32GB)
16	LSTM	1.572 ± 0.022	paper	/	28,215,432	4 × Tesla V100 (32GB)
17	ProtBert (fix)	1.649 ± 0.022	paper	BFD for pre-train	422,510,081	4 × Tesla V100 (32GB)

Benchmark for Protein Sequence Understanding (PEER)