TorchProtein

Here, we provide four tutorials to show new users a general picture of TorchProtein from four perspectives, the protein data structure, the solution to sequence-based protein property prediction, the solution to structure-based property prediction and how to learn protein representations from unlabeled protein structures.

Tutorial 1 - Protein Data Structure

In this tutorial, we will learn TorchProtein from following aspects:

How TorchProtein represents the sequence and structure of a protein with a unified data structure;
How to read/write a protein with such data structure from/to a file;
What operations we can perform to analyze the protein data;
What atom-, residue- and protein-level attributes are incorporated in the data by default;
How we can register customized attributes of proteins.

Tutorial 2 - Sequence-based Protein Property Prediction

In this tutorial, we will learn TorchProtein from following aspects:

How to fetch a protein sequence dataset and specify the transformation functions we would to perform on each sample;
How to construct a sequence-based model to extract protein sequence representations;
Five types of protein sequence understanding tasks considered in TorchProtein;
How can we solve each type of tasks by wraping a protein sequence encoder into a task-specific module;
How to instantiate an engine to conduct training and evaluation.

Tutorial 3 - Structure-based Protein Property Prediction

In this tutorial, we will learn TorchProtein from following aspects:

How to fetch a protein structure dataset for function prediction and specify the transformation functions applied on each sample;
How to better represent the geometric structures of proteins with various dynamic graph construction methods;
How to construct a superior protein structure encoder;
How can we solve the function prediction task by wraping the structure encoder into a task-specific module;
How to define an engine that accommodates training and evaluation.

Tutorial 4 - Pre-trained Protein Structure Representations

In this tutorial, we will learn TorchProtein from following aspects:

How to fetch an unlabeled protein structure dataset for pre-training and specify the transformation functions applied on each sample;
Effectively representing the geometric structures of proteins through dynamic graph construction methods;
The definition of a superior protein structure encoder;
How to pre-train the protein structure encoder via two typical self-supervised learning approaches;
How to fine-tune the pre-trained encoder on a structure-based protein function prediction task.

Note. For more details about the interfaces involved in these tutorials, please refer to the document.

Tutorials

Tutorial 1 - Protein Data Structure

Tutorial 2 - Sequence-based Protein Property Prediction

Tutorial 3 - Structure-based Protein Property Prediction

Tutorial 4 - Pre-trained Protein Structure Representations

TorchProtein