Here, we provide four tutorials to show new users a general picture of TorchProtein from four perspectives, the protein data structure, the solution to sequence-based protein property prediction, the solution to structure-based property prediction and how to learn protein representations from unlabeled protein structures.
Tutorial 1 - Protein Data Structure
In this tutorial, we will learn TorchProtein from following aspects:
- How TorchProtein represents the sequence and structure of a protein with a unified data structure;
- How to read/write a protein with such data structure from/to a file;
- What operations we can perform to analyze the protein data;
- What atom-, residue- and protein-level attributes are incorporated in the data by default;
- How we can register customized attributes of proteins.
Tutorial 2 - Sequence-based Protein Property Prediction
In this tutorial, we will learn TorchProtein from following aspects:
- How to fetch a protein sequence dataset and specify the transformation functions we would to perform on each sample;
- How to construct a sequence-based model to extract protein sequence representations;
- Five types of protein sequence understanding tasks considered in TorchProtein;
- How can we solve each type of tasks by wraping a protein sequence encoder into a task-specific module;
- How to instantiate an engine to conduct training and evaluation.
Tutorial 3 - Structure-based Protein Property Prediction
In this tutorial, we will learn TorchProtein from following aspects:
- How to fetch a protein structure dataset for function prediction and specify the transformation functions applied on each sample;
- How to better represent the geometric structures of proteins with various dynamic graph construction methods;
- How to construct a superior protein structure encoder;
- How can we solve the function prediction task by wraping the structure encoder into a task-specific module;
- How to define an engine that accommodates training and evaluation.
Tutorial 4 - Pre-trained Protein Structure Representations
In this tutorial, we will learn TorchProtein from following aspects:
- How to fetch an unlabeled protein structure dataset for pre-training and specify the transformation functions applied on each sample;
- Effectively representing the geometric structures of proteins through dynamic graph construction methods;
- The definition of a superior protein structure encoder;
- How to pre-train the protein structure encoder via two typical self-supervised learning approaches;
- How to fine-tune the pre-trained encoder on a structure-based protein function prediction task.
Note. For more details about the interfaces involved in these tutorials, please refer to the document.