- Enabling the infrastructure for use
- Reading in a chain from PDB file and displaying it
- Generating an arbitrary chain of some set of residues
- Creating a subchain of a protein
- Manipulating a protein
- Querying a chain for data in various ways
- Applying Rotamer
- Visualizing a protein and conformations
- Deleting / obliterating a protein
- Accessing collision checking methods
- Creating a protein with the trimmed sidechain
To use the LoopTK infrastructure for protein manipulation, it needs to first load in data from the "resources/" directory.
The data in this directories defines all the atoms, blocks, and residues. This directory MUST be in the same directory as
the executable using LoopTK. Once this is done, initialize the infrastructure by calling:
LoopTK::Initialize(ENABLE_WARNINGS);
One can also pass in SUPPRESS_WARNINGS so that the infrastructure never displays warnings. A warning is displayed when a
non-fatal situation is encountered that may be contrary to what the user wanted to accomplish.
Suppose we have a pdb file called "pdbTest.pdb". The following code will do the trick:
PProtein *protein = PDBIO::readFromFile("pdbTest.pdb"); //gets the protein
PChainNavigator(protein).Run(); //makes and executes a visualizer for the protein
The identifiers for residues are locating in the PID class within "PConstants.h". Use these identifiers to specify which
residues to add to a PProtein. When all the residues are added, make sure to call finalize() on the PProtein. This is the
function that will actually finish constructing the chain and make it functional. Suppose we want to make a 4 residue chain
with the residues "PHE", "HIS", "ALA", and "TRP":
PProtein *protein = new PProtein();
protein->AddResidue(PID::PHE);
protein->AddResidue(PID::HIS);
protein->AddResidue(PID::ALA);
protein->AddResidue(PID::TRP);
protein->finalize(); //completes construction
A subchain of a protein shares all the atoms and bonds. So, manipulating a subchain manipulates the protein too, but the
subchain only affects the residues within it. The following code grabs the residues indexed from 1 to 7 from an already
generated PProtein called "protein":
PProtein *subchain = new PProtein(protein,1,7);
For example, when subchain is rotated, the residues in the rest of the protein will not be affected.
A protein is manipulated by rotating a it around some bond some number of degrees. Manipulation requires a specification of
"direction", since during a rotation, one part of the protein remains static and the other part is moving. The following code
takes the 3rd DOF on the backbone, and rotates the protein 120 degrees in the "forward direction".
protein->RotateBackbone(2,forward,120);
Note that this is equivalent to:
protein->RotateChain(PID::BACKBONE,2,forward,120);
where this function takes as a parameter the block type the DOF should be queried from.
Remember that in protein, there are two DOF for each amino acid backbone: N-Ca and Ca-C.
The DOF of the sidechain is dependent on the type of amino acid.
Look at the *.blk files to see which bonds in the side chain has DOF set to 1.
When manipulating a protein which has a subchain, manipulating the residue
that is part of a subchain will only affect the subchain. This provides an alternative
method of manipulating the subchain using consistent indexing.
There are various hierarchies of data stored in a PProtein. The chain level contains information about residues, DOFs,
anchors, and end effectors of the chain. The residue level contains information about what atoms exist in a residue, and the
atoms know about their position and what they're colliding with. Here are a number of queries:
Getting the first residue in the chain:
PResidue *firstRes = protein->GetResidue(0);
Getting the number of residues in the chain:
int numResidues = protein->size();
Getting the C_Alpha atom's position in the 5th residue:
Vector3 ca_position = protein->GetResidue(4)->getAtomPosition(PID::C_ALPHA);
Detect if there are self-collisions (collisions of the chain with itself):
if(subChain->InSelfCollision()) cout<<"Subchain is in collision with itself.";
Detecting if the 5th residue is in collision:
if(subChain->getResidue(4)->InAnyCollision()) cout<<"Residue at index 4 is in collisions.";
It is possible to apply rotamer conformations to the sidechains so that most likely sidechain conformation can be found
quickly.
To apply rotamer to a particular residue that is collision free:
(Run multiple times to cycle through all possible rotamers.)
protein->getResidue(1)->ApplyRotamer();
Alternatively, to manually apply a desired rotamer, first check how many rotamers are available for a residue:
int rotamersize = PResources::GetRotamerSize(PID::HIS);
Then, save the current sidechain position (so it can be undone later):
protein->getResidue(1)->SaveSideChain();
Apply the desired rotamer on the given residue:
protein->getResidue(1)->ApplyRotamer(0);
Undo rotamer to saved state if not desired:
protein->getResidue(1)->ResetSideChain();
LoopTK provides a class for visualizing and manipulating proteins and conformation spaces.
Use PChainNavigator.h or PConfSpaceNavigator.h
PChainNavigator:
Manipulation
r: Rotate current backbone DOF by 5 degrees
e: Rotate current backbone DOF by -5 degrees
>: Increase backbone DOF Index
<: Decrease backbone DOF Index
[: Set backbone DOF Index to 0
]: Set backbone DOF Index to last
v: Randomize DOFs of protein
Sidechain
b: Toggle Sidechain visibility.
D: Disable sidechain
E: Enable sidechain
CCD
a: CCD Descent 1 step
c: CCD Descent until 0.001
IO:
s: Write PDB to "output.pdb"
Rotamer
1: Set sidechain DOF index to 0
2: Increment sidechain DOF index
3: Decrement sidechain DOF index
4: Rotate the current sidechain DOF by 5 degrees
5: Rotate the current sidechain DOF by -5 degrees
6: Set Residue index to 0
7: Increment Residue index
8: Decrement Residue index
9: Apply Rotamer to currentl residue
0: Reset the sidechain config
/: Toggle current residue highlight
PConfSpaceNavigator:
b: Toggle sidechain visibility
l: Toggle loop visibility
-: Derement conformation index.
=: Increment conformation index.
d: Display one conformation only.
s: Toggle section display.
+: Incresase quality
-: Decrease quality.
When should we use "delete" and when should we use "obliterate"?
What are the differences between them?
Deleting a protein will deallocate it and all of its children but not its parents. Calling "obliterate" is equivalent to
deleting the top-level protein. PDBIO::readFromFile returns the entire protein, so deleting it and calling "obliterate" will
do the same thing. The reason for including "obliterate" is that if you only have a subchain, you can still deallocate the
whole protein. Consider the following examples:
PProtein *p = PDBIO::readFromFile("2CRO.pdb");
PProtein *p2 = new PProtein(p,5,10);
delete p2;
PProtein *p3 = new PProtein(p,12,15); // valid
PProtein *p = PDBIO::readFromFile("2CRO.pdb ");
PProtein *p2 = new PProtein(p,5,10);
p2->Obliterate();
PProtein *p3 = new PProtein(p,12,15); // invalid since "p" is now deallocated
PChain objects have a method called "getSpaceManager()" which gives direct access to the grid (via an abstract interface).
Recall that PProtein objects are subtypes of PChain. For example, to get atoms from "PProtein *p" that collide with a ball
centered at (1,2,3) with radius 5, the following code will do the trick:
PSpaceManager *m = p->getSpaceManager();
list colliding = m->AtomsNearPoint(Vector3(1,2,3),5);
PTools::CreateSlimProtein() does this. It takes as input a PProtein and returns a new PProtein object with the sidechains
trimmed to only contain the c-beta atom.