Add atom container align and rmsd methods (Zube 2914).#173
Add atom container align and rmsd methods (Zube 2914).#173
align and rmsd methods (Zube 2914).#173Conversation
avirshup
left a comment
There was a problem hiding this comment.
This code looks really, really good. Here are some picky style comments.
| two paired sets of points. | ||
|
|
||
| Args: | ||
| src_points(numpy.ndarray): An Nx3 array of the 3D source points. |
There was a problem hiding this comment.
The docstring parser expects a space between variable name and the type, i.e.,
src_points (numpy.ndarray): [...]
| # limitations under the License. | ||
|
|
||
| import numpy as np | ||
| from math import sqrt |
There was a problem hiding this comment.
This is super-picky, but: standard library imports should go above third-party module imports. Why? Because PEP8 says so
| Args: | ||
| points1(numpy.ndarray): An Nx3 array of the 3D points. | ||
| points2(numpy.ndarray): An Nx3 array of the 3D points. | ||
| centered(Boolean): If true then the points are centered around (0,0,0). |
There was a problem hiding this comment.
These types should be spelled bool and float
| """ | ||
|
|
||
| if points1.shape[0] != points2.shape[0]: | ||
| raise ValueError('The number of points for calculating the RMSD between the first array %d does not equal the number of points in the second array %d' % (num_points1, num_points2)) |
There was a problem hiding this comment.
Should reflow this onto multiple lines to keep everything within 100 columns
| other (AtomContainer): The list of atoms to calculate the RMSD with. | ||
|
|
||
| Returns: | ||
| rmsd_error(Float): The root mean square deviation measuring the alignment error. |
There was a problem hiding this comment.
This should return a Scalar[length] object
| other (AtomContainer): The list of atoms to align to. | ||
|
|
||
| Returns: | ||
| rmsd_error(Float): The root mean square deviation measuring the alignment error. |
There was a problem hiding this comment.
This should return a Scalar[length] object (i.e., a number with length units)
|
|
||
| # Calculate the 4x4 matrix transforming self_coords into other_coords. | ||
| rmsd_error, xform = rmsd_align(self_coords, other_coords) | ||
| return rmsd_error, xform |
There was a problem hiding this comment.
This method should apply the transformation before returning. Three cases to handle:
- if all the atoms in this container are part of the same molecule, then transform the entire molecule:
self.atoms[0].molecule.transform(xform)- if all the atoms in this container don't belong to any molecule, just transform the atoms in this container
self.transform(xform)- if all the atoms in this container belong to different molecules,
raise ValueError
There was a problem hiding this comment.
I don't think this method should apply the transformation because what it transforms is based on the contents of the atom lists, a really significant operation, transforming the entire molecule, need to be aware of what your atom lists contain.
I don't see any reason to restrict the container atoms to a single molecule either, could be useful to position a group of molecules.
There was a problem hiding this comment.
Definitely see your point here. In terms of API philosophy, I still like the idea of a one-line method to automatically align an entire system, but we definitely need something that just calculates a transformation without applying it.
Before going further, let's meet up (offline) and write out some API use cases to see what makes sense.
| @exports | ||
| def rmsd_align(src_points, dst_points): | ||
| """ Calculate the 4x4 transformation matrix that aligns a set of source points | ||
| to a set of destination of points. |
There was a problem hiding this comment.
Docstring text should be aligned at the same indentation level as the quotes, except for the first line. So this should look like this:
""" Calculate the 4x4 transformation matrix that aligns a set of source points
to a set of destination of points.
[...]
"""Not like this:
""" Calculate the 4x4 transformation matrix that aligns a set of source points
to a set of destination of points.
[...]
"""| d = np.linalg.det(V) * np.linalg.det(U) | ||
|
|
||
| # Make sure the rotation preserves orientation (det = 1). | ||
| if d < 0.0: |
There was a problem hiding this comment.
Two more comments. One big thing (reversing myself on this) - the mathutils routines should be able to handle quantities with units in addition to bare numpy arrays (some sample code to deal with this in the comments below).
This makes the xform matrix tricky, since only some of its elements have units. My preference is probably to replace it with a class, but we should discuss.
ETA - here's my sketch for what the interface for an xform object could look like:
class Transform(object):
""" A combined translation and rotation.
Attributes:
rotation (np.ndarray): 3x3 rotation matrix
translation (Vector[len=3]): translation vector (including units if applicable)
This is equivalent to the tranform described by the 4x4 matrix:
[ self.rotation self.translation]
[ 0 1 ]
"""
def __init__(self, translation, rotation):
self.translation = translation
self.rotation = rotation
def transform_coords(self, other):
""" Transform the coordinates of the passed object and return them.
Does not modify the passed object (use ``self.apply`` for that).
This method is overloaded to accept _either_ a set of coordinates OR any atom container.
Args:
other (Matrix[shape=(*,3)] OR AtomContainer): object to transform the coordinates of
Returns:
Matrix[shape=(*,3)]: transformed coordinates.
"""
def apply(self, other):
""" Apply this transformation to the coordinates of the passed object.
This will modify the coordinates of the passed object in-place.
This method is overloaded to accept _either_ a set of coordinates OR any atom container.
Args:
other (Matrix[shape=(*,3)] OR AtomContainer): object to transform
"""| dst_center = dst_points.sum(axis=0) / num_points | ||
|
|
||
| # Calculate correlation matrix. | ||
| C = np.dot(np.transpose(dst_points[:]-dst_center), src_points[:]-src_center) |
There was a problem hiding this comment.
- Would prefer to rename
Cto something likecorrelation_matto improve readability. - The variable names should be lowercase - it makes life easier for various introspection tools
|
|
||
| @exports | ||
| def rmsd_align(src_points, dst_points): | ||
| """ Calculate the 4x4 transformation matrix that aligns a set of source points |
There was a problem hiding this comment.
After looking at the code, I'm going to contradict myself - I think the routines in mathutils should optionally handle both MdtQuantities (ie numbers with units) as well as numpy arrays. Trying to strip units every time we want to use these routines is just too error-prone.
Easiest way to do this is probably just strip units out at the beginning and add them at the end. We already do this in several places in the code, but all in a haphazard way (feel free to come up with a cleaner way to do it). Here's the basic idea though.
To strip units at the beginning:
if isinstance(src_points, mdt.units.MdtQuantity):
units = mdt.units.get_units(src_points)
p1 = units.value_of(src_points)
p2 = units.value_of(dst_points)
else:
units = NoneTo reapply units at the end:
if units is not None:
rmsd_error *= unitsThere was a problem hiding this comment.
It seems that numpy plays well with pint units, no need to convert the input points. The only modifications needed is to set units and ensure matrix/vector multiply retains the correct units.
Check for units:
if isinstance(src_points, mdt.units.MdtQuantity):
units = mdt.units.get_units(src_points)
else:
units = 1.0
Multiply martix/vector product by units:
cdiff = rot_mat.dot(src_points[i]-src_center)*units - (dst_points[i]-dst_center)
… code to support units in align functions.
Here's the alignment code. I've also included a pytest and notebook example.
I'd like to add visualization to the notebook.