Skip to content

Add atom container align and rmsd methods (Zube 2914).#173

Open
ktbolt wants to merge 5 commits intomasterfrom
atom-container-align
Open

Add atom container align and rmsd methods (Zube 2914).#173
ktbolt wants to merge 5 commits intomasterfrom
atom-container-align

Conversation

@ktbolt
Copy link
Copy Markdown

@ktbolt ktbolt commented Aug 24, 2017

Here's the alignment code. I've also included a pytest and notebook example.

I'd like to add visualization to the notebook.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Aug 24, 2017

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@avirshup avirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks really, really good. Here are some picky style comments.

Comment thread moldesign/mathutils/align.py Outdated
two paired sets of points.

Args:
src_points(numpy.ndarray): An Nx3 array of the 3D source points.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring parser expects a space between variable name and the type, i.e.,
src_points (numpy.ndarray): [...]

# limitations under the License.

import numpy as np
from math import sqrt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super-picky, but: standard library imports should go above third-party module imports. Why? Because PEP8 says so

Comment thread moldesign/mathutils/align.py Outdated
Args:
points1(numpy.ndarray): An Nx3 array of the 3D points.
points2(numpy.ndarray): An Nx3 array of the 3D points.
centered(Boolean): If true then the points are centered around (0,0,0).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types should be spelled bool and float

Comment thread moldesign/mathutils/align.py Outdated
"""

if points1.shape[0] != points2.shape[0]:
raise ValueError('The number of points for calculating the RMSD between the first array %d does not equal the number of points in the second array %d' % (num_points1, num_points2))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should reflow this onto multiple lines to keep everything within 100 columns

Comment thread moldesign/molecules/atomcollections.py Outdated
other (AtomContainer): The list of atoms to calculate the RMSD with.

Returns:
rmsd_error(Float): The root mean square deviation measuring the alignment error.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a Scalar[length] object

Comment thread moldesign/molecules/atomcollections.py Outdated
other (AtomContainer): The list of atoms to align to.

Returns:
rmsd_error(Float): The root mean square deviation measuring the alignment error.
Copy link
Copy Markdown
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a Scalar[length] object (i.e., a number with length units)


# Calculate the 4x4 matrix transforming self_coords into other_coords.
rmsd_error, xform = rmsd_align(self_coords, other_coords)
return rmsd_error, xform
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should apply the transformation before returning. Three cases to handle:

  1. if all the atoms in this container are part of the same molecule, then transform the entire molecule:
self.atoms[0].molecule.transform(xform)
  1. if all the atoms in this container don't belong to any molecule, just transform the atoms in this container
self.transform(xform)
  1. if all the atoms in this container belong to different molecules, raise ValueError

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method should apply the transformation because what it transforms is based on the contents of the atom lists, a really significant operation, transforming the entire molecule, need to be aware of what your atom lists contain.

I don't see any reason to restrict the container atoms to a single molecule either, could be useful to position a group of molecules.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely see your point here. In terms of API philosophy, I still like the idea of a one-line method to automatically align an entire system, but we definitely need something that just calculates a transformation without applying it.

Before going further, let's meet up (offline) and write out some API use cases to see what makes sense.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's discuss offline.

Comment thread moldesign/mathutils/align.py Outdated
@exports
def rmsd_align(src_points, dst_points):
""" Calculate the 4x4 transformation matrix that aligns a set of source points
to a set of destination of points.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring text should be aligned at the same indentation level as the quotes, except for the first line. So this should look like this:

    """ Calculate the 4x4 transformation matrix that aligns a set of source points 
    to a set of destination of points.
    [...]
    """

Not like this:

    """ Calculate the 4x4 transformation matrix that aligns a set of source points 
        to a set of destination of points.
        [...]
    """

Comment thread moldesign/mathutils/align.py Outdated
d = np.linalg.det(V) * np.linalg.det(U)

# Make sure the rotation preserves orientation (det = 1).
if d < 0.0:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍 👍 👍 👍

Copy link
Copy Markdown
Contributor

@avirshup avirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more comments. One big thing (reversing myself on this) - the mathutils routines should be able to handle quantities with units in addition to bare numpy arrays (some sample code to deal with this in the comments below).

This makes the xform matrix tricky, since only some of its elements have units. My preference is probably to replace it with a class, but we should discuss.

ETA - here's my sketch for what the interface for an xform object could look like:

class Transform(object):
    """ A combined translation and rotation.

    Attributes:
        rotation (np.ndarray): 3x3 rotation matrix
        translation (Vector[len=3]): translation vector (including units if applicable)

    This is equivalent to the tranform described by the 4x4 matrix:
    [  self.rotation  self.translation]
    [  0              1               ]

    """
    def __init__(self, translation, rotation):
        self.translation = translation
        self.rotation = rotation

    def transform_coords(self, other):
        """ Transform the coordinates of the passed object and return them.

        Does not modify the passed object (use ``self.apply`` for that).
        This method is overloaded to accept _either_ a set of coordinates OR any atom container.

        Args:
            other (Matrix[shape=(*,3)] OR AtomContainer): object to transform the coordinates of

        Returns:
            Matrix[shape=(*,3)]: transformed coordinates.
        """

    def apply(self, other):
        """ Apply this transformation to the coordinates of the passed object.

        This will modify the coordinates of the passed object in-place.
        This method is overloaded to accept _either_ a set of coordinates OR any atom container.

        Args:
            other (Matrix[shape=(*,3)] OR AtomContainer): object to transform
        """

Comment thread moldesign/mathutils/align.py Outdated
dst_center = dst_points.sum(axis=0) / num_points

# Calculate correlation matrix.
C = np.dot(np.transpose(dst_points[:]-dst_center), src_points[:]-src_center)
Copy link
Copy Markdown
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would prefer to rename C to something like correlation_mat to improve readability.
  2. The variable names should be lowercase - it makes life easier for various introspection tools


@exports
def rmsd_align(src_points, dst_points):
""" Calculate the 4x4 transformation matrix that aligns a set of source points
Copy link
Copy Markdown
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the code, I'm going to contradict myself - I think the routines in mathutils should optionally handle both MdtQuantities (ie numbers with units) as well as numpy arrays. Trying to strip units every time we want to use these routines is just too error-prone.

Easiest way to do this is probably just strip units out at the beginning and add them at the end. We already do this in several places in the code, but all in a haphazard way (feel free to come up with a cleaner way to do it). Here's the basic idea though.

To strip units at the beginning:

if isinstance(src_points, mdt.units.MdtQuantity):
    units = mdt.units.get_units(src_points)
    p1 = units.value_of(src_points)
    p2 = units.value_of(dst_points)
else:
   units = None

To reapply units at the end:

if units is not None:
    rmsd_error *= units

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that numpy plays well with pint units, no need to convert the input points. The only modifications needed is to set units and ensure matrix/vector multiply retains the correct units.

Check for units:

if isinstance(src_points, mdt.units.MdtQuantity):
    units = mdt.units.get_units(src_points)
else:
    units = 1.0

Multiply martix/vector product by units:

    cdiff = rot_mat.dot(src_points[i]-src_center)*units - (dst_points[i]-dst_center)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too!

… code to support units in align functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants