Skip to content

holehouse-lab/PIMMS

PIMMS: Polymer Interactions in Multi-component MixtureS


Travis Build Status codecov


PIMMS version v0.1.4 (March 2026)

Preamble

PIMMS is still in development, but in general the master branch on this repository can be considered (mostly) stable.

As of version 0.1.37 (March 2024), we consider this to be the 'gold' release with respect to features, and further updates before we increment to 0.2.0 will include only bug fixes, additional tests, and performance improvements

The current version (0.1.4)

As you use this version of PIMMS, please report any/all issues where things:

  1. Are wrong/don't make sense
  2. Don't behave in a way you expect
  3. Errors/exceptions when you wouldn't expect them

Please log any issues in the issue tracker.

There shouldn't be any bugs in this version... but there could be. So we're working on it! This includes developing and deploying a large unit test suite, which takes time, but we're working hard!

Background

What is PIMMS?

PIMMS is a lattice-based simulation engine that allows both 2D and 3D simulations. Useful features include:

  1. Easy to use! Upon installation, a command-line executable (PIMMS) is available, should be in your $PATH variable, and can be used to run simulations. No messing around; it (should) just work!
  2. Easy to define interaction parameters through a simple parameter file (example included in /demo_keyfiles/demo_1/params.prm)
  3. Easily run fast 2D or 3D lattice-based simulations
  4. Run simulations with many distinct components
  5. Run simulations of a single homo or heteropolymer
  6. Run simulations of many copies of polymers to explore phase behavior
  7. Drive interactions over three distinct length scales
  8. Various other things

How is PIMMS written?

PIMMS is written almost fully in Python (=>3.7, 3.12 recommended), with the most computationally intensive parts written in fully optimized Cython that compiles down to native C.

With most of the complex behavior implemented in Python, maintenance and development are fast and efficient. However, certain functionality (i.e., analysis of large systems) is, as a result, disproportionately expensive vs. the actual simulation, so you may wish to alter the frequency at which certain analysis routines are performed based on your interests.

PIMMS is a relatively large codebase of ~20K lines of (mostly) Python code. As mentioned, it is under active development, including streamlining and optimization. Several features currently built into PIMMS are not documented here, either because they are not quite ready or are still in development. Again, we're working on finalizing all this up.

Who develops PIMMS?

Alex Holehouse developed an initial version of PIMMS during his time in the Pappu lab. Since starting his own lab, much of PIMMS has been rewritten, and Dr. Ryan Emenecker has joined as a core developer. PIMMS is maintained exclusively by the Holehouse lab.

Has PIMMS been used in any publications to date?

Why yes, it has, thank you for asking! Please check out:

Alston, J. J. & Soranno, A. Condensation goes viral: a polymer physics perspective. J. Mol. Biol. 167988 (2023).

Soranno, A., Incicco, J. J., De Bona, P., Tomko, E. J., Galburt, E. A., Holehouse, A. S. & Galletto, R. Shelterin Components Modulate Nucleic Acids Condensation and Phase Separation in the Context of Telomeric DNA. J. Mol. Biol. 434, 167685 (2022).

Sankaranarayanan, M., Emenecker, R. J., Wilby, E. L., Jahnel, M., Trussina, I. R. E. A., Wayland, M., Alberti, S., Holehouse, A. S. & Weil, T. T. Adaptable P body physical states differentially regulate bicoid mRNA storage during early Drosophila development. Dev. Cell 56, 2886–2901.e6 (2021).

Moses, D., Yu, F., Ginell, G. M., Shamoon, N. M., Koenig, P. S., Holehouse, A. S. & Sukenik, S. Revealing the Hidden Sensitivity of Intrinsically Disordered Proteins to their Chemical Environment. J. Phys. Chem. Lett. 11, 10131–10136 (2020).

Holehouse, A. S., Ginell, G. M., Griffith, D. & Böke, E. Clustering of Aromatic Residues in Prion-like Domains Can Tune the Formation, State, and Organization of Biomolecular Condensates. Biochemistry 60, 3566–3581 (2021).

Martin, E. W.*, Holehouse, A. S.*, Peran, I.*, Farag, M., Incicco, J. J., Bremer, A., Grace, C. R., Soranno, A., Pappu, R. V. & Mittag, T. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science 367, 694–699 (2020).

Boeynaems, S., Holehouse, A. S., Weinhardt, V., Kovacs, D., Van Lindt, J., Larabell, C., Van Den Bosch, L., Das, R., Tompa, P. S., Pappu, R. V. & Gitler, A. D. Spontaneous driving forces give rise to protein-RNA condensates with coexisting phases and complex material properties. Proc. Natl. Acad. Sci. U. S. A. 116, 7889–7898 (2019).

Martin, E. W.*, Holehouse, A. S.*, Grace, C. R., Hughes, A., Pappu, R. V. & Mittag, T. Sequence Determinants of the Conformational Properties of an Intrinsically Disordered Protein Prior to and upon Multisite Phosphorylation. J. Am. Chem. Soc. 138, 15323–15335 (2016).

Is this the final documentation?

No. We are actively working on a full Sphinx-based readthedocs documentation suite for PIMMS, but for now, this readme file serves as the core PIMMS documentation.

Installation

NB: Installation assumes you have set up a correct conda or uv environment with Python 3.7 or higher (any 3.7 is fine). Using 3.7 or higher is important as there are some language features in 3.7 that we use that were not in earlier versions that PIMMS requires. We recommend using Python 3.10+ (tests and standard development environment is 3.12 for us).

If conda and pip are new to you, there is a lot of documentation on this online, and I'd suggest taking a look at this page here as a first step. Assuming conda is set up, installation should be easy!

Step one: Make sure dependencies are installed

Assuming conda is installed and your in the relevant environment, the first thing to do is ensure the channel conda-forge is available.

Specifically run

conda config --add channels conda-forge

We then recommend creating a clean environment with Python 3.12

conda create -n pimms  python=3.12 -y
conda activate pimms

Note you're welcome to install PIMMS into an existing environment if you want, but we strongly recommend doing this FIRST to ensure things actually install correctly into a vanilla and empty environment. Having a dedicated environment also avoids any dependency clashes and is generally a safer bet.

Assuming this works correctly, next install some standard packages:

pip install numpy scipy cython pandas versioningit

And assuming these work, install mdtraj:

	# Then install mdtraj, which provides the xtc library backend - this is
	# what lets us write VMD-compatible trajectories
	pip install mdtraj

This should all work out of the box without issue. At this stage if anything goes wrong it's outside of my hands (although I'm happy to offer advice).

Step two: Installing PIMMS

Assuming the packages above are installed correctly, the next step is to actually install PIMMS.

Installing directly from GitHub

One way this can be done is by installing directly from GitHub:

pip install --no-build-isolation git+https://github.com/holehouse-lab/PIMMS.git

NB you need the --no-build-isolation or pip will try and build the Wheel in an isolated environment that lacks Cython.

Installing from source

Alternatively, you can download the source and install directly from source:

git clone https://github.com/holehouse-lab/PIMMS.git

Alternatively, to install from source, download the codebase and navigate to the directory where the pyproject.toml file is and run

pip install -e . --upgrade --force-reinstall
Checking install worked

If all seems to have gone off without a hitch, open a new terminal, start up the conda environment you just installed PIMMS in, and run (from any directory) the command:

PIMMS --version

If it worked, you should see:

version <current version number>

NB: the VERY first time you do this may take 5-10 seconds while the internal Python environment initializes, but after that it should be basically instantaneous.

If you have any issues during installtion, please raise an issue on here and we'll try and fix it!

This installation has been tested and works on both Linux and macOS. If someone has a Windows machine and wants to test this out they are more the welcome, but, PIMMS has (AFAIK) never been run on Windows so I would anticipate things not working well out of the box...

Versioning and releases (maintainers)

PIMMS uses versioningit for version resolution (instead of versioneer).

  • Build/version configuration lives in pyproject.toml:
    • build backend: setuptools.build_meta
    • version provider: versioningit
  • Runtime version reporting (pimms.__version__) comes from installed package metadata via importlib.metadata.
  • If version metadata cannot be resolved (for example, source-tree import before install), PIMMS falls back to 0+unknown.

Release/version workflow:

  1. Create an annotated git tag for the release (for example 0.1.39).
  2. Build/install from that tagged commit.
  3. Verify reported version:
    • PIMMS --version
    • or python -c "import pimms; print(pimms.__version__)"

Development builds from non-tagged commits will include local version metadata (for example commit distance/hash) as generated by versioningit.

Usage

Running a simulation

PIMMS simulations require two files

  1. A keyfile, which defines the components of the simulation and all aspects of that simulation.
  2. A parameter file which defines the interactions between distinct components. The keyfile also defines the location of the parameter file.

Simulations are run as follows:

PIMMS -k <keyfile.kf>

For convenience some correctly formatted and annotated keyfiles and parameter files are available in the /demo directory. We recommend that you use these as a starting point for your own. These files have been annotated, but for completeness a more expansive description of the keywords is provided below.

Keyfile keywords

Keyfiles define everything about the system and simulation, including

  • What polymers are present, their sequence, and how many of them are there
  • How long the simulation should run for and how big the simulation box is
  • The frequency of different analysis and output
  • The move-set

Below we outline the keywords you may wish to change. Note that there are additional keywords that control some advanced functionality, but we're still finalizing that behaviour.

System setup keywords

Keyword Format (type) Description
DIMENSIONS INT
(2 or 3)

A x B
or
A x B x C
Size of the simulation box (in lattice units). 2D or 3D (defines if the simulation is a 2D or 3D simulation)
RESIZED_EQUILIBRATION [OPTIONAL] INT
(2 or 3)

A x B
or
A x B x C
Defines alternative simulation dimensions to be used during equilibration; e.g., if you wish to equilibrate chains at a higher concentration to facilitate single condensate forming. MUST be smaller than the dimensions defined by the DIMENSIONS keyword.
CHAIN See description One of the few multi component keywords in PIMMS and the only keyword that can appear multiple times, the CHAIN keyword defines a specific polymer chain and the number of that chain that will exist in the simulation. The format should be

CHAIN : N {CHAIN IDENTIY}

Where N defines the number of the chain and {CHAIN IDENTITY} gives polymer sequence in one-letter alphabet code. As an example

CHAIN : 20 QQQQQQQQQQ

Would give 20 poly-glutamine polymers. In later versions of PIMMS we will be updating this to allow the reading of keyfiles that use three-letter codes
TEMPERATURE FLOAT Simulation temperature to be used (units are arbitrary and depend on the units of the parameter file)
N_STEPS INT Number of main chain Monte Carlo steps
PARAMETER_FILE STRING Relative or absolute path of the parameter file
EQUILIBRATION INT Number of steps to be run as equilibration (i.e. before any analysis or trajectory output is generated)
HARDWALL [OPTIONAL] BOOL Boolean flag set to True or False that defines if a hardwall boundary is used or not. By default periodic boundary conditions (PBC) are used, but if hardwall is set to true the edges of the simulation box are reflective with an infinitely repulsive potential. Default = False
PRINT_FREQ INT Step frequency at which the system status is printed to STDOUT
XTC_FREQ INT Step frequency at which the system system configuration is written to XTC file
EN_FREQ INT Step frequency at which the system energy is written to ENERGY.dat
SEED [OPTIONAL] INT Random seed to allow reproducible runs
ENERGY_CHECK INT Step frequency at which global energy is recalculated and compared to the current current as determined locally on each step. All energy calculations are exact, so this is primarily for sanity checking. For large systems this can be an expensive operation.
NON_INTERACTING [OPTIONAL] BOOL Boolean that defines if the Hamiltonian is non-interacting or not. This is a convenient way to generate "EV" ensembles for the same system configuration. Default = False
ANGLES_OFF BOOL Boolean flag set to True or False that defines if angle potentials are to be used or not. If set to False (or not set), angles from the parameter file will be used. If set to True, angles are ignored and parameter files do not need to define angles. Default = False
EXPERIMENTAL_FEATURES [OPTIONAL] BOOL Boolean flag set to True or False that defines if experimental features are allowed. Strongly recommend if your name is not Ryan or Alex to keep this at False, and even then, check if your last name is Holehouse or Emenenecker because IF NOT you should still keep it False, probably. Default = False
CASE_INSENSITIVE_CHAINS [OPTIONAL] BOOL Boolean flag which, if set to False, means that chain sequence is case sensitive. By default, this is True, which means upon reading a keyfile chains are converted to upper case. However, sometimes you may wish for more unique beads, in which case a lower-case chain can be useful. Default = True
AUTOCENTER [OPTIONAL] BOOL Only relevant for single-chain simulations, but this flag, if set to True, ensures every frame of the resulting trajectory is centered in the middle of the box. This is especially useful if you want to avoid the need to align a trajectory after for analysis or visualization. Default = False
LATTICE_TO_ANGSTROMS [OPTIONAL] FLOAT Defines the conversion for lattice units to Angstroms for the output trajectory file that's generated. NB This ONLY influences the XTC trajectory, not any of the internal analysis which is always returned in lattice units. Default = 3.65

MOVE_* keywords

MOVE_ keywords define the frequency with which different moves are performed during the simulation. The values associated with these keywords must sum up to 1.0, and all must be defined.

Keyword Format (type) Description
MOVE_CRANKSHAFT FLOAT Crankshaft moves drive local chain perturbations and are coded in optimized C (and so very fast). In general, a large fraction of your simulation moveset should be these moves.
CRANKSHAFT_SUBSTEPS INT Defines a multiplier for the number of substeps performed. So each time a crankshaft move is selected, the underlying code performs CRANKSHAFT_SUBSTEPS multiplied by some scaling factors (defined by CRANKSHAFT_MODE) worth of moves for each bead in the system. In this way a single crankshaft move can actually encompass millions of individual MC moves!
CRANKSHAFT_MODE KEYWORD [PROPORTIONAL] defines how chain length influences the multiplier for the crankshaft moves. Use PROPORTIONAL.
MOVE_CHAIN_TRANSLATE FLOAT Single chain rigid body translation
MOVE_CHAIN_ROTATE FLOAT Single chain rigid body rotation
MOVE_CHAIN_PIVOT FLOAT Chain pivot at a random potion
MOVE_CLUSTER_TRANSLATE FLOAT Translate a randomly selected contiguous cluster of chains
MOVE_CLUSTER_ROTATE FLOAT Rotate a randomly selected contiguous cluster of chains

Quench simulation keywords

The following block of keywords defines various options that control quench-based simulations. In quench simulations, the simulation starts at a temperature defined by QUENCH_START and progressively decreases (or increases) to QUENCH_END. This is particularly useful to achieve convergence of complex systems, and is simply an annealing simulation.

Keyword Format (type) Description
QUENCH_RUN BOOL Boolean (true or false) that defines if a quench run will be used. The 'quench' part of a question run always happens first in the simulation, although quenches can be low to high or high to low. Default = False
QUENCH_FREQ INT Frequency at which the temperature is updated.
QUENCH_STEPSIZE FLOAT Step (in temperature) that is taken each time the temperature is updated
QUENCH_START FLOAT Starting temperature
QUENCH_END FLOAT Ending temperature
QUENCH_AS_EQUILIBRATION BOOL Boolean (true or false) that defines if the quench is treated as an equilibration period.

Analysis keywords

The following block of keywords defines various options that control on-the-fly analysis done in PIMMS:

Keyword Format (type) Description
ANALYSIS_FREQ INT Step frequency at which all default analysis is performed. This sets the default for all other types of analysis unless explicitly defined.
ANA_POL INT Step frequency at which polymeric analysis is performed.
ANA_INTSCAL INT Step frequency at which internal scaling analysis is performed.
ANA_DISTMAP INT Step frequency at which distance-map analysis is performed.
ANA_ACCEPTANCE INT Step frequency at which acceptance information is printed out
ANA_INTER_RESIDUE INT Step frequency at which inter-residue interaction analysis is performed
ANA_CLUSTER INT Step frequency at which cluster-based analysis is performed
ANA_RESIDUE_PAIRS INT INT Defines pairs of residues (i.e. "1 5") which are analyzed for inter-residue distance.

Save keywords

Keywords for changing how PIMMS saves your trajectory file.

Keyword Format (type) Description
SAVE_AT_END BOOL Boolean (true or false) that determines whether PIMMS saves your .xtc file at the end of the simulation or saves at each 'save step'. The default is False. If set to true, this will mean the simulation has more sustained RAM usage but may increase simulation performance substantially depending on hardware configuration and setup. It's worth saying the RAM footprint here is not huge, but the bigger issue is if your job crashes/fails, you would lose all trajectory information.
SAVE_EQ BOOL Boolean (true or false) that determines whether PIMMS saves trajectory frames for the equilibration steps of a simulation. The default is True. If set to False, PIMMS begins to save your trajectory frames after the equilibration steps have completed.

Restart file keywords

PIMMS supports running simulations from restart files. This can be useful for a few reasons:

  1. Your simulation crashed and you want to pick up where you left off...

  2. You ran a simulation but want to collect more data for a specific trajectory.

  3. You want to see how changing simulation parameters alters a conclusion given a specific starting state (e.g. temperature, moveset, actual forcefield parameters etc. etc.).

Additionally, PIMMS provides the EXTRA_CHAIN keyword, which lets you take a restart file and then start a new simulation using the chains defined in the restart file AND place new additional chains in the simulation box.

There are a few key restart keywords to be aware of

Keyword Format (type) Description
RESTART_FREQ INT When running a simulation, this defines the frequency with which PIMMS writes a restart file (called restart.pimms out to disk. Note that regardless of what this value is, a final restart file is generated at the end of every simulation, just in case...
RESTART_FILE STRING Defines an absolute or relative path for a PIMMS-generated restart file. If this is provided, the simulation will override the provided DIMENSIONS and HARDWALL keywords and read these from the restart file. HOWEVER, these can be over-ridden using RESTART_OVERRIDE_DIMENSIONS and RESTART_OVERRIDE_HARDWALL
RESTART_OVERRIDE_HARDWALL BOOL Flag which if set (to either TRUE or FALSE) will over-ride the HARDWALL mode defined in the restart file. NB: A simulation can have been run initially with HARDWALL:TRUE and then restarted as HARDWALL:FALSE but the converse is not possible (see FAQ).
RESTART_OVERRIDE_DIMENSIONS INT
(2 or 3)

A x B
or
A x B x C
Enables you to change the dimensions of the box on restart. NB: To change box dimensions the original simulation must have been run with HARDWALL:TRUE and the new box dimensions can only be bigger, not smaller (see FAQ).
EXTRA_CHAINS see description The EXTRA_CHAIN keyword follows the same syntax as the CHAINS keyword and lets you add additional chains into a restart file that did not exist before. This is particularly useful if you want to ask how a system at equilibrium changes upon the addition of some chains. From a chain ID perspective (relevant if you want to freeze chains), extra chains are added after the restart file chains. NOTE that if no RESTART\_FILE is provided, providing an EXTRA_CHAIN keyword will trigger an error to avoid a situation where the keyfile parsing silently ignores EXTRA\_CHAINS in the keyfile. Note also that EXTRA_CHAIN requires that EXPERIMENTAL_FEATURES be set to True.

Restart FAQs

Because restarting simulation is actually pretty powerful, we provide some specific hints/suggestions as to things you can/cannot do...

My original simulation was in a 30x30x30 box - can I restart in a 50x50x50 box? Yes, assuming the initial simulation had HARDWALL:TRUE set. This is because to re-start using a different box size we want to avoid needing to re-position any chains, and if HARDWALL is False this will mean we'd likely have chains going through the periodic boundaries. Note that to do this you must provide the RESTART_OVERRIDE_DIMENSIONS keyword, or the keyfile will simply default to the restart file's dimensions.

My original simulation was in a 30x30x30 box - can I restart in a 20x20x20 box? No - there is currently no way to restart in a smaller box for similar reasons as described above.

My original simulation had HARDWALL:TRUE set, can I restart with HARDWALL:FALSE? Yes! If your original simulation had HARDWALL set to True

Note that to do this you must provide the RESTART_OVERRIDE_HARDWALL keyword, or the keyfile will simply default to the restart file's hardwall flag.

I really need to restart in a smaller box, or after equilibrating with PBC conditions Do you? Do you really? If you REALLY do, we can implement these features by re-equilibrating chains that clash boundaries while holding all others frozen. However, this would require additional code to be written, so if you REALLY need this and have a compelling use case, let me know. However, I'd strongly recommend trying to cast your question in a way that does not require that.

What is a restart file anyway? Good question! A restart file is ACTUALLY just a Python dict saved as a pickle object. It only contains information on the lattice dimensions, prior energy, and chain identity and positions. This means in principle (and in practice) you can actually build your own "restart" file to configure a starting structure for your simulation... This is quite powerful, because you can build structures, surfaces, basically whatever you can imagine as long as you can define that starting structure in terms of one or more chains with one or more beads.

To get technical, the underlying dictionary has four top-level key-value pairs:

  • DIMENSIONS : [x,y,z] dimensions of the box
  • ENERGY : float instantaneous potential energy of the prior configuration. This is not currently used, so you can set this to any value without concern.
  • HARDWALL : bool flag defining if the simulation was using HARDWALL rules or not
  • CHAINS : dict dictionary that maps chain ID to sequence and position of each chain.

The CHAINS dictionary has key-value pairs where each key is a chainID (must start at one and increment monotonically and without gaps) while value is a 3-position list where

  • [0] = position for each bead (list of lists, where each sublist has 2x or 3x elements to define [x,y] or [x,y,z] coordinates.
  • [1] = sequence for each bead
  • [2] = chain type - a unique ID for each chain type, although PIMMS doesn't enforce two sequences with the same sequence to be the same chain type, although we probably expect that to be the case most of the time... We do allow two chains identical in terms of sequence to have different chain types to facilitate specific behavior for a subset of chains of some sequence. However, this is not currently implemented anywhere... As such, in general, it's a good idea to set each unique chain (as defined by sequence) to be its chain type, but two chains with the same sequence should be the same chain type.

Based on this, it should be clear one could, in principle, create a restart file from scratch... Also, combined with a freeze file, one can build complex structures that are frozen (and therefore do not add to the computational cost of the simulation), enabling simulations of polymers under all manner of structural constraints to be entirely feasible.

Freeze chain keywords

As of version 0.1.37, PIMMS supports chain freezing. Briefly, this allows you to pass a freeze file (described below) where you can specify specific chains you want to freeze. Freezing means the chains do not move during the simulation and are not sampled, although they still interact as they would if they could move.

There are two keywords relevant to freeze files: WRITE_CHAIN_TO_CHAINID and FREEZE_FILE.

Keyword Format (type) Description
WRITE_CHAIN_TO_CHAINID BOOL Flag which if set to true means we generate a file called chain_to_chainid.txt which maps the chain ID (a unique index starting at 1) to the sequence as PIMMS represents the chain internally. This is useful in that it makes it straight forward to determine which chain(s) one might want to freeze. The chain_to_chainid.txt file has three columns: (1) chain ID, (2) chain length (3) chain sequence.
FREEZE_FILE STRING Absolute or relative path to a freeze file.

The freeze file format is very simple - it's a two-column file where the first column defines the freeze mode being used (right now, the only available mode is C for chain), and the second column is the chain ID of the chain to freeze. Each chain must be specified on a single line. A # symbol should precede any comments, and comments can be written in-line (i.e., after a specific ahin) or on their own line.

Note that PIMMS will report on the freeze file status during setup. Finally, freeze files work both for de novo simulations and for simulations run from restart files with or without EXTRA_CHAIN. This becomes especially useful in that it enables you to design a specific starting configuration in an arbitrary restart file, and then freeze that configuration.

Forbidden keywords

The following block of keywords defines various options that control temperature-sweep Metropolis Monte Carlo moves. This functionality is not fully ready, so I do not recommend using it for now until we confirm some key things! To keep things secret, we haven't even included a description of the keywords!

Keyword Format (type) Description
TSMMC_JUMP_TEMP FLOAT
TSMMC_STEP_MULTIPLIER INT
TSMMC_INTERPOLATION_MODE STRING
TSMMC_NUMBER_OF_POINTS INT
TSMMC_FIXED_OFFSET INT

Similarly, there are some legacy moves that should also not be altered but must be included. Some of these may be removed for the final release or updated, depending on our ongoing tests.

Keyword Format (type) Description
MOVE_SLITHER FLOAT Slither the chain through the system (BROKEN, do not use). Must be set to 0.0
MOVE_HEAD_PIVOT FLOAT Pivot the head residue of the chain (this is a somewhat redundant move) Must be set to 0.0
MOVE_CTSMMC FLOAT Single chain TSMMC Must be set to 0.0
MOVE_MULTICHAIN_TSMMC FLOAT Multiple randomly selected chains undergo TSMMC Must be set to 0.0
MOVE_SYSTEM_TSMMC FLOAT Entire system undergoes TSMMC Must be set to 0.0
MOVE_RATCHET_PIVOT FLOAT A single chain undergoes a directed pivot move. Must be set to 0.0

Parameter file

The parameter file defines the interactions experienced by the system. Note - EVERY bead defined on a CHAIN must be included in the parameter file and fully defined, with no exceptions.

A parameter file has three sections:

The angle section

PIMMS has a rudimentary 'backbone' angle term. The start of the parameter file includes a section where those angle strengths are defined. The format is

ANGLE_PENALTY <residue name> X X X

For your purposes, these Xs should be 0 - i.e., there is no angle restraint applied. We're still optimizing the implementation here, so I wouldn't use this (yet)

The bead-bead section

Next, for EVERY bead, one must define the bead - bead interaction. PIMMS allows three different distance ranges for interactions (short range, long range, and super long range). These are defined by three distinct values. In this way, of you wanted to define A-B interaction as short, medium, and long, one might write

A	B	-30	-10	-5

This would mean beads A and B are directly adjacent to one another, and an interaction strength of -30 is realized. When they are one site apart -10 and two sites apart -5. These pairwise interactions must be defined for every bead in the system.

The bead-solvent section

Finally, we must ALSO define bead-solvent interactions explicitly; The solvent reserves the bead type 0, such that bead-solvent interactions are

A	0	-5

This would say every solvent-exposed face of bead A provides -5 energy.

Putting it all together

With this, a simple example of a parameter file might be

# angle section
ANGLE_PENALTY <residue name> X X X
ANGLE_PENALTY <residue name> X X X

# bead-bead interactions
A	A -5
A	B -10
B	B 0

# bead-solvent interaction
A	0 0
B	B 0

Output files

PIMMS generates a ton of output files. Below is a brief overview of those files. All files are overwritten when the simulation starts, so simulations can be re-run in the same directory. Output files are subdivided below into distinct types.

General

The following files provide a description of different aspects of the system and simulation.

Filename Explanation
log.txt Contains information on system setup. The current code underutilizes this, and we are expanding the information written here.
parameters_used.prm We've discovered it's useful to explicitly save which parameters were used WITH a simulation for cross-referencing in the future. This means one can 100% reproduce a simulation from the output files.
absolute_energies_of_angles.txt There are two modes that angle energies can be defined, one of which scales the energies by T, in which case the absolute energies depend on the simulation temperature. This file reports those absolute energies and is honestly best used for debugging stuff

System output

The following files report on the 2D or 3D orientation of the simulated system and the frequency at which they are written out is determined by XTC_FREQ.

Filename Explanation
START.pdb This file defines the topology of the simulation system and can be viewed in all good molecular viewers (i.e. VMD: vmd START.pdb)
traj.xtc This trajectory file defines the molecular evolution of the system and can be viewed in most good molecular viewers in conjunction with the topology file [START.pdb] (i.e., for VMD: vmd START.pdb traj.xtc

The following files report the step-dependent value of different aspects of the simulation. For all the following files the first column reports on the relative step number (where each of the move-types is treated as a single step (i.e. ignoring sub-steps).

Filename Explanation
ENERGY.dat Reports on the instantaneous potential energy of the system.
MOVE_FREQS.dat Reports the frequency with which each move type is proposed. Note that for crankshaft, the TOTAL number of moves is reported (i.e., including subset MC steps) such that these values can be interpreted as the absolute number of accept/reject moves proposed.
ACCEPTANCE.dat Shows the same information as in MOVE_FREQS.dat but with accepted moves, allowing the user to back-calculate the acceptance ratio on any move type.
TOTAL_MOVES.dat Tracks the total number of moves (again, moves here include all sub-moves in crankshaft steps, so this counts the number of accept/reject operations performed).
PERFORMANCE.dat For evaluating performance, this file writes out the time per step (using overall steps) and elapsed and expected remaining time. This file is generated every 5th percentile of the simulation (or every 1 step, whichever is larger). Estimated remaining time information is also added to the logging file simultaneously. There is no keyword to control this frequency for consistency across runs.

Instantaneous single-chain analysis files

These files describe the instantaneous analysis that will be most relevant for thinking about a single chain. This means output from this analysis is written at some regular interval as defined by ANALYSIS_FREQ or, if simulations with many chains are run, the associated analysis is performed for every chain, which - if you don't care about it - can be computationally expensive.

Filename Explanation
RG.dat Reports on the instantaneous radius of gyration of the chain(s).
ASPH.dat Reports on the instantaneous asphericity of the chain(s).
END_TO_END_DIST.dat Reports on the instantaneous end-to-end distance of the chain(s)
RES_TO_RES_DIST.dat Reports on the instantaneous residue-to-residue distance, where residue pairs are defined by the keyword ANA_RESIDUE_PAIRS

Summary single-chain analysis files

These files describe the analysis that will be most relevant for thinking about a single chain, but the analysis that is reported ONLY at the end of the simulation represents an ensemble average.

If simulations with many chains are run, the associated analysis is performed for every chain, which - if you don't care about it - can be computationally expensive. The ANA_* keywords control the frequency of analysis moves - in particular, the ANA_POL will scale all polymeric analyses, which can be useful when running simulations in which the behavior of individual chains is of no interest.

Filename Explanation
INTSCAL.dat Reports the ensemble-average instantaneous internal scaling profile.
INTSCAL_SQUARED.dat Reports the ensemble-average instantaneous root-mean squared (RMS) internal scaling profile.
SCALING_INFORMATION.dat Result of analytical fits to the root-mean square scaling profile to extract the apparent scaling exponent and pre-factor that best describes the 1D scaling information. For details on this, see the associated discussion in Peran, Holehouse et al. PNAS 2019
DISTANCE_MAP.dat Reports the ensemble-average inter-residue distance map. A NON_INTERACTING simulation can be run to generate the excluded-volume equivalent, allowing for a scaling map to be easily generated.

Instantaneous multi-chain output

These files describe the analysis that will be most relevant for thinking about multiple chains interacting together. Two types of multi-chain assemblies are analyzed - clusters and long-range (LR) clusters.

Filename Explanation
CLUSTERS.dat Lists the number of chains in each possible cluster. Chains not in a cluster are counted as clusters of "1" chain. Clusters are defined as chains in a continuous connected network that are 1 or 2 lattice sites away from one another.
LR_CLUSTERS.datw Lists the number of chains in each long-range cluster. Chains not in a long-range cluster are counted as clusters of "1" chain. Long-range clusters are defined as chain in a continuous connected network that are 1 or 2 lattice sites away from one another.
NUM_[LR]_CLUSTERS.dat Total number of (long-range) clusters at a given moment.
[LR]_CLUSTER_RG.dat Radius of gyration of each (long range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
[LR]_CLUSTER_ASPH.dat Asphericity of each (long-range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
[LR]_CLUSTER_VOL.dat Volume of each (long-range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
[LR]_CLUSTER_AREA.dat Volume of each (long-range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
[LR]_CLUSTER_DEN.dat Density of each (long-range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
[LR]_CLUSTER_RADIAL_DENSITY_PROFILE.dat Radial density profile of each (long-range) cluster. The clusters defined in CLUSTERS.dat (or LR_CLUSTERS.dat) map to those analyzed here.
CHAIN_<n>_CLUSTERS.dat Fraction of each cluster defined in CLUSTERS.dat that consists of CHAIN .
CHAIN_<n>_LR_CLUSTERS.dat Fraction of each long-range cluster defined in LR_CLUSTERS.dat that consists of CHAIN .

Other

Output files that (for now) are not useful/useable

Filename Explanation
restart.pimms PIMMS allows simulations to be restarted, although this functionality is not yet ready. However, the restart.pimms is the only file required for restart.

Example keyfiles

The pimms.tar.gz tarball comes with two examples under

demo_keyfiles/demo_1  # multi-chain simulation
demo_keyfiles/demo_2  # single-chain simulation
demo_keyfiles/demo_3  # single-chain simulation

The key files here (KEYFILE.kf are heavily annotated, and a separate readme.md is found.

Changelog

0.1.40 (March modernization)

  • See changelog.md for an extensive description of the March Modernization

0.1.39 (TBD currently 0.1.38 patches)

  • Improved PIMMS -i keyword descriptions to capture all valid keywords.
  • Added checks when reading in XTC/PDB files for writing output to provide useful warnings if files are missing.
  • Fixed smol bug where chainType ID was not being correctly incremented when multiple different EXTRA_CHAINS were being added.
  • Added files into the demos/surface showing example of running a simulation on a surface.
  • Added support for non-cubic / square boxes (EXPERIMENTAL_FEATURE) as well as the new keyword EQUILIBRATION_OFFSET
  • Added check to escape infinite loop in case broken chains have somehow been loaded...

0.1.38 (April 2024)

  • Fixed bug where autocenter was not ignored if multiple chains are provided. NOTE there still seems to be a weird bug where autocenter chains saved using SAVE_AT_END occasionally 'jump' in absolute position, but chain never straddles PBC so should not through any issues for chain-centric frame of reference analysis.
  • Fixed bug where if RESIZED_EQUILIBRATION was set to True we tried to write to START.pdb even though eq_START.pdb was created. Also fixed so unused eq_START.pbd and eq_traj.xtc are not created.

0.1.37 (March 2024)

  • Fixed bug in write_positions_to_file() where spacing was not correctly passed to initialize_pdb_file().
  • Added ability to pass an sequence to write_positions_to_file(), although this currently only works/makes sense if you have a SINGLE chain. This could be updated in the future.
  • Actually documented restart functionality and the EXTRA CHAINS keyword, which enables restarted simulations to start with additional chains.
  • Added sanity check to ensure EXTRA_CHAIN is only valid if a restart file is provided
  • Added sanity check to restart file to ensure chain sequence string and chain position list are the same length.
  • Added freeze chain functionality:
    • Freeze chain functionality enables one or more chains in the simulation to be frozen in place. Chains are identified by their chain ID. In PIMMS, each chain is deterministically assigned a unique ID based on the input order (i.e. number of chains, order chains are defined using CHAIN keyword, use of restart file, and EXTRA_CHAIN keyword).
    • To define chains to be frozen, the FREEZE_FILE keyword lets you define the path to a 2-column file that describes what should be frozen, referencing by this chain ID (see FREEZE_FILE keyword).
    • To make it easy to figure out this mapping, we also included a WRITE_CHAIN_TO_CHAINID keyword which if set to True means PIMMS writes a file mapping the chain ID to each chain so you can manually work out which chainID you want to use.
    • Added documentation (both here and internally) for the FREEZE file keywords.
  • KEYWORDS ADDED: FREEZE_FILE, WRITE_CHAIN_TO_CHAINID (discussed above).

0.1.36 (February 2024)

  • Updated docs
  • Updated demo keyfiles
  • Fixed mismatch in integer type breaking PIMMS entirely...
  • Changed performance output, removing the STEPS_PER_SECOND.dat file, which was always questionably useless, and replacing it with a PERFORMANCE.dat, which includes steps per second information, time elapsed, and anticipated time remaining, all nicely formatted in a header-containing output file. This file is always written and is written every 5th percentile through the simulation AND after 20 steps just so if you're running a REALLY long simulation, you can get a ballpark estimate of how bad this is gonna be relatively quickly.
  • Improved many docstrings
  • Ensured SAVE_AT_END and SAVE_EQ status is now written out during initialization
  • Updated log info to get estimated time remaining updates
  • Added safety check to ensure Cython and Numpy intsizes are matched, rather than just having PIMMS crash with an obscure error!
  • Added safety check to ensure PIMMS can accomodate the number of beads on the lattice of the given integer type

0.1.35 (January 2024)

  • Major update to Cython backend to facilitate better control over memory usage. In previous versions, PIMMS defined all back-end grids (chain grids and type grids) as [n x n x n] matrices, where each element was a 64-bit number. Because of the cubic term here, as grids become larger the memory footprint associated with PIMMS becomes very large; for a [200 x 200 x 200] grid, the memory footprint can reach 100s of MBs. In version 0.1.35, the backend memory management has been made dynamic; that is, we can compile PIMMS versions that specify the number of bits associated with the elements in the grids. Right now, the default version compiles with 64-bit integers still, but to recompile with a smaller memory footprint you can change just two flags in CONFIGS.py and the newly created cython_config.pxd; specifically :

      ctypedef cnp.int64_t NUMPY_INT_TYPE 
      
      # can be changed to
      
      ctypedef cnp.int32_t NUMPY_INT_TYPE 
      
      # or 
      
      ctypedef cnp.int16_t NUMPY_INT_TYPE 
    

    While in CONFIGS.py

      NP_INT_TYPE = np.int64
      
      # can be changed to
      
      NP_INT_TYPE = np.int32
      
      # or
      
      NP_INT_TYPE = np.int16
    

    Right now, we continue to default to 64-bit numbers as this is test driven, but ultimately the plan is to transition to a 16-bit backend which basically reduces the memory footprint down to 25% of what it would have been with a 64-bit backend. To what extent this improves performance (steps/second) is unclear, but it HUGELY helps running many parallel jobs on many CPU systems where we actually become memory limited!

  • In addition to the memory re-write, we re-wrote delete_pbc_pairs() in inner_loops_hardwall.pyx to substantially improve performance by fully typing the function - for non 64-bit numbers this adds a ~8x improvement in performance, and maybe 1-2x for native (64-bit) memory implementations.

  • Despite the big improvement in memory utilization, arguably the most important update in version 0.1.35 is a re-write of how trajectory saving is done. In particular, we previously had a punishingly inefficient approach for writing new trajectory frames that was so stupid it almost makes you wonder if it was a deliberate act of sabotage by someone. In any case, we (Ryan) has re-written this code to [firstly] ensure trajectory writing is done in a single output operation of XTC only data (instead of the 3x I/O operations we had previously, [don't ask...]).

  • Beyond this update, we (Ryan) also added the SAVE_AT_END keyword (default = False). If set to True, this means the simulation only writes the entire XTC at the end of the simulation. If you are worried about simulations crashing this is not ideal. However, where this is not a major concern, avoiding many I/O operations offers big gains, especially for larger systems.

  • Added SAVE_EQ keyword. Default True. If set to False, equilibrations steps are not saved. This works for both RESIZED_EQUILIBRATION experiments (an eq.traj file is still made but it only contains a single frame) and when RESIZED_EQUILIBRATION is not used (standard sims). When RESIZED_EQUILIBRATION is not used, PIMMS will begin saving (or updating the trajobj if SAVE_AT_END is set to True) after the equilibration step but does not save before.

  • KEYWORDS ADDED: SAVE_AT_END, SAVE_EQ (discussed above).

  • Version 0.1.35 is the final architectural change prior to the bump to 0.2.0 which will be the first live PIMMS release. Get psyched. Small updated (e.g. 0.1.36) will come after but these will be minor patches.

1.0.34 (September 2023)

  • Major update to Cython backend to improve performance. All numpy arrays are now passed as memory views instead of as new arrays, which reduces the overhead on large arrays substantially
  • This big re-write has been tested extensively without any issues identified
  • The default lattice-to-realspace value (LATTICE_TO_ANGSTROMS) has been updated from 4.0 nm to 3.65 nm
  • KEYWORDS ADDED: CASE_INSENSITIVE_CHAINS, AUTOCENTER

0.1.33-patch

  • Update so logfile is always a new file instead of appended to (pimmslogger.py)

0.1.33 (October 2022)

  • Restructured to define the DEFAULTS dictionary which sets and explains default parameters for keyfiles. This means default options are encoded directly in
  • Updated internal documentation
  • Added reduceD_printing mode
  • KEYWORDS ADDED: REDUCED_PRINTING
  • Fixed bug which could lead to an error when a non-essential keyword was unset
  • If residues are unknown to single letter-to-three letter conversion ensure that first character in the unknown residue type is not a number, because this causes PDB readers to fail.

0.1.32 (April 2022)

  • Added EXTRA_CHAIN keyword
  • Fixed bug in how pdb chain ID was being written for TER lines (always using chain A)
  • Added and improved internal RestartObject code and functionality (including improved parsing)
  • Improved information printed when a RESTART file is used to make it easier to see what is going on.
  • Added CONECT records to output PDB files, so bonds between chains are easily visualized
  • Added LATTICE_TO_ANGSTROMS keyword such that PDB file dimensions are controllable. Default=4 (same as before) so this will not change anything compared to prior simulations.
  • Improved code documentation and removed xtc_utils due to redundancy.

0.1.31 (March 2022)

  • Changed so PDB chains defined by the internal chainType - that is, all chains of same type have same PDB chain ID, which is convenient for visualization

Copyright

Copyright (c) 2015-2024, Alex Holehouse & Ryan Emenecker

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors