Philosophy and approach
The mdacli project evolved from our experiences in taurenmd and maicos which both build a CLI interface on the fly.
They were developed since we were facing the same problem in the labs regularly. New students/scientists start writing their analysis scripts and face the same challenges and problems i.e.
How to initialize the universe and loop through frames without copying many lines of code?
How to write a CLI parser to analyze several of their simulations?
How to process and save their trajectories in a clever way?
…
Some of these problems can be solved by using the
MDAnalysis.analysis.base.AnalysisBase
class of MDA. However,
this class is limited to python, and sometimes a direct CLI to these
scripts is very helpful for the day-to-day analysis.
A generic CLI wrapper for all classes based on the AnalysisBase could
therefore help people to analyze their simulation
data with the least effort. With this approach, it is easier to use for
MDA-users since they just stay within their known universe
with known selection commands and results structures.
An existing framework makes it also more attractive for users
and developers to write their analysis using the
base.AnalysisBase.
Starting from taurenmd and maicos we
developed a general CLI for any
MDAnalysis.analysis.base.AnalysisBase
class.
mdacli detects all analysis classes located inside the
MDA project and builds a CLI wrapper around them. The wrapper
is generic so it also applies to any downstream
project that uses the MDAnalysis.analysis.base.AnalysisBase
as parent class. If new classes are added to the
MDA codebase they are just they will show up without
any adjustments to mdacli itself.
The core of the wrapper is a docstring parser in combination
with an argument inspection using the inspect library. Based on
a created dictionary containing
each parameter of the class with its docstring and type, the actual command
line interface is build using argparse
. The syntax of the
topology and the trajectory flags (-f, -s, …) is inspired by the
GROMACS CLI
syntax.
The interface also provides a way to save the data using that all analysis results of an AnalysisBase class is stored inside results objects. The saving routines automatically detects the type of the results and saves them either as JSON dumps (for simple variables), CSV files (for 1D and 2D arrays), or zipped data dumps for high higher-dimensional arrays.