We realized a python-based open-source package to analyze the results stemming from ab initio molecular-dynamics simulations of fluids. The library is a collection of python scripts that include two major libraries dealing with file formats and with crystallography. We propose a simplified format to store the extracted trajectories and relevant thermodynamic information of the simulations, which is saved in UMD (“Universal Molecular Dynamics”) files. The package allows the computation of a series of structural, transport and thermodynamic properties. Starting with the pair-distribution function it defines bond lengths, builds an interatomic connectivity matrix, and eventually determines the chemical speciation. Determining the lifetime of the chemical species allows to run a full statistical analysis. Then dedicated scripts compute the mean-square displacements for the atoms as well as for the chemical species, and determine the diffusion coefficients. The implemented self-correlation analysis of the atomic velocities yield the diffusion coefficients and the vibrational spectrum, and of the stresses the viscosity. The package is available via the github website as an open-access package and we encourage further collaborators. A compact archive can be downloaded from here.

And if you use the code, while waiting for its publication in a scientific paper (later this year), please cite it using its DOI: 10.5281/zenodo.3710978

## UMD file format

The umd files are ascii files; typical extension is .umd.dat but not mandatory.

They are appropriate to the ensembles which preserve the number of particles unchanged throughout the simulation. The UMD package can read les stemming form calculations where the shape and volume of the simulation box varies. These cover all the most common calculations, like NVT and NPT, where the number of particles, N, temperature T, volume, V, and/or pressure, P, are kept constant.

Each physical property is expressed on one line. Every line starts with a keyword. In this way the format is highly adaptable and allows for new properties to be added to the umd file, all the while preserving its readability throughout versions.

All .umd files contain a header describing the content of the simulation cell: the number of atoms, electrons, and atomic types, as well as details for each atom, such as its type, chemical symbol, number of valence electrons, and its mass. One empty line marks the end of the header, and separates it from the main part of the umd file.

Then each step of the simulation is detailed. A short header lists the instantaneous thermodynamic parameters, like energy, stresses, equivalent hydrostatic pressure, density, volume, lattice parameters, etc. This information is followed by the table describing the actual atomic positions and velocities. This table has first a header line, where the different measures are listed, including their units.

Then each atom is detailed on one line, where

- columns 1-3 give the atomic positions expressed in reduced coordinates
- columns 4-6 give the atomic positions expressed in cartesian units
- columns 7-9 give the real Cartesian positions that take into account diffusion,
- columns 10-12 give the atomic velocities
- columns 12-15 give the atomic forces
- column 16 gives the atomic charges
- column 17 gives the atomic local magnetic moment.

All the functions related to the reading and printing of the umd files are combined into one python library, umd process.py. These functions are consistently called by all the scripts of the package after proper loading.

## Crystallography library

All the information related to the actual atomic structure is stored and dealt with in a series of functions and data structures grouped in the *crystallography.py* library. The underlying philosophy is to treat the lattice as a vectorial space. The unit cell parameters together with their orientation represent the basis vectors. The space has a series of scalar attributes (*e.g. *specific volume, density, temperature, and specific number of atoms), thermodynamic properties (*e.g.* internal energy, pressure, and heat capacity), and a series of tensorial properties (*e.g. *stress and elasticity). The space is populated by atoms. All this ensemble is defined in the *Lattice* class.

The atoms are defined in the *Atoms* class. The atoms are characterized by a series of scalar properties (*e.g. *name, symbol, mass, number of electrons, and a series of vectorial properties). The latter are related to the position of the atoms in space, either relative the vectorial basis described in the *Lattice* class, or relative to universal Cartesian coordinates. Transformations between the various types of coordinates are realized inside the *Lattice* class. The atoms also possess velocities and forces, which are both vectorial properties.

## Thermodynamic averages

The statistical error of the average is computed using the halving method. Starting with the initial data sample, at each step *k *the number the samples is halved by averaging over every two corresponding consecutive samples from the previous step *k-1*. The procedure is repeated recursively until the last two averages. As the procedure advances convergence is achieved and can be extracted automatically.