pymatgen
Comprehensive Python materials science toolkit for crystal structure manipulation, electronic structure analysis, phase diagram computation, and Materials Project integration.
Introduction
Pymatgen (Python Materials Genomics) is a robust open-source library designed for materials scientists, computational chemists, and researchers working on high-throughput material design. It serves as the programmatic bridge to the Materials Project database, enabling users to fetch, analyze, and manipulate structural and electronic data efficiently. This skill is intended for professionals analyzing atomic configurations, thermodynamics, and physical properties of inorganic compounds, providing a standardized framework for complex materials modeling.
-
Advanced crystal structure manipulation including creation from scratch, supercell generation, primitive cell transformation, and space group symmetry analysis using SpacegroupAnalyzer.
-
Multi-format support for computational chemistry codes, with seamless reading and writing for over 100 file types, including VASP (POSCAR, CONTCAR), CIF, XYZ, and Gaussian input files.
-
Deep analysis capabilities for electronic structure data, such as Band Structures and Density of States (DOS) visualization and data extraction.
-
Thermodynamic stability assessment through phase diagram generation, including hull distance calculation and stability analysis for multi-component systems.
-
Programmatic access to the Materials Project (MP) API to search for material properties, retrieve structural data, and filter materials by chemical composition or thermodynamic criteria.
-
Coordination environment analysis using CrystalNN to characterize local atomic arrangements and bonding neighborhoods.
-
Users should install the core library via pip and optionally install mp-api for cloud database access. Additional dependencies for visualization and advanced analysis are available via optional pip extras.
-
Common inputs include standard computational materials science files like POSCAR or CIF. Outputs typically include parsed Python objects (Structure, Molecule), numerical property data, or converted files.
-
The tool is highly effective for automating research pipelines, such as converting batches of structures between different simulation formats or calculating stability trends across vast design spaces.
-
It is constrained by the underlying computational data availability; users should ensure valid API keys are configured for external database integration.
Repository Stats
- Stars
- 19,677
- Forks
- 2,197
- Open Issues
- 42
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 29, 2026, 01:49 AM