Engineering
rdkit avatar

rdkit

Cheminformatics toolkit for molecular analysis and design. Perform SMILES/SDF parsing, descriptor calculation (LogP, TPSA), fingerprinting, substructure searching, and chemical reaction modeling using RDKit.

Introduction

The RDKit skill provides a comprehensive environment for cheminformatics, enabling precise molecular control and computational chemistry workflows. It is designed for researchers, medicinal chemists, and data scientists working on drug discovery, material science, or molecular modeling. By leveraging the industry-standard RDKit Python library, this skill allows users to programmatically handle chemical data, automate parsing of structure files, and perform advanced computational analyses without requiring manual intervention in complex pipelines.

  • Perform molecular I/O using SMILES strings, MOL files, MOL blocks, and InChI formats.

  • Execute automated sanitization including valence checks, aromaticity perception, and chirality assignment.

  • Calculate key molecular descriptors such as Molecular Weight, LogP, TPSA, and Lipinski's Rule of Five parameters for drug-likeness.

  • Conduct batch processing of large chemical datasets via SDMolSupplier and multithreaded readers.

  • Perform structural analysis including ring perception (SSSR), fragment identification, Murcko scaffold extraction, and substructure searching.

  • Generate 2D depictions and 3D coordinate sets for molecular visualization and spatial analysis.

  • Compute molecular fingerprints (RDKit, MACCS keys, Torsions) for similarity assessment and virtual screening.

  • Model chemical reactions and manipulate structures programmatically for library enumeration.

  • Always validate Mol objects using conditional checks to handle parsing errors, as invalid input returns None.

  • RDKit molecules undergo automatic sanitization; use explicit flags to disable this behavior for custom processing or troubleshooting.

  • For complex workflows or high-throughput screening, prioritize multithreaded suppliers to optimize performance.

  • Use the MurckoScaffold module to decompose molecules into core frameworks for medicinal chemistry analysis.

  • When performing similarity searches, ensure that the chosen fingerprint type matches the specific needs of your scaffold or property-based analysis.

  • This tool is intended for programmatic control; for simpler workflows, consider whether high-level wrappers like datamol provide a more efficient entry point.

Repository Stats

Stars
195
Forks
26
Open Issues
4
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
Apr 30, 2026, 09:35 AM
View on GitHub