Protonation Utilities#
Modules for handling molecular protonation states.
Protonation#
Script to protonate SMILES strings using OpenBabel, MolScrub, or ChemAxon.
Requires (dependent on the method chosen): - openbabel - chemaxon - molscrub
- shepherd_score.protonation.protonate.neutralize_atoms(mol)[source]#
Attempts to neutralize every atom of a molecule by adding/removing hydrogens. It doesn’t attempt to keep the formal charge of the original molecule. Neutralizes more than the rdMolStandardize.Uncharger.
Below is copied from: https://rdkit.org/docs/Cookbook.html
This neutralize_atoms() algorithm is adapted from Noel O’Boyle’s nocharge code. It is a neutralization by atom approach and neutralizes atoms with a +1 or -1 charge by removing or adding hydrogen where possible. The SMARTS pattern checks for a hydrogen in +1 charged atoms and checks for no neighbors with a negative charge (for +1 atoms) and no neighbors with a positive charge (for -1 atoms), this is to avoid altering molecules with charge separation (e.g., nitro groups).
The neutralize_atoms() function differs from the rdMolStandardize.Uncharger behavior. See the MolVS documentation for Uncharger:
https://molvs.readthedocs.io/en/latest/api.html#molvs-charge
“This class uncharges molecules by adding and/or removing hydrogens. In cases where there is a positive charge that is not neutralizable, any corresponding negative charge is also preserved.”
As an example, rdMolStandardize.Uncharger will not change charges on C[N+](C)(C)CCC([O-])=O, as there is a positive charge that is not neutralizable. In contrast, the neutralize_atoms() function will attempt to neutralize any atoms it can (in this case to C[N+](C)(C)CCC(=O)O). That is, neutralize_atoms() ignores the overall charge on the molecule, and attempts to neutralize charges even if the neutralization introduces an overall formal charge on the molecule.
- Parameters:
mol (rdkit.Chem.Mol)
- Return type:
rdkit.Chem.Mol
- shepherd_score.protonation.protonate.force_neutralize(mol)[source]#
Force neutralize a molecule by adding/removing hydrogens. Does not attempt to keep the formal charge of the original molecule. First runs neutralize_atoms, then runs rdMolStandardize.ChargeParent.
- Parameters:
mol (rdkit.Chem.Mol)
- Return type:
rdkit.Chem.Mol
- shepherd_score.protonation.protonate.remove_bad_protomers(protomers)[source]#
Remove protomers with bad charges from the rules in ZINC22 build pipeline.
- shepherd_score.protonation.protonate.protonate_smiles(smiles, pH=7.4, method='molscrub', *, path_to_bin='', cxcalc_exe=None, molconvert_exe=None, chemaxon_license_path=None)[source]#
Protonate SMILES string with MolScrub, OpenBabel, or ChemAxon at given pH.
ChemAxon requires cxcalc and molconvert executables to be installed with relevant license as input or set to the CHEMAXON_LICENSE_URL environment variable.
OpenBabel workflow adapted from DockString: dockstring/dockstring
- Parameters:
smiles (str) – SMILES string of molecule to be protonated.
pH (float (default: 7.4)) – pH at which the molecule should be protonated.
method (Literal['openbabel', 'molscrub', 'chemaxon']) – Method to use for protonation. Defaults to ‘molscrub’.
path_to_bin (str (default: '')) – Path to environment bin containing mk_prepare_ligand.py.
cxcalc_exe (str | None (default: None)) – Path to cxcalc executable.
molconvert_exe (str | None (default: None)) – Path to molconvert executable.
chemaxon_license_path (str | None (default: None)) – Path to chemaxon license file. If
None, theCHEMAXON_LICENSE_URLenvironment variable is used.
- Returns:
List of SMILES strings of tautomers/protomers.
- Return type:
ChemAxon Utilities#
Script to tautomerize and protonate molecules using ChemAxon’s package. REQUIRES ChemAxon license.
tautomerize_chemaxon functions are adapted from jenna-fromer/build_3d_py
- shepherd_score.protonation.chemaxon_utils.tautomerize_chemaxon(smiles, cxcalc_exe, molconvert_exe, pH=7.4, cutoff=10, tautomer_limit=20, protomer_limit=20, neutralize=True, verbose=False, chemaxon_license_path=None)[source]#
Tautomerize/protonate a molecule using ChemAxon’s package. Defaults are from the ZINC22 build pipeline.
- Parameters:
smiles (str) – SMILES string of the molecule to tautomerize/protonate.
cxcalc_exe (str) – Path to the cxcalc executable.
molconvert_exe (str) – Path to the molconvert executable.
pH (float (default: 7.4)) – pH value to use for the protonation.
cutoff (float (default: 10)) – Cutoff value to use for the tautomerization/protonation.
tautomer_limit (float (default: 20)) – Limit for the tautomerization/protonation.
protomer_limit (float (default: 20)) – Limit for the protomerization.
neutralize (bool (default: True)) – Whether to neutralize the molecule before tautomerization/protonation.
verbose (bool (default: False)) – Whether to print verbose output.
chemaxon_license_path (str | None (default: None)) – Path to the chemaxon license file. If
None, theCHEMAXON_LICENSE_URLenvironment variable is used.
- Returns:
List of SMILES strings of the tautomers/protomers with bad charges removed.
- Return type:
MolScrub Utilities#
Script to tautomerize and protonate molecules using molscrub package developed by ForliLab.
- shepherd_score.protonation.molscrub_utils.tautomerize_molscrub(smiles, pH=7.4, neutralize=True)[source]#
Find all tautomers/protomers of a molecule using molscrub package.
- Parameters:
smiles (str) – SMILES string of the molecule to tautomerize/protonate.
pH (float (default: 7.4)) – pH value to use for the protonation.
neutralize (bool (default: True)) – Whether to neutralize the molecule before scrubbing.
chemaxon_license_path (str | None (default: None)) – Path to the chemaxon license file. If
None, theCHEMAXON_LICENSE_URLenvironment variable is used.
- Returns:
List of SMILES strings of the tautomers/protomers.
- Return type: