Pharmacophore Utilities#
Modules for pharmacophore feature extraction and manipulation.
Pharmacophore Extraction#
Generate pharmacophores from a RDKit conformer.
Parts of code adapted from Francois Berenger / Tsuda Lab and RDKit.
References:
Tsuda Lab: tsudalab/ACP4 (From https://doi.org/10.1021/acs.jcim.2c01623)
RDKit: rdkit/rdkit
RDKit: rdkit/rdkit
- shepherd_score.pharm_utils.pharmacophore.find_hydrophobes(mol, cluster_hydrophobic=True)[source]#
Find hydrophobes and cluster them.
- shepherd_score.pharm_utils.pharmacophore.get_pharmacophores_dict(mol, multi_vector=True, exclude=[], check_access=False, scale=1.0)[source]#
Get the positions of pharmacophore anchors and their associated unit vectors.
Returns a dictionary. Adapted from rdkit.Chem.Features.ShowFeats.ShowMolFeats.
- Parameters:
mol (rdkit.Chem.Mol) – RDKit Mol object with a conformer.
multi_vector (bool, optional) – Whether to represent pharmacophores with multiple vectors. Default is
True.exclude (list, optional) – List of atom indices to not include as a HBD. Default is [].
check_access (bool, optional) – Check if HBD/HBA are accessible to the molecular surface. Default is
False.scale (float, optional) – Length of the vector in Angstroms. Default is 1.0.
- Returns:
Dictionary with format
{'FeatureName': {'P': [(anchor coord), ...], 'V': [(rel. vec), ...]}}.- Return type:
- shepherd_score.pharm_utils.pharmacophore.get_pharmacophores(mol, multi_vector=True, exclude=[], check_access=False, scale=1.0)[source]#
Get the identity, anchor positions, and relative unit vectors for each pharmacophore.
Pharmacophore ordering for indexing: (‘Acceptor’, ‘Donor’, ‘Aromatic’, ‘Hydrophobe’, ‘Cation’, ‘Anion’, ‘ZnBinder’)
Notes
The
check_accessparameter is currently based on whether interaction points sampled from a sphere’s surface with a radius of 1.8A from the acceptor/donor atom falls outside the solvent accessible surface defined by the vdW radius + 0.8A of the neighboring atoms. This works for buried acceptors/donors, but may be prone to false positives. For example, CN(C)C would have its sole HBA rejected. Other approaches such as buried volume should be considered in the future.- Parameters:
mol (rdkit.Chem.Mol) – RDKit Mol object with conformer.
multi_vector (bool, optional) – Whether to represent pharmacophores with multiple vectors. Default is
True.exclude (list, optional) – List of hydrogen indices to not include as a HBD. Default is [].
check_access (bool, optional) – Check if HBD/HBA are accessible to the molecular surface. Default is
False.scale (float, optional) – Length of a pharmacophore vector in Angstroms. Default is 1.0.
- Returns:
X (np.ndarray) – Identity of pharmacophore corresponding to the indexing order, shape (N,).
P (np.ndarray) – Anchor positions of each pharmacophore, shape (N, 3).
V (np.ndarray) – Unit vectors in a relative position to the anchor positions, shape (N, 3). Adding P and V results in the position of the vector’s extended point.
- Return type:
Pharmacophore Vectors#
Generate the vector features for pharmacophores from a rdkit conformer.
- Adapted from rdkit:
Changed to return anchor position and relative unit vector.
- shepherd_score.pharm_utils.pharmvec.GetAromaticFeatVects(conf, featAtoms, featLoc, return_both=False, scale=1.0)[source]#
Compute the direction vector for an aromatic feature
Changed: only return one vector, process later for visualization and scoring
- Parameters:
conf (a conformer)
featAtoms (list of atom IDs that make up the feature)
featLoc (location of the aromatic feature specified as point3d)
return_both (bool for whether to return both vectors or just one.)
scale (the size of the direction vector)
- Returns:
list of anchor position(s) as rdkit Point3D list of relative unit vector(s) as rdkit Point3D
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetDonorFeatVects(conf, featAtoms, scale=1.0, exclude=[])[source]#
Get vectors for hydrogen bond donors in the direction of the hydrogens.
- Parameters:
- Returns:
list of anchor position(s) as rdkit Point3D or None list of relative unit vector(s) as rdkit Point3D or None list of neighboring hydrogens or None
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetAcceptorFeatVects(conf, featAtoms, scale=1.0)[source]#
Get the anchor positions and relative unit vectors of an acceptor atom.
Assumes HBA’s are only O and N as defined by smarts_features.fdef. If HBA is not one of those, then it assumes the atom has one lone pair.
- Parameters:
- Returns:
(list of anchor position(s) as RDKit Point3D or [None], list of relative unit vector(s) as RDKit Point3D or [None])
- Return type:
- shepherd_score.pharm_utils.pharmvec.GetHalogenFeatVects(conf, featAtoms, scale=1.0)[source]#
Get the anchor positions and relative unit vectors of a halogen atom. Assumes only one connection.
- Parameters:
conf (rdkit Mol object with a conformer.)
featAtoms (list containing rdkit Atom object of atom attributed as an acceptor.)
scale (float (default = 1.) length of direction vector.)
- Returns:
list of anchor position(s) as rdkit Point3D or [None] list of relative unit vector(s) as rdkit Point3D or [None]
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetDonor1FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Donor of type 1. Made to generate a single vector representation.
This is a donor with one heavy atom. It is not clear where we should we should be putting the direction vector for this. It should probably be a cone. In this case we will just use the direction vector from the donor atom to the heavy atom.
Changed: conditioning based on the number of hydrogens 1. If 1 hydrogen, vector should point in the direction of the hydrogen. 2. If 2 hydrogens, vector should point in a bisecting direction of the two hydrogens. 3. If 3 hydrogens, point in the direction of the bond.
- Parameters:
conformer (conf - rdkit Mol object with)
feature (featAtoms - list of atoms that are part of the)
1.0) (scale - float for length of the direction vector (default =)
- Returns:
anchor position as rdkit Point3D or None relative unit vector(s) as rdkit Point3D or None list of hydrogen rdkit Atom objects
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetDonor2FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Donor of type 2. Made to generate a single vector representation.
This is a donor with two heavy atoms as neighbors. The atom may are may not have hydrogen on it. Here are the situations with the neighbors that will be considered here 1. two heavy atoms and two hydrogens: we will assume a sp3 arrangement here 2. two heavy atoms and one hydrogen: this can either be sp2 or sp3 3. two heavy atoms and no hydrogens
Changed: conditioning based on the number of hydrogens 1. For case 1, point in the direction bisecting the two hydrogens. 2. For case 2, point in the direction of the hydrogen. 3. For case 3, no changes.
- Parameters:
conf (rdkit Mol object with conformer)
featAtoms (list of atoms that are part of the feature)
scale (float for length of the direction vector (default = 1.0))
- Returns:
anchor position as rdkit Point3D or None relative unit vector(s) as rdkit Point3D or None list of hydrogen rdkit Atom objects
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetDonor3FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Donor of type 3. Made to generate a single vector representation.
This is a donor with three heavy atoms as neighbors. We will assume a tetrahedral arrangement of these neighbors. So the direction we are seeking is the last fourth arm of the sp3 arrangement
Changed: Return anchor and relative unit vector tuple
- Parameters:
conf (rdkit Mol object with conformer)
featAtoms (list of atoms that are part of the feature)
scale (float for length of the direction vector (default = 1.0))
- Returns:
anchor position as rdkit Point3D or None relative unit vector(s) as rdkit Point3D or None list of hydrogen rdkit Atom objects
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetAcceptor1FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Acceptor of type 1 (single vector representation).
This is an acceptor with one heavy atom neighbor. There are two possibilities:
The bond to the heavy atom is a single bond (e.g. CO): We use the inversion of this bond direction and mark it as a ‘cone’.
The bond to the heavy atom is a double bond (e.g. C=O): We have two possible directions except in some special cases (e.g. SO2) where we use bond direction.
Notes
Modified to condition on the number of hydrogens with methanamine fix:
Case 1: If one hydrogen, vector points in the opposite direction of the bisection of the acute angle formed by the heavy-acceptor-hydrogen. If two hydrogens, assume sp3 and project in that lone-pair direction. If not tetrahedral, return None.
Case 2: Return the bisecting vector of the two lone-pairs.
- Parameters:
- Returns:
(anchor position as RDKit Point3D or None, relative unit vector(s) as RDKit Point3D or None)
- Return type:
- shepherd_score.pharm_utils.pharmvec.GetAcceptor2FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Acceptor of type 2. Made to generate a single vector representation.
This is the acceptor with two adjacent heavy atoms. We will special case a few things here. If the acceptor atom is an oxygen we will assume a sp3 hybridization the acceptor directions (two of them) reflect that configurations. Otherwise the direction vector in plane with the neighboring heavy atoms
Changed: Only generate one vector Rather than generating 2 vectors for sp3 oxygen, just keep the average vector.
- Parameters:
conf (rdkit Mol object with conformer)
featAtoms (list of atoms that are part of the feature)
scale (float for length of the direction vector (default = 1.0))
- Returns:
anchor position as rdkit Point3D or None relative unit vector(s) as rdkit Point3D or None
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetAcceptor3FeatVects_single(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Donor of type 3. Made to generate a single vector representation.
This is a donor with three heavy atoms as neighbors. We will assume a tetrahedral arrangement of these neighbors. So the direction we are seeking is the last fourth arm of the sp3 arrangement
Changed: to return anchor and relative unit vector tuple
- Parameters:
conf (rdkit Mol object with conformer)
featAtoms (list of atoms that are part of the feature)
scale (float for length of the direction vector (default = 1.0))
- Returns:
anchor position as rdkit Point3D or None relative unit vector(s) as rdkit Point3D or None
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetAcceptor1FeatVects(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Acceptor of type 1 (multi-vector representation).
This is an acceptor with one heavy atom neighbor. There are two possibilities:
The bond to the heavy atom is a single bond (e.g. CO): We use the inversion of this bond direction and mark it as a ‘cone’.
The bond to the heavy atom is a double bond (e.g. C=O): We have two possible directions except in some special cases (e.g. SO2) where we use bond direction.
Notes
Modified to change return format, with fixes for methanamine and two vectors for hydroxyls.
- Parameters:
- Returns:
(list of anchor position(s) as RDKit Point3D or None, list of relative unit vector(s) as RDKit Point3D or None)
- Return type:
- shepherd_score.pharm_utils.pharmvec.GetAcceptor2FeatVects(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Acceptor of type 2. Made to generate a single vector representation.
This is the acceptor with two adjacent heavy atoms. We will special case a few things here. If the acceptor atom is an oxygen we will assume a sp3 hybridization the acceptor directions (two of them) reflect that configurations. Otherwise the direction vector in plane with the neighboring heavy atoms
Changed: return format
- Parameters:
conf (rdkit Mol object with a conformer.)
featAtoms (list containing rdkit Atom object of atom attributed as an acceptor.)
scale (float (default = 1.) length of direction vector.)
- Returns:
list of anchor position(s) as rdkit Point3D or None list of relative unit vector(s) as rdkit Point3D or None
- Return type:
Tuple
- shepherd_score.pharm_utils.pharmvec.GetAcceptor3FeatVects(conf, featAtoms, scale=1.0)[source]#
Get the direction vectors for Donor of type 3. Made to generate a single vector representation.
This is a donor with three heavy atoms as neighbors. We will assume a tetrahedral arrangement of these neighbors. So the direction we are seeking is the last fourth arm of the sp3 arrangement
Changed: return format
- Parameters:
conf (rdkit Mol object with a conformer.)
featAtoms (list containing rdkit Atom object of atom attributed as an acceptor.)
scale (float (default = 1.) length of direction vector.)
- Returns:
list of anchor position(s) as rdkit Point3D or None list of relative unit vector(s) as rdkit Point3D or None
- Return type:
Tuple