Electrostatic Scoring#
PyTorch Implementation#
Gaussian volume overlap scoring functions combined with continuous electrostatics PYTORCH VERSION.
Batched and non-batched functionalities
Reference math: https://doi.org/10.1002/(SICI)1096-987X(19961115)17:14<1653::AID-JCC7>3.0.CO;2-K https://doi.org/10.1021/j100011a016
- shepherd_score.score.electrostatic_scoring.VAB_2nd_order_esp(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
Torch implementation. Handles batching
- shepherd_score.score.electrostatic_scoring.shape_tanimoto_esp(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
Compute Tanimoto shape similarity
- shepherd_score.score.electrostatic_scoring.get_overlap_esp(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)[source]#
Torch implementation. Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.
Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.
- Parameters:
centers_1 (torch.Tensor (batch, N, 3) or (N, 3)) – Coordinates for the sets of points representing molecule 1.
centers_2 (torch.Tensor (batch, N, 3) or (N, 3)) – Coordinates for the sets of points representing molecule 2.
charges_1 (torch.Tensor (batch, N) or (N,)) – Electrostatic energy for the sets of points representing molecule 1.
charges_2 (torch.Tensor (batch, N) or (N,)) – Electrostatic energy for the sets of points representing molecule 2.
alpha (float) – Parameter controlling the width of the Gaussians.
lam (float) – Parameter controlling the influence of electrostatics.
- Returns:
tanimoto_esp – Tanimoto similarities of electrostatics.
- Return type:
torch.Tensor (batch, 1) or (1,)
- shepherd_score.score.electrostatic_scoring.esp_combo_score(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)[source]#
Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity. Single instance or batch accepted (in the 0th dimension). This will ONLY check the shape of points_1 to deterimine if it is batched or not so errors in the shape of the other tensors may or may not be caught.
- Parameters:
centers_w_H_1 (torch.Tensor (N + n_H, 3) | (batch, N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).
centers_1 (torch.Tensor (N, 3) or (n_surf, 3) | (batch, N, 3) or (batch, n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).
points_1 (torch.Tensor (n_surf, 3) | (batch, n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).
partial_charges_1 (torch.Tensor (N + n_H,) | (batch, N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).
point_charges_1 (torch.Tensor (n_surf,) | (batch, n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)
radii_1 (torch.Tensor (N + n_H,) | (batch, N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)
alpha (float) – Gaussian width parameter used for shape similarity.
lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.
probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.
esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity
centers_w_H_2 (torch.Tensor)
centers_2 (torch.Tensor)
points_2 (torch.Tensor)
partial_charges_2 (torch.Tensor)
point_charges_2 (torch.Tensor)
radii_2 (torch.Tensor)
- Returns:
Similarity score (range: [0, 1]). Higher is more similar.
- Return type:
torch.Tensor (1,) or (batch, 1)
NumPy Implementation#
Gaussian volume overlap scoring functions combined with continuous electrostatics NUMPY VERSIONS
Sincle instance functionality ONLY.
Reference math: https://doi.org/10.1002/(SICI)1096-987X(19961115)17:14<1653::AID-JCC7>3.0.CO;2-K https://doi.org/10.1021/j100011a016
- shepherd_score.score.electrostatic_scoring_np.VAB_2nd_order_esp_np(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
2nd order volume overlap of AB
- Return type:
- shepherd_score.score.electrostatic_scoring_np.shape_tanimoto_esp_np(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
Compute Tanimoto shape similarity
- Return type:
- shepherd_score.score.electrostatic_scoring_np.get_overlap_esp_np(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)[source]#
Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.
Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.
- Parameters:
centers_1 (np.ndarray (N, 3)) – Coordinates for the sets of points representing molecule 1.
centers_2 (np.ndarray (N, 3)) – Coordinates for the sets of points representing molecule 2.
charges_1 (np.ndarray (N,)) – Electrostatic energy for the sets of points representing molecule 1.
charges_2 (np.ndarray (N,)) – Electrostatic energy for the sets of points representing molecule 2.
alpha (float) – Parameter controlling the width of the Gaussians.
lam (float) – Parameter controlling the influence of electrostatics.
- Returns:
Tanimoto similarities of electrostatics
- Return type:
np.ndarray (N,)
- shepherd_score.score.electrostatic_scoring_np.esp_combo_score_np(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)[source]#
Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity. Follows ShaEP formulation.
- Parameters:
centers_w_H_1 (np.ndarray (N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).
centers_1 (np.ndarray (N, 3) or (n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).
points_1 (np.ndarray (n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).
partial_charges_1 (np.ndarray (N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).
point_charges_1 (np.ndarray (n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)
radii_1 (np.ndarray (N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)
alpha (float) – Gaussian width parameter used for shape similarity.
lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.
probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.
esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity
centers_w_H_2 (ndarray)
centers_2 (ndarray)
points_2 (ndarray)
partial_charges_2 (ndarray)
point_charges_2 (ndarray)
radii_2 (ndarray)
- Returns:
Similarity score (range: [0, 1]). Higher is more similar.
- Return type:
np.ndarray (1,)
JAX Implementation#
Electrostatic potential similarity scoring functions. JAX VERSIONS
- shepherd_score.score.electrostatic_scoring_jax.VAB_2nd_order_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
2nd order volume overlap of AB
- shepherd_score.score.electrostatic_scoring_jax.VAB_2nd_order_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha, lam)[source]#
2nd order volume overlap of AB with masking for padded entries. charges expected as shape (-1, 1).
- shepherd_score.score.electrostatic_scoring_jax.shape_tanimoto_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha, lam)[source]#
Compute Tanimoto ESP similarity with masking for padded entries.
- shepherd_score.score.electrostatic_scoring_jax.get_overlap_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha=0.81, lam=62.20604814099848)#
Jitted Jax function. Masked version of get_overlap_esp_jax. Padding entries indicated by mask arrays are excluded from the overlap computation.
- Parameters:
- Returns:
Tanimoto ESP similarity.
- Return type:
Array scalar
- shepherd_score.score.electrostatic_scoring_jax.shape_tanimoto_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#
Compute Tanimoto shape similarity
- shepherd_score.score.electrostatic_scoring_jax.get_overlap_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)#
Jitted Jax function. Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.
Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.
- Parameters:
centers_1 (Array (N, 3)) – Coordinates for the sets of points representing molecule 1.
centers_2 (Array (N, 3)) – Coordinates for the sets of points representing molecule 2.
charges_1 (Array (N,1)) – Electrostatic energy for the sets of points representing molecule 1.
charges_2 (Array (N,1)) – Electrostatic energy for the sets of points representing molecule 2.
alpha (float) – Parameter controlling the width of the Gaussians.
lam (float) – Parameter controlling the influence of electrostatics.
- Returns:
Tanimoto similarities of electrostatics.
- Return type:
Array (N,)
- shepherd_score.score.electrostatic_scoring_jax.esp_combo_score_jax(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)#
Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity.
- Parameters:
centers_w_H_1 (Array (N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).
centers_1 (Array (N, 3) or (n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).
points_1 (Array (n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).
partial_charges_1 (Array (N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).
point_charges_1 (Array (n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)
radii_1 (Array (N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)
alpha (float) – Gaussian width parameter used for shape similarity.
lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.
probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.
esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity
centers_w_H_2 (jax.Array)
centers_2 (jax.Array)
points_2 (jax.Array)
partial_charges_2 (jax.Array)
point_charges_2 (jax.Array)
radii_2 (jax.Array)
- Returns:
Similarity score (range: [0, 1]). Higher is more similar.
- Return type:
Array (1,)