Electrostatic Scoring#

PyTorch Implementation#

Gaussian volume overlap scoring functions combined with continuous electrostatics PYTORCH VERSION.

Batched and non-batched functionalities

Reference math: https://doi.org/10.1002/(SICI)1096-987X(19961115)17:14<1653::AID-JCC7>3.0.CO;2-K https://doi.org/10.1021/j100011a016

shepherd_score.score.electrostatic_scoring.VAB_2nd_order_esp(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

Torch implementation. Handles batching

Parameters:
  • centers_1 (torch.Tensor)

  • centers_2 (torch.Tensor)

  • charges_1 (torch.Tensor)

  • charges_2 (torch.Tensor)

  • alpha (float)

  • lam (float)

Return type:

torch.Tensor

shepherd_score.score.electrostatic_scoring.shape_tanimoto_esp(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

Compute Tanimoto shape similarity

Parameters:
  • centers_1 (torch.Tensor)

  • centers_2 (torch.Tensor)

  • charges_1 (torch.Tensor)

  • charges_2 (torch.Tensor)

  • alpha (float)

  • lam (float)

Return type:

torch.Tensor

shepherd_score.score.electrostatic_scoring.get_overlap_esp(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)[source]#

Torch implementation. Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.

Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.

Parameters:
  • centers_1 (torch.Tensor (batch, N, 3) or (N, 3)) – Coordinates for the sets of points representing molecule 1.

  • centers_2 (torch.Tensor (batch, N, 3) or (N, 3)) – Coordinates for the sets of points representing molecule 2.

  • charges_1 (torch.Tensor (batch, N) or (N,)) – Electrostatic energy for the sets of points representing molecule 1.

  • charges_2 (torch.Tensor (batch, N) or (N,)) – Electrostatic energy for the sets of points representing molecule 2.

  • alpha (float) – Parameter controlling the width of the Gaussians.

  • lam (float) – Parameter controlling the influence of electrostatics.

Returns:

tanimoto_esp – Tanimoto similarities of electrostatics.

Return type:

torch.Tensor (batch, 1) or (1,)

shepherd_score.score.electrostatic_scoring.esp_combo_score(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)[source]#

Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity. Single instance or batch accepted (in the 0th dimension). This will ONLY check the shape of points_1 to deterimine if it is batched or not so errors in the shape of the other tensors may or may not be caught.

Parameters:
  • centers_w_H_1 (torch.Tensor (N + n_H, 3) | (batch, N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).

  • centers_1 (torch.Tensor (N, 3) or (n_surf, 3) | (batch, N, 3) or (batch, n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).

  • points_1 (torch.Tensor (n_surf, 3) | (batch, n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).

  • partial_charges_1 (torch.Tensor (N + n_H,) | (batch, N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).

  • point_charges_1 (torch.Tensor (n_surf,) | (batch, n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)

  • radii_1 (torch.Tensor (N + n_H,) | (batch, N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)

  • alpha (float) – Gaussian width parameter used for shape similarity.

  • lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.

  • probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.

  • esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity

  • centers_w_H_2 (torch.Tensor)

  • centers_2 (torch.Tensor)

  • points_2 (torch.Tensor)

  • partial_charges_2 (torch.Tensor)

  • point_charges_2 (torch.Tensor)

  • radii_2 (torch.Tensor)

Returns:

Similarity score (range: [0, 1]). Higher is more similar.

Return type:

torch.Tensor (1,) or (batch, 1)

NumPy Implementation#

Gaussian volume overlap scoring functions combined with continuous electrostatics NUMPY VERSIONS

Sincle instance functionality ONLY.

Reference math: https://doi.org/10.1002/(SICI)1096-987X(19961115)17:14<1653::AID-JCC7>3.0.CO;2-K https://doi.org/10.1021/j100011a016

shepherd_score.score.electrostatic_scoring_np.VAB_2nd_order_esp_np(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

2nd order volume overlap of AB

Return type:

ndarray

shepherd_score.score.electrostatic_scoring_np.shape_tanimoto_esp_np(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

Compute Tanimoto shape similarity

Return type:

ndarray

shepherd_score.score.electrostatic_scoring_np.get_overlap_esp_np(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)[source]#

Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.

Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.

Parameters:
  • centers_1 (np.ndarray (N, 3)) – Coordinates for the sets of points representing molecule 1.

  • centers_2 (np.ndarray (N, 3)) – Coordinates for the sets of points representing molecule 2.

  • charges_1 (np.ndarray (N,)) – Electrostatic energy for the sets of points representing molecule 1.

  • charges_2 (np.ndarray (N,)) – Electrostatic energy for the sets of points representing molecule 2.

  • alpha (float) – Parameter controlling the width of the Gaussians.

  • lam (float) – Parameter controlling the influence of electrostatics.

Returns:

Tanimoto similarities of electrostatics

Return type:

np.ndarray (N,)

shepherd_score.score.electrostatic_scoring_np.esp_combo_score_np(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)[source]#

Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity. Follows ShaEP formulation.

Parameters:
  • centers_w_H_1 (np.ndarray (N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).

  • centers_1 (np.ndarray (N, 3) or (n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).

  • points_1 (np.ndarray (n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).

  • partial_charges_1 (np.ndarray (N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).

  • point_charges_1 (np.ndarray (n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)

  • radii_1 (np.ndarray (N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)

  • alpha (float) – Gaussian width parameter used for shape similarity.

  • lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.

  • probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.

  • esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity

  • centers_w_H_2 (ndarray)

  • centers_2 (ndarray)

  • points_2 (ndarray)

  • partial_charges_2 (ndarray)

  • point_charges_2 (ndarray)

  • radii_2 (ndarray)

Returns:

Similarity score (range: [0, 1]). Higher is more similar.

Return type:

np.ndarray (1,)

JAX Implementation#

Electrostatic potential similarity scoring functions. JAX VERSIONS

shepherd_score.score.electrostatic_scoring_jax.VAB_2nd_order_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

2nd order volume overlap of AB

Parameters:
  • centers_1 (jax.Array)

  • centers_2 (jax.Array)

  • charges_1 (jax.Array)

  • charges_2 (jax.Array)

  • alpha (float)

  • lam (float)

Return type:

jax.Array

shepherd_score.score.electrostatic_scoring_jax.VAB_2nd_order_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha, lam)[source]#

2nd order volume overlap of AB with masking for padded entries. charges expected as shape (-1, 1).

Parameters:
  • centers_1 (jax.Array)

  • centers_2 (jax.Array)

  • charges_1 (jax.Array)

  • charges_2 (jax.Array)

  • mask_1 (jax.Array)

  • mask_2 (jax.Array)

  • alpha (float)

  • lam (float)

Return type:

jax.Array

shepherd_score.score.electrostatic_scoring_jax.shape_tanimoto_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha, lam)[source]#

Compute Tanimoto ESP similarity with masking for padded entries.

Parameters:
  • centers_1 (jax.Array)

  • centers_2 (jax.Array)

  • charges_1 (jax.Array)

  • charges_2 (jax.Array)

  • mask_1 (jax.Array)

  • mask_2 (jax.Array)

  • alpha (float)

  • lam (float)

Return type:

jax.Array

shepherd_score.score.electrostatic_scoring_jax.get_overlap_esp_jax_mask(centers_1, centers_2, charges_1, charges_2, mask_1, mask_2, alpha=0.81, lam=62.20604814099848)#

Jitted Jax function. Masked version of get_overlap_esp_jax. Padding entries indicated by mask arrays are excluded from the overlap computation.

Parameters:
  • centers_1 (Array (N, 3))

  • centers_2 (Array (N, 3))

  • charges_1 (Array (N,) or (N, 1))

  • charges_2 (Array (N,) or (N, 1))

  • mask_1 (Array (N,)) – Binary mask: 1 for real entries, 0 for padding.

  • mask_2 (Array (N,)) – Binary mask: 1 for real entries, 0 for padding.

  • alpha (float)

  • lam (float)

Returns:

Tanimoto ESP similarity.

Return type:

Array scalar

shepherd_score.score.electrostatic_scoring_jax.shape_tanimoto_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha, lam)[source]#

Compute Tanimoto shape similarity

Parameters:
  • centers_1 (jax.Array)

  • centers_2 (jax.Array)

  • charges_1 (jax.Array)

  • charges_2 (jax.Array)

  • alpha (float)

  • lam (float)

Return type:

jax.Array

shepherd_score.score.electrostatic_scoring_jax.get_overlap_esp_jax(centers_1, centers_2, charges_1, charges_2, alpha=0.81, lam=62.20604814099848)#

Jitted Jax function. Compute electrostatic similarity which weights Gaussian volume overlap by electrostatics. The Tanimoto score is used.

Typically lam=0.3*LAM_SCALING is used for surface point clouds and lam=0.1 for partial charge weighted volumetric overlap.

Parameters:
  • centers_1 (Array (N, 3)) – Coordinates for the sets of points representing molecule 1.

  • centers_2 (Array (N, 3)) – Coordinates for the sets of points representing molecule 2.

  • charges_1 (Array (N,1)) – Electrostatic energy for the sets of points representing molecule 1.

  • charges_2 (Array (N,1)) – Electrostatic energy for the sets of points representing molecule 2.

  • alpha (float) – Parameter controlling the width of the Gaussians.

  • lam (float) – Parameter controlling the influence of electrostatics.

Returns:

Tanimoto similarities of electrostatics.

Return type:

Array (N,)

shepherd_score.score.electrostatic_scoring_jax.esp_combo_score_jax(centers_w_H_1, centers_w_H_2, centers_1, centers_2, points_1, points_2, partial_charges_1, partial_charges_2, point_charges_1, point_charges_2, radii_1, radii_2, alpha, lam=0.001, probe_radius=1.0, esp_weight=0.5)#

Computes a similarity score defined by ShaEP. It is a balanced score between electrostatics and shape similarity.

Parameters:
  • centers_w_H_1 (Array (N + n_H, 3)) – Coordinates of atom centers INCLUDING hydrogens of molecule 1. Used for computing electrostatic potential. Same for centers_w_H_2 except (M + m_H, 3).

  • centers_1 (Array (N, 3) or (n_surf, 3)) – Coordinates of points for molecule 1 used to compute shape similarity. Use atom centers for volumentric similarity. Use surface centers for surface similarity. Same for centers except (M, 3) or (m_surf, 3).

  • points_1 (Array (n_surf, 3)) – Coordinates of surface points for molecule 1. Same for points_2 except (m_surf, 3).

  • partial_charges_1 (Array (N + n_H,)) – Partial charges corresponding to the atoms in centers_w_H_1. Same for partial_charges_2 except (M + m_H,).

  • point_charges_1 (Array (n_surf,)) – The electrostatic potential calculated at each surface point (points_1). Same for point_charges_1 except (m_surf,)

  • radii_1 (Array (N + n_H,)) – vdW radii corresponding to the atoms in centers_w_H_1 (angstroms) Same for radii_2 except (M + m_H,)

  • alpha (float) – Gaussian width parameter used for shape similarity.

  • lam (float (default = 0.001)) – Electrostatic potential weighting parameter (smaller = higher weight). 0.001 was chosen as default based empirical observations of the distribution of scores generated by _esp_comparison before summation.

  • probe_radius (float (default = 1.0)) – Surface points found within vdW radii + probe radius will be masked out. Surface generation uses a probe radius of 1.2 (radius of hydrogen) so we use a slightly lower radius for be more tolerant.

  • esp_weight (float (default = 0.5)) – Weight to be placed on electrostatic similarity with respect to shape similarity. 0 = only shape similarity 1 = only electrostatic similarity

  • centers_w_H_2 (jax.Array)

  • centers_2 (jax.Array)

  • points_2 (jax.Array)

  • partial_charges_2 (jax.Array)

  • point_charges_2 (jax.Array)

  • radii_2 (jax.Array)

Returns:

Similarity score (range: [0, 1]). Higher is more similar.

Return type:

Array (1,)