Main Alignment Module#
Alignment algorithms using Torch-based scoring functions.
- shepherd_score.alignment.objective_ROCS_overlay(se3_params, ref_points, fit_points, alpha, precomputed_U=None)[source]#
Objective function to optimize ROCS overlay. Supports batched and non-batched inputs. If the inputs are batched, the loss is the average across the batch.
- Parameters:
se3_params (torch.Tensor (batch, 7) or (7,)) – Parameters for SE(3) transformation. The first 4 values in the last dimension are quaternions of form (r,i,j,k) and the last 3 values of the last dimension are the translations in (x,y,z).
ref_points (torch.Tensor (batch, N, 3) or (N,3)) – Reference points. If you want to optimize to the same ref_points, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
fit_points (torch.Tensor (batch, M, 3) or (M,3)) – Set of points to apply SE(3) transformations to maximize shape similarity with ref_points. If you want to optimize to the same fit_points, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
alpha (float) – Gaussian width parameter used in scoring function.
precomputed_U (torch.Tensor | None)
- Returns:
loss – 1 - average(Tanimoto score).
- Return type:
torch.Tensor (1,)
- shepherd_score.alignment.score_ROCS_overlay_with_avoid(ref_points, fit_points, alpha, fit_points_for_avoid, avoid_points, avoid_min_dist, avoid_weight, precomputed_U=None)[source]#
See objective_ROCS_overlay_with_avoid for parameter descriptions.
- shepherd_score.alignment.objective_ROCS_overlay_with_avoid(se3_params, ref_points, fit_points, alpha, fit_points_for_avoid, avoid_points, avoid_min_dist, avoid_weight, precomputed_U=None)[source]#
Objective function to optimize ROCS overlay. Supports batched and non-batched inputs. If the inputs are batched, the loss is the average across the batch.
- Parameters:
se3_params (torch.Tensor (batch, 7) or (7,)) – Parameters for SE(3) transformation. The first 4 values in the last dimension are quaternions of form (r,i,j,k) and the last 3 values of the last dimension are the translations in (x,y,z).
ref_points (torch.Tensor (batch, N, 3) or (N,3)) – Reference points. If you want to optimize to the same ref_points, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
fit_points (torch.Tensor (batch, M, 3) or (M,3)) – Set of points to apply SE(3) transformations to maximize shape similarity with ref_points. If you want to optimize to the same fit_points, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
alpha (float) – Gaussian width parameter used in scoring function.
fit_points_for_avoid (torch.Tensor (M,3)) – Set of points to apply SE(3) transformations to then compare to avoid_points
avoid_points (torch.Tensor (K,3) (default=None)) – If not None, these are points that are used in an additional term in the objective function to penalize overlap with these points.
avoid_min_dist (float (default=2.0)) – Minimum distance with no penalization between fit_points_for_avoid and avoid_points.
avoid_weight (float (default=1.0)) – Weight for the avoid_points term in the scoring function.
precomputed_U (torch.Tensor | None)
- Returns:
loss –
- 1 - (average(Tanimoto score fit_points to ref_points)
avoid_weight * average(hard sphere overlap of fit_points_for_avoid to avoid_points)).
- Return type:
torch.Tensor (1,)
- shepherd_score.alignment.objective_ROCS_esp_overlay(se3_params, ref_points, fit_points, ref_charges, fit_charges, alpha, lam, precomputed_U=None)[source]#
Objective function to optimize ROCS overlay. Supports batched and non-batched inputs. If the inputs are batched, the loss is the average across the batch.
- Parameters:
se3_params (torch.Tensor (batch, 7) or (7,)) – Parameters for SE(3) transformation. The first 4 values in the last dimension are quaternions of form (r,i,j,k) and the last 3 values of the last dimension are the translations in (x,y,z).
ref_points (torch.Tensor (batch, N, 3) or (N,3)) – Reference points.
fit_points (torch.Tensor (batch, M, 3) or (M,3)) – Set of points to apply SE(3) transformations to maximize shape similarity with ref_points.
ref_charges (torch.Tensor (batch, N) or (N,)) – Electric potential at the corresponding ref_points coordinates.
fit_charges (torch.Tensor (batch, M) or (M,)) – Electric potential at the corresponding fit_points coordinates
alpha (float) – Gaussian width parameter used in scoring function.
lam (float) – Scaling term for charges used in the exponential kernel of the ESP scoring function.
precomputed_U (torch.Tensor | None)
- Returns:
loss – 1 - mean(ESP Tanimoto score).
- Return type:
torch.Tensor (1,)
- shepherd_score.alignment.objective_esp_combo_score_overlay(se3_params, ref_centers_w_H, fit_centers_w_H, ref_centers, fit_centers, ref_points, fit_points, ref_partial_charges, fit_partial_charges, ref_surf_esp, fit_surf_esp, ref_radii, fit_radii, alpha, lam, probe_radius, esp_weight)[source]#
Objective for ESP combo score. Handles broadcasting for ref_* inputs. fit_* inputs are expected to be repeated if se3_params is batched.
- Parameters:
se3_params (torch.Tensor)
ref_centers_w_H (torch.Tensor)
fit_centers_w_H (torch.Tensor)
ref_centers (torch.Tensor)
fit_centers (torch.Tensor)
ref_points (torch.Tensor)
fit_points (torch.Tensor)
ref_partial_charges (torch.Tensor)
fit_partial_charges (torch.Tensor)
ref_surf_esp (torch.Tensor)
fit_surf_esp (torch.Tensor)
ref_radii (torch.Tensor)
fit_radii (torch.Tensor)
alpha (float)
lam (float)
probe_radius (float)
esp_weight (float)
- Return type:
torch.Tensor
- shepherd_score.alignment.objective_pharm_overlay(se3_params, ref_pharms, fit_pharms, ref_anchors, fit_anchors, ref_vectors, fit_vectors, similarity='tanimoto', extended_points=False, only_extended=False, precomputed_self_overlaps=None)[source]#
Objective function to optimize ROCS overlay. Supports batched and non-batched inputs. If the inputs are batched, the loss is the average across the batch.
- Parameters:
se3_params (torch.Tensor (batch, 7) or (7,)) – Parameters for SE(3) transformation. The first 4 values in the last dimension are quaternions of form (r,i,j,k) and the last 3 values of the last dimension are the translations in (x,y,z).
ref_anchors (torch.Tensor (batch, N, 3) or (N,3)) – Reference anchors. If you want to optimize to the same ref_anchors, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
fit_anchors (torch.Tensor (batch, M, 3) or (M,3)) – Set of anchors to apply SE(3) transformations to maximize shape similarity with ref_anchors. If you want to optimize to the same fit_anchors, with a batch of different se3_params, try use torch.Tensor.repeat((batch, 1, 1)).
ref_charges (torch.Tensor (batch, N) or (N,)) – Electric potential at the corresponding ref_anchors coordinates.
fit_charges (torch.Tensor (batch, N) or (N,)) – Electric potential at the corresponding fit_anchors coordinates
alpha (float) – Gaussian width parameter used in scoring function.
lam (float) – Scaling term for charges used in the exponential kernel of the ESP scoring function.
ref_pharms (torch.Tensor)
fit_pharms (torch.Tensor)
ref_vectors (torch.Tensor)
fit_vectors (torch.Tensor)
similarity (Literal['tanimoto', 'tversky', 'tversky_ref', 'tversky_fit'])
extended_points (bool)
only_extended (bool)
precomputed_self_overlaps (Tuple[torch.Tensor, torch.Tensor] | None)
- Returns:
loss – 1 - mean(ESP Tanimoto score).
- Return type:
torch.Tensor (1,)
- shepherd_score.alignment.crippen_align(ref_rdmol, fit_rdmol)[source]#
Align fit_rdmol with respect to ref_rdmol with rdkit’s Crippen Alignment algorithm.
- Parameters:
ref_rdmol (rdkit.Chem.rdchem.Mol) – Reference molecule that fit_rdmol is aligned to.
fit_rdmol (rdkit.Chem.rdchem.Mol) – Fit molecule that will be aligned to the reference.
- Returns:
aligned_fit_rdmol – Fit molecule with new aligned coordinates.
- Return type:
- shepherd_score.alignment.optimize_ROCS_overlay(ref_points, fit_points, alpha, *, fit_points_for_avoid=None, avoid_points=None, avoid_min_dist=2.0, avoid_weight=1.0, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize alignment of fit_points with respect to ref_points using SE(3) transformations and maximizing gaussian overlap score.
If num_repeats is 1, the initial guess for alignment is an identity rotation and aligned COMs. If num_repeats is 5 or greater, four initial guesses are aligned using principal components.
- Parameters:
ref_points (torch.Tensor (N,3)) – Reference points.
fit_points (torch.Tensor (M,3)) – Set of points to apply SE(3) transformations to maximize shape similarity with ref_points.
alpha (float) – Gaussian width parameter used in scoring function.
fit_points_for_avoid (torch.Tensor (M,3)) – Set of points to apply SE(3) transformations to then compare to avoid_points
avoid_points (torch.Tensor (K,3) (default=None)) – If not None, these are points that are used in an additional term in the objective function to penalize overlap with these points.
avoid_min_dist (float (default=2.0)) – Minimum distance with no penalization between fit_points_for_avoid and avoid_points.
avoid_weight (float (default=1.0)) – Weight for the avoid_points term in the scoring function.
num_repeats (int (default=50)) – Number of different random initializations of SE(3) transformation parameters.
trans_centers (torch.Tensor (P, 3) (default=None)) – Locations to translate fit_points’ center of mass as an initial guesses for optimization. At each translation center, 10 rotations are also sampled. So the number of initializations scales as (# translation centers * 10 + 5) where 5 is from the identity and 4 PCA with aligned COM’s. If None, then num_repeats rotations are done with aligned COM’s.
lr (float (default=0.1)) – Learning rate or step-size for optimization
max_num_steps (int (default=200)10) – Maximum number of steps to optimize over.
verbose (bool (False)) – Print initial and final similarity scores with scores every 100 steps.
- Returns:
- aligned_pointstorch.Tensor (M,3)
The transformed point cloud for fit_points using the optimized SE(3) transformation for alignment with ref_points.
- SE3_transformtorch.Tensor (4,4)
Optimized SE(3) transformation matrix used to obtain aligned_points from fit_points.
- scoretorch.Tensor (1,)
Tanimoto shape similarity score for the optimal transformation.
- Return type:
- shepherd_score.alignment.optimize_ROCS_overlay_analytical(ref_points, fit_points, alpha, *, fit_points_for_avoid=None, avoid_points=None, avoid_min_dist=2.0, avoid_weight=1.0, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize shape alignment using analytical gradients instead of autograd.
Same interface and behavior as
optimize_ROCS_overlay, but uses hand-derived analytical gradients with a manual Adam optimizer, eliminating PyTorch autograd overhead.- Parameters:
ref_points (torch.Tensor (N,3))
fit_points (torch.Tensor (M,3))
alpha (float)
fit_points_for_avoid (torch.Tensor (M2,3) or None) – Points to penalize for overlap with avoid_points. Defaults to fit_points if None.
avoid_points (torch.Tensor (K,3) or None) – Fixed points to avoid overlapping with.
avoid_min_dist (float) – Distance threshold for avoid penalty.
avoid_weight (float) – Weight of the avoid penalty term.
num_repeats (int)
trans_centers (torch.Tensor or None)
lr (float)
max_num_steps (int)
verbose (bool)
- Return type:
tuple of (aligned_points, SE3_transform, score)
- shepherd_score.alignment.optimize_ROCS_esp_overlay(ref_points, fit_points, ref_charges, fit_charges, alpha, lam, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize alignment of fit_points with respect to ref_points using SE(3) transformations and maximizing electrostatic-weighted gaussian overlap score.
- Parameters:
ref_points (torch.Tensor (N,3)) – Reference points.
fit_points (torch.Tensor (M,3)) – Set of points to apply SE(3) transformations to maximize shape similarity with ref_points.
ref_charges (torch.Tensor (batch, N) or (N,)) – Electric potential at the corresponding ref_points coordinates.
fit_charges (torch.Tensor (batch, N) or (N,)) – Electric potential at the corresponding fit_points coordinates
alpha (float) – Gaussian width parameter used in scoring function.
lam (float) – Scaling term for charges used in the exponential kernel of the ESP scoring function.
num_repeats (int (default=50)) – Number of different random initializations of SE(3) transformation parameters.
trans_centers (torch.Tensor (P, 3) (default=None)) – Locations to translate fit_points’ center of mass as an initial guesses for optimization. At each translation center, 10 rotations are also sampled. So the number of initializations scales as (# translation centers * 10 + 5) where 5 is from the identity and 4 PCA with aligned COM’s. If None, then num_repeats rotations are done with aligned COM’s.
lr (float (default=0.1)) – Learning rate or step-size for optimization
max_num_steps (int (default=200)) – Maximum number of steps to optimize over.
verbose (bool (False)) – Print initial and final similarity scores with scores every 100 steps.
- Returns:
- aligned_pointstorch.Tensor (M,3)
The transformed point cloud for fit_points using the optimized SE(3) transformation for alignment with ref_points.
- SE3_transformtorch.Tensor (4,4)
Optimized SE(3) transformation matrix used to obtain aligned_points from fit_points.
- scoretorch.Tensor (1,)
Tanimoto shape similarity score for the optimal transformation.
- Return type:
- shepherd_score.alignment.optimize_ROCS_esp_overlay_analytical(ref_points, fit_points, ref_charges, fit_charges, alpha, lam, *, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize ESP alignment using analytical gradients instead of autograd.
Same interface and behavior as
optimize_ROCS_esp_overlay, but uses hand-derived analytical gradients with a manual Adam optimizer.- Parameters:
ref_points (torch.Tensor (N,3))
fit_points (torch.Tensor (M,3))
ref_charges (torch.Tensor (N,))
fit_charges (torch.Tensor (M,))
alpha (float)
lam (float) – Pre-scaled lam (e.g. LAM_SCALING * lam_user).
num_repeats (int)
trans_centers (torch.Tensor or None)
lr (float)
max_num_steps (int)
verbose (bool)
- Return type:
tuple of (aligned_points, SE3_transform, score)
- shepherd_score.alignment.optimize_esp_combo_score_overlay(ref_centers_w_H, fit_centers_w_H, ref_centers, fit_centers, ref_points, fit_points, ref_partial_charges, fit_partial_charges, ref_surf_esp, fit_surf_esp, ref_radii, fit_radii, alpha, lam, probe_radius=1.0, esp_weight=0.5, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize alignment using ESP combo score.
- Parameters:
ref_centers_w_H (torch.Tensor)
fit_centers_w_H (torch.Tensor)
ref_centers (torch.Tensor)
fit_centers (torch.Tensor)
ref_points (torch.Tensor)
fit_points (torch.Tensor)
ref_partial_charges (torch.Tensor)
fit_partial_charges (torch.Tensor)
ref_surf_esp (torch.Tensor)
fit_surf_esp (torch.Tensor)
ref_radii (torch.Tensor)
fit_radii (torch.Tensor)
alpha (float)
lam (float)
probe_radius (float)
esp_weight (float)
num_repeats (int)
trans_centers (torch.Tensor | None)
lr (float)
max_num_steps (int)
verbose (bool)
- Return type:
Tuple[torch.Tensor, torch.Tensor, torch.Tensor]
- shepherd_score.alignment.optimize_pharm_overlay(ref_pharms, fit_pharms, ref_anchors, fit_anchors, ref_vectors, fit_vectors, similarity='tanimoto', extended_points=False, only_extended=False, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize alignment of fit_anchors with respect to ref_anchors using SE(3) transformations and maximizing electrostatic-weighted gaussian overlap score.
- Parameters:
ref_pharms (torch.Tensor (N,) Indices reflecting pharmacophore type of reference molecule)
fit_pharms (torch.Tensor (N,) Indices reflecting pharmacophore type of fit molecule)
ref_anchors (torch.Tensor (N,3) Reference pharmacophore positions (anchors).)
fit_anchors (torch.Tensor (M,3) Set of anchors to align pharmacophores to ref.)
ref_vectors (torch.Tensor (batch, N, 3) or (N,3) Relative unit vectors to the anchor anchors.)
fit_vectors (torch.Tensor (batch, N, 3) or (N,3) Relative unit vectors to the anchor anchors.)
similarity (str from ('tanimoto', 'tversky', 'tversky_ref', 'tversky_fit')) –
- Specifies what similarity function to use.
’tanimoto’ – symmetric scoring function ‘tversky’ – asymmetric -> Uses OpenEye’s formulation 95% normalization by molec 1 ‘tversky_ref’ – asymmetric -> Uses Pharao’s formulation 100% normalization by molec 1. ‘tversky_fit’ – asymmetric -> Uses Pharao’s formulation 100% normalization by molec 2.
extended_points (bool of whether to score HBA/HBD with gaussian overlaps of extended points.)
only_extended (bool for when extended_points is True, decide whether to only score the) – extended points (ignore anchor overlaps)
num_repeats (int (default=50)) – Number of different random initializations of SE(3) transformation parameters.
trans_centers (torch.Tensor (P, 3) (default=None)) – Locations to translate fit_points’ center of mass as an initial guesses for optimization. At each translation center, 10 rotations are also sampled. So the number of initializations scales as (# translation centers * 10 + 5) where 5 is from the identity and 4 PCA with aligned COM’s. If None, then num_repeats rotations are done with aligned COM’s.
lr (float (default=0.1) Learning rate or step-size for optimization)
max_num_steps (int (default=200) Maximum number of steps to optimize over.)
verbose (bool (False) Print initial and final similarity scores with scores every 100 steps.)
- Returns:
- aligned_pointstorch.Tensor (M,3)
The transformed point cloud for fit_points using the optimized SE(3) transformation for alignment with ref_points.
- aligned_vectorstorch.Tensor (M,3)
The transformed vectors for fit_vectors using the optimized SO(3) transformation for aligment with ref_points.
- SE3_transformtorch.Tensor (4,4)
Optimized SE(3) transformation matrix used to obtain aligned_points from fit_points.
- scoretorch.Tensor (1,)
Tanimoto shape similarity score for the optimal transformation.
- Return type:
- shepherd_score.alignment.optimize_pharm_overlay_analytical(ref_pharms, fit_pharms, ref_anchors, fit_anchors, ref_vectors, fit_vectors, similarity='tanimoto', extended_points=False, only_extended=False, num_repeats=50, trans_centers=None, lr=0.1, max_num_steps=200, verbose=False)[source]#
Optimize pharmacophore alignment using analytical gradients instead of autograd.
Same interface and behavior as
optimize_pharm_overlay, but uses hand-derived analytical gradients with PyTorch’s Adam optimizer, eliminating PyTorch autograd overhead.Supports
similarity='tanimoto','tversky','tversky_ref', and'tversky_fit', andextended_points=True.- Parameters:
ref_pharms (torch.Tensor (N,))
fit_pharms (torch.Tensor (M,))
ref_anchors (torch.Tensor (N,3))
fit_anchors (torch.Tensor (M,3))
ref_vectors (torch.Tensor (N,3))
fit_vectors (torch.Tensor (M,3))
similarity (str)
extended_points (bool)
only_extended (bool)
num_repeats (int)
trans_centers (torch.Tensor or None)
lr (float)
max_num_steps (int)
verbose (bool)
- Return type:
tuple of (aligned_anchors, aligned_vectors, SE3_transform, score)