Skip to content

Distances

Distance matrices are the core geometric primitive behind many structural analyses. In Lahuta, these APIs operate directly on coordinate arrays (n x 3) and return NumPy arrays, so they are easy to combine with other tools in the scientific Python ecosystem (e.g. SciPy, pandas).

Use this page when you want explicit distance values. Use Spatial when you want neighbor lists within a radius.

Coordinate Input

All distance functions expect 3D Cartesian coordinates with shape (n, 3) (or (m, 3) for cross-distance cases).

import numpy as np
from lahuta import LahutaSystem

system = LahutaSystem("core/data/ubi.cif")
X = np.asarray(system.props.positions_view, dtype=np.float64, order="C")
print(X.shape)  # (n_atoms, 3)

pairwise_distances

metrics.pairwise_distances(X) returns a full square matrix (n, n) with all pairwise distances.

from lahuta import metrics

D = metrics.pairwise_distances(X[:16])
print(D.shape)      # (16, 16)
print(D[0, 1:5])    # distances from point 0 to points 1..4

You can also pass a second array:

C = metrics.pairwise_distances(X[:8], X[8:24])
print(C.shape)  # (8, 16)

cdist

metrics.cdist(XA, XB) computes cross-distances between two coordinate sets.

XA = X[:8]
XB = X[8:24]
cross = metrics.cdist(XA, XB)
print(cross.shape)  # (8, 16)

Use this when the geometry is naturally split into two sets (for example query vs reference atoms).

pdist

metrics.pdist(X) returns condensed upper-triangle distances (a 1D vector), similar to SciPy semantics.

condensed = metrics.pdist(X[:16])
print(condensed.shape)  # (16 * 15 // 2,)

This is useful when you want all pairwise distances without allocating a full square matrix.

Squared Distances

All three functions support squared=True:

D_sq = metrics.pairwise_distances(X[:16], squared=True)
P_sq = metrics.pdist(X[:16], squared=True)
C_sq = metrics.cdist(X[:8], X[8:24], squared=True)

Squared distances are often preferable in scoring code where you want to avoid repeated square roots.