Spatial And Neighbors¶
Spatial analysis answers a simple biological question: which parts of a structure are close in 3D space.
This is useful on its own (packing, local environments) and as a foundation for later interaction analysis.
In Lahuta, the geometric layer is built around atom coordinates from LahutaSystem.props and neighbor search utilities that work on those coordinates.
Start Simple: Neighbors From A LahutaSystem¶
If you already have a structure file, the fastest start is system.find_neighbors(...).
from lahuta import LahutaSystem
system = LahutaSystem("core/data/ubi.cif")
neighbors = system.find_neighbors(cutoff=4.0, residue_difference=1)
print("pairs:", len(neighbors))
print("pairs array shape:", neighbors.pairs.shape)
print("distance_sq shape:", neighbors.distances.shape)
print("first 5 sqrt distances [A]:", neighbors.get_sqrt_distances()[:5])
residue_difference lets you suppress trivially local sequence neighbors.
For proteins, this is often useful when you want more meaningful non-local proximity signals instead of immediate backbone-adjacent pairs.
Working With NSResults¶
Neighbor results are returned as NSResults:
pairs:(n, 2)integer array of atom index pairsdistances/distances_sq: squared distances (A^2)get_sqrt_distances(): Euclidean distances (A)
You can filter results incrementally:
# Keep only pairs within 2.5 A
tight = neighbors.filter(2.5)
# Keep pairs where either side is in `keep`
keep = list(range(10))
subset_any = tight.filter(keep)
# Keep pairs where only the first column index is in `keep`
subset_col0 = tight.filter(keep, 0)
For high-throughput analysis, pairs_view and distances_view expose zero-copy read-only views.
If your goal is explicit distance matrices (pairwise_distances, cdist, pdist), use the dedicated Distances page.
Radius Search In A Scikit-Learn Style¶
NearestNeighbors provides a familiar interface for grouped radius queries:
import numpy as np
from lahuta import LahutaSystem, NearestNeighbors
system = LahutaSystem("core/data/ubi.cif")
X = np.asarray(system.props.positions_view, dtype=np.float64, order="C")
nn = NearestNeighbors(radius=6.0, algorithm="kd_tree", sort_results=True).fit(X)
distances, indices = nn.radius_neighbors(return_distance=True)
print("n query points:", len(indices))
print("neighbors of first point:", indices[0][:5])
print("distances of first point [A]:", distances[0][:5])
This is convenient when you want per-query neighbor lists for geometric statistics or clustering-style postprocessing.
Advanced Backends: KDIndex And FastNS¶
For lower-level control, Lahuta exposes two direct engines:
KDIndex: indexed cross-query radius search (good for repeated query analysis)FastNS: grid-based self-search over one coordinate set
import numpy as np
from lahuta import FastNS, KDIndex, LahutaSystem
system = LahutaSystem("core/data/ubi.cif")
X = np.asarray(system.props.positions_view, dtype=np.float64, order="C")
# KD index for repeated cross-searches
kd = KDIndex()
kd.build_view(X) # zero-copy build on C-contiguous float64 array
flat = kd.radius_search(X[:10], radius=5.0)
print("kd flat pairs:", len(flat))
# FastNS for self-search
grid = FastNS(X)
if grid.build(4.0):
res = grid.self_search()
print("fastns pairs:", len(res))
Most users can stay with system.find_neighbors(...) or NearestNeighbors. Use KDIndex / FastNS when you need explicit control over index lifecycle, memory behavior, or output layout.