← blog

From cones to track — EMA tracking, KNN graphs, and DFS path search

How a list of detected cones becomes a pair of track boundaries and a center line — exponential smoothing to stabilize noisy detections, a K-nearest-neighbor graph to encode which cones are adjacent, and a depth-first search that finds the best path through that graph. This is the second half of the voxel cone detection pipeline. The full working implementation is on GitHub.

Where we left off

The previous post covered how raw LiDAR point clouds become a list of detected cones: voxel downsampling, ground removal, DBSCAN clustering, and geometric scoring. That pipeline runs every 500 ms on an accumulated window of up to 20,000 points streamed over UDP from a Livox HAP sensor. The 500 ms interval is a configurable constant (DETECT_INTERVAL) set deliberately slow for initial testing — the full pipeline comfortably runs at 10 Hz (100 ms) without modification, it's just a matter of what you want to accumulate. The output is a Vec<Cone> — each with a position, a bounding size, and a confidence score.

The next problem: individual detections are noisy. A cone that returns 8 points in one frame might return only 3 in the next, shifting the computed centroid by several centimeters. Worse, at the 500 ms detection rate a cone that was briefly occluded might disappear entirely for one or two frames, creating gaps in the cone list. Connecting noisy, incomplete cone lists directly into track boundaries produces unstable, flickering paths. We need to smooth and persist observations across frames before doing any path computation.

Step 1 — EMA tracking

The tracker maintains a list of tracked cones across frames. Each tracked cone has a smoothed position, a hit count, and a missed-frame counter. Every 500 ms, after the detection pipeline finishes, the tracker does two things: it matches new detections to existing tracked cones, then it applies exponential smoothing to update the positions of matched cones.

Matching is greedy: for each new detection, find the nearest unmatched tracked cone within a 40 cm radius. If one is found, they're paired. If not, the detection becomes a new tracked cone. This is a simplified version of the Hungarian assignment algorithm — it doesn't guarantee the globally optimal assignment, but with typical cone spacings of 1–3 m the greedy approach rarely makes mistakes.

For matched pairs, the tracked cone's position is updated with an exponential moving average (EMA):

x_new = α · x_old + (1 − α) · x_detected y_new = α · y_old + (1 − α) · y_detected α = 0.6 → 60% old state, 40% new measurement

The EMA is the simplest possible low-pass filter. It has no memory of how the position got to where it is — it only needs the previous state and the new measurement. Compared to a windowed average (keep N samples, compute mean), it uses constant memory and has constant update cost regardless of how long the cone has been tracked.

Why α = 0.6? The step response of an EMA with factor α reaches 95% of a step change in approximately log(0.05) / log(α) frames:

n₀.₉₅ = log(0.05) / log(0.6) ≈ −2.996 / −0.511 ≈ 5.9 frames

At the default 500 ms interval that works out to ~3 seconds, but since the interval is configurable the step response stays constant in frames regardless of rate — about 6 detection cycles to track a real position change. That's fast enough to follow the slow drift from ego-motion while filtering centimeter-scale noise from point density variation. A higher α (say 0.9) would filter more aggressively but take 10+ frames to track a real position change — unacceptable for a moving vehicle.

Unmatched tracked cones have their missed counter incremented. Unconfirmed cones (fewer than 2 hits) are dropped after 8 missed frames. Confirmed cones can be set to persist indefinitely — useful once the full track has been mapped. This asymmetry means false positives are short-lived but true positives are never lost.

Step 2 — KNN graph construction

With a stable list of tracked cones, the next step is to figure out which cones are adjacent along the track boundary. We can't just sort by distance from the car — on a curved track, the nearest cone in straight-line distance isn't necessarily the next one along the boundary. Instead, we build a graph where each cone is a node and edges connect cones that are geometrically nearby.

For each cone, we find its K = 6 nearest neighbors within a 6 m radius and draw directed edges to them. Then we symmetrize: an edge is kept in the final undirected graph only if it's mutual (cone A is in B's K-NN and B is in A's K-NN) or if the two cones are within 4 m of each other. The 4 m rule prevents an isolated cone from having no connections just because its nearest neighbor is on the opposite boundary and not mutual.

The O(n²) complexity in the pipeline header refers to this step: for each of the n cones we scan all other cones to build the neighbor list. With n ≈ 30 tracked cones that's 900 distance computations per frame — negligible. The reason we don't use a k-d tree here (as in the clustering step) is that 30 points don't justify the tree construction overhead.

Why K = 6? On a straight section the "correct" graph has each cone connected to the one behind it and the one ahead — degree 2. K = 6 provides redundancy: if one edge goes to the wrong side of the track, there are five others, and the path search can discard the bad one. It also handles hairpin corners where the "next" cone might not be the closest in Euclidean distance.

Step 3 — DFS path search

Given the KNN graph, we need to find the best path from the car's current position along the left boundary and separately along the right boundary. We run two independent depth-first searches — one starting from the nearest valid left cone (y > 0), one from the nearest valid right cone (y < 0).

The DFS maintains a stack of (path, last\_direction) pairs. At each step, it tries to extend the current path through every neighbor in the KNN graph, pruning candidates that violate geometric constraints, and pushes valid extensions back onto the stack. When the stack is empty, the lowest-cost path found is returned.

Pruning rules. Four constraints are checked before any candidate extension is accepted:

  1. Distance: the segment from the current cone to the candidate must be between 1 cm and 3 m. Shorter rejects duplicates; longer rejects jumps across the track.
  2. Absolute angle: the angle between the current direction vector and the candidate direction must be at most 75°. This is computed via the dot product:
    cos θ = (d⃗_prev · d⃗_next) / (‖d⃗_prev‖ · ‖d⃗_next‖) θ = arccos(cos θ) → reject if θ > 75°
  3. Signed angle: the cross product gives the sign of the turn — left or right. For the left boundary the path should not turn sharply right (cross_z > 0 means left turn in 2D):
    cross_z = d⃗_prev.x · d⃗_next.y − d⃗_prev.y · d⃗_next.x signed_angle = atan2(cross_z, dot) → reject left path if signed_angle < −50°
  4. Reversal: if the previous turn and the current turn have opposite signs and both exceed 1.3 rad (≈ 74°), the path is zigzagging — reject. This catches cases where the graph has short-circuit edges that loop back.

There's also a blocking check: if any off-path cone lies within 80 cm of the proposed segment, the extension is rejected. This prevents paths from cutting through clusters of cones that belong to the opposite boundary.

The cost function. Every complete path (length ≥ 2) is scored. Lower cost is better. The cost has four components:

Length 5000 / path_len Penalizes short paths — longer is better, all else equal
Curvature 5–1000 × angle per turn Correct-direction curves cheap; wrong-direction expensive
Gap 150 × (dist − 5) per segment Large gaps suggest a skipped cone
Wrong side +1500 per cone Heavy penalty if a left-path cone has y < −1 m

The curvature term is the most nuanced. It uses the cross product sign to determine whether a turn curves toward the inside of the track (correct for that boundary) or the outside (wrong). A gentle correct curve adds almost nothing to cost; the same angle in the wrong direction adds 80–1000× more. This encodes the physical constraint that both track boundaries curve in the same direction around a corner.

The complexity O(b^d) has branching factor b ≤ K = 6 and depth d ≤ 16 (MAX_PATH_LEN). In the worst case that's 6¹⁶ ≈ 2.8 × 10¹² nodes — but the pruning constraints eliminate the vast majority of branches immediately. In practice the effective branching factor is 1.5–2.0, making the search complete in microseconds.

tracked cone positionsKNN graph edgesDFS — left boundaryDFS — right boundarycenter lineleftright
⏸ pause
Top-down view: tracked cones → KNN graph edges → DFS left path (blue) → DFS right path (yellow) → midpoint center line (green). Click to pause.

Step 4 — Conflict resolution

The two DFS searches are independent. Nothing prevents them from assigning the same cone to both the left and right boundary — this happens at corners where a cone is ambiguously positioned, or when the graph has unexpected cross-edges. Before computing the center line, overlapping cones are resolved with a four-strategy cascade:

  1. Continuity. If the cone's predecessor in one path is within 3 m, that path has a continuous local connection — keep it there, remove from the other. This handles the common case where one boundary "owns" the cone via a short edge while the other reached it via a longer detour.
  2. Curvature direction. Compute the signed turn angle at the conflict cone for both paths using atan2(cross_z, dot). A left turn on the left path is geometrically correct; a right turn on the right path is geometrically correct. The path with the consistent curvature wins.
  3. Remaining length. If after the conflict point one path has 3+ more cones than the other, it's more likely correct. Remove the cone from the shorter path.
  4. Spatial fallback. If the cone's y coordinate is above +0.5 m it's definitively on the left; below −0.5 m it's on the right. Remove from the wrong path. If still tied, give preference to the shorter path.

The cascade is applied independently for every conflicting cone. The O(l × r) cost is from scanning both path arrays to find each conflict, where l and r are the path lengths — typically ≤ 16, so at most 256 comparisons.

Step 5 — Virtual cones and center line

Before computing the center line, gaps larger than 5 m in either path are filled with virtual cones. These are linearly interpolated positions — not real sensor detections — inserted to prevent the center line from jumping over large holes. The number of virtual cones per gap is n = ⌈gap / 3.5⌉ − 1, targeting a maximum spacing of 3.5 m. Their height is the average of the two bracketing real cones, and they're flagged as virtual so downstream code can treat them differently if needed.

The center line is computed by pairing each left cone with its best-matching right cone. "Best" is defined by minimizing |Δx| + |width − 3| subject to |Δx| < 4 m and 1.5 m ≤ width ≤ 5 m. The midpoint of each matched pair becomes a center point. The result is sorted by x-coordinate (forward direction) and published as the vehicle's reference trajectory.

There's also a path persistence mechanism: if the new left or right path overlaps more than 60% with the previous frame's path (measured as fraction of positions matched within 80 cm), the previous path is kept unchanged — the new one is assumed to be the same track, not an improvement. The hysteresis increases over time: after 12 stable frames, the system requires the new path to be at least 20% cheaper before switching. This prevents flickering between two valid-looking paths that the cost function can't decisively separate.

Why it's built this way

Every algorithm in this pipeline was chosen to match the latency and resource constraints of a real-time embedded system. The EMA uses constant memory and O(1) update per tracked cone. The KNN graph is O(n²) but n is bounded by the physical number of cones visible — never more than ~40. The DFS is exponential in theory but the geometric pruning makes it linear in practice.

Crucially, no step requires a global optimizer or iterative convergence. Each stage makes a single pass over its input and hands off a smaller, cleaner representation to the next. The entire detection + tracking + pathfinding pipeline consistently completes in under 10 ms on an ARM laptop, leaving the 33 ms rendering budget untouched. No GPU, no neural network, no ROS — just geometry and a few hundred lines of Rust.

The main limitation is exactly where you'd expect it: when cones are sparse (long straights, distant corners), the KNN graph becomes sparse, the DFS has fewer paths to explore, and the cost function has less evidence to work with. Virtual cone interpolation patches the center line, but the path quality degrades gracefully rather than catastrophically. For a Formula Student track that stays within 20 m of the sensor, it's more than enough.