Problem
Spectral-Doppler velocity depends on the beam-to-vessel angle $\theta$ through $f_d = 2 f_0 v \cos\theta / c$, and angle correction is set by hand. Patil & Anand (EMBC 2019) learn $\theta$ directly from a single grayscale B-mode carotid image — no color Doppler, no segmentation. This project rebuilds that pipeline from scratch, replicates it cleanly, explains why it works, and pushes the estimator as far as it will go.
Approach
- One typed, test-first library (Keras 3 / JAX,
pixi), with the model written once and the backend chosen per machine. - Orientation-preserving grid pooling instead of global average pooling (global pooling is partly rotation-invariant — wrong for an orientation target). This is the load-bearing design choice that makes a frozen backbone reproduce the paper.
- Two sampling protocols behind a config flag: image-level sampling (the paper’s standard augmented-corpus protocol) and patient-level sampling (cross-subject, holding out whole volunteers) — two complementary lenses, each reported and each tuned to its own best.
- Optuna TPE hyperparameter search against cached frozen features (each trial a shallow head fit; one extraction per backbone serves both protocols), then a stacked ensemble of the tuned backbones.
- Clinical-grade, post-hoc evaluation, all Keras-free: split-conformal intervals, Bland–Altman, calibration curves, patient-level nested CV, test-time augmentation, a classical structure-tensor prior + fusion, and Grad-CAM.
- Every figure is regenerated from
results/by script; the whole thing is reproducible withpixi run all.
Headline results
- Replication: a frozen DenseNet201 + grid pooling reproduces Table I at 5.84% MAPE (3.77° MAE) — the fix was the pooling, not the backbone (it lifts the frozen model from ~14% to 5.84%).
- Best estimator, image-level sampling: an Optuna-tuned 5-model ensemble reaches 2.79% MAPE / 1.96° MAE ($R^2$ 0.995) — better than the paper’s best single model.
- Best estimator, patient-level sampling: the tuned ensemble reaches 8.53% MAPE / 5.93° MAE ($R^2$ 0.952) on the stricter cross-subject regime.
- Architecture bake-off: frozen DenseNet201 beats ConvNeXt and EfficientNetV2 — newer is not better for small-data frozen transfer.
- Clinical-grade: split-conformal 90% intervals of ±20.5° at 95% coverage; Bland–Altman +4.3° bias vs the single reference reading (method-vs-reference, not inter-observer — honestly flagged); test-time augmentation cuts per-image MAE 7.8° → 4.7°.
- Honest about the ceiling: end-to-end fine-tuning and modern self-supervised encoders (DINOv2, USFM) are deferred to a CUDA box — documented, not hidden.
Links
- Repo: github.com/nilesh-patil/ultrasound-doppler-angle-estimation
- Blog post: Reading the Doppler angle off B-mode: a deep-learning replication, tuned two ways — the full write-up, the pooling insight, and the figures.
- Prior work: EMBC 2019 paper · extended preprint arXiv:2508.04243.