ffdas: Volumetric ultrasound reconstruction at warp speed

Abstract

Volumetric ultrafast ultrasound imaging demands reconstruction of images with millions of voxels thousands of times per second, creating computational challenges that limit both real-time feedback and easy offline analysis. Graphics processing units (GPUs) are well suited to this workload, yet we show that standard delay-and-sum implementations underutilize GPU resources through fragmented memory access patterns, even when sufficient computational capacity is available. Three optimization strategies address this: aligning memory access with GPU transfer granularity, halving memory traffic through mixed-precision storage, and exploiting spatial locality to utilize tensor core arithmetic. Together, these achieve kilohertz frame rates for 1283-voxel grids with 1024-element arrays, substantially outperforming existing implementations while maintaining image quality. This enables real-time volumetric imaging at scales previously restricted to offline processing, supporting applications such as intraoperative brain imaging and brain-computer interfaces where immediate feedback is essential. We release our implementation as part of the open-source ffdas library.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…