Computed Tomography (CT) serves as a key imaging technology that relies on computationally intensive filtering and back-projection algorithms for 3D image reconstruction. While conventional high-resolution image reconstruction (> 2K3) solutions provide quick results, they typically treat reconstruction as an offline workload to be performed remotely on large-scale HPC systems. The growing demand for post-construction AI-driven analytics and the need for real-time adjustments call for high-resolution reconstruction solutions that are feasible on local computing resources, i.e. a multi-GPU server at most. In this paper, we propose a novel approach that utilizes Tensor Cores to optimize image reconstruction without sacrificing precision. We also introduce a framework designed to enable real-time execution of end-to-end distributed image reconstruction in a multi-GPU environment. Evaluations conducted on a single Nvidia A100 and H100 GPU show performance improvements of 1.91 × and 2.15 × compared to highly optimized production libraries. Furthermore, our framework, when deployed on 8-card Nvidia A100 GPU system, demonstrates the ability to reconstruct real-world datasets into 20483 volumes (32 GB) in slightly more than one minute and 40963 volumes (256 GB) in 7 minutes.