Accompanied by the rapid development of ultrafast laser platforms in recent decades, the spatiotemporal manipulation of ultrashort laser pulses has attracted much attention due to the potential for cutting-edge applications of structured light, including optical tweezers, optical communications, super-resolution imaging, time-resolved spectroscopy in molecules and quantum materials, and strong-field physics. Today, techniques capable of characterizing the full spatial, temporal, and polarization state properties of structured light are strongly desired. Here, we demonstrate a technique, termed 3D TIPTOE, for characterizing structured mid-infrared waveforms, which uses only a two-dimensional silicon-based image sensor as both the detector and the nonlinear medium. By combining the advantages of the sub-cycle time resolution afforded by nonlinear excitation and the spatial resolution inherent to the two-dimensional sensor, the 3D TIPTOE technique allows full characterization of structured electric fields, significantly reducing the complexity of detection compared to other techniques. The validity of the technique is established by measuring both few-cycle Bessel–Gaussian pulses and radially polarized femtosecond vector beams.