ResearchHub | Open Science Community

FZ

Fan Zhang

Author with expertise in Single Image Super-Resolution Techniques

Achievements

This user has not unlocked any achievements yet.

Key Stats

Upvotes received:

0

Publications:

9

(11% Open Access)

Cited by:

3

h-index:

23

/

i10-index:

43

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

Show more

How is this calculated?

Publications

Accelerating Learnt Video Codecs with Gradient Decay and Layer-Wise Distillation

Tianhao Peng et al.Jun 12, 2024

In recent years, end-to-end learnt video codecs have demonstrated their potential to compete with conventional coding algorithms in term of compression efficiency. However, most learning-based video compression models are associated with high computational complexity and latency, in particular at the decoder side, which limits their deployment in practical applications. In this paper, we present a novel model-agnostic pruning scheme based on gradient decay and adaptive layer-wise distillation. Gradient decay enhances parameter exploration during sparsification whilst preventing runaway sparsity and is superior to the standard Straight-Through Estimation. The adaptive layer-wise distillation regulates the sparse training in various stages based on the distortion of intermediate features. This stage-wise design efficiently updates parameters with minimal computational overhead. The proposed approach has been applied to three popular end-to-end learnt video codecs, FVC, DCVC, and DCVC-HEM. Results confirm that our method yields up to 65% reduction in MACs and 2× speedup with less than 0.3dB drop in BD-PSNR. Supporting code and supplementary material can be downloaded from: https://jasminepp.github.io/lightweighltdvc/.

Artificial Intelligence

Signal Processing

0

Paper

Artificial Intelligence

Save

BVI-Artefact: An Artefact Detection Benchmark Dataset for Streamed Videos

Feng Chen et al.Jun 12, 2024

Professionally generated content (PGC) streamed online can contain visual artefacts that degrade the quality of user experience. These artefacts arise from different stages of the streaming pipeline, including acquisition, post-production, compression, and transmission. To better guide streaming experience enhancement, it is important to detect specific artefacts at the user end in the absence of a pristine reference. In this work, we address the lack of a comprehensive benchmark for artefact detection within streamed PGC, via the creation and validation of a large database, BVI-Artefact. Considering the ten most relevant artefact types encountered in video streaming, we collected and generated 480 video sequences, each containing various artefacts with associated binary artefact labels. Based on this new database, existing artefact detection methods are benchmarked, with results showing the challenging nature of this tasks and indicating the requirement of more reliable artefact detection methods. To facilitate further research in this area, we have made BVI-Artifact publicly available at bttps://chenfeng-bristol.github.io/BVI=Artefact/

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

Save

Object detection on low-resolution images with two-stage enhancement

Minghong Li et al.May 24, 2024

Although deep learning-based object detection methods have achieved superior performance on conventional benchmark datasets, it is still difficult to detect objects from low-resolution (LR) images under diverse degradation conditions. To this end, a two-stage enhancement method for the LR image object detection (TELOD) framework is proposed. In the first stage, an extremely lightweight task disentanglement enhancement network (TDEN) is developed as a super-resolution (SR) sub-network before the detector. In the TDEN, the SR images can be obtained by applying the recurrent connection manner between an image restoration branch (IRB) and a resolution enhancement branch (REB) to enhance the input LR images. Specifically, the TDEN reduces the difficulty of image reconstruction by dividing the total image enhancement task into two sub-tasks, which are accomplished by the IRB and REB, respectively. Furthermore, a shared feature extractor is applied across two sub-tasks to explore common and accurate feature representations. In the second stage, an auxiliary feature enhancement head (AFEH) driven by high-resolution (HR) image priors is designed to improve the task-specific features produced by the detection Neck without any extra inference costs. In particular, the feature interaction module is built into the AFEH to integrate the features from the enhancement and detection phases to learn comprehensive information for detection. Extensive experiments show that the proposed TELOD significantly outperforms other methods. Specifically, the TELOD achieves mAP improvements of 1.8% and 3.3% over the second best method AERIS on degraded VOC and COCO datasets, respectively.

Artificial Intelligence

0

Paper

Save

Power supply reliability evaluation of distribution network based on non-intrusive low-voltage power load identification and time series algorithm

Wenqian Jiang et al.Jan 1, 2024

Electrical And Electronic Engineering

0

Paper

Electrical And Electronic Engineering

Save

Immersive Video Compression Using Implicit Neural Representations

Ho Kwan et al.Jun 12, 2024

Recent work on implicit neural representations (INRs) has evidenced their potential for efficiently representing and encoding conventional video content. In this paper we, for the first time, extend their application to immersive (multi-view) videos, by proposing MV-HiNeRV, a new INR-based immersive video codec. MV-HiNeRV is an enhanced version of a state-of-the-art INR-based video codec, HiNeRV, which was developed for single-view video compression. We have modified the model to learn a different group of feature grids for each view, and share the learnt network parameters among all views. This enables the model to effectively exploit the spatio-temporal and the inter-view redundancy that exists within multi-view videos. The proposed codec was used to compress multi-view texture and depth video sequences in the MPEG Immersive Video (MIV) Common Test Conditions, and tested against the MIV Test model (TMIV) that uses the VVenC video codec. The results demonstrate the superior performance of MV-HiNeRV, with significant coding gains (up to 72.33%) over TMIV. The implementation of MV-HiNeRV is published for further development and evaluation 1 1 https://hmkx.github.io/mv-hinerv/.

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

Computer Vision And Pattern Recognition

Save

KCGGC: Keypoint Confidence-Guided Gamma Correction for Automatic Enhancement of Lateral Cervical Spine X-ray Images

M. ZHANG et al.Jan 1, 2024

Artificial Intelligence

0

Paper

Artificial Intelligence

Save

CSFIN: A lightweight network for camouflaged object detection via cross-stage feature interaction

Minghong Li et al.Jan 1, 2025

0

Paper

Save

RankDVQA-Mini: Knowledge Distillation-Driven Deep Video Quality Assessment

Chen Feng et al.Jun 12, 2024

Artificial Intelligence

0

Paper

Artificial Intelligence

Save

EDSD: efficient driving scenes detection based on Swin Transformer

Wei Chen et al.Jul 20, 2024

Artificial Intelligence

Automotive Engineering

0

Paper

Artificial Intelligence

Automotive Engineering

Save