ResearchHub | Open Science Community

JS

Jiaming Shen

Author with expertise in Natural Language Processing

Achievements

This user has not unlocked any achievements yet.

Key Stats

Upvotes received:

0

Publications:

4

(25% Open Access)

Cited by:

0

h-index:

22

/

i10-index:

35

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

Show more

How is this calculated?

Publications

Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting

Zhen Qin et al.Jan 1, 2024

Artificial Intelligence

Computer Science

0

Paper

Artificial Intelligence

Computer Science

Save

Multilingual Fine-Grained News Headline Hallucination Detection

Jiaming Shen et al.Jan 1, 2024

Artificial Intelligence

0

Paper

Artificial Intelligence

Save

Knowledge Distillation with Perturbed Loss: From a Vanilla Teacher to a Proxy Teacher

Rongzhi Zhang et al.Aug 24, 2024

Knowledge distillation is a popular technique to transfer knowledge from a large teacher model to a small student model. Typically, the student learns to imitate the teacher by minimizing the KL divergence of its output distribution with the teacher's output distribution. In this work, we argue that such a learning objective is sub-optimal because there exists a discrepancy between the teacher's output distribution and the ground truth label distribution. Therefore, forcing the student to blindly imitate the unreliable teacher output distribution leads to inferior performance. To this end, we propose a novel knowledge distillation objective PTLoss by first representing the vanilla KL-based distillation loss function via a Maclaurin series and then perturbing the leading-order terms in this series. This perturbed loss implicitly transforms the original teacher into a proxy teacher with a distribution closer to the ground truth distribution. We establish the theoretical connection between this "distribution closeness'' and the student model generalizability, which enables us to select the PTLoss's perturbation coefficients in a principled way. Extensive experiments on six public benchmark datasets demonstrate the effectiveness of PTLoss with teachers of different scales.

Artificial Intelligence

Machine Learning

0

Paper

Artificial Intelligence

Machine Learning

Save

Coordinated protection of low-voltage DC distribution system based on the front zone braking coefficient

Ruike Zhang et al.Jan 1, 2025

Mechanical Engineering

Automotive Engineering

0

Paper

Mechanical Engineering

Automotive Engineering

Save