Posts by Collection

portfolio

Real MTCNN-PyTorch

The MTCNN that makes full use of PyTorch and Torchvision! I never understand why common MTCNN-PyTorch implementations have to convert images back to PIL Images and back to Tensors again and again. This implementation avoids that and is much faster.

generals.ai

Group project for COMP2113. We implemented the game generals.io in terminal using only C++ STL libraries. We also added four kinds of AI players with different strategies. (I have always wanted to do this ever since I started playing the game at high school… Quite some dream come true here…)

SPY ETF Close Price Regression

Group project for FITE3010. We tried out 9 models including: ARIMA-GARCH Hybrid Model, BNN, Multi-asset Attention Model, RF, MLP, SVR, HMM, TFT, and XGBoost. I am responsible for the TFT part and made it public on GitHub. It reached a test MSE of 89.19.

publications

Protego: User-Centric Pose-Invariant Privacy Protection Against Face Recognition-Induced Digital Footprint Exposure

Published in arXiv, under review of a CCF-A conference, 2025

Protego encapsulates a user’s 3D facial signatures into a pose-invariant 2D representation, which is dynamically deformed into a natural-looking 3D mask tailored to the pose and expression of any facial image of the user, and applied prior to online sharing. Motivated by a critical limitation of existing methods, Protego amplifies the sensitivity of FR models so that protected images cannot be matched even among themselves.

Recommended citation: Ziling Wang, Shuya Yang, Jialin Lu, Ka-Ho Chow (2025). "Protego: User-Centric Pose-Invariant Privacy Protection Against Face Recognition-Induced Digital Footprint Exposure." arXiv. 1(3).
Download Paper

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Published in arXiv, under review of a CCF-A conference, 2025

TRivia is a self-supervised fine-tuning method that enables pretrained vision-language models to learn table recognition directly from unlabeled table images in the wild. Built upon Group Relative Policy Optimization, TRivia automatically identifies unlabeled samples that most effectively facilitate learning and eliminates the need for human annotations through a question-answering-based reward mechanism.

Recommended citation: Junyuan Zhang and Bin Wang and Qintong Zhang and Fan Wu and Zichen Wen and Jialin Lu and Junjie Shan and Ziqi Zhao and Shuya Yang and Ziling Wang and Ziyang Miao and Huaping Zhong and Yuhang Zang and Xiaoyi Dong and Ka-Ho Chow and Conghui He (2025). "TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition." arXiv. 1(3).
Download Paper

services

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.