CUHK-X Multimodal Human Activity Challenge

The Challenge

Competition Overview

The CUHK-X Multimodal Human Activity Challenge is the first large-scale international competition that excludes RGB data entirely — models learn human dynamics information from depth, IMU, mmWave radar, and skeleton modalities. This privacy-preserving design mirrors the deployment reality of healthcare, smart home, and elderly-care systems, where visual privacy must be preserved at every stage of training, validation, and inference.

Hosted by the AIoT Lab at The Chinese University of Hong Kong on Kaggle across two parallel tracks, with finals held alongside UbiComp 2026 in Shanghai. The Small Model Track targets efficient, edge-deployable HAR; the Large Model Track pushes multimodal LLMs on VQA covering action understanding and reasoning. Total prize pool: USD $20,000.

Host

CUHK · AIoT Lab

Platform

Kaggle — two parallel competitions

Duration

June 20 – September 15, 2026

Finals Venue

UbiComp 2026 · Shanghai · Oct 11, 2026

Two Tracks

Small Model (HAR) · Large Model (VQA)

Permitted Modalities

Depth · IMU · mmWave · Skeleton · Thermal · Infrared

Total Prize Pool

USD $20,000

Finals Format

On-site deployment · technical report · awards

Competition structure

Challenge Format

Two independent parallel tracks, each with its own Kaggle leaderboard, prize pool, and evaluation criteria.

Small Model Track

HAR · Lightweight

Efficient multimodal Human Activity Recognition

Targeted at resource-constrained edge deployment in smart home and healthcare scenarios — applications such as Alzheimer's monitoring, fall detection, and elderly care, where models must run on low-power devices with limited memory and compute.

Participants build lightweight multimodal models that fuse depth imagery, IMU streams, mmWave radar, and skeleton keypoints to classify 40 daily activities under strict cross-subject evaluation. Traditional architectures (CNN, RNN, Transformer) are encouraged; large pretrained foundation models are not permitted — placing the spotlight on architecture design, sensor fusion, and inference efficiency.

Task40-class HAR · Cross-Subject

Cross-Subject SplitTrain: user 1–9, 16–24 · Test: user 10–11, 25–26

Primary ModalitiesDepth, IMU, mmWave, Skeleton, IR, Thermal

Model ConstraintCNN / RNN / Transformer · model size ≤ 100 MB · no large pretrained backbones

Prize PoolUSD $10,000

Final Scoring Breakdown (for finalists)

Kaggle Private LB

20%

Final On-Site Private Test

30%

Reproducibility (Stage 2)

10%

Technical Report

20%

Presentation

10%

Model Efficiency

10%

Large Model Track

VQA · HAU & HARn

LVLMs on depth & non-RGB modalities

Pushes the frontier of Large Vision-Language Models on non-RGB modalities.

Participants tackle Human Action Understanding (HAU) and Human Action Reasoning (HARn) through privacy-preserving video Visual Question Answering. There is no parameter limit, encouraging exploration of prompt design, modality alignment, and fine-tuning at scale.

TaskVQA on privacy-preserving videos

Cross-Subject SplitTrain: user 1–9, 16–24 · Test: user 10–11, 25–26

Primary ModalitiesDepth, Thermal, IR

Model ConstraintNo parameter limit · LVLMs encouraged

Prize PoolUSD $10,000

Final Scoring Breakdown (for finalists)

Kaggle Private LB

20%

Final On-Site Private Test

30%

Reproducibility (Stage 2)

10%

Technical Report

20%

Presentation

10%

Model Efficiency

10%

Three-Stage Competition · Both Tracks

The CUHK-X Challenge runs across three stages: a 3-month Kaggle phase, a Zoom-based selection stage to verify reproducibility and prevent cheating, and the on-site finals at UbiComp 2026 in Shanghai. Top 15 teams per track on the Kaggle private leaderboard advance to selection; top 6 then advance to finals.

Kaggle Open Competition

Jun 20 – Sep 15, 2026 · 3 months

Submit prediction CSVs to Kaggle
Public leaderboard for reference; private leaderboard decides ranking
Certificate awards: Excellence (Top 15%), Distinction (Top 30%), Successful Participation
Top 15 teams per track advance to Selection Stage

Selection Stage · Zoom Verification

Sep 16 – Sep 30, 2026 · 2 weeks

Top 15 submit code + checkpoint + inference.sh via email/Drive
Zoom session: live inference on sample data (released at session start)
Sample data has seen + unseen subjects
Organizers reproduce Kaggle result offline to verify weights
Top 6 teams per track pass to UbiComp finals

UbiComp Finals UbiComp 2026

Oct 11, 2026 · Shanghai · 1 day

On-site inference on organizer's brand new private dataset (cross-subject)
Mandatory: 15-min technical report presentation + Q&A
Awards ceremony + USD $500 travel grant per attending finalist team
Remote participation via Zoom supported if travel is not possible

Prizes

Awards

Each track carries an independent prize pool of USD $10,000. The following prizes apply independently to both the Small Model Track and the Large Model Track.

🥇

1st Place × 1

$6,000

Highest overall score

🥈

2nd Place × 1

$3,000

Second-ranked team

🥉

3rd Place × 1

$1,000

Third-ranked team

📄

Best Report Award

TBD

Selected by review committee

⭐

Tier	Award	Eligibility
🏆	Outstanding Award	UbiComp Finals Top 6
🎖️	Finalist Award	Kaggle Private LB Top 15
🎗️	Excellence Award	Kaggle Private LB Top 15% (excl. Top 15)
📜	Distinction Award	Kaggle Private LB Top 30% (excl. above)
✨	Successful Participation Award	Teams with ≥ 1 valid submission

✈️ Travel Grants for UbiComp 2026 Finalists

Flat travel grant for every finalist team attending in person at UbiComp 2026 Shanghai

✈️

All Finalist Teams (Top 6 per track)

USD $500 / team

In-person attendance is strongly encouraged. Teams unable to travel may join remotely via Zoom — organizers will operate the projector and coordinate live Q&A on their behalf. Remote teams remain eligible for prizes and awards, but travel grants only apply to teams attending in person.

Rank	Team	Score	Last Submission
Loading…

Schedule

Challenge Timeline

May 23
2026

Website Released

Official CUHK-X Challenge website goes live with full timeline, track details, and dataset overview.

Jun 20
2026

Competition Launch

Both Kaggle competitions open for registration and submissions. Dataset publicly released. Promotion channels go live simultaneously.

Sep 15
2026

Kaggle Leaderboard Freeze

Public submissions close. Top 15 teams per track on the Kaggle private leaderboard are notified and required to upload code + checkpoint within 48 hours.

Sep 16–30
2026

Selection Stage · Zoom Verification

Top 15 teams attend a Zoom verification session where they run live inference on freshly released sample data. Organizers also reproduce Kaggle results offline using submitted code. Teams with accuracy gap > 10% from their private LB score are disqualified.

Oct 1
2026

Final Top 6 Announced

Top 6 teams per track passing verification are officially invited to UbiComp 2026 finals. Final Technical Report due by this date.

Oct 11
2026

UbiComp 2026 Finals · Shanghai

Finalists run inference on a brand-new private dataset (cross-subject, never released). 15-min technical report presentation + Q&A. Awards ceremony same day. Teams unable to travel may participate via Zoom.

How to participate

Registration

Step 1. Register your team on this official website using the form below — required to be eligible for prizes, announcements, and finals invitations. Step 2. Join the competition on Kaggle and create your team there with the same team name. ⚠ Your Kaggle team name must exactly match the team name registered here — otherwise your certificate and shortlist notification cannot reach you.

Step 1 — Register Your Team

Submit your team info so we can keep you in the loop on competition news, finals logistics, prize coordination, and last-minute announcements. Takes about 2 minutes to fill out.

Fields collected

Team Name
Contact Email
Affiliation
Country / Region
Track(s) of interest
Team Members (1–3)
Faculty Advisor

Step 2 — Join on Kaggle (use the same team name)

Small Model Track

Lightweight HAR

Multimodal action recognition under resource constraints. Build efficient CNN / RNN / Transformer architectures — no large pretrained backbones permitted.

Large Model Track

VQA · HAU & HARn

Vision-language models on depth and non-RGB modalities. Tackle 6,160 VQA questions across five reasoning types — no parameter limit.

Registration Steps

Create Kaggle Account

Join Competition

Accept the competition rules on one or both Kaggle pages.

Form Your Team

Solo entry or up to 3 members. Merge teams via Kaggle's UI before deadline.

Download & Submit

Access the dataset, train your model, and submit predictions to the leaderboard.

Eligibility

Open to students, researchers, and industry teams worldwide
Cross-institution and cross-country teams are permitted
Members of the AIoT Lab and direct collaborators are ineligible for prizes
Participants must comply with Kaggle's terms of service

Team Rules

Team size: 1–3 members (faculty advisor not counted)
Each individual may join only one team per track
Participation in both tracks is allowed
Team mergers lock 7 days before the submission deadline

The data behind the challenge

Dataset Introduction

Most large vision-language models still depend almost entirely on RGB data, while modalities such as depth, thermal imaging, IMU, and millimeter-wave radar remain severely underrepresented. The root cause is a lack of large-scale, high-quality paired multimodal datasets.

CUHK-X, built by the AIoT Lab at CUHK, addresses this gap with 64,267 samples across seven fully synchronized modalities collected from 30 participants performing 40 daily activities across two real-world indoor environments. Annotations follow a Ground-Truth First strategy, combining LLM-generated scene descriptions with human review to ensure temporal and logical consistency. The dataset supports three progressive tasks: HAR (action classification), HAU (action understanding), and HARn (action reasoning).

Total Samples

64,267 fully synchronized recordings

Participants

30 (diverse age and gender)

Environments

2 real-world indoor settings

Action Classes

40 daily life activities

Dataset Modalities

RGB · Depth · Thermal · Infrared · Skeleton · IMU (×5) · mmWave

Challenge Modalities

All except RGB

Benchmark Tasks

HAR · HAU · HARn

Annotation Strategy

Ground-Truth First · LLM + human review

Cross-Subject Split — 30 Participants

Training Set

user 1–9 · 16–24

Public Test Set

user 10,11,25–26

Private Test Set

held out

🌐

CUHK-X Dataset Homepage Train & test splits · 30 participants

Modalities — Available in Dataset

RGB (challenge-excluded) Depth Thermal Infrared Skeleton IMU ×5 mmWave Radar

Benchmark Tasks

HAR — Action Classification HAU — Action Understanding HARn — Action Reasoning

Anti-Cheating & Reproducibility

Verification & Rules

All Top 15 teams per track must pass a Zoom-based verification session before advancing to the UbiComp 2026 finals. The verification involves live inference on freshly released sample data, plus offline reproduction of Kaggle results by organizers.

What you submit

Submission Package

By Sep 22, 23:59 UTC, Top 15 teams upload:

code/ — full training and inference code
checkpoints/model.pth — final model weights
inference.sh — single entry script (data_dir → CSV)
README.md — reproducibility artifact
honor_declaration.pdf — signed

What happens in Zoom

Live Verification Session

A 45-min recorded session per team. Sample data link released at session start. Teams have ≤ 2 hours to complete inference and submit results.

Sample data = seen + unseen subjects (Part A + Part B)
Committee follows your README step-by-step
Pass = ≤ 10% accuracy gap from Kaggle private LB
Failed teams replaced by next-ranked team

Track-Specific Rules

Both Tracks · Strictly Forbidden

Manual labeling of test samples
Using test set ground-truth labels in training (any form)
Multi-account registration or collusion between teams

Small Model Track · Additional Restrictions

No large pretrained backbones
No closed-source APIs or LLMs for development
No API / LLM labeling of training data

Large Model Track

Any pretrained model allowed (including LVLMs)
Closed-source APIs encouraged
LLM-based pseudo-labeling of training data permitted
Prompt engineering is part of the expected toolkit

IP & Code Usage (Kaggle Standard)

Participants retain full copyright on all submitted code and models
All competition data are the exclusive property of AIoT Lab. Please download and review the detailed data license and terms here: LICENSE.txt
Non-finalist code is destroyed after the competition — no public release
Finalist teams (Champion Award, Top 6 per track) must open-source their solution under Apache 2.0 license within 30 days of UbiComp 2026 finals — standard Kaggle Winner License practice
Finalist teams unwilling to open-source may decline their finalist status; the slot is then passed to the next-ranked team