Publicly Shared Datasets
While some parts of our database is confidential, we do have the following datasets that are available for download upon request.

SIXray-D Annotations – The annotations in the database are for detection of prohibited items in the SIXray dataset consists of 10,925 annotation files on 5 different classes, namely, gun, knife, wrench, pliers, and scissors.

For details, please visit​​​​​

Warwick-NTU Multi-camera Forecasting (WNMF) database- The WNMF dataset contains footage of individuals traversing the Nanyang Technological University (NTU) campus. The dataset contains cross-camera tracking information, such that future trajectories of individuals can be anticipated across multiple camera views. The data was collected over a span of 20 days using 15 different CCTV cameras. This was not a controlled collection (i.e., it was collected in real-world conditions).

For details, please visit​​​​​​​​​

NTU CCTV-Fights Dataset - CCTV-Fights Dataset contains 1,000 videos picturing real-world fights, recorded from CCTVs or mobile cameras. We also provide frame-level annotation of each fight instance segment present in the videos, with its exact starting and ending points.

For details, please visit​​​​​​​

ROSE-Youtu Face Liveness Detection Dataset - We introduce a new and comprehensive face anti-spoofing database, ROSE-Youtu Face Liveness Detection Database, which covers a large variety of illumination conditions, camera models, and attack types. The ROSE-Youtu Face Liveness Detection Database (ROSE-Youtu) consists of 4225 videos with 25 subjects in total (3350 videos with 20 subjects publically available with 5.45GB in size).

SIR2 Benchmark Dataset- We propose the Single Image Reflection Removal(SIR2) Benchmark Dataset with a large number and a great diversity of mixture images, and ground truth of background and reflection. Our dataset includes the controlled scenes taken indoor and wild scenes taken outdoor. One part of the controlled scene is composed by a set of solid objects, which uses commonly available daily-life objects (e.g. ceramix mugs, plush toys, fruits, etc.) for both the background and the reflected scenes. The other parts of the controlled scenes use five different postcards and combines them in a pair-wise manner by using each card as background and reflection, respectively. The wild scenes are with real-world objects of complex reflectance (car, tree leaves, glass windows, etc), various distances and scales (residential halls, gardens, and lecture room, etc), and different illuminations (direct sunlight, cloudy sky light and twilight, etc.).

Action Recognition Dataset - The NTU RGB+D action recognition dataset consists of 56,880 action samples containing RGB videos, depth map sequences, 3D skeletal data, and infrared videos for each sample. This dataset is captured by 3 Microsoft Kinect v.2 cameras concurrently. The resolution of RGB videos are 1920×1080, depth maps and IR videos are all in 512×424, and 3D skeletal data contains the three dimensional locations of 25 major body joints, at each frame. The total size of the dataset is 1.3TB.

Video Object Instance Dataset - The Video-Object-Instance (NTU-VOI) dataset from NTU’s ROSE Lab is provided for the evaluation of object instance search and localization in large scale videos. It consists of 146 ground truth video clips with bounding box annotations of object instances in each frame. The total download size of the videos is ~222MB.

Recaptured Images Dataset - The images in the database are captured by using 5 different brands camera (Canon, Casio, Lumix, Nikon and Sony) consisting 2000 natural images and 2700 finely recaptured images. The resolutions range from 2272 by 1704 to 4256 by 2832.

For details, please visit​​​​​