Skip to the content.

Abstract

Object detectors have improved in recent years, obtaining better results and faster inference time. However, small object detection is still a problem that has not yet a definitive solution. The autonomous weapons detection on Closed-circuit television (CCTV) has been studied recently, being extremely useful in the field of security, counter-terrorism, and risk mitigation. This article presents a new dataset obtained from a real CCTV installed in a university and the generation of synthetic images, to which Faster R-CNN was applied using Feature Pyramid Network with ResNet-50 resulting in a weapon detection model able to be used in quasi real-time CCTV (90 ms of inference time with an NVIDIA GeForce GTX-1080Ti card) improving the state of the art on weapon detection in a two stages training. In this work, an exhaustive experimental study of the detector with these datasets was performed, showing the impact of synthetic datasets on the training of weapons detection systems, as well as the main limitations that these systems present nowadays. The generated synthetic dataset and the real CCTV dataset are available to the whole research community.

Datasets

This study presents two new datasets, being these: “Mock attack dataset” and “Unity synthetic dataset”.

Mock attack dataset

This dataset has been manually annotated and collected during a mock attack, after obtaining all the permissions by our University and the security personnel. Details are presented below, indicating each of the cameras used during the mock attack and the scenarios they present.

Infrastructure for data acquisition is composed of three surveillance cameras located at different places in the same area covering two different corridors and one entrance, forming different scenarios. The description of each camera is as follows:

Image

Unity synthetic dataset

This dataset was generated by modeling in Unity Game Engine a scenario that emulates a part of a city and an educational center within it. Several cameras capture the movements of multiple characters, made up of 11 different models and 7 animations. These images enhance the generated datasets with 11 different objects: 4 types of handguns, 5 types of rifles, a knife, and a smartphone. This dataset consists of three splits with 500 (U0.5), 1000 (U1) and 2500 (U2.5) images.

Image

Download Datasets

Terms of use

This dataset can be used for academic research free of charge, citing the paper as we explain below. If you seek to use the data for commercial purposes please contact us.

Citation

If you use our dataset, please kindly cite the following paper: Real-time gun detection in CCTV: An open problem. Neural Networks (2020), doi: https://doi.org/10.1016/j.neunet.2020.09.013.

@article{SalazarGonzalez2020,
title = "Real-time gun detection in CCTV: An open problem",
journal = "Neural Networks",
year = "2020",
issn = "0893-6080",
doi = "https://doi.org/10.1016/j.neunet.2020.09.013",
url = "http://www.sciencedirect.com/science/article/pii/S0893608020303361",
author = "Salazar Gonz{\'{a}}lez, Jose L. and Zaccaro, Carlos and {\'{A}}lvarez-Garc{\'{i}}a, Juan A. and Soria-Morillo, Luis M. and Sancho Caparrini, Fernando",
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Contact the authors of this work for commercial use.

CC BY NC 4.0