BrnoCompSpeed: Review of Traffic Camera Calibration and A Comprehensive Dataset for Monocular Speed Measurement [arXiv] (IEEE ITS - under review)Abstract: In this paper, we focus on visual speed measurement from a single monocular camera, which is an important task of visual traffic surveillance. Existing methods addressing this problem are hard to compare due to lack of a common dataset with reliable ground truth. Therefore, it is not clear how the methods compare in various aspects and what are the factors affecting their performance. We captured a new dataset of 18 full-HD videos, each around one hour long, captured at 6 different locations. Vehicles in videos (20,865 instances in total) are annotated with precise speed measurements from optical gates using LIDAR and verified with several reference GPS tracks. We provide the videos and metadata (calibration, lengths of features in image, annotations, etc.) for future comparison and evaluation. Camera calibration is the most crucial part of the speed measurement; therefore, we analyze a recently published method for fully automatic camera calibration and vehicle speed measurement and report the results in detail on this dataset.
Abstract: Detection of vehicles in traffic surveillance needs good and large training datasets in order to achieve competitive detection rates. We are showing an approach to automatic synthesis of custom datasets, simulating various major influences: viewpoint, camera parameters, sunlight, surrounding environment, etc. Our goal is to create a competitive vehicle detector which “has not seen a real car before.” We are using Blender as the modeling and rendering engine. A suitable scene graph accompanied by a set of scripts was created, that allows simple configuration of the synthesized dataset. The generator is also capable of storing rich set of metadata that are used as annotations of the synthesized images. We synthesized several experimental datasets, evaluated their statistical properties, as compared to real-life datasets. Most importantly, we trained a detector on the synthetic data. Its detection performance is comparable to a detector trained on state-of-the-art real-life dataset. Synthesis of a dataset of 10,000 images takes only several hours, which is much more efficient, compared to manual annotation, let aside the possibility of human error in annotation.