BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition [CVPR 2016]

Jakub Sochor, Adam Herout, Jiří Havel
GRAPH@FIT, Brno University of Technology
Corresponding authors: {ispanhel; herout} [at]


We are dealing with the problem of fine-grained vehicle make&model recognition and verification. Our contribution is showing that extracting additional data from the video stream – besides the vehicle image itself – and feeding it into the deep convolutional neural network boosts the recognition performance considerably. This additional information includes: 3D vehicle bounding box used for “unpacking” the vehicle image, its rasterized low-resolution shape, and information about the 3D vehicle orientation. Experiments show that adding such information decreases classification error by 26% (the accuracy is improved from 0.772 to 0.832) and boosts verification average precision by 208% (0.378 to 0.785) compared to baseline pure CNN without any input modifications. Also, the pure baseline CNN outperforms the recent state of the art solution by 0.081. We provide an annotated set “BoxCars” of surveillance vehicle images augmented by various automatically extracted auxiliary information. Our approach and the dataset can considerably improve the performance of traffic surveillance systems.


  • Paper (OpenAccess CV-F repository)
  • BoxCars21k – dataset used for training and evaluation of the algorithm in the paper.
    • Dataset details can be found in the paper
    • See README in the zip file for further information about  the dataset structure
    • Some statistics about the dataset:
      • 21,250 vehicles (63,750 images)
      • 27 different makes
      • 148 make & model & submodel + model year classes
    • The dataset is for non-commercial usage only
    • For commercial use please contact us to {ispanhel, herout} [at]
  • All the source codes will be published together with journal version of the paper which we are working on right now.

Datasets License

Except where otherwise noted, this work is licensed under
© 2016, SOCHOR, HEROUT, HAVEL. Some Rights Reserved.


 Title = {{BoxCars}: {3D} Boxes as {CNN} Input for Improved Fine-Grained Vehicle Recognition},
 Author = {Sochor, Jakub and Herout, Adam and Havel, Jiri},
 Booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 Year = {2016},
 Month = {June}



This work was supported by TACR project “RODOS”, TE01020155.