Enhancing aerial vehicle detection in transportation monitoring using spatiotemporal object detection models
View/ Open
Date
2023-06-01Author
Telegraph, KristinaPublisher
Πανεπιστήμιο Κύπρου, Πολυτεχνική Σχολή / University of Cyprus, Faculty of EngineeringPlace of publication
CyprusGoogle Scholar check
Keyword(s):
Metadata
Show full item recordAbstract
Image object detection has shown tremendous success in recent years, leading to its adaptation to the domain of video. However, the major advancements based on single-shot deep learning models process single frames individually. Hence, relying on spatial information alone can be problematic in cases where there are occlusions, blurred/unclear background, lack of information in low-resolution, and changing lighting conditions, all of which are common occurrences in transportation monitoring applications. Overcoming these problems necessitates incorporating both spatial and temporal information into the detection process. To address this challenge, several spatiotemporal detection models were investigated, which used sequences of video frames and explicit motion cues to build better representations of the scene context. First, a representative custom dataset of video sequences of aerial road network footage from an unmanned aerial vehicle was collected and annotated with three vehicle classes, to be used for model training and validation. Then, different spatiotemporal models were developed and incorporated into the YOLO framework. Overall, the spatiotemporal models show significant improvement in results, with the best model showing a mean average precision (mAP50) of 83.1% for all classes, which is a 16.22% improvement over its corresponding single frame model. The addition of attention mechanisms to the spatiotemporal models’ architecture was also explored. Inference tests were carried out to perform qualitative and inference speed comparisons. Finally, it was concluded that the addition of temporal information to deep learning object detectors is in fact an effective approach to improve vehicle detection in aerial video data.