The drones are now common devices that enable us to record or capture many scenes as the bird’s-eye view. While the challenges of normal viewpoints have been considered to be the prevalence, in recent years, there has been increasing interest in flying drones and their applications in healthcare, video surveillance, search-and-rescue, and agriculture. Its use cases ranging from protecting personal security to boosting productivity in the workplace. The object detection has now been widely used in many practical scenarios. For example, SNIPER, RetinaNet and YOLOv3 are the top models for object detection on MS-COCO dataset with mAP 46.1%, 40.8%, 33.0%, respectively. Many successful CNN architectures, e.g., OverFeat, R-CNN, Fast R-CNN, Faster R-CNN, SSD, RFCN, YOLO, YOLOv2, Faster R-CNN, RetinaNet, YOLOv3, and SNIPER have performed well on the task of object detection. CNNs have driven notable progress in visual recognition and object detection. In the past few years, Convolutional Neural Networks (CNNs) have brought breakthroughs in speech, audio, image, and video processing. Recently, the research community has focused on deep learning and its applications towards the object recognition/detection tasks. There are several notable studies on specific object detection like face detection, pedestrian detection and vehicle detection. There are many competition benchmarks, i.e., PASCAL-VOC, ImageNet Large Scale Visual Recognition Challenge (ILSVRC), MS-COCO, and VisDrone-DET. Generic object detection has received significant attention. For example, the main challenges in pedestrian detection come from crowded scenes with heavy overlaps, occlusion, and low-resolution images. Here are some common challenges that object detectors face on aerial images: viewpoints, illuminations, scale variations, perspectives, intra-class variations, low resolutions, and occlusions. In recent years, the research community has started moving towards other categories than the well-known categories like person, cat, or dog. Historically, object detection has mainly directed on the detection of a single category, for example, person class. Object detection can detect a specific instance, i.e., Obama’s face, Eiffel Tower, Golden Gate Bridge or objects of specific categories, i.e., humans, cars, bicycles. In particular, these techniques have produced remarkable development for object detection. The deep-learning technology has brought significant breakthroughs in recent years. Object detection has many applications in robot vision, autonomous driving, human-computer interaction, intelligent video surveillance. The coordinates of a detected object represent the object’s bounding box. Giving a testing image, object detection could localize the coordinates of the objects and assign the corresponding labels to the objects in terms of the object category, i.e., human, dog, or cat. Since it plays a principal role in understanding and absorbing the contexts of images, therefore, object detection is considered to be a prerequisite measure that offers the computer to detect various objects. Its development in the past two decades can be regarded as an epitome of computer vision history. It has been an important research topic for decades. Object detection is a fundamental yet difficult task in image processing and computer vision research. We conduct experiments on VisDrone2019 dataset which contains 96 videos with 39,988 annotated frames and provide insights into efficient object detectors for aerial images. In this paper, we present an evaluation of state-of-the-art deep-learning detectors including Faster R-CNN (Faster Regional CNN), RFCN (Region-based Fully Convolutional Networks), SNIPER (Scale Normalization for Image Pyramids with Efficient Resampling), Single-Shot Detector (SSD), YOLO (You Only Look Once), RetinaNet, and CenterNet for the object detection in videos captured by drones. While deep-learning modern object detectors have recently achieved great success in general benchmarks, i.e., PASCAL-VOC and MS-COCO, the robustness of these detectors on aerial images captured by drones is not well studied. However, this task is very challenging due to the unconstrained viewpoints and low resolution of captured videos. Object detection task is one of the core steps in understanding videos collected from the drones. Unmanned aircraft systems or drones enable us to record or capture many scenes from the bird’s-eye view and they have been fast deployed to a wide range of practical domains, i.e., agriculture, aerial photography, fast delivery and surveillance.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |