As drones continue to find use in wide range of commercial applications, the focus is shifting towards higher levels of automation and tighter integration of drones with business processes, for significantly improved efficiencies. Recent advances in AI have enabled computers to makes sense of the visual data around them, almost reaching human level performance in some cases. Some of the tasks enabled by these algorithms, include:
- Object detection – identify and locate objects of interest in an image
- Object counting – identify and count objects of interest in an image
- Image segmentation – classify pixels in an image into multiple finite segments to simplify representation
- Change detection – detect changes between two temporally spaced images
- Image classification – classify an image into one of the known categories of images
Fig. Detection and counting of Arabian Oryx from aerial image (actual case study)
The technological potential of drones is being further enhanced by combining autonomous drone tech with AI. Computer vision systems, mounted on drones, enable them to gather rich visual data either in the form of photos or videos. Processing this data using AI unfolds unique perspectives and information, which otherwise would be either impossible or very expensive to derive using traditional techniques involving human effort.
With the vision to leverage AI for drone applications, FlytBase platform is being further extended to incorporate AI capabilities to process aerial image data.
FlytBase AI platform is based in the cloud, wherein the entire workflow of preparing datasets, training models and deploying trained-models for inferencing has been automated. This enables quicker turn around time and faster iterations when a use case is being worked upon. Being in the cloud also helps in scaling the system up at runtime when demand (either for training, or for real-time inferencing) increases.
Examples of use cases that can be automated with FlytBase AI platform, are:
- Object counting – e.g. counting the number of Arabian Oryx from an orthomap image. These are an endangered specie and keeping a tab on their count goes towards their conservation.
- Object detection – e.g. locating cracks and rust areas from an image of industrial structures.
- Change detection – e.g. detecting changes between two photos of a parking lot taking from almost the same vantage point at different times.
To harness FlytBase AI platform capabilities, customers bring in their use-case to FlytBase, along with sufficient training images dataset.
Fig. Preprocessing image training data
The customer provided data is carefully cropped, labeled and packaged for training purposes, and added to an Image Dataset Library.
Fig. FlytBase AI training workflow
The FlytBase AI model-training workflow consists of:
- Model Library: Hosts object detection models to choose from during training.
- Pre-trained weights library: Hosts weights from previously trained models to borrow representation from
- Image dataset library: Hosts packaged datasets provided by customers. The raw data is pre-processed for image augmentation and labeling before putting into this library.
Via the above workflow, user can select various pieces of the training pipeline and initiate training on one of our GPU enabled cloud compute nodes. This results in a trained model ready for inferencing.
Fig: Customer does live inferencing via GUI or APIs over secure channels
Once our model is trained, it is deployed on the platform for direct use by our users. Users can do live inferencing either via our web console, or by using REST API’s exposed by the platform. REST APIs have the added advantage of integrating this platform with customer’s system for further automation.
FlytBase AI platform is designed to support multi tenancy, which enables utilisation usage of resources, and hence cost savings for our customers.
Deep Learning Algorithm
At the heart of the image-processing pipeline are state-of-the-art CNN models employing recent advancements in computer-vision and deep-learning.
Over the last few years, several object detection models have been published, which have significantly improved upon the previous generation, in terms of accuracy and speed of inferencing. Notable are, SSD, DetectNet, Fast R-CNN, Faster R-CNN, Yolo and Yolo V2. Similarly, for image classification, ResNet50, VGG16/19 and Inception models are some of the most prefered models. Some models have better accuracy, while others might be faster at inferencing than others. Selection of a model takes into account these criteria, tailored to customer’s use case.
Fig. Several models can be trained simultaneously, and the best chosen
The pipeline allows several model implementations (same model with different hyper-parameters, or different models altogether) to be trained on the same dataset, simultaneously, so that the best can be chosen. Since different model implementations might need datasets to be arranged in different formats (e.g. from PASCAL VOC to TFRecord format), we have built adapters to transform the data on the fly to suit the model.
We have used transfer learning to tune the off-the-shelf pre-trained models for getting higher accuracy for detecting our object(s) of interest. This involves removing layers of the off-the-shelf pre-trained models to keep the correct level of representation from previous dataset, before training them on new dataset.
The FlytBase AI platform is agnostic to the particular framework in which the models are implemented (Tensorflow, Caffe, Theano etc. ), by virtue of an abstraction layer. This allows the platform to assimilate the best implementation of cutting edge models coming out of research labs, with ease.
Challenges and Solutions
Using high resolution aerial images to train computer vision models poses unique challenges:
- Lack of sufficient training data: There are plenty of open training datasets out there, but almost all of them have images taken from human eye level. What makes aerial images unique is their top-down view of the objects. Moreover, for custom object detection, customers don’t often have enough images to train the model on, wherein we have to make do with limited set of images.
- Very high resolution of the images: Computer vision models can process images of limited resolution at a time. For high resolution images, we need to crop the images into sizable chunks and run inference on them one at a time. This can lead to double counting or misses.
- Shallow features of objects: When looked down from the top, objects can have very generic shapes which a) can be hard to detect and b) can appear to be similar to other objects.
FlytBase AI platform uses various approaches to address these challenges, including data augmentation, cropping with different offsets for hi-res images, and training models on similar looking objects for better differentiation. Improving algorithms to address these challenges is a continuous process, further enriching the platform.
The Road Ahead
There is a vast potential to be unlocked for our customers, from the images they collect via drones. With its scalable architecture, automated pipeline, and with our vast experience in dealing with drones, their data and automation, FlytBase AI platform will result in significant improvement in efficiencies for our customers.
FlytBase AI platform is optimised for interpretation of drone data, and it seamlessly integrates with the rest of FlytBase platform to offer connectivity with your business applications.
If you are looking to leverage machine-learning technology for automation of your drone data-processing, please reach out to our experts at firstname.lastname@example.org.
Visit and Submit a form: flytbase.com/ai