The Nisqually River Foundation is a nature conservation organization tasked with the successful implementation of a watershed stewardship plan. As a part of this plan, they assist the Nisqually Indian Tribe to measure and monitor the fish species present in the Nisqually River.
To do this, the Nisqually Indian Tribe has installed one video camera and infrared sensors in a fish ladder at a dam on the river. The camera is triggered to capture 30 seconds of video when any fish swims past the infrared sensors. Then the recorded videos are scrutinized manually to identify and name the fish species in it.
The manual process to identify and name any living species through captured videos is resource intensive from a time, human, and cost perspective. So, when the Nisqually River Foundation, a Washington based nature conservation organization, encountered a similar challenge to measure and monitor the Salmon species fish identification, they approached us for an automated technology-driven solution.
First, the collected video feeds were processed to extract the relevant frames. Deep learning AI models were then trained to draw bounding boxes around each fish passing by the camera. The entire workflow encapsulated in a Web App automated the process of video feed input, detection and classification. The automated AI solution leveraged the latest implemented deep learning algorithms using the Microsoft Azure and Cognitive Services platform stack.
Given the nature of the problem and the format of the video files, processing power was a key requirement for the training and validation phases. A GPU machine was the natural choice to run the object detection models. Hence, we selected the NC6 GPU VM in the Azure portal.
--Lucas Joppa, Chief Environmental Scientist, Microsoft AI for Earth
The Microsoft AI for Earth team was a key enabler and influencer for project’s success, through timely access for technical support and resolution of AI platform queries.
For a reliable cloud solution with machine learning capabilities, Microsoft Azure Data Science Virtual Machine (DSVM) was chosen.
For the purpose of extracting frames from the videos and tag them, Microsoft Visual Object Tagging Tool (VOTT) came in handy.
The final object detection algorithm chosen was the YOLO V3 Video detection algorithm.
The first challenge was to process the videos and tag the fish. The heavy manual work involved in this was automated by leveraging the MS VOTT tool.
The tagged frames were then used to train a model using CNTK and Faster RCNN. This model was then tested against more frames extracted from the videos. While this solution was good, it lacked speed and real-time video detection capabilities.
As an enhancement to the solution, we moved to video object detection using YOLO V3, which provides a faster solution with real-time capabilities.
This web-based AI solution would save the client, valuable hours of expert biologist time and infrastructure costs spent in manually reviewing the videos. As part of a planned upgrade, an enhanced version of the solution has been provided to the customer, which predicts to deliver substantial cost savings to the tune of 80%.