I created an automated multi-step Python, R and batch script machine learning pipeline which runs and utilizes the output of a pre-trained object detection model (MegaDetector) to classify all Snapshot Wisconsin project images. This decreased the classification backlog by nearly 10 million images (> 50%) and gives an initial classification to all new images within minutes of upload.
Prior to implementation of the automated classification process, it was estimated that catching up on classifying the backlog of historical images would take over 3 years with human classification options available at full capacity. With an additional 1 million images being added to the system each month, there was no expectation of catching up without intervention.
The automated classification process not only reduced the historic data requiring additional human tagging, but it keeps up with new images added each month as well.