Leroy, my first AI project, has presented a number of new technical challenges. Here are my lessons learned presented in a multi-part series.
Object detection Versus Classification
My first iteration of Project Leroy started with classification. Classification for Leroy was taking an entire frame from his cameras video stream and trying to classify it as a type of bird. The results were very inconsistent.
I'm 91 percent sure this is a Male Northern Cardinal. #ai pic.twitter.com/xBhWZDFbzS
— ProjectLeroy (@ProjectLeroy) July 29, 2020
It makes sense why classfication was not effective because when we bird warch as humans we are not looking at everything our eye sees and trying understand it as one type of thing. Instead we understand that what we're looking at is a composition of many little things. Of those things, we look for birds in particular and when we spot one, we try and understand what type of bird it is. This led my to update Leroy to achieve his goal in two steps. First object detection and second classification.
Once Leroy has detected an objected the results include the level of confidence and bounding box coordinates. With this information, I am able to set a threshold on when to proceed with actually capturing a photo. I’m constantly tweaking this, but for now 40% is pretty good. Once Leroy hits that threshold, he uses the bounding box to save only that part of the frame, like a cutout of the original photo. That cutout is what I run through classification. The results have been much more accurate.
I'm 78% sure I see a bird and 47% sure its Haemorhous mexicanus (House Finch) #ai #GoogleCoral pic.twitter.com/pxSW6JMTGj
— ProjectLeroy (@ProjectLeroy) August 26, 2020
The models I am using for object detection and classification respectively are:
- ssd_mobilenet_v2_coco_quant_postprocess_edgetpu.tflite
- mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite.
Both are provided by the Google Coral’s model page.