Getting Started with Azure Computer Vision and Object Detection

Recently, I wanted to play around with Azure Computer Vision. More specifically I wanted to play with object detection.

Computer vision is the ability for a machine learning algorythm to understand and predict the contents of images and videos.

Object detection is not only being able to tell that there is something in the image but to be able to know exact details about those objects.

I had a conference coming up in Orlando and I thought it would be cool to have a local reference in one of my demos using computer vision in a fun way.

So I looked up computer vision and found out that I could play with the cognitive service right within the browser. You can check that out here.

This first thing I wanted to do is see what the prebuilt AI models was able to do out of the box. I was trying to solve a simple question

How can I tell the difference between Chip and Dale?

So I uploaded a picture of the Disney characters Chip and Dale (no, not the other ones) to see what it would come up with.

Looking at the output, the model couldn’t tell if the shapes where characters from Disney, let alone, chipmunks. After seeing the metadata available from computer vision, I realized that I needed to custom train this model if I wanted it to be able to answer my burning question…

So my next stop was Custom Vision, and I found that I could create an object detection model to tell the difference between features that I have tagged. Here are the steps I took to create the model.

  1. I checked out the documentation.
  2. I went to Bing and searched for images to train the model, and separated a few images to validate the model.
  3. I uploaded the images and tagged the features I needed.
  4. I saved and trained my model.
  5. I reviewed the performance to ensure my model was delivering the right result.
  6. I selected an image, not in my original training data, and validated the model.
No alt text provided for this image


I was able to do this in just a few minutes. The result was that I was able to pass in an image of Chip and Dale and the model was able to confidently predict who was who. Think of the types of use cases this could apply to! You could be able to tell good produce from bad, good hardware from bad, known and unknown people at your door. The opportunities are endless.

Identify a use case you are interested and build a POC (Proof of Concept), you might be surprised at how much you can do without having to be,or consult, a data scientist. This empowers every developer to infuse their app with AI and change the world they see. Power at our fingertips!



P.S. If you are a business leader and interested in AI, check out the AI Leadership Institute!