Serverless image classification with Azure Functions and Custom Vision – Part 1

Welcome to the new series focused on Azure Custom Vision and Azure Functions! In this series, you are going to train a TensorFlow image classification model in Azure Custom Vision and run the model in an Azure Function.

In the first part, you will build an image classification model using the Custom Vision service. You will learn how to:

  • Build and train a custom image classification model in Custom Vision SDK for Python.
  • Publish and consume the model using Python.

To complete the exercise, you will need:

You will also need to install Python 3 and Visual Studio Code or another code editor.

Scenario details

You are working at an Animal shelter, and you want to build a registration app for new animals that will automatically classify the animals based on the uploaded images. You are going to use Azure Custom Vision to train a “Cats and Dogs” classification model and deploy the model in an Azure Function. The function will be invoked via HTTP requests.

How to use Azure Custom Vision

Azure Custom Vision is an Azure Cognitive Services service that lets you build and deploy your own image classification and object detection models. You can train your models using either the Custom Vision web-based interface or the Custom Vision client library SDKs. In this article, we will use Python and Visual Studio code to train our Custom Vision model.

Do you want to use the web-based interface to build your model? You can read my previous article about creating a Custom Vision model for flower classification.

Collect the data

Images of cats and dogs were taken from the Kaggle Cats and Dogs Dataset. This dataset is composed of over three million images of cats and dogs, manually classified by people at thousands of animal shelters across the US. To build and train our Custom Vision model, we will only consider 120 images per class.

Get the key and endpoint of the Custom Vision resource

Once you have created a Custom Vision resource in the Azure portal, you will need the key and endpoint of your training and prediction resources to connect your Python application to Custom Vision. You can find these values both in the Azure portal and the Custom Vision web portal.

  1. Navigate to customvision.ai and sign in.

  2. Click the settings icon (⚙) at the top toolbar.

  3. Expand your prediction resource and save the Key, the Endpoint, and the Prediction resource ID.

  4. Expand the training resource and save the Key and the Endpoint.

    Get the key and endpoint of the Custom Vision resource

Set up your application

Install the client library

To create an image classification project with Custom Vision for Python, you’ll need to install the Custom Vision client library. Install the Azure Cognitive Services Custom Vision SDK for Python package with pip:

1
pip install azure-cognitiveservices-vision-customvision

Create a configuration file

Create a configuration file (.env) and save the Azure’s keys and endpoints you copied in the previous step.

Create a new Python application

Want to view the whole code at once? You can find it on GitHub.

  1. Create a new Python file (app.py) and import the following libraries:

    1
    2
    3
    4
    5
    6
    
    from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
    from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
    from azure.cognitiveservices.vision.customvision.training.models import ImageFileCreateBatch, ImageFileCreateEntry
    from msrest.authentication import ApiKeyCredentials
    from dotenv import load_dotenv
    import os, time
    
  2. Add the following code to load the keys and endpoints from the configuration file.

    1
    2
    3
    4
    5
    6
    
    load_dotenv()
    training_endpoint = os.getenv('TRAINING_ENDPOINT')
    training_key = os.getenv('TRAINING_KEY')
    prediction_endpoint = os.getenv('PREDICTION_ENDPOINT')
    prediction_key = os.getenv('PREDICTION_KEY')
    prediction_resource_id = os.getenv('PREDICTION_RESOURCE_ID')
    

Authenticate the client

Use the following code to create a CustomVisionTrainingClient and CustomVisionPredictionClient object. You will use the trainer object to create a new Custom Vision project and train an image classification model and the predictor object to make a prediction using the published endpoint.

1
2
3
4
credentials = ApiKeyCredentials(in_headers={"Training-key": training_key})
trainer = CustomVisionTrainingClient(training_endpoint, credentials)
prediction_credentials = ApiKeyCredentials(in_headers={"Prediction-key": prediction_key})
predictor = CustomVisionPredictionClient(prediction_endpoint, prediction_credentials)

Create a new project

To create a new Custom Vision project, you use the create_project method of the trainer object. By default, the domain of the new project is set to General. Since we want to export our model, we should select one of the compact domains.

Add the following code to create a new Multiclass classification project in the General (compact) domain.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Find the domain id
classification_domain = next(domain for domain in trainer.get_domains() if domain.type == "Classification" and domain.name == "General (compact)")

# Create a new project
publish_iteration_name = "Iteration1"
project_name = " Cats and Dogs classifier"
project_description = " A Custom Vision project to detect cats and dogs"
domain_id = classification_domain.id
classification_type = "Multiclass"
print ("Creating project...")
project = trainer.create_project(project_name, project_description, domain_id, classification_type)

Create the tags

The images that we will use to build our model are grouped into 2 classes (tags): cat and dog. Insert the following code after the project creation. This code adds classification tags to the project using the create_tag method and specifies the description of the tags.

1
2
3
# Create tags for cats and dogs
cat_tag = trainer.create_tag(project.id, "Cat")
dog_tag = trainer.create_tag(project.id, "Dog")

Upload and tag images

The following code uploads the training images and their corresponding tags. In each batch, it uploads 60 images of the same tag.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Upload and tag images
images_folder = os.path.join(os.path.dirname(__file__), "images", "Train")
tags_folder_names = [ "Cat", "Dog" ]

print("Adding images...")

for tag_num in range(0, 2):
    if tag_num == 0:
        tag = cat_tag
    else:
        tag = dog_tag
    for batch_num in range(0, 2):
        image_list = []
        for image_num in range(1, 61):
            file_name = f"{tags_folder_names[tag_num]} ({60*batch_num + image_num}).jpg"
            with open(os.path.join(images_folder, tags_folder_names[tag_num], file_name), "rb") as image_contents:
                image_list.append(ImageFileCreateEntry(name=file_name, contents=image_contents.read(), tag_ids=[tag.id]))

        upload_result = trainer.create_images_from_files(project.id, ImageFileCreateBatch(images=image_list))
        if not upload_result.is_batch_successful:
            print("Image batch upload failed.")
            for image in upload_result.images:
                print("Image status: ", image.status)
            exit(-1)
    print(f"{tags_folder_names[tag_num]} Uploaded")
In a single batch, there is a limit of 64 images and 20 tags.

Train and evaluate the model

Use the train_project method of the trainer object to train your model using all the images in the training set.

1
2
3
4
5
6
7
8
# Training
print ("Training...")
iteration = trainer.train_project(project.id)
while (iteration.status != "Completed"):
    iteration = trainer.get_iteration(project.id, iteration.id)
    print ("Training status: " + iteration.status)
    print ("Waiting 10 seconds...")
    time.sleep(20)

The Custom Vision service calculates three performance metrics:

  • precision,
  • recall, and
  • average precision.

The following code displays performance information for the latest training iteration and for each tag using a threshold value equal to 50%. You can also use the get_iteration_performance method to get the standard deviation of the above-mentioned performance values.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Get iteration performance information
threshold = 0.5
iter_performance_info = trainer.get_iteration_performance(project.id, iteration.id, threshold)
print("Iteration Performance:")
print(f"\tPrecision: {iter_performance_info.precision*100 :.2f}%\n"
      f"\tRecall: {iter_performance_info.recall*100 :.2f}%\n"
      f"\tAverage Precision: {iter_performance_info.average_precision*100 :.2f}%")

print("Performance per tag:")
for item in iter_performance_info.per_tag_performance:
    print(f"* {item.name}:")
    print(f"\tPrecision: {item.precision*100 :.2f}%\n"
          f"\tRecall: {item.recall*100 :.2f}%\n"
          f"\tAverage Precision: {item.average_precision*100 :.2f}%")

Quick Test an image

If you want to test your model before publishing it, you can use the quick_test_image method of the trainer object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Quick test
test_images_folder_path = os.path.join(os.path.dirname(__file__), "images", "Test")
test_image_filename = "4.jpg"

print("Quick test a local image...")
with open(os.path.join(test_images_folder_path, test_image_filename), "rb") as image_contents:
    quick_test_results = trainer.quick_test_image(project.id, image_contents.read(), iteration_id=iteration.id)
    # Display the results
    print(f"Quick Test results for image {test_image_filename}:")
    for prediction in quick_test_results.predictions:
        print(f"\t{prediction.tag_name}: {prediction.probability*100 :.2f}%")

Publish the iteration

Add the following code, which publishes the current iteration. Once the iteration is published, you can use the prediction endpoint to make predictions.

1
2
3
4
# Publish the current iteration
print("Publishing the current iteration...")
trainer.publish_iteration(project.id, iteration.id, publish_iteration_name, prediction_resource_id)
print ("Iteration published!")

Test the prediction endpoint

Use the classify_image method of the predictor object to send a new image for analysis and retrieve the prediction result. Add the following code to the end of the program and run the application.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Test - Make a prediction
print("Testing the prediction endpoint...")
for img_num in range(1,9):
    test_image_filename = str(img_num) + ".jpg"
    with open(os.path.join(test_images_folder_path, test_image_filename), "rb") as image_contents:
        results = predictor.classify_image(project.id, publish_iteration_name, image_contents.read())

        # Display the results
        print(f"Testing image {test_image_filename}...")
        for prediction in results.predictions:
            print(f"\t{prediction.tag_name}: {prediction.probability*100 :.2f}%")

If the application ran successfully, navigate to the Custom Vision website to see the newly created project.

Summary and next steps

In this article, you learned how to use the Azure Custom Vision SDK for Python to build and train an image classification model. In the second part of this learning series, I will show you how to export your image classifier using the Python SDK and run the exported TensorFlow model locally to predict whether an image contains a cat or a dog.

You may also check out the following resources:

Clean-up

If you want to delete this project, navigate to the Custom Vision project gallery page and select the trash icon under the project.

If you have finished learning, you can delete the resource group from your Azure subscription:

  1. In the Azure portal, select Resource groups on the right menu and then select the resource group that you have created.
  2. Click Delete resource group.

You May Also Like