Overview

This exercise demonstrates how to extract custom entities from text using Azure AI Language Service. The solution involves training a model to recognize specific entities in classified ads.


Steps & Configuration Details

1. Provision an Azure AI Language Resource

  • Open Azure Portal (https://portal.azure.com) and sign in.
  • Select Create a resource → Search for Language Service → Click Create.
  • Configuration Items:
    • Subscription: Your Azure subscription.
    • Resource Group: Select or create a resource group.
    • Region: Choose from:
      • Australia East
      • Central India
      • East US
      • East US 2
      • North Europe
      • South Central US
      • Switzerland North
      • UK South
      • West Europe
      • West US 2
      • West US 3
    • Name: Enter a unique name.
    • Pricing Tier: F0 (Free) or S (Standard).
    • Storage Account: Create a new storage account.
    • Storage Account Type: Standard LRS.
    • Responsible AI Notice: Agree.

After provisioning, navigate to Keys and Endpoint in the Resource Management section.


2. Configure Storage Access

  • Open Azure Portal → Navigate to Storage Account.
  • Select Access Control (IAM)Add Role Assignment.
  • Configuration Items:
    • Role: Storage Blob Data Contributor
    • Assign Access To: User, group, or service principal
    • Select Members: Choose your user account.

3. Upload Sample Ads

  • Download sample ads:
    https://aka.ms/entity-extraction-ads
    
  • Extract files and upload them to Azure Blob Storage:
    • Create a Blob Container named classifieds.
    • Set Anonymous Access Level to Container (anonymous read access for containers and blobs).
    • Upload extracted .txt files.

4. Create a Custom Named Entity Recognition Project

  • Open Language Studio (https://language.cognitive.azure.com) and sign in.
  • Select Custom Named Entity RecognitionCreate a project.
  • Configuration Items:
    • Project Name: CustomEntityLab
    • Primary Language: English (US)
    • Description: Custom entities in classified ads
    • Blob Store Container: classifieds
    • Labeling Option: No, I need to label my files as part of this project

5. Label Training Data

  • Navigate to Data Labeling.
  • Define three entities:
    • ItemForSale
    • Price
    • Location
  • Assign labels to sample ads:
    "face cord of firewood" → ItemForSale
    "Denver, CO" → Location
    "$90" → Price
    
  • Save labels.

6. Train the Model

  • Navigate to Training JobsStart a Training Job.
  • Configuration Items:
    • Model Name: ExtractAds
    • Training Mode: Automatically split the testing set from training data
  • Click Train.

7. Evaluate the Model

  • Navigate to Model Performance → Select ExtractAds.
  • Review Accuracy Metrics.
  • If errors exist, check Test Set Details.

8. Deploy the Model

  • Navigate to Deploying ModelAdd Deployment.
  • Configuration Items:
    • Deployment Name: AdEntities
    • Model: ExtractAds
  • Click Deploy.

9. Configure Your Application

  • Clone the repository:
    git clone https://github.com/MicrosoftLearning/mslearn-ai-language
    
  • Open the folder in Visual Studio Code.
  • Install dependencies:
    • C#:
      dotnet add package Azure.AI.TextAnalytics --version 5.3.0
      
    • Python:
      pip install azure-ai-textanalytics==5.3.0
      
  • Open the configuration file:
    • C#: appsettings.json
    • Python: .env
  • Update Configuration Values:
    • Azure AI Language Endpoint
    • API Key
    • Project Name (CustomEntityLab)
    • Deployment Name (AdEntities)
  • Save the configuration file.

10. Add Code to Extract Entities

  • Open the code file:
    • C#: Program.cs
    • Python: custom-entities.py
  • Add references:
    • C#:
      using Azure;
      using Azure.AI.TextAnalytics;
      
    • Python:
      from azure.core.credentials import AzureKeyCredential
      from azure.ai.textanalytics import TextAnalyticsClient
      
  • Create the AI Language client:
    • C#:
      AzureKeyCredential credentials = new AzureKeyCredential(aiSvcKey);
      Uri endpoint = new Uri(aiSvcEndpoint);
      TextAnalyticsClient aiClient = new TextAnalyticsClient(endpoint, credentials);
      
    • Python:
      credential = AzureKeyCredential(ai_key)
      ai_client = TextAnalyticsClient(endpoint=ai_endpoint, credential=credential)
      
  • Submit entity extraction requests:
    • C#:
      RecognizeCustomEntitiesOperation operation = await aiClient.RecognizeCustomEntitiesAsync(WaitUntil.Completed, batchedDocuments, projectName, deploymentName);
      foreach (RecognizeEntitiesResult documentResult in operation.Value) {
          Console.WriteLine($"Recognized {documentResult.Entities.Count} entities:");
          foreach (CategorizedEntity entity in documentResult.Entities) {
              Console.WriteLine($"Entity: {entity.Text}, Category: {entity.Category}, ConfidenceScore: {entity.ConfidenceScore}");
          }
      }
      
    • Python:
      operation = ai_client.begin_recognize_custom_entities(batchedDocuments, project_name=project_name, deployment_name=deployment_name)
      document_results = operation.result()
      for doc, custom_entities_result in zip(files, document_results):
          print(doc)
          for entity in custom_entities_result.entities:
              print(f"Entity '{entity.text}' has category '{entity.category}' with confidence score of '{entity.confidence_score}'")
      

11. Run Your Application

  • C#:
    dotnet run
    
  • Python:
    python custom-entities.py
    
  • Observe the output, which should display extracted entities and confidence scores.

12. Clean Up

  • Delete Azure resources to avoid unnecessary costs:

This summary captures the essential steps while highlighting all configuration items and code references required for extracting custom entities using Azure AI Language.

Related Posts