Overview

This exercise demonstrates how to develop an audio-enabled chat application using Azure AI Foundry and the Phi-4-multimodal-instruct model. The app provides AI assistance by summarizing voice messages left by customers.


Steps & Configuration Details

1. Create an Azure AI Foundry Project

  • Open Azure AI Foundry portal (https://ai.azure.com) and sign in.
  • Select + Create project.
  • Configuration Items:
    • Hub Name: A valid name.
    • Subscription: Your Azure subscription.
    • Resource Group: Select or create a resource group.
    • Location: Choose from:
      • East US
      • East US 2
      • North Central US
      • South Central US
      • Sweden Central
      • West US
      • West US 3
    • Connect Azure AI Services: Create a new AI Services resource.
    • Connect Azure AI Search: Skip connecting.

2. Deploy a Multimodal Model

  • Navigate to Models + endpointsDeploy base model.
  • Search for Phi-4-multimodal-instruct and select it.
  • Configuration Items:
    • Deployment Name: A valid name.
    • Deployment Type: Global Standard.
    • Deployment Details: Use default settings.

3. Configure the Client Application

  • Open Azure Portal (https://portal.azure.com).
  • Launch Azure Cloud Shell (PowerShell environment).
  • Clone the repository:
    rm -r mslearn-ai-audio -f
    git clone https://github.com/MicrosoftLearning/mslearn-ai-language mslearn-ai-audio
    
  • Navigate to the correct folder:
    • Python: cd mslearn-ai-audio/Labfiles/09-audio-chat/Python
    • C#: cd mslearn-ai-audio/Labfiles/09-audio-chat/C-sharp
  • Install dependencies:
    • Python:
      python -m venv labenv
      ./labenv/bin/Activate.ps1
      pip install -r requirements.txt azure-identity azure-ai-projects azure-ai-inference
      
    • C#:
      dotnet add package Azure.Identity
      dotnet add package Azure.AI.Inference --version 1.0.0-beta.3
      dotnet add package Azure.AI.Projects --version 1.0.0-beta.3
      
  • Open the configuration file:
    • Python: .env
    • C#: appsettings.json
  • Update Configuration Values:
    • Project Connection String (copied from Azure AI Foundry portal).
    • Model Deployment Name (Phi-4-multimodal-instruct).
  • Save the configuration file.

4. Implement the AI Chat Client

  • Open the code file:
    • Python: audio-chat.py
    • C#: Program.cs
  • Add references:
    • Python:
      from dotenv import load_dotenv
      from azure.identity import DefaultAzureCredential
      from azure.ai.projects import AIProjectClient
      from azure.ai.inference.models import SystemMessage, UserMessage, TextContentItem
      
    • C#:
      using Azure.Identity;
      using Azure.AI.Projects;
      using Azure.AI.Inference;
      
  • Initialize the AI Foundry client:
    • Python:
      project_client = AIProjectClient.from_connection_string(
          conn_str=project_connection,
          credential=DefaultAzureCredential()
      )
      
    • C#:
      var projectClient = new AIProjectClient(project_connection, new DefaultAzureCredential());
      
  • Create a chat client:
    • Python:
      chat_client = project_client.inference.get_chat_completions_client(model=model_deployment)
      
    • C#:
      ChatCompletionsClient chat = projectClient.GetChatCompletionsClient();
      

5. Submit an Audio-Based Prompt

  • Python:
    file_path = "https://github.com/MicrosoftLearning/mslearn-ai-language/raw/refs/heads/main/Labfiles/09-audio-chat/data/avocados.mp3"
    response = chat_client.complete(
        messages=[
            SystemMessage(system_message),
            UserMessage([
                TextContentItem(text=prompt),
                {"type": "audio_url", "audio_url": {"url": file_path}}
            ])
        ]
    )
    print(response.choices[0].message.content)
    
  • C#:
    string audioUrl = "https://github.com/MicrosoftLearning/mslearn-ai-language/raw/refs/heads/main/Labfiles/09-audio-chat/data/avocados.mp3";
    var requestOptions = new ChatCompletionsOptions() {
        Messages = {
            new ChatRequestSystemMessage(system_message),
            new ChatRequestUserMessage(
                new ChatMessageTextContentItem(prompt),
                new ChatMessageAudioContentItem(new Uri(audioUrl))
            )
        },
        Model = model_deployment
    };
    var response = chat.Complete(requestOptions);
    Console.WriteLine(response.Value.Content);
    

6. Run Your Application

  • Python:
    python audio-chat.py
    
  • C#:
    dotnet run
    
  • Example prompt:
    Can you summarize this customer's voice message?
    
  • The response should display the AI-generated summary.

7. Clean Up

  • Delete Azure resources to avoid unnecessary costs:

Related Posts