AI-102 Study Series Exercise 24: Audio-Enabled Chat with Phi-4 Multimodal Model

Overview

This exercise demonstrates how to develop an audio-enabled chat application using Azure AI Foundry and the Phi-4-multimodal-instruct model. The app provides AI assistance by summarizing voice messages left by customers.

Steps & Configuration Details

1. Create an Azure AI Foundry Project

Open Azure AI Foundry portal (https://ai.azure.com) and sign in.
Select + Create project.
Configuration Items:
- Hub Name: A valid name.
- Subscription: Your Azure subscription.
- Resource Group: Select or create a resource group.
- Location: Choose from:
  - East US
  - East US 2
  - North Central US
  - South Central US
  - Sweden Central
  - West US
  - West US 3
- Connect Azure AI Services: Create a new AI Services resource.
- Connect Azure AI Search: Skip connecting.

2. Deploy a Multimodal Model

Navigate to Models + endpoints → Deploy base model.
Search for Phi-4-multimodal-instruct and select it.
Configuration Items:
- Deployment Name: A valid name.
- Deployment Type: Global Standard.
- Deployment Details: Use default settings.

3. Configure the Client Application

Open Azure Portal (https://portal.azure.com).
Launch Azure Cloud Shell (PowerShell environment).

Clone the repository:

rm -r mslearn-ai-audio -f
git clone https://github.com/MicrosoftLearning/mslearn-ai-language mslearn-ai-audio

Navigate to the correct folder:
- Python: cd mslearn-ai-audio/Labfiles/09-audio-chat/Python
- C#: cd mslearn-ai-audio/Labfiles/09-audio-chat/C-sharp

Install dependencies:

Python:

python -m venv labenv
./labenv/bin/Activate.ps1
pip install -r requirements.txt azure-identity azure-ai-projects azure-ai-inference

C#:

dotnet add package Azure.Identity
dotnet add package Azure.AI.Inference --version 1.0.0-beta.3
dotnet add package Azure.AI.Projects --version 1.0.0-beta.3

Open the configuration file:
- Python: .env
- C#: appsettings.json
Update Configuration Values:
- Project Connection String (copied from Azure AI Foundry portal).
- Model Deployment Name (Phi-4-multimodal-instruct).
Save the configuration file.

4. Implement the AI Chat Client

Open the code file:
- Python: audio-chat.py
- C#: Program.cs

Add references:

Python:

from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.inference.models import SystemMessage, UserMessage, TextContentItem

C#:

using Azure.Identity;
using Azure.AI.Projects;
using Azure.AI.Inference;

Initialize the AI Foundry client:

Python:

project_client = AIProjectClient.from_connection_string(
    conn_str=project_connection,
    credential=DefaultAzureCredential()
)

C#:

var projectClient = new AIProjectClient(project_connection, new DefaultAzureCredential());

Create a chat client:

Python:

chat_client = project_client.inference.get_chat_completions_client(model=model_deployment)

C#:

ChatCompletionsClient chat = projectClient.GetChatCompletionsClient();

5. Submit an Audio-Based Prompt

Python:

file_path = "https://github.com/MicrosoftLearning/mslearn-ai-language/raw/refs/heads/main/Labfiles/09-audio-chat/data/avocados.mp3"
response = chat_client.complete(
    messages=[
        SystemMessage(system_message),
        UserMessage([
            TextContentItem(text=prompt),
            {"type": "audio_url", "audio_url": {"url": file_path}}
        ])
    ]
)
print(response.choices[0].message.content)

C#:

string audioUrl = "https://github.com/MicrosoftLearning/mslearn-ai-language/raw/refs/heads/main/Labfiles/09-audio-chat/data/avocados.mp3";
var requestOptions = new ChatCompletionsOptions() {
    Messages = {
        new ChatRequestSystemMessage(system_message),
        new ChatRequestUserMessage(
            new ChatMessageTextContentItem(prompt),
            new ChatMessageAudioContentItem(new Uri(audioUrl))
        )
    },
    Model = model_deployment
};
var response = chat.Complete(requestOptions);
Console.WriteLine(response.Value.Content);

6. Run Your Application

Python:
```
python audio-chat.py
```
C#:
```
dotnet run
```

Example prompt:

Can you summarize this customer's voice message?

The response should display the AI-generated summary.

7. Clean Up

Delete Azure resources to avoid unnecessary costs:
- Open Azure Portal (https://portal.azure.com).
- Navigate to Resource Groups.
- Select the resource group and click Delete.

Overview#

Steps & Configuration Details#

1. Create an Azure AI Foundry Project#

2. Deploy a Multimodal Model#

3. Configure the Client Application#

4. Implement the AI Chat Client#

5. Submit an Audio-Based Prompt#

6. Run Your Application#

7. Clean Up#

Related Posts

Overview

Steps & Configuration Details

1. Create an Azure AI Foundry Project

2. Deploy a Multimodal Model

3. Configure the Client Application

4. Implement the AI Chat Client

5. Submit an Audio-Based Prompt

6. Run Your Application

7. Clean Up