Webhook Conversation

Note

This integration requires Home Assistant >=2025.8.

Integration to connect Home Assistant conversation agents and AI features to external systems through webhooks.

This integration allows you to use n8n workflows or other custom webhook-based systems as conversation agents in Home Assistant, enabling powerful automation and AI-driven interactions with your smart home.

Features

🤖 Use n8n workflows as conversation agents in Home Assistant
🧩 AI Tasks via a dedicated webhook, supporting text or structured outputs
💬 Text-to-Speech (TTS) support with custom webhook-based voice synthesis
🎤 Speech-to-Text (STT) support with custom webhook-based voice recognition
📎 Support for file attachments in AI Tasks (images, documents, etc.)
📡 Send conversation context and exposed entities to webhooks
🏠 Seamless integration with Home Assistant's voice assistant system
🔧 Configurable webhook URLs and output fields
⏱️ Configurable timeout for handling long-running workflows (1-300 seconds)
🚀 Response streaming for real-time conversation responses

Quick Start

🚀 New to n8n workflows? Check out our example workflow for a complete working setup with OpenAI integration and attachment support!

Installation

HACS (Recommended)

Note

Quick Install: Click the "My Home Assistant" badge at the top of this README for one-click installation via HACS.

Make sure HACS is installed
Add this repository as a custom repository in HACS:
- Go to HACS → ⋮ → Custom repositories
- Add https://github.com/eulemitkeule/webhook-conversation with type Integration
Search for "Webhook Conversation" in HACS and install it
Restart Home Assistant

Manual Installation

Download the latest release from the releases page
Extract the custom_components/webhook_conversation folder to your custom_components directory
Restart Home Assistant

Configuration

Home Assistant Setup

The setup process consists of two steps:

Step 1: Create the Integration Entry

Go to Settings → Devices & Services
Click Add Integration and search for "Webhook Conversation"
Add the integration (no configuration options are required at this stage)

Step 2: Add Conversation Agents and AI Tasks

After the integration is added, you'll see the "Webhook Conversation" integration on your integrations page. From there:

Add Conversation Agent: Click the "Add Entry" button on the integration page and select "Conversation Agent" to create a new webhook-based conversation agent. Configure it with:
- Webhook URL: The URL of your webhook endpoint (remember to activate the workflow in n8n and to use the production webhook URL)
- Output Field: The field name in the webhook response containing the reply (default: "output")
- Timeout: The timeout in seconds for waiting for a response (default: 30 seconds, range: 1-300 seconds)
- Enable Response Streaming: Enable real-time streaming of responses as they are generated (default: disabled)
- System Prompt: A custom system prompt to provide additional context or instructions to your AI model

Custom Request Fields (JSON): Optional JSON object merged into each request. String values support Home Assistant templates.

Add AI Task: Click the "Add Entry" button on the integration page and select "AI Task" to create a webhook-based AI task handler. Configure it with:
- Webhook URL: The URL of your webhook endpoint (remember to activate the workflow in n8n and to use the production webhook URL)
- Output Field: The field name in the webhook response containing the reply (default: "output")
- Timeout: The timeout in seconds for waiting for a response (default: 30 seconds, range: 1-300 seconds)
- Enable Response Streaming: Enable real-time streaming of responses as they are generated (default: disabled)
- System Prompt: A custom system prompt to provide additional context or instructions to your AI model

Custom Request Fields (JSON): Optional JSON object merged into each request. String values support Home Assistant templates.

Add TTS (Text-to-Speech): Click the "Add Entry" button on the integration page and select "TTS" to create a webhook-based text-to-speech service. Configure it with:
- Webhook URL: The URL of your webhook endpoint that will handle TTS requests
- Timeout: The timeout in seconds for waiting for audio response (default: 30 seconds, range: 1-300 seconds)
- Supported Languages: List of supported language codes (e.g., "en-US", "de-DE", "fr-FR")
- Voices: Optional list of available voice names for speech synthesis
- Authentication: Optional HTTP basic authentication for securing your webhook endpoint

Custom Request Fields (JSON): Optional JSON object merged into each request. String values support Home Assistant templates.

Add STT (Speech-to-Text): Click the "Add Entry" button on the integration page and select "STT" to create a webhook-based speech-to-text service. Configure it with:
- Webhook URL: The URL of your webhook endpoint that will handle STT requests
- Timeout: The timeout in seconds for waiting for transcription response (default: 30 seconds, range: 1-300 seconds)
- Supported Languages: List of supported language codes (e.g., "en-US", "de-DE", "fr-FR")
- Output Field: The field name in the webhook response containing the transcribed text (default: "output")
- Authentication: Optional HTTP basic authentication for securing your webhook endpoint

Custom Request Fields (JSON): Optional JSON object merged into each request. String values support Home Assistant templates.

Note

You can add multiple conversation agents, AI task handlers, TTS services, and STT services by repeating steps 2-4. Each can be configured with different webhook URLs and settings to support various use cases.

n8n Workflow Setup

Create an n8n workflow with the following structure:

Webhook Trigger: Set up a webhook trigger to receive POST requests from Home Assistant
Process the payload: Your workflow should include a node to process the incoming payload from Home Assistant. This can be done using the "Set" node to extract relevant information from the incoming JSON.
Your AI/Processing Logic: Process the conversation and entity data
Return Response: Return a JSON response with your configured output field

Note: For AI Tasks, the output value should adhere to the JSON schema provided in the structure field.

Example Workflow

For a quick start, you can use the provided example workflow that demonstrates a complete integration with OpenAI's GPT model and attachment and streaming support:

📁 Simple n8n Workflow

This example workflow includes:

Webhook Trigger: Receives POST requests from Home Assistant
Extract Attachments: JavaScript code node that processes binary attachments from AI Tasks
OpenAI Integration: GPT model integration with dynamic response format (text or JSON)
AI Agent: LangChain agent that handles the conversation and processes attachments
Response Handler: Responses are returned to Home Assistant in chunks

To use this example:

Download the workflow file
Import it into your n8n instance (Settings → Import from file)
Configure your OpenAI credentials in the OpenAI node
Update the model name to match your available OpenAI model
Activate the workflow
Copy the webhook URL and use it in your Home Assistant n8n conversation integration

Input schema

For conversations

{
  "conversation_id": "abc123",
  "user_id": "user id from ha",
  "language": "de-DE",
  "agent_id": "conversation.webhook_agent",
  "device_id": "satellite_device_id",
  "device_info": {
    "name": "Kitchen Voice Satellite",
    "manufacturer": "Raspberry Pi",
    "model": "Pi 4B"
  },
  "messages": [
    {
      "role": "assistant|system|tool_result|user",
      "content": "message content"
    }
  ],
  "query": "latest user message",
  "exposed_entities": [
    {
      "entity_id": "light.living_room",
      "name": "Living Room Light",
      "state": "on",
      "aliases": ["main light"],
      "area_id": "living_room",
      "area_name": "Living Room"
    }
  ],
  "system_prompt": "optional additional system instructions",
  "stream": false
}

For AI tasks

{
  "conversation_id": "abc123",
  "messages": [
    {
      "role": "assistant|system|tool_result|user",
      "content": "message content"
    }
  ],
  "query": "task instructions",
  "task_name": "task name",
  "system_prompt": "optional additional system instructions",
  "structure": "json schema for output",
  "binary_objects": [
    {
      "name": "filename.jpg",
      "path": "/path/to/file",
      "mime_type": "image/jpeg",
      "data": "base64_encoded_file_content"
    }
  ],
  "stream": false
}

For STT (Speech-to-Text)

{
  "audio": {
    "name": "audio.wav",
    "path": "/path/to/audio.wav",
    "mime_type": "audio/wav",
    "data": "base64_encoded_audio_content"
  },
  "language": "en-US"
}

Note

For conversations: The device_id and device_info fields are only set when the conversation was initiated via a voice satellite. The language field contains the language code (e.g., "de-DE") configured for the conversation. The agent_id field contains the entity ID of the conversation agent.

For AI tasks: The binary_objects field is only included when attachments are present in the AI task. The structure field is only included when a JSON schema is provided by the action call. The task_name field is only included for AI tasks when provided by the action call. Each attachment is converted to base64 format and includes metadata such as filename, file path, and MIME type.

For TTS: The voice field is only included when a specific voice is requested and the TTS service has been configured with available voices. The webhook should return audio data with an appropriate Content-Type header (e.g., "audio/wav" or "audio/mp3").

For STT: The audio data is automatically converted to the appropriate format and encoded as base64. The webhook should return a JSON response with the transcribed text in the configured output field (default: "output").

Custom Request Fields (JSON)

Each subentry (Conversation Agent, AI Task, TTS, STT) supports Custom Request Fields (JSON). This is an optional JSON object that is merged into the outgoing request body.

Values that are strings are rendered as Home Assistant templates.
Template variables include payload (the current request body) plus all existing top-level keys (e.g. query, conversation_id, language, text, audio, etc.).
If a custom field key already exists in the request, it is ignored (the built-in field wins).

Example:

{
  "source": "home_assistant",
  "now": "{{ now().isoformat() }}",
  "user": "{{ user_id }}",
  "agent": "{{ agent_id }}",
  "query_length": "{{ query | length if query is defined else 0 }}"
}

Authentication

The webhook conversation integration supports basic HTTP authentication for secure communication with your webhook endpoints. This ensures that only authorized requests can access your n8n workflows or other webhook services.

Configuration

To enable basic HTTP authentication:

In the integration configuration, provide:
- Username: Your HTTP authentication username
- Password: Your HTTP authentication password
The integration will automatically include the proper authentication headers in all requests to your webhook URLs

n8n Authentication Setup

For n8n workflows, you can secure your webhook endpoints by:

In your n8n workflow:
- Open the Webhook Trigger node
- Go to the "Settings" tab
- Under "Authentication", select "Basic Auth"
- Set your desired username and password via the credential property
In Home Assistant:
- Use the same username and password in your webhook conversation integration configuration
- The integration will automatically authenticate with your secured n8n webhook

Important

Basic HTTP authentication credentials are transmitted with every request. Always use HTTPS to ensure credentials are encrypted in transit.

Usage

Voice Assistant Pipeline Setup

To use the n8n conversation agent with voice assistants, you need to create a voice assistant pipeline:

Go to Settings → Voice assistants
Click Add Assistant
Configure your pipeline:
- Name: Give your pipeline a descriptive name (e.g., "Webhook Assistant")
- Language: Select your preferred language
- Speech-to-text: Choose your preferred STT engine (e.g., Whisper, Google Cloud, or your webhook STT service)
- Conversation agent: Select your webhook conversation agent from the dropdown
- Text-to-speech: Choose your preferred TTS engine (e.g., Google Translate, Piper, or your webhook TTS service)
- Wake word: Optionally configure a wake word engine
Click Create to save your pipeline
Set this pipeline as the default for voice assistants or assign it to specific devices

Response Streaming

The webhook conversation integration supports optional response streaming for real-time conversation responses. When enabled, responses are streamed as they are generated, providing a more natural and responsive conversation experience.

How Response Streaming Works

When response streaming is enabled:

Real-time Updates: Responses appear in real-time as they are generated by your webhook endpoint
Improved User Experience: Users see responses being typed out naturally, similar to ChatGPT-style interfaces
Better Performance: No need to wait for the complete response before displaying it to the user

Webhook Response Format for Streaming

When streaming is enabled, your webhook endpoint should return responses in a streaming format instead of a single JSON response. The expected format is:

{"type": "item", "content": "First part of the response"}
{"type": "item", "content": " continues here"}
{"type": "item", "content": " and more content"}
{"type": "end"}

Example n8n Streaming Setup

To implement streaming in your n8n workflow:

Configure Webhook Node: Set the response mode to "Streaming"
Configure Agent Node: Enable streaming in the agent node settings

Attachment Support

The webhook conversation integration supports file attachments in AI Tasks, allowing you to send images, documents, and other files to your n8n workflows for processing.

How Attachments Work

When an AI Task includes attachments, they are automatically:

Read from the file system
Encoded as base64 strings
Included in the binary_objects field of the webhook payload

Attachment Data Structure

Each attachment in the binary_objects array contains:

name: The filename or media content ID
path: The full file path on the system
mime_type: The MIME type of the file (e.g., "image/jpeg", "application/pdf")
data: The base64-encoded file content

Processing Attachments in n8n

In your n8n workflow, you can process attachments by:

Accessing the binary_objects array: Use {{ $json.body.binary_objects }} to access all attachments
Processing individual files: Loop through the array or access specific attachments by index
Decoding base64 data: Use the function node in the example workflow or your own custom code to decode the file content
File type handling: Use the mime_type field to determine how to process different file types

Tip

Attachment support is only available for AI Tasks, not regular conversation messages. Make sure your n8n workflow can handle payloads both with and without the binary_objects field.

Speech-to-Text (STT) Support

The webhook conversation integration includes support for custom Speech-to-Text services through webhooks, allowing you to use external STT engines like OpenAI's Whisper API, Google Cloud Speech-to-Text, or custom speech recognition solutions.

How STT Works

When configured, the STT webhook integration:

Receives audio data: Home Assistant captures voice input from microphones or voice satellites
Processes via webhook: Your webhook endpoint receives the audio data and converts it to text
Returns transcribed text: The webhook returns the transcribed text in JSON format
Integrates with conversation: The transcribed text is passed to your conversation agent for processing

STT Configuration

When adding an STT subentry, you can configure:

Webhook URL: The endpoint that will handle speech-to-text transcription requests
Supported Languages: List of language codes your STT service supports (e.g., "en-US", "de-DE", "fr-FR")
Output Field: The field name in the webhook response containing the transcribed text (default: "output")
Timeout: How long to wait for transcription (default: 30 seconds)
Authentication: HTTP basic authentication for securing your webhook

STT Request Format

Your webhook will receive POST requests with this JSON payload:

{
  "audio": {
    "name": "audio.wav",
    "path": "/path/to/audio.wav",
    "mime_type": "audio/wav",
    "data": "base64_encoded_audio_content"
  },
  "language": "en-US"
}

STT Response Format

Your webhook should return a JSON response with the transcribed text:

{
  "output": "Hello, this is the transcribed text from the audio"
}

Usage in Voice Assistants

Once configured, your STT webhook service will appear in Home Assistant's STT service list and can be used:

Voice Assistant Pipelines: Select your webhook STT service in voice assistant pipeline configuration
Voice Satellites: Use with Wyoming satellite devices or other voice input devices
Mobile Apps: Compatible with Home Assistant mobile app voice input

Tip

The integration automatically converts raw audio streams to properly formatted WAV files with headers before encoding to base64. This ensures compatibility with most external STT services that expect standard audio file formats.

Text-to-Speech (TTS) Support

The webhook conversation integration includes support for custom Text-to-Speech services through webhooks, allowing you to use external TTS engines like OpenAI's TTS API, ElevenLabs, or custom voice synthesis solutions.

How TTS Works

When configured, the TTS webhook integration:

Receives TTS requests: Home Assistant sends text that needs to be synthesized to speech
Processes via webhook: Your webhook endpoint processes the text and generates audio
Returns audio data: The webhook returns audio data in WAV or MP3 format
Plays in Home Assistant: The audio is played through Home Assistant's audio system

TTS Configuration

When adding a TTS subentry, you can configure:

Webhook URL: The endpoint that will handle TTS synthesis requests
Supported Languages: List of language codes your TTS service supports (e.g., "en-US", "de-DE", "fr-FR")
Voices: Optional list of available voice names for different speaking styles
Timeout: How long to wait for audio generation (default: 30 seconds)
Authentication: HTTP basic authentication for securing your webhook

TTS Request Format

Your webhook will receive POST requests with this JSON payload:

{
  "text": "Hello, this is the text to be synthesized",
  "language": "en-US",
  "voice": "optional_voice_name"
}

TTS Response Format

Your webhook should return audio data with the appropriate Content-Type header:

Content-Type: Must be audio/wav or audio/mp3
Body: Raw audio data in the specified format

Usage in Voice Assistants

Once configured, your TTS webhook service will appear in Home Assistant's TTS service list and can be used:

Voice Assistant Pipelines: Select your webhook TTS service in voice assistant pipeline configuration
TTS Service Calls: Use the tts.speak service with your webhook TTS entity
Media Players: The generated audio can be played on any media player device

Supported Audio Formats

The TTS webhook integration supports:

WAV: Uncompressed audio format (audio/wav)
MP3: Compressed audio format (audio/mp3)

Tip

For best performance, consider using MP3 format to reduce bandwidth usage, especially for longer text synthesis. Make sure your webhook endpoint sets the correct Content-Type header to match the audio format being returned.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

🐛 Report issues
💬 GitHub Discussions

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

HonzaBejvl/webhook-conversation

Webhook Conversation

Features

Quick Start

Installation

HACS (Recommended)

Manual Installation

Configuration

Home Assistant Setup

Step 1: Create the Integration Entry

Step 2: Add Conversation Agents and AI Tasks

n8n Workflow Setup

Example Workflow

Input schema

For conversations

For AI tasks

For STT (Speech-to-Text)

Custom Request Fields (JSON)

Authentication

Configuration

n8n Authentication Setup

Usage

Voice Assistant Pipeline Setup

Response Streaming

How Response Streaming Works

Webhook Response Format for Streaming

Example n8n Streaming Setup

Attachment Support

How Attachments Work

Attachment Data Structure

Processing Attachments in n8n

Speech-to-Text (STT) Support

How STT Works

STT Configuration

STT Request Format

STT Response Format

Usage in Voice Assistants

Text-to-Speech (TTS) Support

How TTS Works

TTS Configuration

TTS Request Format

TTS Response Format

Usage in Voice Assistants

Supported Audio Formats

Contributing

Support

License

On this page

Languages

Contributors