Stream AI chats using Django in 5 minutes (OpenAI and Anthropic) πŸ’§

Published: April 7, 2024

I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.

No third-party packages needed:

  • We'll use Django's inbuilt server-sent events with StreamingHttpResponse, plus minimal vanilla JavaScript in a simple template.

Streaming completions from LLMs to the browser immediately:

  • We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).

Here's how our finished app will look:

This technique is surprisingly simple. We'll aim to do this in under 5 minutes.

Let's begin πŸš€

Setup your Django app

  • Run this in the terminal:
pip install --upgrade django python-dotenv anthropic openai
django-admin startproject core .
python manage.py startapp sim
  • Add our app sim to the INSTALLED_APPS in settings.py:
# settings.py
INSTALLED_APPS = [
'sim',
...
]

Add your environment variables

Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.

  • Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.
ANTHROPIC_API_KEY=<your_anthropic_api_key>
OPENAI_API_KEY=<your_openai_api_key>
  • Add the following to the top of your settings.py file to load your environment variables from the .env file:
from pathlib import Path
import os
from dotenv import load_dotenv
load_dotenv()

Create your Django view to stream the LLM completions to the browser

  • Add the following code to sim/views.py:
from django.http import HttpResponse, StreamingHttpResponse
from django.shortcuts import render
from openai import OpenAI
from typing import Iterator
from anthropic import Anthropic
def index(request) -> HttpResponse:
return render(request, 'index.html')
def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:
"""
This func returns a streaming response that will be used to stream the completion back to the client.
We specify the LLM stream that we want to use in the `is_using` variable.
You could easily modify this to choose the LLM in your request.
"""
is_using = "anthropic" # "openai
def complete_with_anthropic() -> Iterator[str]:
"""
Stream an anthropic completion back to the client.
Docs: https://docs.anthropic.com/claude/reference/messages-streaming
"""
anthropic_client = Anthropic()
with anthropic_client.messages.stream(
max_tokens=1024,
system="You turn anything that I say into a funny, jolly, rhyming poem. "
"Add emojis occasionally.",
messages=[
{"role": "user",
"content": user_prompt
},
],
model="claude-3-opus-20240229",
) as stream:
for text in stream.text_stream:
content = text
if content is not None:
# We tidy the content for showing in the browser:
content = content.replace("\n", "")
content = content.replace(",", ", ")
content = content.replace(".", ". ")
yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.
print(text, end="", flush=True)
def complete_with_openai() -> Iterator[str]:
"""
Stream an openai completion back to the client.
Docs: https://platform.openai.com/docs/api-reference/streaming
"""
openai_client = OpenAI()
stream = openai_client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": "You turn anything that I say into a funny, jolly, rhyming poem. "
"Add emojis occasionally.",
},
{"role": "user",
"content": user_prompt
},
],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content is not None:
# We tidy the content for showing in the browser:
content = content.replace("\n", "")
content = content.replace(",", ", ")
content = content.replace(".", ". ")
yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.
# We select our chosen completion func.
if is_using == "openai":
completion_func = complete_with_openai
elif is_using == "anthropic":
completion_func = complete_with_anthropic
else:
raise ValueError(f"Unknown completion service: {is_using}")
return StreamingHttpResponse(completion_func(), content_type="text/event-stream")

You should see the completions streaming in real-time like in the below video:

Create your Django template to display the LLM results to the user in the browser

  • Create a new folder at sim/templates
  • add a new file called index.html in it and add the following code:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Stream LLM completion with Django</title>
<style>
.container {
display: flex;
flex-direction: column;
align-items: center;
text-align: center;
font-family: Arial, sans-serif;
}
.heading {
font-size: 24px;
margin-bottom: 20px;
}
.btn {
background-color: #ffcccc;
color: black;
padding: 10px 20px;
border: none;
border-radius: 20px;
cursor: pointer;
}
.btn:hover {
background-color: #ff9999;
}
#prompt-input {
width: 80%;
padding: 10px;
border-radius: 5px;
border: 1px solid #ccc;
margin-bottom: 15px;
}
#completion-text {
border-radius: 5px;
width: 80%;
overflow-y: scroll;
}
</style>
</head>
<body>
<div class="container">
<p class="heading">Stream data from an LLM</p>
<div id="completion-text"></div>
<input
id="prompt-input"
type="text"
placeholder="Enter your text"
style="width: 80%; padding: 10px; border-radius: 5px; border: 1px solid #ccc; margin-bottom: 15px;"
required
/>
<button
class="btn"
style="background-color: #ffcccc; color: black; padding: 10px 20px; border: none; border-radius: 20px; cursor: pointer;"
onclick="startSSE()"
>
Generate
</button>
</div>
<script>
let eventSource
const sseData = document.getElementById('completion-text')
const promptInput = document.getElementById('prompt-input')
function startSSE() {
const prompt = document.getElementById('prompt-input').value
if (!prompt) {
alert('Please enter a prompt')
return
}
const urlEncoded = encodeURIComponent(prompt)
const url = `generate-completion/${urlEncoded}`
eventSource = new EventSource(url)
eventSource.onopen = () => {
console.log('Connection to server opened')
}
eventSource.onmessage = (event) => {
console.log('event.data = ', event.data)
sseData.innerHTML += event.data
}
}
</script>
</body>
</html>

πŸ’‘ Side note : Here's a video of me generating the above HTML using Photon Designer πŸ’‘

Here's me using my product Photon Designer to generate the above HTML code:

-> Let's get back to building πŸš€

Update your urls

  • In core/urls.py, add the following code:
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('', include('sim.urls')),
]
  • Create a file at sim/urls.py, add the following code:
from django.urls import path
from . import views
urlpatterns = [
path('', views.index, name='index'),
path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')
]

Run your Django app

python manage.py runserver
  • Visit http://localhost:8000/ in your browser to see the completions streaming in real-time.

Complete - you can now stream your LLM completions to the browser using Django βœ…

Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.

You've added a new technique to your programming toolbelt πŸ™‚

If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.

P.S Want to ship better features with AI?
Join my free weekly newsletter

Each week, I share bite-sized learnings and AI news that matters - so you can build better software in practice.

No spam guaranteed Β· Unsubscribe whenever