Stream AI chats using Django in 5 minutes (OpenAI and Anthropic) π§
I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.
No third-party packages needed:
- We'll use Django's inbuilt server-sent events with
StreamingHttpResponse
, plus minimal vanilla JavaScript in a simple template.
Streaming completions from LLMs to the browser immediately:
- We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).
Here's how our finished app will look:
This technique is surprisingly simple. We'll aim to do this in under 5 minutes.
Let's begin π
Setup your Django app
- Run this in the terminal:
pip install --upgrade django python-dotenv anthropic openaidjango-admin startproject core .python manage.py startapp sim
- Add our app sim to the
INSTALLED_APPS
in settings.py:
# settings.pyINSTALLED_APPS = ['sim',...]
Add your environment variables
Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.
- Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.
ANTHROPIC_API_KEY=<your_anthropic_api_key>OPENAI_API_KEY=<your_openai_api_key>
- Add the following to the top of your
settings.py
file to load your environment variables from the .env file:
from pathlib import Pathimport osfrom dotenv import load_dotenvload_dotenv()
Create your Django view to stream the LLM completions to the browser
- Add the following code to
sim/views.py
:
from django.http import HttpResponse, StreamingHttpResponsefrom django.shortcuts import renderfrom openai import OpenAIfrom typing import Iteratorfrom anthropic import Anthropicdef index(request) -> HttpResponse:return render(request, 'index.html')def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:"""This func returns a streaming response that will be used to stream the completion back to the client.We specify the LLM stream that we want to use in the `is_using` variable.You could easily modify this to choose the LLM in your request."""is_using = "anthropic" # "openaidef complete_with_anthropic() -> Iterator[str]:"""Stream an anthropic completion back to the client.Docs: https://docs.anthropic.com/claude/reference/messages-streaming"""anthropic_client = Anthropic()with anthropic_client.messages.stream(max_tokens=1024,system="You turn anything that I say into a funny, jolly, rhyming poem. ""Add emojis occasionally.",messages=[{"role": "user","content": user_prompt},],model="claude-3-opus-20240229",) as stream:for text in stream.text_stream:content = textif content is not None:# We tidy the content for showing in the browser:content = content.replace("\n", "")content = content.replace(",", ", ")content = content.replace(".", ". ")yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.print(text, end="", flush=True)def complete_with_openai() -> Iterator[str]:"""Stream an openai completion back to the client.Docs: https://platform.openai.com/docs/api-reference/streaming"""openai_client = OpenAI()stream = openai_client.chat.completions.create(model="gpt-3.5-turbo",messages=[{"role": "system","content": "You turn anything that I say into a funny, jolly, rhyming poem. ""Add emojis occasionally.",},{"role": "user","content": user_prompt},],stream=True,)for chunk in stream:content = chunk.choices[0].delta.contentif content is not None:# We tidy the content for showing in the browser:content = content.replace("\n", "")content = content.replace(",", ", ")content = content.replace(".", ". ")yield f"data: {content}\n\n" # We yield the content in the text/event-stream format.# We select our chosen completion func.if is_using == "openai":completion_func = complete_with_openaielif is_using == "anthropic":completion_func = complete_with_anthropicelse:raise ValueError(f"Unknown completion service: {is_using}")return StreamingHttpResponse(completion_func(), content_type="text/event-stream")
- Check the stream by visiting http://localhost:8000/stream/ in your browser.
You should see the completions streaming in real-time like in the below video:
Create your Django template to display the LLM results to the user in the browser
- Create a new folder at
sim/templates
- add a new file called
index.html
in it and add the following code:
<!doctype html><html lang="en"><head><meta charset="UTF-8" /><title>Stream LLM completion with Django</title><style>.container {display: flex;flex-direction: column;align-items: center;text-align: center;font-family: Arial, sans-serif;}.heading {font-size: 24px;margin-bottom: 20px;}.btn {background-color: #ffcccc;color: black;padding: 10px 20px;border: none;border-radius: 20px;cursor: pointer;}.btn:hover {background-color: #ff9999;}#prompt-input {width: 80%;padding: 10px;border-radius: 5px;border: 1px solid #ccc;margin-bottom: 15px;}#completion-text {border-radius: 5px;width: 80%;overflow-y: scroll;}</style></head><body><div class="container"><p class="heading">Stream data from an LLM</p><div id="completion-text"></div><inputid="prompt-input"type="text"placeholder="Enter your text"style="width: 80%; padding: 10px; border-radius: 5px; border: 1px solid #ccc; margin-bottom: 15px;"required/><buttonclass="btn"style="background-color: #ffcccc; color: black; padding: 10px 20px; border: none; border-radius: 20px; cursor: pointer;"onclick="startSSE()">Generate</button></div><script>let eventSourceconst sseData = document.getElementById('completion-text')const promptInput = document.getElementById('prompt-input')function startSSE() {const prompt = document.getElementById('prompt-input').valueif (!prompt) {alert('Please enter a prompt')return}const urlEncoded = encodeURIComponent(prompt)const url = `generate-completion/${urlEncoded}`eventSource = new EventSource(url)eventSource.onopen = () => {console.log('Connection to server opened')}eventSource.onmessage = (event) => {console.log('event.data = ', event.data)sseData.innerHTML += event.data}}</script></body></html>
π‘ Side note : Here's a video of me generating the above HTML using Photon Designer π‘
Here's me using my product Photon Designer to generate the above HTML code:
-> Let's get back to building π
Update your urls
- In
core/urls.py
, add the following code:
from django.contrib import adminfrom django.urls import path, includeurlpatterns = [path('admin/', admin.site.urls),path('', include('sim.urls')),]
- Create a file at
sim/urls.py
, add the following code:
from django.urls import pathfrom . import viewsurlpatterns = [path('', views.index, name='index'),path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')]
Run your Django app
python manage.py runserver
- Visit
http://localhost:8000/
in your browser to see the completions streaming in real-time.
Complete - you can now stream your LLM completions to the browser using Django β
Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.
You've added a new technique to your programming toolbelt π
If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.