Stream AI chats using Django in 5 minutes (OpenAI and Anthropic) 💧

I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.

No third-party packages needed:

We'll use Django's inbuilt server-sent events with StreamingHttpResponse, plus minimal vanilla JavaScript in a simple template.

Streaming completions from LLMs to the browser immediately:

We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).

Here's how our finished app will look:

This technique is surprisingly simple. We'll aim to do this in under 5 minutes.

Let's begin 🚀

Setup your Django app

Run this in the terminal:

pip install --upgrade django python-dotenv anthropic openai
django-admin startproject core .
python manage.py startapp sim

Add our app sim to the INSTALLED_APPS in settings.py:

# settings.py
INSTALLED_APPS = [
    'sim',
    ...
]

Add your environment variables

Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.

Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.

ANTHROPIC_API_KEY=<your_anthropic_api_key>
OPENAI_API_KEY=<your_openai_api_key>

Add the following to the top of your settings.py file to load your environment variables from the .env file:

from pathlib import Path
import os
from dotenv import load_dotenv

load_dotenv()

Create your Django view to stream the LLM completions to the browser

Add the following code to sim/views.py:

from django.http import HttpResponse, StreamingHttpResponse
from django.shortcuts import render
from openai import OpenAI
from typing import Iterator
from anthropic import Anthropic

def index(request) -> HttpResponse:
    return render(request, 'index.html')

def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:
    """
    This func returns a streaming response that will be used to stream the completion back to the client.

    We specify the LLM stream that we want to use in the `is_using` variable.
    You could easily modify this to choose the LLM in your request.
    """

    is_using = "anthropic"  # "openai

    def complete_with_anthropic() -> Iterator[str]:
        """
        Stream an anthropic completion back to the client.
        Docs: https://docs.anthropic.com/claude/reference/messages-streaming
        """
        anthropic_client = Anthropic()
        with anthropic_client.messages.stream(
                max_tokens=1024,
                system="You turn anything that I say into a funny, jolly, rhyming poem. "
                       "Add emojis occasionally.",
                messages=[

                    {"role": "user",
                     "content": user_prompt
                     },
                ],
                model="claude-3-opus-20240229",
        ) as stream:
            for text in stream.text_stream:
                content = text
                if content is not None:
                    # We tidy the content for showing in the browser:
                    content = content.replace("\n", "")
                    content = content.replace(",", ", ")
                    content = content.replace(".", ". ")

                    yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

                print(text, end="", flush=True)

    def complete_with_openai() -> Iterator[str]:
        """
        Stream an openai completion back to the client.
        Docs: https://platform.openai.com/docs/api-reference/streaming
        """
        openai_client = OpenAI()
        stream = openai_client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You turn anything that I say into a funny, jolly, rhyming poem. "
                               "Add emojis occasionally.",
                },
                {"role": "user",
                 "content": user_prompt
                 },
            ],
            stream=True,
        )
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content is not None:
                # We tidy the content for showing in the browser:
                content = content.replace("\n", "")
                content = content.replace(",", ", ")
                content = content.replace(".", ". ")

                yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

    # We select our chosen completion func.
    if is_using == "openai":
        completion_func = complete_with_openai
    elif is_using == "anthropic":
        completion_func = complete_with_anthropic
    else:
        raise ValueError(f"Unknown completion service: {is_using}")

    return StreamingHttpResponse(completion_func(), content_type="text/event-stream")

Check the stream by visiting http://localhost:8000/stream/ in your browser.

You should see the completions streaming in real-time like in the below video:

Create your Django template to display the LLM results to the user in the browser

Create a new folder at sim/templates
add a new file called index.html in it and add the following code:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Stream LLM completion with Django</title>
    <style>
      .container {
        display: flex;
        flex-direction: column;
        align-items: center;
        text-align: center;
        font-family: Arial, sans-serif;
      }

      .heading {
        font-size: 24px;
        margin-bottom: 20px;
      }

      .btn {
        background-color: #ffcccc;
        color: black;
        padding: 10px 20px;
        border: none;
        border-radius: 20px;
        cursor: pointer;
      }

      .btn:hover {
        background-color: #ff9999;
      }

      #prompt-input {
        width: 80%;
        padding: 10px;
        border-radius: 5px;
        border: 1px solid #ccc;
        margin-bottom: 15px;
      }

      #completion-text {
        border-radius: 5px;
        width: 80%;
        overflow-y: scroll;
      }
    </style>
  </head>

  <body>
    <div class="container">
      <p class="heading">Stream data from an LLM</p>
      <div id="completion-text"></div>

      <input
        id="prompt-input"
        type="text"
        placeholder="Enter your text"
        style="width: 80%; padding: 10px; border-radius: 5px; border: 1px solid #ccc; margin-bottom: 15px;"
        required
      />
      <button
        class="btn"
        style="background-color: #ffcccc; color: black; padding: 10px 20px; border: none; border-radius: 20px; cursor: pointer;"
        onclick="startSSE()"
      >
        Generate
      </button>
    </div>

    <script>
      let eventSource
      const sseData = document.getElementById('completion-text')
      const promptInput = document.getElementById('prompt-input')

      function startSSE() {
        const prompt = document.getElementById('prompt-input').value
        if (!prompt) {
          alert('Please enter a prompt')
          return
        }
        const urlEncoded = encodeURIComponent(prompt)
        const url = `generate-completion/${urlEncoded}`

        eventSource = new EventSource(url)

        eventSource.onopen = () => {
          console.log('Connection to server opened')
        }

        eventSource.onmessage = (event) => {
          console.log('event.data = ', event.data)
          sseData.innerHTML += event.data
        }
      }
    </script>
  </body>
</html>

💡 Side note : Here's a video of me generating the above HTML using Photon Designer 💡

Here's me using my product Photon Designer to generate the above HTML code:

-> Let's get back to building 🚀

Update your urls

In core/urls.py, add the following code:

from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', include('sim.urls')),
]

Create a file at sim/urls.py, add the following code:

from django.urls import path

from . import views

urlpatterns = [
    path('', views.index, name='index'),
    path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')
]

Run your Django app

python manage.py runserver

Visit http://localhost:8000/ in your browser to see the completions streaming in real-time.

Complete - you can now stream your LLM completions to the browser using Django ✅

Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.

You've added a new technique to your programming toolbelt 🙂

If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.