Stream AI chats using Django in 5 minutes (OpenAI and Anthropic) πŸ’§

Published: April 7, 2024

I'll show you how the fastest way to send LLM completions from Anthropic or OpenAI to the browser in real-time using Django.

No third-party packages needed:

  • We'll use Django's inbuilt server-sent events with StreamingHttpResponse, plus minimal vanilla JavaScript in a simple template.

Streaming completions from LLMs to the browser immediately:

  • We'll show results to users as soon as they are generated, rather than waiting for the entire completion, using Django's server-sent events (SSE).

Here's how our finished app will look:

This technique is surprisingly simple. We'll aim to do this in under 5 minutes.

Let's begin πŸš€

Setup your Django app

  • Run this in the terminal:
pip install --upgrade django python-dotenv anthropic openai
django-admin startproject core .
python manage.py startapp sim
  • Add our app sim to the INSTALLED_APPS in settings.py:
# settings.py
INSTALLED_APPS = [
    'sim',
    ...
]

Add your environment variables

Create a file called .env at "core/.env" and add the below to it. We'll use this to store our environment variables, which we won't commit to version control.

  • Add your api keys to the .env file. You can get your api keys from the Anthropic and Open AI websites.
ANTHROPIC_API_KEY=<your_anthropic_api_key>
OPENAI_API_KEY=<your_openai_api_key>

  • Add the following to the top of your settings.py file to load your environment variables from the .env file:
from pathlib import Path
import os
from dotenv import load_dotenv

load_dotenv()

Create your Django view to stream the LLM completions to the browser

  • Add the following code to sim/views.py:
from django.http import HttpResponse, StreamingHttpResponse
from django.shortcuts import render
from openai import OpenAI
from typing import Iterator
from anthropic import Anthropic

def index(request) -> HttpResponse:
    return render(request, 'index.html')

def generate_completion(request, user_prompt: str) -> StreamingHttpResponse:
    """
    This func returns a streaming response that will be used to stream the completion back to the client.

    We specify the LLM stream that we want to use in the `is_using` variable.
    You could easily modify this to choose the LLM in your request.
    """

    is_using = "anthropic"  # "openai

    def complete_with_anthropic() -> Iterator[str]:
        """
        Stream an anthropic completion back to the client.
        Docs: https://docs.anthropic.com/claude/reference/messages-streaming
        """
        anthropic_client = Anthropic()
        with anthropic_client.messages.stream(
                max_tokens=1024,
                system="You turn anything that I say into a funny, jolly, rhyming poem. "
                       "Add emojis occasionally.",
                messages=[

                    {"role": "user",
                     "content": user_prompt
                     },
                ],
                model="claude-3-opus-20240229",
        ) as stream:
            for text in stream.text_stream:
                content = text
                if content is not None:
                    # We tidy the content for showing in the browser:
                    content = content.replace("\n", "")
                    content = content.replace(",", ", ")
                    content = content.replace(".", ". ")

                    yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

                print(text, end="", flush=True)

    def complete_with_openai() -> Iterator[str]:
        """
        Stream an openai completion back to the client.
        Docs: https://platform.openai.com/docs/api-reference/streaming
        """
        openai_client = OpenAI()
        stream = openai_client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You turn anything that I say into a funny, jolly, rhyming poem. "
                               "Add emojis occasionally.",
                },
                {"role": "user",
                 "content": user_prompt
                 },
            ],
            stream=True,
        )
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content is not None:
                # We tidy the content for showing in the browser:
                content = content.replace("\n", "")
                content = content.replace(",", ", ")
                content = content.replace(".", ". ")

                yield f"data: {content}\n\n"  # We yield the content in the text/event-stream format.

    # We select our chosen completion func.
    if is_using == "openai":
        completion_func = complete_with_openai
    elif is_using == "anthropic":
        completion_func = complete_with_anthropic
    else:
        raise ValueError(f"Unknown completion service: {is_using}")

    return StreamingHttpResponse(completion_func(), content_type="text/event-stream")

You should see the completions streaming in real-time like in the below video:

Create your Django template to display the LLM results to the user in the browser

  • Create a new folder at sim/templates
  • add a new file called index.html in it and add the following code:
<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Stream LLM completion with Django</title>
    <style>
      .container {
        display: flex;
        flex-direction: column;
        align-items: center;
        text-align: center;
        font-family: Arial, sans-serif;
      }

      .heading {
        font-size: 24px;
        margin-bottom: 20px;
      }

      .btn {
        background-color: #ffcccc;
        color: black;
        padding: 10px 20px;
        border: none;
        border-radius: 20px;
        cursor: pointer;
      }

      .btn:hover {
        background-color: #ff9999;
      }

      #prompt-input {
        width: 80%;
        padding: 10px;
        border-radius: 5px;
        border: 1px solid #ccc;
        margin-bottom: 15px;
      }

      #completion-text {
        border-radius: 5px;
        width: 80%;
        overflow-y: scroll;
      }
    </style>
  </head>

  <body>
    <div class="container">
      <p class="heading">Stream data from an LLM</p>
      <div id="completion-text"></div>

      <input
        id="prompt-input"
        type="text"
        placeholder="Enter your text"
        style="width: 80%; padding: 10px; border-radius: 5px; border: 1px solid #ccc; margin-bottom: 15px;"
        required
      />
      <button
        class="btn"
        style="background-color: #ffcccc; color: black; padding: 10px 20px; border: none; border-radius: 20px; cursor: pointer;"
        onclick="startSSE()"
      >
        Generate
      </button>
    </div>

    <script>
      let eventSource
      const sseData = document.getElementById('completion-text')
      const promptInput = document.getElementById('prompt-input')

      function startSSE() {
        const prompt = document.getElementById('prompt-input').value
        if (!prompt) {
          alert('Please enter a prompt')
          return
        }
        const urlEncoded = encodeURIComponent(prompt)
        const url = `generate-completion/${urlEncoded}`

        eventSource = new EventSource(url)

        eventSource.onopen = () => {
          console.log('Connection to server opened')
        }

        eventSource.onmessage = (event) => {
          console.log('event.data = ', event.data)
          sseData.innerHTML += event.data
        }
      }
    </script>
  </body>
</html>

πŸ’‘ Side note : Here's a video of me generating the above HTML using Photon Designer πŸ’‘

Here's me using my product Photon Designer to generate the above HTML code:

-> Let's get back to building πŸš€

Update your urls

  • In core/urls.py, add the following code:
from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', include('sim.urls')),
]
  • Create a file at sim/urls.py, add the following code:
from django.urls import path

from . import views

urlpatterns = [
    path('', views.index, name='index'),
    path('generate-completion/<str:user_prompt>', views.generate_completion, name='generate-completion')
]

Run your Django app

python manage.py runserver
  • Visit http://localhost:8000/ in your browser to see the completions streaming in real-time.

Complete - you can now stream your LLM completions to the browser using Django βœ…

Congrats. You've successfully set up a Django app to stream LLM completions to the browser in real-time, using Django's inbuilt server-sent events.

You've added a new technique to your programming toolbelt πŸ™‚

If you'd like to see another simple guide on server-sent events (SSE) and Django, check out my guide: The simplest way to add server sent events to Django.

Subscribe to my free newsletter

Get updates on AI, software, and business.