Using Django & AssemblyAI for More Accurate Twilio Call Transcriptions
Building a Django App to Enhance Twilio Call Transcriptions with AssemblyAI's Advanced Speech-to-Text Capabilities
Introduction
Recording phone calls is a breeze with Twilio's Programmable Voice API, but the speech-to-text accuracy can sometimes fall short, especially in niche domains like healthcare and engineering. AssemblyAI's transcription service, however, provides high accuracy by default and offers optional keyword lists for even better results. In this tutorial, we'll walk you through creating a Django application that records Twilio calls and transcribes them using AssemblyAI.
Grab your favorite beverage, and let's dive in!
Prerequisites
Before we start, make sure you have the following:
Setting Up the Project
Step 1: Create a Virtual Environment
First, let's create a virtual environment to keep our dependencies isolated:
virtualenv venv
venv/Scripts/activate
This command creates a new virtual environment named tensha
and activates it.
Step 2: Install Dependencies
Next, we'll install Django, requests, and Twilio:
pip install django requests twilio
Step 3: Start a Django Project
Let's create a new Django project named tensha
:
django-admin startproject djtranscribe
cd djtranscribe
python manage.py startapp caller
Now we have a basic Django project with an app named caller
.
Project Configuration
Step 4: Update urls.py
Open tensha/
urls.py
and update it to include the caller
app URLs:
from django.conf.urls import include
from django.contrib import admin
from django.urls import path
urlpatterns = [
path('', include('caller.urls')),
path('admin/', admin.site.urls),
]
Step 5: Update settings.py
Open tensha/
settings.py
and add the caller
app to the INSTALLED_APPS
list. Also, add environment variables for our project settings:
import os
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'caller',
]
BASE_URL = os.getenv("BASE_URL")
TWIML_INSTRUCTIONS_URL = f"{BASE_URL}/record/"
TWILIO_PHONE_NUMBER = os.getenv("TWILIO_PHONE_NUMBER")
Step 6: Create caller/
urls.py
In the caller
directory, create a file named urls.py
and add the following routes:
from django.conf.urls import url
from . import views
urlpatterns = [
url(r'dial/(\d+)/$', views.dial, name="dial"),
url(r'record/$', views.record_twiml, name="record-twiml"),
url(r'get-recording-url/([A-Za-z0-9]+)/$', views.get_recording_url, name='recording-url'),
]
Step 7: Update views.py
Now, let's add the view functions to handle dialing, recording, and retrieving the recording URL. Open caller/
views.py
and replace its content with:
from django.conf import settings
from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt
from twilio.rest import Client
from twilio.twiml.voice_response import VoiceResponse
def dial(request, phone_number):
twilio_client = Client()
call = twilio_client.calls.create(
to=f'+{phone_number}',
from_=settings.TWILIO_PHONE_NUMBER,
url=settings.TWIML_INSTRUCTIONS_URL,
)
return HttpResponse(f"dialing +{phone_number}. call SID is: {call.sid}")
@csrf_exempt
def record_twiml(request):
response = VoiceResponse()
response.say('Ahoy! Call recording starts now.')
response.record()
response.hangup()
return HttpResponse(str(response), content_type='application/xml')
def get_recording_url(request, call_sid):
twilio_client = Client()
recording_urls = ""
call = twilio_client.calls.get(call_sid)
for r in call.recordings.list():
recording_urls = "\n".join([recording_urls, f"https://api.twilio.com{r.uri}"])
return HttpResponse(recording_urls, 200)
Setting Environment Variables
To keep sensitive information secure, we'll use environment variables. Set the following variables in your terminal:
On Linux
export TWILIO_ACCOUNT_SID=your_twilio_account_sid
export TWILIO_AUTH_TOKEN=your_twilio_auth_token
export TWILIO_PHONE_NUMBER=your_twilio_phone_number
export BASE_URL=your_ngrok_url
export ASSEMBLYAI_KEY=your_assemblyai_key
export RECORDING_URL=your_recording_url
export TRANSCRIPTION_ID=your_transcription_id
On Windows
set TWILIO_ACCOUNT_SID=your_twilio_account_sid
set TWILIO_AUTH_TOKEN=your_twilio_auth_token
set TWILIO_PHONE_NUMBER=your_twilio_phone_number
set BASE_URL=your_ngrok_url
set ASSEMBLYAI_KEY=your_assemblyai_key
set RECORDING_URL=your_recording_url
set TRANSCRIPTION_ID=your_transcription_id
Step 8: Start Ngrok
Ngrok creates a secure tunnel to your localhost, making it accessible over the web. Run Ngrok in a separate terminal window:
./ngrok http 8000
Copy the HTTPS URL provided by Ngrok and set it as your BASE_URL
environment variable.
Step 9: Update ALLOWED_HOSTS
In tensha/
settings.py
, update ALLOWED_HOSTS
to include your Ngrok URL:
ALLOWED_HOSTS = ['your_ngrok_subdomain.ngrok.io', '127.0.0.1', 'localhost']
Running the Project
Step 10: Run Django Development Server
Ensure Ngrok is running and your virtual environment is active. Then, start the Django server:
python manage.py runserver
Step 11: Dial a Number
Open your browser and navigate to:
http://localhost:8000/dial/<phone_number>
Replace <phone_number>
with the number you want to call. Twilio will call this number, play a message, and start recording.
Step 12: Get Recording URL
After the call, navigate to:
http://localhost:8000/get-recording-url/<call_sid>
Replace <call_sid>
with the SID from the call you just made. This URL will display the location of the recording.
Transcribing Recordings
Step 13: Create transcribe.py
Create a file named transcribe.py
to send the recording to AssemblyAI for transcription:
import os
import requests
endpoint = "https://api.assemblyai.com/v2/transcript"
json = {
"audio_url": os.getenv("RECORDING_URL")
}
headers = {
"authorization": os.getenv("ASSEMBLYAI_KEY"),
"content-type": "application/json"
}
response = requests.post(endpoint, json=json, headers=headers)
print(response.json())
Run the script to start the transcription process:
python transcribe.py
Step 14: Create print_
transcription.py
Create another file named print_
transcription.py
to retrieve and display the transcription:
import os
import requests
endpoint = f"https://api.assemblyai.com/v2/transcript/{os.getenv('TRANSCRIPTION_ID')}"
headers = {
"authorization": os.getenv("ASSEMBLYAI_KEY"),
}
response = requests.get(endpoint, headers=headers)
print(response.json())
print("\n\n")
print(response.json()['text'])
Run the script to get the transcription:
python print_transcription.py
Conclusion
Congratulations! You've built a Django application that can dial phone numbers, record calls using Twilio, and transcribe the recordings with AssemblyAI. You can now extend this project to fit your needs, whether it's for personal use, business, or an exciting new application idea. Happy coding! ๐