Transcribe Voice Audio to Text with AI on your Windows or Linux PC – OpenAI Whisper Tutorial

février 20, 2024 Mourad ELGORMA

Hi everyone! This video covers…
• OpenAI Whisper, FREE powerful AI-driven speech/audio to text.
• How to create searchable text files from your audio and video clips.
• 100% local transcription on your PC. No Internet required after AI model is downloaded.
• How to install/use on Windows and Linux (Ubuntu 22.04 LTS used as an example).
• Install for either CPU and GPU-driven AI.
• Compare performance of CPU vs GPU-driven OpenAI Whisper transcription.
• How to automatically transcribe entire directories of video and audio files to text (i.e., .mp4, .mov, .wav, mp3, and many other files).
• Transcribe audio/speech to text using an AMD GPU with ROCm.
• Transcribe audio/speech to text using an AMD GPU with ROCm within a docker container.
• Transcribe audio/speech to text using an NVIDIA GPU.

Look for « Walkthrough » in the below table of contents to find the start of this video’s 7 main sections…

Clickable Table of Contents
00:00 Start
00:30 Overview
05:55 Test Clips Overview
06:20 Test Clip #1 Demo
06:45 Test Clip #2 Demo
07:21 Test Clip #3 Demo
07:42 Test Clip #4 Demo (aka the «  »long » » clip)
08:55 Walkthrough #1: Windows simple setup.
09:46 Windows ffmpeg setup
10:22 Create Python environment on WIndows
10:50 Activate Python enviornment on Windows
11:06 Install OpenAI-Whisper on Windows
11:18 Check Whisper setup on Windows
11:38 Transcribe test clip #1 on Windows
12:13 Examine transcription output
12:53 Transcribe directory tree on Windows
13:15 Transcribe dir PowerShell script
15:01 Walkthrough #2: Linux simple setup.
15:13 Check Whisper setup on Linux
15:25 Transcribe clip #1 on Linux
15:57 Model download on first use
16:17 Transcribe directory tree on Linux
16:38 Transcribe dir bash script
17:30 Run bash script, transcribe dir
17:49 Open .srt file
18:14 Walkthrough #3: Linux/ROCm/Docker/Whisper setup.
18:26 Docker/Ubuntu (Debian) setup
19:37 docker image ls permission issue
20:03 docker «  »hello world » » test
20:16 Examine downloaded image
20:27 docker container ls
20:41 ROCm/Pytorch docker container docs/setup
21:37 AMD/ROCm Docker Hub
21:50 AMD ROCm/Pytorch Container
22:01 docker pull rocm/pytorch
22:29 docker rocm/pytorch command line
22:47 docker run rocm/pytorch
23:21 prep container for openai-whisper
23:33 rocminfo command
23:46 setup ffmpeg (rocm/pytorch container)
23:58 openai-whisper setup (rocm/pytorch container)
24:09 Verify container Pytorch is using GPU
24:24 Run whisper in container
24:36 List stopped containers
25:07 Restarting prepped container
25:37 Sharing files with a container
26:51 Prep container for openai-whisper #2
27:21 Transcribe shared file within container
28:28 Examine transcribed output (container shared folder)
29:27 Creating openai-whisper image (commit container)
30:25 Testing our whisper docker image
31:32 Transcribe in whisper docker image (no model download required)
32:08 Transcribe with image in one step
33:13 Examine output from host
33:29 Docker/Whisper recap #1
35:11 Transcribe directory with docker image
36:05 Transcribe dir bash script, docker
40:56 run bash script, transcribe dir, docker
41:57 docker container transcription recap
42:26 Walkthrough #4: Linux/ROCm/Pytorch/Whisper native setup.
42:40 VS Code setup
43:30 ROCm/Pytorch native setup
44:36 ROCm GPG key setup
45:01 ROCm repo setup
45:57 Ubuntu 22.04 quick install copy/paste
46:44 Native rocm/pytorch setup
47:47 Verify Pytorch is using the GPU
48:02 Install openai-whisper, native
48:31 Transcribe clip #1, native
49:08 Adding language
49:21 Examine transcription output, native setup
49:46 Using wildcard, multiple clips at once
50:20 Walkthrough #5: Linux/ROCm/Pytorch/Whisper CPU vs GPU perf test.
51:02 Verify GPU is in use, ROCm GPU vs CPU
51:18 time transcription using GPU
52:36 Whisper ROCm GPU result
52:48 Remove ROCm, prep for CPU perf run
53:26 Setup CPU Pytorch
54:24 Verify Pytorch is using CPU
54:45 time transcription using CPU, ROCm GPU vs CPU
55:04 ROCm GPU vs CPU results
56:38 Walkthrough #6: Windows/CUDA/Pytorch/Whisper CPU vs GPU perf test.
56:54 Setup Python on Windows
57:28 Create Python environment
59:20 Activate Python environment
59:58 Install openai-whisper
01:00:31 Verify Pytorch is using CPU, not GPU
01:01:23 Initial «  »burn in » » transcription
01:01:44 Burn-in pre-test run
01:03:21 Examine transcription output, srt file with time stamps
01:04:08 Time CPU transcription with PowerShell Measure-Command
01:06:23 Whisper CPU result
01:06:54 Walkthrough #7: Windows/CUDA/Pytorch native setup.
01:09:19 Verify Pytorch using NVIDIA GPU
01:10:10 CUDA GPU vs CPU results
01:10:30 CPU and Python 3.11, IMPORTANT
01:12:28 Outro