🛠️ Step-by-Step Setup for Audio Transcription with faster-whisper
1️⃣ Install WSL and a Linux Distro
Open PowerShell as Administrator and run:
wsl --install
This installs WSL and the default Ubuntu distro. If you want a specific distro:
wsl --install -d Ubuntu-22.04
After installation, restart your computer if prompted.
2️⃣ Download Python (Latest Version)
Inside your WSL terminal (Ubuntu), run:
sudo apt update
sudo apt install python3 python3-pip python3-venv -y
Check version:
python3 --version
3️⃣ Create and Install a Python Virtual Environment
Navigate to your working directory:
mkdir whisper_project && cd whisper_project
python3 -m venv venv
4️⃣ Activate the Environment
source venv/bin/activate
Your prompt should now show (venv)
.
5️⃣ Install faster-whisper
pip install faster-whisper
Optional: If you want GPU support (NVIDIA), install torch
with CUDA:
pip install torch torchvision torchaudio --index-url <https://download.pytorch.org/whl/cu118>
6️⃣ Create a Python Script to Transcribe Audio
Create a file called transcribe.py
:
nano transcribe.py
Paste this code:
from faster_whisper import WhisperModel
model = WhisperModel("base", device="cpu") # Change to "cuda" if using GPU
segments, info = model.transcribe("input_audio.mp3", beam_size=5)
print(f"Detected language: {info.language}")
for segment in segments:
print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
Save and exit: press Ctrl+O
, Enter
, then Ctrl+X
.
7️⃣ Copy Audio File to WSL Location
From Windows, copy your audio file to your WSL directory. Example:
copy C:\\Users\\<your_user>\\Downloads\\audio.mp3 \\\\wsl$\\Ubuntu\\home\\<your-username>\\whisper_project\\
Replace <your-username>
with your actual WSL username.
8️⃣ Convert File Format (If Needed)
If your file isn’t in .mp3
or .wav
, convert it using ffmpeg
:
sudo apt install ffmpeg -y
ffmpeg -i input_audio.webm input_audio.mp3
ffmpeg -i input_video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 output_audio.wav
ffmpeg -i input_video.mp4 -vn -acodec libmp3lame -q:a 2 audio.mp3
9️⃣ Transcribe the Audio
Run the script:
python transcribe.py
You’ll see the transcription printed in your terminal. 🎧📝