🛠️ Step-by-Step Setup for Audio Transcription with faster-whisper

1️⃣ Install WSL and a Linux Distro

Open PowerShell as Administrator and run:

wsl --install

This installs WSL and the default Ubuntu distro. If you want a specific distro:

wsl --install -d Ubuntu-22.04

After installation, restart your computer if prompted.


2️⃣ Download Python (Latest Version)

Inside your WSL terminal (Ubuntu), run:

sudo apt update
sudo apt install python3 python3-pip python3-venv -y

Check version:

python3 --version


3️⃣ Create and Install a Python Virtual Environment

Navigate to your working directory:

mkdir whisper_project && cd whisper_project
python3 -m venv venv


4️⃣ Activate the Environment

source venv/bin/activate

Your prompt should now show (venv).


5️⃣ Install faster-whisper

pip install faster-whisper

Optional: If you want GPU support (NVIDIA), install torch with CUDA:

pip install torch torchvision torchaudio --index-url <https://download.pytorch.org/whl/cu118>


6️⃣ Create a Python Script to Transcribe Audio

Create a file called transcribe.py:

nano transcribe.py

Paste this code:

from faster_whisper import WhisperModel

model = WhisperModel("base", device="cpu")  # Change to "cuda" if using GPU

segments, info = model.transcribe("input_audio.mp3", beam_size=5)

print(f"Detected language: {info.language}")
for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

Save and exit: press Ctrl+O, Enter, then Ctrl+X.


7️⃣ Copy Audio File to WSL Location

From Windows, copy your audio file to your WSL directory. Example:

copy C:\\Users\\<your_user>\\Downloads\\audio.mp3 \\\\wsl$\\Ubuntu\\home\\<your-username>\\whisper_project\\

Replace <your-username> with your actual WSL username.


8️⃣ Convert File Format (If Needed)

If your file isn’t in .mp3 or .wav, convert it using ffmpeg:

sudo apt install ffmpeg -y
ffmpeg -i input_audio.webm input_audio.mp3

ffmpeg -i input_video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 output_audio.wav

ffmpeg -i input_video.mp4 -vn -acodec libmp3lame -q:a 2 audio.mp3

9️⃣ Transcribe the Audio

Run the script:

python transcribe.py

You’ll see the transcription printed in your terminal. 🎧📝