Building a Music Recognizer with Python
How I built a Shazam-like tool that identifies tracks from system audio in real time.
The Problem
You're listening to a DJ set on YouTube. A track drops and you think "what is this?" By the time you pause, open Shazam on your phone, and hold it to the speaker — the track has changed.
I wanted something that runs in the background, listens to whatever my computer is playing, and automatically identifies every track. No manual intervention.
The Architecture
The system has four parts:
1. Audio Capture
Windows has a concept called "loopback audio" — you can record whatever is coming out of the speakers. I use PyAudioWPatch (a fork of PyAudio that supports WASAPI loopback):
The capture runs in a background thread, grabbing 6-second chunks of audio at regular intervals.
2. Audio Fingerprinting
Each audio chunk gets sent to Shazam's recognition algorithm. I use the ShazamAPI Python library which implements the same fingerprinting that the Shazam app uses:
- •Convert raw audio to WAV format
- •Generate audio fingerprint (spectral analysis)
- •Send fingerprint to Shazam's servers
- •Get back track metadata: title, artist, album, links
If Shazam doesn't recognize it (common with obscure tracks), I fall back to AudD as a secondary API.
3. Deduplication
When you're listening to a 7-minute track, the recognizer will identify it multiple times. I use a simple pending/confirmed system:
- •First recognition → mark as "pending"
- •Second recognition of same track → mark as "confirmed" and save
- •Different track recognized → discard pending, start new
This eliminates false positives and avoids duplicates.
4. Storage & Export
Confirmed tracks go into:
- •SQLite database — fast lookups, deduplication by title+artist
- •Excel file — formatted spreadsheet with links to Spotify, SoundCloud, YouTube Music
The GUI
I built a minimal Tkinter GUI with a dark theme (matching my IDE aesthetic):
- •Real-time display of currently playing track
- •History of all recognized tracks with clickable streaming links
- •System tray integration — runs quietly in the background
- •Delete/dismiss controls for false positives
Set Analyzer Mode
The coolest feature: Set Analyzer. Give it a YouTube URL of a DJ set, and it:
- 01Downloads the audio using yt-dlp
- 02Splits it into segments
- 03Runs Shazam on each segment
- 04Outputs a timestamped tracklist
I've used this to create tracklists for sets that have no tracklist posted online.
Stats After One Month
- •545 unique tracks recognized
- •~3,000 total recognitions (before deduplication)
- •Primary genres: deep house, afro house, melodic techno, progressive
- •Most recognized artist: Da Africa Deep (the man is everywhere)
- •Weirdest recognition: Bach's Passacaglia in C Minor during a Keinemusik set (it was a sample)
What I Learned
- 01Audio APIs on Windows are painful. WASAPI loopback is the only reliable way to capture system audio, and the Python bindings are fragile.
- 02Shazam is remarkably good at identifying tracks even with background noise, EQ changes, and DJ mixing.
- 03The 80/20 rule applies. The core recognizer took a day. The GUI, edge cases, and polish took a week.
- 04Build tools you actually use. This isn't a portfolio project — I run it every day. That's the best motivation.
Try It Yourself
The full source is on GitHub. You'll need a free AudD API token (300 requests/day) and Python 3.10+.