Back to blog
6 min read

Building a Music Recognizer with Python

How I built a Shazam-like tool that identifies tracks from system audio in real time.

PythonAudioProject

The Problem

You're listening to a DJ set on YouTube. A track drops and you think "what is this?" By the time you pause, open Shazam on your phone, and hold it to the speaker — the track has changed.

I wanted something that runs in the background, listens to whatever my computer is playing, and automatically identifies every track. No manual intervention.

The Architecture

The system has four parts:

1. Audio Capture

Windows has a concept called "loopback audio" — you can record whatever is coming out of the speakers. I use PyAudioWPatch (a fork of PyAudio that supports WASAPI loopback):

The capture runs in a background thread, grabbing 6-second chunks of audio at regular intervals.

2. Audio Fingerprinting

Each audio chunk gets sent to Shazam's recognition algorithm. I use the ShazamAPI Python library which implements the same fingerprinting that the Shazam app uses:

  • Convert raw audio to WAV format
  • Generate audio fingerprint (spectral analysis)
  • Send fingerprint to Shazam's servers
  • Get back track metadata: title, artist, album, links

If Shazam doesn't recognize it (common with obscure tracks), I fall back to AudD as a secondary API.

3. Deduplication

When you're listening to a 7-minute track, the recognizer will identify it multiple times. I use a simple pending/confirmed system:

  • First recognition → mark as "pending"
  • Second recognition of same track → mark as "confirmed" and save
  • Different track recognized → discard pending, start new

This eliminates false positives and avoids duplicates.

4. Storage & Export

Confirmed tracks go into:

  • SQLite database — fast lookups, deduplication by title+artist
  • Excel file — formatted spreadsheet with links to Spotify, SoundCloud, YouTube Music

The GUI

I built a minimal Tkinter GUI with a dark theme (matching my IDE aesthetic):

  • Real-time display of currently playing track
  • History of all recognized tracks with clickable streaming links
  • System tray integration — runs quietly in the background
  • Delete/dismiss controls for false positives

Set Analyzer Mode

The coolest feature: Set Analyzer. Give it a YouTube URL of a DJ set, and it:

  1. 01Downloads the audio using yt-dlp
  2. 02Splits it into segments
  3. 03Runs Shazam on each segment
  4. 04Outputs a timestamped tracklist

I've used this to create tracklists for sets that have no tracklist posted online.

Stats After One Month

  • 545 unique tracks recognized
  • ~3,000 total recognitions (before deduplication)
  • Primary genres: deep house, afro house, melodic techno, progressive
  • Most recognized artist: Da Africa Deep (the man is everywhere)
  • Weirdest recognition: Bach's Passacaglia in C Minor during a Keinemusik set (it was a sample)

What I Learned

  1. 01Audio APIs on Windows are painful. WASAPI loopback is the only reliable way to capture system audio, and the Python bindings are fragile.
  2. 02Shazam is remarkably good at identifying tracks even with background noise, EQ changes, and DJ mixing.
  3. 03The 80/20 rule applies. The core recognizer took a day. The GUI, edge cases, and polish took a week.
  4. 04Build tools you actually use. This isn't a portfolio project — I run it every day. That's the best motivation.

Try It Yourself

The full source is on GitHub. You'll need a free AudD API token (300 requests/day) and Python 3.10+.

No track playing

331 tracks available

Queue · 0 tracks