Data Analysis
songsee avatar

songsee

Generate professional-grade spectrograms, feature-panel visualizations, and audio analysis plots from audio files using the songsee CLI tool.

Introduction

The songsee skill provides high-performance audio visualization capabilities directly within your development or data analysis workflow. Designed as a command-line interface, it transforms raw audio files into rich, insightful visual representations that help users understand the spectral, rhythmic, and textural characteristics of their sound data. Whether you are conducting music information retrieval (MIR), debugging audio processing pipelines, or performing forensic audio analysis, songsee serves as a reliable visual diagnostic tool. It supports standard formats including WAV and MP3 natively, with extended format support available via ffmpeg.

  • Multi-format visualization support including spectrograms, mel-spectrograms, chroma, harmonic-percussive source separation (hpss), self-similarity matrices, loudness, tempograms, mfcc, and spectral flux.

  • Configurable visual output with multiple color palettes such as magma, inferno, viridis, classic, and grayscale to suit research or presentation needs.

  • Precise time-slicing capabilities, allowing users to extract and visualize specific segments of an audio file by defining start times and durations.

  • Flexible output customization, including adjustable FFT window/hop settings, frequency range filtering, and output dimensions.

  • Robust command-line integration, supporting both file-based inputs and piped stdin streams for automated processing pipelines.

  • Supported audio input formats include native WAV/MP3 decoding, with additional formats handled automatically if ffmpeg is detected in the system environment.

  • Enables grid-style layouts when requesting multiple visualization types simultaneously for comprehensive data comparison.

  • Highly recommended for researchers and developers working on AI audio agents, machine learning audio feature engineering, or general audio analysis tasks.

  • Users should ensure ffmpeg is installed if processing complex or non-standard audio codecs to ensure broad compatibility.

  • Output can be saved as standard high-quality image formats like PNG or JPG, suitable for inclusion in reports or documentation.

Repository Stats

Stars
365,620
Forks
74,934
Open Issues
6,998
Language
TypeScript
Default Branch
main
Sync Status
Idle
Last Synced
Apr 28, 2026, 11:27 AM
View on GitHub