Department of Electronics 2020-2021: Prof. Shilpa Achaliya

RAMRAO ADIK INSTITUTE OF TECHNOLOGY
Nerul, Navi Mumbai - 400706.
A PROJECT REPORT
On
"Python Based Audio Spectrum Analyzer"

Submitted in partial fulllment of the requirement for the award of the degree of
BACHELOR OF ENGINEERING
IN
ELECTRONICS
SEMESTER - 4
Submitted by
Nilesh Singh 19EE1144
Karan Chauhan 19EE1143
Kunal Tandel 16EE2004
Under the Guidance of

Prof. SHILPA ACHALIYA
DEPARTMENT OF ELECTRONICS
2020-2021
1
Contents
1 Introduction 3
1.1 Features of audio spectrum analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Zero crossing ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Spectral centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Chroma frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Libraries Used 4
2.1 Numpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 PyAudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 PyQt5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.5 PyQtGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Source Code 4
3.1 audio_feature_extraction.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 record_play.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 pyqtgraph_ui.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 application.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5 main_window.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Screenshots 13
5 Conclusion 14
6 References 15
List of Figures
1 Live Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Zero crossing ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Spectral centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 Chroma frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2
1 Introduction
A spectrum analyzer measures the magnitude of an input signal versus frequency within the full fre-
quncy range of the instrument. The primary use is to measure the power of the spectrum of known
and unknown signals. The most common spectrum analyzers measure is electrical.
This project real-time python based audio spectrum analyzer. Here, a Audio les is read and its
information and frames rates are processed to generate a graphical animation. First of all input audio
from mircophone is read using PyAudio. Initially audio raw data is in bytes format this need to be
converted into integer in order to plot the graph. The bytes data is converted into integer using numpy
which is stored as arrays.
To plot the input data in time domain we will use PyQtGraph, which will give better and smooth
control over the graph. To plot the spectrum of input we need to rst calculate the Fourier Transform
of it, this is done by using Fast Fourier Transform(t) algorithm. Numpy has inbuilt t function, we
will use this one to calculate the t and power spectrum density of input signal. t converts the time
domain function into frequency domain function, which makes easy to analyze the input signal. To
graph this one will we use pyqtgraph again. This two graph is plotted in single window of PyQt5.
1.1 Features of audio spectrum analyzer

This piece of python application contains some useful feature to analyze audio data
1. Spectrogram
2. Zero crossing ratio
3. Spectral centroid
4. Chroma frequencies
1.1.1 Spectrogram
A spectrogram is a visula representation of the spectrum of frequencies of sound or other signals as

they vary with time. Spectrograms are sometimes called sonographs, voiceprints or voicegrams. In
2-dimensional arrays, the rst axis is frequency while the second axis is time.
1.1.2 Zero crossing ratio
The zero crossing rate is the rate of sign-changes along a signal, i.e, the rate at which the signal changes
from positive to negative or back. This feature has been used heavily in both speech recognition and
music information retrieval. It usually has higher values for highly percussive sounds like those in
metal and rock.
1.1.3 Spectral centroid
It indicates where the "center of mass" for a sound is located and is calcuated as the weighted mean of
the frequencies present in the sound. Cosider two songs, one from blues genre and the other belonging
to metal. Now as compared to the blues genre song whic is the same throughout its length, the metal
song has more frequencies towards the end. So spectral centroid for blues song will lie somewhere near
the middle of its spectrum while that for a metal song would be towards its end.
3
1.1.4 Chroma frequencies
Chroma features are an interesting and powerful representation of music audio in which the entire
spectrum is projected onto 12 bins representing the 12 distinct semitones( or chroma) of the musical
octave.
2 Libraries Used
2.1 Numpy
NumPy is the fundamental package for scientic computing with Python. It is used to conver the
bytes data into integer arrays and to generate frequency domain of the audio signal
2.2 Wave
The wave module provides a convenient interface to the WAV sound format. It does not support
compression/decompression. but it does support mono/stereo.
2.3 PyAudio
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio,
you can easily use Python to play and record audio on a variety of platforms.
2.4 PyQt5
PyQt5 is a comprehensive set of Python bindings for Qt v5. It is implemented as more than 35
extension modules and enables Python to be used as an alternative application development language
to C++ on all supported platforms including iOS and Android.
2.5 PyQtGraph
PyQtGraph is a pure-python graphics and GUI library built on PyQt5/PySide and numpy. It is
intended for use in mathematics/ scientic/ engineering applications. Despite being written entirely
in python. the library is very fast due to its heavy leverage of numpy for number crunching and Qt's
GraphicsView framework for fast display.
3 Source Code
3.1 audio_feature_extraction.py
import matplotlib.pyplot as plt
import numpy as np
import librosa, librosa.display
import sklearn
x, sr = librosa.load('output.wav')
def audiowaveform():
global x,sr
plt.style.use('ggplot')
plt.figure(figsize=(14,5))
librosa.display.waveplot(x, sr=sr)
def zcr():
global x,sr
4
zcrs = librosa.feature.zero_crossing_rate(x)
plt.figure(figsize = (14,5))
plt.title('Zero crossing rate')
plt.plot(zcrs[0])
def spectrogram():
global x,sr
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
plt.title('Spectrogram')
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='log')
plt.colorbar()
def spectralcentroid():
global x,sr
spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
frames = range(len(spectral_centroids))
t = librosa.frames_to_time(frames)
def normalize(x, axis=0):
return sklearn.preprocessing.minmax_scale(x, axis=axis)
plt.figure(figsize=(14,5))
plt.title('Spectral Centroid')
librosa.display.waveplot(x, sr=sr, alpha=0.4)
plt.plot(t, normalize(spectral_centroids), color='r')
def chromafrequencies():
global x,sr
hop_length = 512
chromagram = librosa.feature.chroma_stft(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(14, 5))
plt.title('Chroma frequencies')
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length,
3.2 record_play.py
import pyaudio
import wave
import time
import sys
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
def setTimer(seconds):
global RECORD_SECONDS
RECORD_SECONDS = seconds
def recordAudio():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
5
global RECORD_SECONDS, WAVE_OUTPUT_FILENAME
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
def playAudio():
CHUNK = 1024
global WAVE_OUTPUT_FILENAME
wf = wave.open(WAVE_OUTPUT_FILENAME, 'rb')
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
data = wf.readframes(frame_count)
return (data, pyaudio.paContinue)
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)
stream.stop_stream()
stream.close()
wf.close()
p.terminate()
3.3 pyqtgraph_ui.py
import pyaudio
import numpy as np
from PyQt5 import QtCore, QtGui, QtWidgets
6
class Ui_MainWindow1(object):
def setupUi(self, MainWindow):
MainWindow.setObjectName("MainWindow")
MainWindow.resize(1261, 741)
self.centralwidget = QtWidgets.QWidget(MainWindow)
self.centralwidget.setObjectName("centralwidget")
self.horizontalLayout = QtWidgets.QHBoxLayout(self.centralwidget)
self.horizontalLayout.setObjectName("horizontalLayout")
self.graphicsView = GraphicsLayoutWidget(self.centralwidget)
self.graphicsView.setObjectName("graphicsView")
self.horizontalLayout.addWidget(self.graphicsView)
MainWindow.setCentralWidget(self.centralwidget)
self.actionExit = QtWidgets.QAction(MainWindow)
self.actionExit.setObjectName("actionExit")
self.retranslateUi(MainWindow)
QtCore.QMetaObject.connectSlotsByName(MainWindow)
self.audiocapture()
# Function to capture the audio data

def audiocapture(self):
self.FORMAT = pyaudio.paInt16
self.CHANNELS = 1
self.RATE = 14100
self.CHUNK = 1024
self.MAX_PLOT_SIZE = self.CHUNK * 50
# setup audio recording

self.audio = pyaudio.PyAudio()
self.stream = self.audio.open(format=self.FORMAT, channels=self.CHANNELS,

rate=self.RATE, input=True,
frames_per_buffer= self.CHUNK)
self.win = self.graphicsView
# create a plot for the time domain data

self.data_plot = self.win.addPlot(title="Audio Signal Vs Time")
self.data_plot.setXRange(0 ,self.MAX_PLOT_SIZE)
self.data_plot.showGrid(True, True)
self.data_plot.addLegend()
self.time_curve = self.data_plot.plot(pen=(24,215,248), name = "Time Domain Audio")
# create a plot for the frequency domain data

self.win.nextRow()
self.fft_plot = self.win.addPlot(title="Power Vs Frequency Domain")
self.fft_plot.addLegend()
self.fft_curve = self.fft_plot.plot(pen='y', name = "Power Spectrum")
self.fft_plot.showGrid(True, True)
self.total_data = []
7
self.timer = QtCore.QTimer()
self.timer.timeout.connect(self.update)
self.timer.start(0)
#upate chunk of audio data
def update(self):
# read data
raw_data = self.stream.read(self.CHUNK)
# convert raw bytes into integers

data_sample = np.frombuffer(raw_data, dtype=np.int16)
self.total_data = np.concatenate([self.total_data, data_sample ])
# remove old data

if len(self.total_data) > self.MAX_PLOT_SIZE:
self.total_data = self.total_data[self.CHUNK:]
self.time_curve.setData(self.total_data)
# calculate the FFT

fft_data = data_sample * np.hanning(len(data_sample))
power_spectrum = 20 * np.log10(np.abs(np.fft.rfft(fft_data))/len(fft_data))
self.fft_curve.setData(power_spectrum)
self.fft_plot.enableAutoRange('xy', False)
def retranslateUi(self, MainWindow):

_translate = QtCore.QCoreApplication.translate
MainWindow.setWindowTitle(_translate("MainWindow", "MainWindow"))
self.actionExit.setText(_translate("MainWindow", "Exit"))
from pyqtgraph import GraphicsLayoutWidget
3.4 application.py
import sys
from PyQt5 import QtCore
from PyQt5 import *
from PyQt5.QtWidgets import *
from main_window import Ui_MainWindow
from record_play import recordAudio, setTimer, playAudio
from audio_feature_extraction import *
from pyqtgraph_ui import Ui_MainWindow1
class MainWindow(QMainWindow, Ui_MainWindow):

def __init__(self, sl_value = 0):
super().__init__()
self.setupUi(self)
self.setWindowIcon(QtGui.QIcon('icon1.png'))
self.exitButton.clicked.connect(qApp.quit)
self.slider.valueChanged['int'].connect(self.lcd.display)
self.recordButton.clicked.connect(recordAudio)
self.playButton.clicked.connect(playAudio)
self.spectrumButton.clicked.connect(self.second_win)
self.spectrogramButton.clicked.connect(spectrogram)
8
self.zcrButton.clicked.connect(zcr)
self.spectralButton.clicked.connect(spectralcentroid)
self.chromaButton.clicked.connect(chromafrequencies)
self.slider.setValue(sl_value)
self.slider.valueChanged[int].connect(self.valuechange)
print(sl_value)
def second_win(self):
self.MainWindow = QtWidgets.QMainWindow()
self.ui = Ui_MainWindow1()
self.ui.setupUi(self.MainWindow)
self.MainWindow.show()
def valuechange(self, value):
self.__init__(value)
setTimer(value)
def spectrum(self):
print('Spectrum button clicked')
if __name__ == '__main__':
app = QApplication(sys.argv)
window = MainWindow()
window.show()
app.exec_()
3.5 main_window.py
from PyQt5 import QtCore, QtGui, QtWidgets
class Ui_MainWindow(object):
def setupUi(self, MainWindow):
MainWindow.setObjectName("MainWindow")
MainWindow.resize(320, 660)
sizePolicy = QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Fixed, QtWidgets.QSizePolicy.Fi
sizePolicy.setHorizontalStretch(0)
sizePolicy.setVerticalStretch(0)
sizePolicy.setHeightForWidth(MainWindow.sizePolicy().hasHeightForWidth())
MainWindow.setSizePolicy(sizePolicy)
MainWindow.setMaximumSize(QtCore.QSize(320, 660))
MainWindow.setMinimumSize(QtCore.QSize(320, 660))
MainWindow.setStyleSheet("QMainWindow{\n"
" background-color: #ffdd61\n"
"\n"
"}")
self.centralwidget = QtWidgets.QWidget(MainWindow)
self.centralwidget.setObjectName("centralwidget")
self.lcd = QtWidgets.QLCDNumber(self.centralwidget)
self.lcd.setGeometry(QtCore.QRect(10, 10, 301, 91))
self.lcd.setAutoFillBackground(False)
self.lcd.setStyleSheet("QLCDNumber{\n"
" background-color :#e3e3e3\n"
"}")
self.lcd.setObjectName("lcd")
9
self.spectrogramButton = QtWidgets.QPushButton(self.centralwidget)
self.spectrogramButton.setGeometry(QtCore.QRect(10, 340, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.spectrogramButton.setFont(font)
self.spectrogramButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.spectrogramButton.setObjectName("spectrogramButton")
self.recordButton = QtWidgets.QPushButton(self.centralwidget)
self.recordButton.setGeometry(QtCore.QRect(10, 160, 301, 51))
self.recordButton.setFont(font)
self.recordButton.setStyleSheet("QPushButton{\n"
"}\n"
" background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1,\n"
" stop: 0 #dadbde, stop: 1#ff5457);\n"
" color: #000000;\n"
"}")
self.recordButton.setObjectName("recordButton")
self.spectrumButton = QtWidgets.QPushButton(self.centralwidget)
self.spectrumButton.setGeometry(QtCore.QRect(10, 280, 301, 51))
self.spectrumButton.setFont(font)
self.spectrumButton.setStyleSheet("QPushButton{\n"
"}\n"
" color: #000000;\n"
10
"}")
self.spectrumButton.setObjectName("spectrumButton")
self.spectralButton = QtWidgets.QPushButton(self.centralwidget)
self.spectralButton.setGeometry(QtCore.QRect(10, 460, 301, 51))
self.spectralButton.setFont(font)
self.spectralButton.setStyleSheet("QPushButton{\n"
"}\n"
" color: #000000;\n"
"}")
self.spectralButton.setObjectName("spectralButton")
self.zcrButton = QtWidgets.QPushButton(self.centralwidget)
self.zcrButton.setGeometry(QtCore.QRect(10, 400, 301, 51))
self.zcrButton.setFont(font)
self.zcrButton.setStyleSheet("QPushButton{\n"
"}\n"
" color: #000000;\n"
"}")
self.zcrButton.setObjectName("zcrButton")
self.slider = QtWidgets.QSlider(self.centralwidget)
self.slider.setGeometry(QtCore.QRect(10, 110, 301, 41))
self.slider.setAutoFillBackground(False)
self.slider.setStyleSheet("QSlider{\n"
" background-color: #ffdd61\n"
"}")
self.slider.setMaximum(10)
self.slider.setOrientation(QtCore.Qt.Horizontal)
self.slider.setObjectName("slider")
self.exitButton = QtWidgets.QPushButton(self.centralwidget)
self.exitButton.setGeometry(QtCore.QRect(10, 600, 301, 51))
11
self.exitButton.setFont(font)
self.exitButton.setStyleSheet("QPushButton{\n"
"}\n"
" color: #f6f7fa;\n"
"}")
self.exitButton.setObjectName("exitButton")
self.chromaButton = QtWidgets.QPushButton(self.centralwidget)
self.chromaButton.setGeometry(QtCore.QRect(10, 520, 301, 51))
self.chromaButton.setFont(font)
self.chromaButton.setStyleSheet("QPushButton{\n"
"}\n"
" color: #000000;\n"
"}")
self.chromaButton.setObjectName("chromaButton")
self.playButton = QtWidgets.QPushButton(self.centralwidget)
self.playButton.setGeometry(QtCore.QRect(10, 220, 301, 51))
self.playButton.setFont(font)
self.playButton.setStyleSheet("QPushButton{\n"
"}\n"
" background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1,\n"
" stop: 0 #dadbde, stop: 1#ff5457);\n"
" color: #000000;\n"
"}")
self.playButton.setObjectName("playButton")
MainWindow.setCentralWidget(self.centralwidget)
12
self.retranslateUi(MainWindow)
self.slider.valueChanged['int'].connect(self.lcd.display)
QtCore.QMetaObject.connectSlotsByName(MainWindow)
def retranslateUi(self, MainWindow):

_translate = QtCore.QCoreApplication.translate
MainWindow.setWindowTitle(_translate("MainWindow", "Audio Spectrum Analyzer"))
self.spectrogramButton.setText(_translate("MainWindow", "SPECTROGRAM"))
self.recordButton.setText(_translate("MainWindow", "RECORD"))
self.spectrumButton.setText(_translate("MainWindow", "LIVE POWER SPECTRUM"))
self.spectralButton.setText(_translate("MainWindow", "SPECTRAL CENTROID"))
self.zcrButton.setText(_translate("MainWindow", "ZERO CROSSING RATIO"))
self.exitButton.setText(_translate("MainWindow", "EXIT"))
self.chromaButton.setText(_translate("MainWindow", "CHROMA FREQUENCIES"))
self.playButton.setText(_translate("MainWindow", "PLAY"))
4 Screenshots
Figure 1: Live Power Spectrum
Figure 2: Spectrogram
13
Figure 3: Zero crossing ratio
Figure 4: Spectral centroid
Figure 5: Chroma frequencies
5 Conclusion
The project implemented basic shape drawing, scaling, translation, window-viewport co-ordinate trans-
fomations and frame refresh to generate a realtime animation of audio signal visualization. The project
is successful describes the spectrum analysis of audio signals.
14
6 References
1. https://numpy.org/
2. https://pypi.org/project/PyQt5/
3. https://matplotlib.org/
4. https://www.scipy.org/
15

Department of Electronics 2020-2021: Prof. Shilpa Achaliya

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Department of Electronics 2020-2021: Prof. Shilpa Achaliya

Uploaded by

Copyright:

Available Formats

RAMRAO ADIK INSTITUTE OF TECHNOLOGY

Nerul, Navi Mumbai - 400706.

"Python Based Audio Spectrum Analyzer"

Nilesh Singh 19EE1144

Karan Chauhan 19EE1143

Kunal Tandel 16EE2004

Under the Guidance of

1.1 Features of audio spectrum analyzer

A spectrogram is a visula representation of the spectrum of frequencies of sound or other signals as

1.1.2 Zero crossing ratio

1.1.3 Spectral centroid

# Function to capture the audio data

# setup audio recording

self.stream = self.audio.open(format=self.FORMAT, channels=self.CHANNELS,

# create a plot for the time domain data

# create a plot for the frequency domain data

# convert raw bytes into integers

# remove old data

# calculate the FFT

def retranslateUi(self, MainWindow):

class MainWindow(QMainWindow, Ui_MainWindow):

def retranslateUi(self, MainWindow):

Figure 1: Live Power Spectrum

Figure 4: Spectral centroid

Figure 5: Chroma frequencies

You might also like