Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

RAMRAO ADIK INSTITUTE OF TECHNOLOGY

Nerul, Navi Mumbai - 400706.

A PROJECT REPORT

On

"Python Based Audio Spectrum Analyzer"


Submitted in partial fulllment of the requirement for the award of the degree of

BACHELOR OF ENGINEERING
IN
ELECTRONICS

SEMESTER - 4

Submitted by

Nilesh Singh 19EE1144

Karan Chauhan 19EE1143

Kunal Tandel 16EE2004

Under the Guidance of


Prof. SHILPA ACHALIYA

DEPARTMENT OF ELECTRONICS

2020-2021

1
Contents
1 Introduction 3
1.1 Features of audio spectrum analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Zero crossing ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Spectral centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Chroma frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Libraries Used 4
2.1 Numpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 PyAudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 PyQt5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.5 PyQtGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Source Code 4
3.1 audio_feature_extraction.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 record_play.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 pyqtgraph_ui.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 application.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5 main_window.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Screenshots 13

5 Conclusion 14

6 References 15

List of Figures
1 Live Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Zero crossing ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Spectral centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 Chroma frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2
1 Introduction
A spectrum analyzer measures the magnitude of an input signal versus frequency within the full fre-
quncy range of the instrument. The primary use is to measure the power of the spectrum of known
and unknown signals. The most common spectrum analyzers measure is electrical.

This project real-time python based audio spectrum analyzer. Here, a Audio les is read and its
information and frames rates are processed to generate a graphical animation. First of all input audio
from mircophone is read using PyAudio. Initially audio raw data is in bytes format this need to be
converted into integer in order to plot the graph. The bytes data is converted into integer using numpy
which is stored as arrays.

To plot the input data in time domain we will use PyQtGraph, which will give better and smooth
control over the graph. To plot the spectrum of input we need to rst calculate the Fourier Transform
of it, this is done by using Fast Fourier Transform(t) algorithm. Numpy has inbuilt t function, we
will use this one to calculate the t and power spectrum density of input signal. t converts the time
domain function into frequency domain function, which makes easy to analyze the input signal. To
graph this one will we use pyqtgraph again. This two graph is plotted in single window of PyQt5.

1.1 Features of audio spectrum analyzer


This piece of python application contains some useful feature to analyze audio data
1. Spectrogram
2. Zero crossing ratio
3. Spectral centroid
4. Chroma frequencies

1.1.1 Spectrogram

A spectrogram is a visula representation of the spectrum of frequencies of sound or other signals as


they vary with time. Spectrograms are sometimes called sonographs, voiceprints or voicegrams. In
2-dimensional arrays, the rst axis is frequency while the second axis is time.

1.1.2 Zero crossing ratio

The zero crossing rate is the rate of sign-changes along a signal, i.e, the rate at which the signal changes
from positive to negative or back. This feature has been used heavily in both speech recognition and
music information retrieval. It usually has higher values for highly percussive sounds like those in
metal and rock.

1.1.3 Spectral centroid

It indicates where the "center of mass" for a sound is located and is calcuated as the weighted mean of
the frequencies present in the sound. Cosider two songs, one from blues genre and the other belonging
to metal. Now as compared to the blues genre song whic is the same throughout its length, the metal
song has more frequencies towards the end. So spectral centroid for blues song will lie somewhere near
the middle of its spectrum while that for a metal song would be towards its end.

3
1.1.4 Chroma frequencies

Chroma features are an interesting and powerful representation of music audio in which the entire
spectrum is projected onto 12 bins representing the 12 distinct semitones( or chroma) of the musical
octave.

2 Libraries Used
2.1 Numpy
NumPy is the fundamental package for scientic computing with Python. It is used to conver the
bytes data into integer arrays and to generate frequency domain of the audio signal

2.2 Wave
The wave module provides a convenient interface to the WAV sound format. It does not support
compression/decompression. but it does support mono/stereo.

2.3 PyAudio
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio,
you can easily use Python to play and record audio on a variety of platforms.

2.4 PyQt5
PyQt5 is a comprehensive set of Python bindings for Qt v5. It is implemented as more than 35
extension modules and enables Python to be used as an alternative application development language
to C++ on all supported platforms including iOS and Android.

2.5 PyQtGraph
PyQtGraph is a pure-python graphics and GUI library built on PyQt5/PySide and numpy. It is
intended for use in mathematics/ scientic/ engineering applications. Despite being written entirely
in python. the library is very fast due to its heavy leverage of numpy for number crunching and Qt's
GraphicsView framework for fast display.

3 Source Code
3.1 audio_feature_extraction.py
import matplotlib.pyplot as plt
import numpy as np
import librosa, librosa.display
import sklearn

x, sr = librosa.load('output.wav')
def audiowaveform():
global x,sr
plt.style.use('ggplot')
plt.figure(figsize=(14,5))
librosa.display.waveplot(x, sr=sr)

def zcr():
global x,sr

4
zcrs = librosa.feature.zero_crossing_rate(x)
plt.figure(figsize = (14,5))
plt.title('Zero crossing rate')
plt.plot(zcrs[0])

def spectrogram():
global x,sr
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
plt.title('Spectrogram')
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='log')
plt.colorbar()

def spectralcentroid():
global x,sr
spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
frames = range(len(spectral_centroids))
t = librosa.frames_to_time(frames)
def normalize(x, axis=0):
return sklearn.preprocessing.minmax_scale(x, axis=axis)
plt.figure(figsize=(14,5))
plt.title('Spectral Centroid')
librosa.display.waveplot(x, sr=sr, alpha=0.4)
plt.plot(t, normalize(spectral_centroids), color='r')

def chromafrequencies():
global x,sr
hop_length = 512
chromagram = librosa.feature.chroma_stft(x, sr=sr, hop_length=hop_length)
plt.figure(figsize=(14, 5))
plt.title('Chroma frequencies')
librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length,

3.2 record_play.py
import pyaudio
import wave
import time
import sys

RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
def setTimer(seconds):
global RECORD_SECONDS
RECORD_SECONDS = seconds

def recordAudio():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100

5
global RECORD_SECONDS, WAVE_OUTPUT_FILENAME
p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

def playAudio():
CHUNK = 1024
global WAVE_OUTPUT_FILENAME
wf = wave.open(WAVE_OUTPUT_FILENAME, 'rb')
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
data = wf.readframes(frame_count)
return (data, pyaudio.paContinue)

stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(0.1)

stream.stop_stream()
stream.close()
wf.close()
p.terminate()

3.3 pyqtgraph_ui.py
import pyaudio
import numpy as np
from PyQt5 import QtCore, QtGui, QtWidgets

6
class Ui_MainWindow1(object):
def setupUi(self, MainWindow):
MainWindow.setObjectName("MainWindow")
MainWindow.resize(1261, 741)
self.centralwidget = QtWidgets.QWidget(MainWindow)
self.centralwidget.setObjectName("centralwidget")
self.horizontalLayout = QtWidgets.QHBoxLayout(self.centralwidget)
self.horizontalLayout.setObjectName("horizontalLayout")
self.graphicsView = GraphicsLayoutWidget(self.centralwidget)
self.graphicsView.setObjectName("graphicsView")
self.horizontalLayout.addWidget(self.graphicsView)
MainWindow.setCentralWidget(self.centralwidget)
self.actionExit = QtWidgets.QAction(MainWindow)
self.actionExit.setObjectName("actionExit")
self.retranslateUi(MainWindow)
QtCore.QMetaObject.connectSlotsByName(MainWindow)
self.audiocapture()

# Function to capture the audio data


def audiocapture(self):
self.FORMAT = pyaudio.paInt16
self.CHANNELS = 1
self.RATE = 14100
self.CHUNK = 1024
self.MAX_PLOT_SIZE = self.CHUNK * 50

# setup audio recording


self.audio = pyaudio.PyAudio()

self.stream = self.audio.open(format=self.FORMAT, channels=self.CHANNELS,


rate=self.RATE, input=True,
frames_per_buffer= self.CHUNK)

self.win = self.graphicsView

# create a plot for the time domain data


self.data_plot = self.win.addPlot(title="Audio Signal Vs Time")
self.data_plot.setXRange(0 ,self.MAX_PLOT_SIZE)
self.data_plot.showGrid(True, True)
self.data_plot.addLegend()
self.time_curve = self.data_plot.plot(pen=(24,215,248), name = "Time Domain Audio")

# create a plot for the frequency domain data


self.win.nextRow()
self.fft_plot = self.win.addPlot(title="Power Vs Frequency Domain")
self.fft_plot.addLegend()
self.fft_curve = self.fft_plot.plot(pen='y', name = "Power Spectrum")

self.fft_plot.showGrid(True, True)
self.total_data = []

7
self.timer = QtCore.QTimer()
self.timer.timeout.connect(self.update)
self.timer.start(0)
#upate chunk of audio data
def update(self):

# read data
raw_data = self.stream.read(self.CHUNK)

# convert raw bytes into integers


data_sample = np.frombuffer(raw_data, dtype=np.int16)
self.total_data = np.concatenate([self.total_data, data_sample ])

# remove old data


if len(self.total_data) > self.MAX_PLOT_SIZE:
self.total_data = self.total_data[self.CHUNK:]
self.time_curve.setData(self.total_data)

# calculate the FFT


fft_data = data_sample * np.hanning(len(data_sample))
power_spectrum = 20 * np.log10(np.abs(np.fft.rfft(fft_data))/len(fft_data))
self.fft_curve.setData(power_spectrum)
self.fft_plot.enableAutoRange('xy', False)

def retranslateUi(self, MainWindow):


_translate = QtCore.QCoreApplication.translate
MainWindow.setWindowTitle(_translate("MainWindow", "MainWindow"))
self.actionExit.setText(_translate("MainWindow", "Exit"))
from pyqtgraph import GraphicsLayoutWidget

3.4 application.py
import sys
from PyQt5 import QtCore
from PyQt5 import *
from PyQt5.QtWidgets import *
from main_window import Ui_MainWindow
from record_play import recordAudio, setTimer, playAudio
from audio_feature_extraction import *
from pyqtgraph_ui import Ui_MainWindow1

class MainWindow(QMainWindow, Ui_MainWindow):


def __init__(self, sl_value = 0):
super().__init__()
self.setupUi(self)
self.setWindowIcon(QtGui.QIcon('icon1.png'))
self.exitButton.clicked.connect(qApp.quit)
self.slider.valueChanged['int'].connect(self.lcd.display)
self.recordButton.clicked.connect(recordAudio)
self.playButton.clicked.connect(playAudio)
self.spectrumButton.clicked.connect(self.second_win)
self.spectrogramButton.clicked.connect(spectrogram)

8
self.zcrButton.clicked.connect(zcr)
self.spectralButton.clicked.connect(spectralcentroid)
self.chromaButton.clicked.connect(chromafrequencies)
self.slider.setValue(sl_value)
self.slider.valueChanged[int].connect(self.valuechange)
print(sl_value)

def second_win(self):
self.MainWindow = QtWidgets.QMainWindow()
self.ui = Ui_MainWindow1()
self.ui.setupUi(self.MainWindow)
self.MainWindow.show()
def valuechange(self, value):
self.__init__(value)
setTimer(value)
def spectrum(self):
print('Spectrum button clicked')

if __name__ == '__main__':
app = QApplication(sys.argv)
window = MainWindow()
window.show()
app.exec_()

3.5 main_window.py
from PyQt5 import QtCore, QtGui, QtWidgets

class Ui_MainWindow(object):
def setupUi(self, MainWindow):
MainWindow.setObjectName("MainWindow")
MainWindow.resize(320, 660)
sizePolicy = QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Fixed, QtWidgets.QSizePolicy.Fi
sizePolicy.setHorizontalStretch(0)
sizePolicy.setVerticalStretch(0)
sizePolicy.setHeightForWidth(MainWindow.sizePolicy().hasHeightForWidth())
MainWindow.setSizePolicy(sizePolicy)
MainWindow.setMaximumSize(QtCore.QSize(320, 660))
MainWindow.setMinimumSize(QtCore.QSize(320, 660))
MainWindow.setStyleSheet("QMainWindow{\n"
" background-color: #ffdd61\n"
"\n"
"}")
self.centralwidget = QtWidgets.QWidget(MainWindow)
self.centralwidget.setObjectName("centralwidget")
self.lcd = QtWidgets.QLCDNumber(self.centralwidget)
self.lcd.setGeometry(QtCore.QRect(10, 10, 301, 91))
self.lcd.setAutoFillBackground(False)
self.lcd.setStyleSheet("QLCDNumber{\n"
" background-color :#e3e3e3\n"
"}")
self.lcd.setObjectName("lcd")

9
self.spectrogramButton = QtWidgets.QPushButton(self.centralwidget)
self.spectrogramButton.setGeometry(QtCore.QRect(10, 340, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.spectrogramButton.setFont(font)
self.spectrogramButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.spectrogramButton.setObjectName("spectrogramButton")
self.recordButton = QtWidgets.QPushButton(self.centralwidget)
self.recordButton.setGeometry(QtCore.QRect(10, 160, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.recordButton.setFont(font)
self.recordButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1,\n"
" stop: 0 #dadbde, stop: 1#ff5457);\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.recordButton.setObjectName("recordButton")
self.spectrumButton = QtWidgets.QPushButton(self.centralwidget)
self.spectrumButton.setGeometry(QtCore.QRect(10, 280, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.spectrumButton.setFont(font)
self.spectrumButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"

10
" font-size: 20px;\n"
"}")
self.spectrumButton.setObjectName("spectrumButton")
self.spectralButton = QtWidgets.QPushButton(self.centralwidget)
self.spectralButton.setGeometry(QtCore.QRect(10, 460, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.spectralButton.setFont(font)
self.spectralButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.spectralButton.setObjectName("spectralButton")
self.zcrButton = QtWidgets.QPushButton(self.centralwidget)
self.zcrButton.setGeometry(QtCore.QRect(10, 400, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.zcrButton.setFont(font)
self.zcrButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.zcrButton.setObjectName("zcrButton")
self.slider = QtWidgets.QSlider(self.centralwidget)
self.slider.setGeometry(QtCore.QRect(10, 110, 301, 41))
self.slider.setAutoFillBackground(False)
self.slider.setStyleSheet("QSlider{\n"
" background-color: #ffdd61\n"
"}")
self.slider.setMaximum(10)
self.slider.setOrientation(QtCore.Qt.Horizontal)
self.slider.setObjectName("slider")
self.exitButton = QtWidgets.QPushButton(self.centralwidget)
self.exitButton.setGeometry(QtCore.QRect(10, 600, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")

11
font.setPointSize(12)
self.exitButton.setFont(font)
self.exitButton.setStyleSheet("QPushButton{\n"
" background-color: #ff5457;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
"}\n"
"QPushButton:pressed{\n"
" color: #f6f7fa;\n"
" background-color: #000000;\n"
" font-size: 22px;\n"
"}")
self.exitButton.setObjectName("exitButton")
self.chromaButton = QtWidgets.QPushButton(self.centralwidget)
self.chromaButton.setGeometry(QtCore.QRect(10, 520, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.chromaButton.setFont(font)
self.chromaButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: #ff5457;\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.chromaButton.setObjectName("chromaButton")
self.playButton = QtWidgets.QPushButton(self.centralwidget)
self.playButton.setGeometry(QtCore.QRect(10, 220, 301, 51))
font = QtGui.QFont()
font.setFamily("Montserrat")
font.setPointSize(12)
self.playButton.setFont(font)
self.playButton.setStyleSheet("QPushButton{\n"
" background-color: #000000;\n"
" border-width: 20px;\n"
" border-radius : 25px;\n"
" color: #ffffff;\n"
"}\n"
"QPushButton:pressed{\n"
" background-color: qlineargradient(x1: 0, y1: 0, x2: 0, y2: 1,\n"
" stop: 0 #dadbde, stop: 1#ff5457);\n"
" color: #000000;\n"
" font-size: 20px;\n"
"}")
self.playButton.setObjectName("playButton")
MainWindow.setCentralWidget(self.centralwidget)

12
self.retranslateUi(MainWindow)
self.slider.valueChanged['int'].connect(self.lcd.display)
QtCore.QMetaObject.connectSlotsByName(MainWindow)

def retranslateUi(self, MainWindow):


_translate = QtCore.QCoreApplication.translate
MainWindow.setWindowTitle(_translate("MainWindow", "Audio Spectrum Analyzer"))
self.spectrogramButton.setText(_translate("MainWindow", "SPECTROGRAM"))
self.recordButton.setText(_translate("MainWindow", "RECORD"))
self.spectrumButton.setText(_translate("MainWindow", "LIVE POWER SPECTRUM"))
self.spectralButton.setText(_translate("MainWindow", "SPECTRAL CENTROID"))
self.zcrButton.setText(_translate("MainWindow", "ZERO CROSSING RATIO"))
self.exitButton.setText(_translate("MainWindow", "EXIT"))
self.chromaButton.setText(_translate("MainWindow", "CHROMA FREQUENCIES"))
self.playButton.setText(_translate("MainWindow", "PLAY"))

4 Screenshots

Figure 1: Live Power Spectrum

Figure 2: Spectrogram

13
Figure 3: Zero crossing ratio

Figure 4: Spectral centroid

Figure 5: Chroma frequencies

5 Conclusion
The project implemented basic shape drawing, scaling, translation, window-viewport co-ordinate trans-
fomations and frame refresh to generate a realtime animation of audio signal visualization. The project
is successful describes the spectrum analysis of audio signals.

14
6 References
1. https://numpy.org/
2. https://pypi.org/project/PyQt5/
3. https://matplotlib.org/
4. https://www.scipy.org/

15

You might also like