- The Difference Between scipy.io.wavfile.read() and librosa.load() in Python – Python Tutorial
- scipy.io.wavfile.read()
- librosa.load()
- scipy.io.wavfile.read() Vs librosa.load()
- scipy.io.wavfile.write#
- Input and output ( scipy.io )#
- MATLAB® files#
- IDL® files#
- Matrix Market files#
- Unformatted Fortran files#
- Netcdf#
- Harwell-Boeing files#
- Wav sound files ( scipy.io.wavfile )#
- Arff files ( scipy.io.arff )#
- scipy.io.wavfile.read#
The Difference Between scipy.io.wavfile.read() and librosa.load() in Python – Python Tutorial
When we plan to read an audio file, we can use scipy.io.wavfile.read() and librosa.load(), in this tutorial, we will introduce the difference between them.
scipy.io.wavfile.read()
scipy.io.wavfile.read(filename, mmap=False)
This function will open a wav file and return the sample rate and data of this wav file.
librosa.load()
librosa.load(path, sr=22050, mono=True, offset=0.0, duration=None, dtype=, res_type='kaiser_best')
This function will open an audio file based on sample rate (if it is not None) and return audio data and sample rate.
We will compare them using some examples.
scipy.io.wavfile.read() Vs librosa.load()
scipy.io.wavfile.read(): we can not open a wav file based on custom sample rate. However, librosa.load() can read.
from scipy.io import wavfile import librosa import numpy as np np.set_printoptions(threshold=np.inf) audio_file = './waihu/6eb2612c-fc23-4ead-b2dd-05009817f7e7.wav' fs, wavdata = wavfile.read(audio_file) print(fs) print(type(wavdata)) audio, fs = librosa.load(audio_file) print(fs) audio, fs = librosa.load(audio_file, sr = 4000) print(fs)
Run this code, you will get:
- scipy.io.wavfile.read() only can read a wav file based on original sample rate.
- If sr = None, librosa.load() will open a wav file base on defualt sample rate 22050.
- If we have set a sr , librosa.load() will read a audio file based on this sr .
- If you have many wav files with different sample rates, librosa.load() is a good choice to read audio data.
from scipy.io import wavfile import librosa import numpy as np np.set_printoptions(threshold=np.inf) audio_file = './waihu/6eb2612c-fc23-4ead-b2dd-05009817f7e7.wav' fs, wavdata = wavfile.read(audio_file) print(wavdata[5000:5100]) audio, fs = librosa.load(audio_file, sr = 8000) print(audio[5000:5100])
Run this code, you will see:
[-4261 -1797 585 1701 2108 1668 928 191 294 1228 2165 2229 1134 -127 -664 -77 1101 2242 2704 2309 1328 442 371 914 1594 1855 1493 855 660 732 632 -1586 -4957 -7701 -7927 -4847 -367 2493 1150 -2137 -4518 -3791 -1486 492 1239 1453 1512 1122 563 344 1263 2205 2379 1207 -45 -426 277 1300 1835 1960 1740 1441 994 810 902 1335 1583 1363 733 598 988 1133 -457 -4040 -7262 -8377 -5986 -1513 2121 1995 -1100 -4103 -4409 -2127 287 1418 1419 1223 950 645 325 882 2011 2640 1896 261 -648 -225 1215 2075] [-0.1300354 -0.05484009 0.01785278 0.0519104 0.06433105 0.05090332 0.02832031 0.00582886 0.00897217 0.03747559 0.06607056 0.06802368 0.03460693 -0.00387573 -0.02026367 -0.00234985 0.03359985 0.06842041 0.08251953 0.07046509 0.04052734 0.01348877 0.01132202 0.02789307 0.04864502 0.05661011 0.04556274 0.02609253 0.0201416 0.02233887 0.01928711 -0.04840088 -0.15127563 -0.23501587 -0.24191284 -0.1479187 -0.01119995 0.07608032 0.03509521 -0.06521606 -0.13787842 -0.11569214 -0.04534912 0.01501465 0.03781128 0.04434204 0.04614258 0.03424072 0.0171814 0.01049805 0.0385437 0.06729126 0.07260132 0.03683472 -0.00137329 -0.01300049 0.00845337 0.03967285 0.05599976 0.05981445 0.05310059 0.04397583 0.03033447 0.02471924 0.02752686 0.04074097 0.04830933 0.04159546 0.02236938 0.01824951 0.03015137 0.03457642 -0.01394653 -0.12329102 -0.22161865 -0.25564575 -0.18267822 -0.0461731 0.06472778 0.06088257 -0.03356934 -0.12521362 -0.134552 -0.06491089 0.00875854 0.04327393 0.04330444 0.037323 0.0289917 0.01968384 0.00991821 0.0269165 0.06137085 0.08056641 0.05786133 0.00796509 -0.01977539 -0.00686646 0.03707886 0.06332397]
scipy.io.wavfile.read() will return integer value, however, librosa.load() will return value between -1 ~ +1 .
scipy.io.wavfile.write#
A 1-D or 2-D NumPy array of either integer or float data-type.
- Writes a simple uncompressed WAV file.
- To write multiple-channels, use a 2-D array of shape (Nsamples, Nchannels).
- The bits-per-sample and PCM/float will be determined by the data-type.
Note that 8-bit PCM is unsigned.
IBM Corporation and Microsoft Corporation, “Multimedia Programming Interface and Data Specifications 1.0”, section “Data Format of the Samples”, August 1991 http://www.tactilemedia.com/info/MCI_Control_Info.html
Create a 100Hz sine wave, sampled at 44100Hz. Write to 16-bit PCM, Mono.
>>> from scipy.io.wavfile import write >>> import numpy as np >>> samplerate = 44100; fs = 100 >>> t = np.linspace(0., 1., samplerate) >>> amplitude = np.iinfo(np.int16).max >>> data = amplitude * np.sin(2. * np.pi * fs * t) >>> write("example.wav", samplerate, data.astype(np.int16))
Input and output ( scipy.io )#
SciPy has many modules, classes, and functions available to read data from and write data to a variety of file formats.
MATLAB® files#
loadmat (file_name[, mdict, appendmat])
savemat (file_name, mdict[, appendmat, . ])
Save a dictionary of names and arrays into a MATLAB-style .mat file.
List variables inside a MATLAB file.
For low-level MATLAB reading and writing utilities, see scipy.io.matlab .
IDL® files#
readsav (file_name[, idict, python_dict, . ])
Matrix Market files#
Return size and storage parameters from Matrix Market file-like ‘source’.
Reads the contents of a Matrix Market file-like ‘source’ into a matrix.
Writes the sparse or dense array a to Matrix Market file-like target.
Unformatted Fortran files#
FortranFile (filename[, mode, header_dtype])
A file object for unformatted sequential files from Fortran code.
Indicates that the file ended properly.
Indicates that the file ended mid-record.
Netcdf#
netcdf_file (filename[, mode, mmap, version, . ])
A file object for NetCDF data.
A data object for netcdf files.
Harwell-Boeing files#
hb_write (path_or_open_file, m[, hb_info])
Wav sound files ( scipy.io.wavfile )#
Write a NumPy array as a WAV file.
Arff files ( scipy.io.arff )#
Small container to keep useful information on a ARFF dataset.
scipy.io.wavfile.read#
Return the sample rate (in samples/sec) and data from an LPCM WAV file.
Parameters : filename string or open file handle
mmap bool, optional
Whether to read data as memory-mapped (default: False). Not compatible with some bit depths; see Notes. Only to be used on real files.
data numpy array
Data read from WAV file. Data-type is determined from the file; see Notes. Data is 1-D for 1-channel WAV, or 2-D of shape (Nsamples, Nchannels) otherwise. If a file-like input without a C-like file descriptor (e.g., io.BytesIO ) is passed, this will not be writeable.
WAV files can specify arbitrary bit depth, and this function supports reading any integer PCM depth from 1 to 64 bits. Data is returned in the smallest compatible numpy int type, in left-justified format. 8-bit and lower is unsigned, while 9-bit and higher is signed.
For example, 24-bit data will be stored as int32, with the MSB of the 24-bit data stored at the MSB of the int32, and typically the least significant byte is 0x00. (However, if a file actually contains data past its specified bit depth, those bits will be read and output, too. [2])
This bit justification and sign matches WAV’s native internal format, which allows memory mapping of WAV files that use 1, 2, 4, or 8 bytes per sample (so 24-bit files cannot be memory-mapped, but 32-bit can).
IEEE float PCM in 32- or 64-bit format is supported, with or without mmap. Values exceeding [-1, +1] are not clipped.
Non-linear PCM (mu-law, A-law) is not supported.
IBM Corporation and Microsoft Corporation, “Multimedia Programming Interface and Data Specifications 1.0”, section “Data Format of the Samples”, August 1991 http://www.tactilemedia.com/info/MCI_Control_Info.html
Adobe Systems Incorporated, “Adobe Audition 3 User Guide”, section “Audio file formats: 24-bit Packed Int (type 1, 20-bit)”, 2007
>>> from os.path import dirname, join as pjoin >>> from scipy.io import wavfile >>> import scipy.io
Get the filename for an example .wav file from the tests/data directory.
>>> data_dir = pjoin(dirname(scipy.io.__file__), 'tests', 'data') >>> wav_fname = pjoin(data_dir, 'test-44100Hz-2ch-32bit-float-be.wav')
Load the .wav file contents.
>>> samplerate, data = wavfile.read(wav_fname) >>> print(f"number of channels = data.shape[1]>") number of channels = 2 >>> length = data.shape[0] / samplerate >>> print(f"length = length>s") length = 0.01s
>>> import matplotlib.pyplot as plt >>> import numpy as np >>> time = np.linspace(0., length, data.shape[0]) >>> plt.plot(time, data[:, 0], label="Left channel") >>> plt.plot(time, data[:, 1], label="Right channel") >>> plt.legend() >>> plt.xlabel("Time [s]") >>> plt.ylabel("Amplitude") >>> plt.show()