MicSpeakerCheck, This program is designed to compare the transition of a digital sound with a sound that has been through a speaker, and recorded via a microphone. Multiple sounds will be tested, and different hardwares (speakers and microphones) will also be tested.

The program works as follows; A selected sound is played through the speaker, and recorded simultaneously. The two raw traces of the two sounds are used to calculate characteristic sound features, which are then correlated to give a correlation coefficient.

These coefficients can then be plotted to form a population distribution of trials, and are compared to the population distributions of 'similar' picked motifs from the song bird.

By analysising the overlap of the distributions we should get a realistic idea if the playback from the speaker will be faithful enough at reproducing birds own song that it can be used in our BMI.

Written by Michael Barnett 5th Aug 2015



In [377]:

    
## THIS IS A GREAT STARTING CELL TO RUN TO ENSURE THAT YOU ACTUALLY HAVE THE HARDWARE APPROPRIATELY CONNECTED.
import pysoundcard
print "----------------------------------------------------------------"
print "DEFAULT API: " + str(pysoundcard.default_api())
print "----------------------------------------------------------------"
print "DEFAULT OUTPUT DEVICE: " + str(pysoundcard.default_output_device())
print "----------------------------------------------------------------"
print "DEFAULT INPUT DEVICE: " + str(pysoundcard.default_input_device())
print "----------------------------------------------------------------"

import sys
sys.version









    



----------------------------------------------------------------
DEFAULT API: {'struct_version': 1, 'name': u'MME', 'api_idx': 0, 'type': 2L, 'default_output_device_index': 3, 'device_count': 4, 'default_input_device_index': 1}
----------------------------------------------------------------
DEFAULT OUTPUT DEVICE: {'struct_version': 2, 'input_channels': 0, 'default_high_output_latency': 0.18, 'default_low_output_latency': 0.09, 'default_sample_rate': 44100.0, 'output_latency': 0.09, 'name': u'Speakers (Realtek High Definiti', 'interleaved_data': True, 'default_high_input_latency': 0.18, 'device_index': 3, 'sample_format': <type 'numpy.float32'>, 'default_low_input_latency': 0.09, 'host_api_index': 0, 'input_latency': 0.09, 'output_channels': 2}
----------------------------------------------------------------
DEFAULT INPUT DEVICE: {'struct_version': 2, 'input_channels': 2, 'default_high_output_latency': 0.18, 'default_low_output_latency': 0.09, 'default_sample_rate': 44100.0, 'output_latency': 0.09, 'name': u'Microphone (Realtek High Defini', 'interleaved_data': True, 'default_high_input_latency': 0.18, 'device_index': 1, 'sample_format': <type 'numpy.float32'>, 'default_low_input_latency': 0.09, 'host_api_index': 0, 'input_latency': 0.09, 'output_channels': 0}
----------------------------------------------------------------






    Out[377]:





'2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Jul  2 2014, 15:12:11) [MSC v.1500 64 bit (AMD64)]'



In [397]:

    
## Major dependancies
import numpy as np
from pysoundcard import Stream
import math
import time

## Used for testing at the moment.
import matplotlib.pyplot as plt
%matplotlib inline

## Quickly grab a sound sample used for testing
import scipy.io.wavfile as wav #Can call with wav.read(directory) , reads out the data from our wav file, can put [1] to return data only

#Just grabs a .wav song file I have on hand. NOTE: From Priscilla's Data, Was recorded with a gain of 30.
fs , wav_data_raw = wav.read("c:\\Users\\mbar372\\Documents\\testing\\pretest.wav")


# Angus wrote this in a previous example, It Normalises the sound, When taken away the sound breaches the ability of the speaker,
# And we get clippin issues

wav_data = np.array(wav_data_raw, dtype=np.float32) # Convert array to ensure it is 32int bit type,
wav_data /= 2**15 # normalize -max_int16..max_int16 to -1..1

# Plot the data we are about to play
plt.plot(wav_data)
print np.shape(wav_data)
plt.show()

# Play the data, and record it too, return the recorded data
new_data = pass_through_hardware(wav_data) 

# Plot the recorded data, to see what it looks like.
print np.shape(new_data)
plt.plot(new_data)
plt.show()









    



(288000L,)






    












    



0.0
0.998458049887
1.99691609977
2.99537414966
3.99383219955
4.99229024943
5.99074829932
(309795L,)

We now have two datasets; the original dataset (which we could of either left raw, or normalized and then played at a set amplitude, which may prove useful for figuring out this gain issue), and the recorded dataset, which represents the same souund, recorded via the default recording device.

Notice the recording amplitude is significantly lower, and always between 0-1.0, which makes me suspect that the return numpy array values from the PySoundCard library are doing a normalisation for us.

When we normalise Priscilla's wav file, we notice it is 10 times the amplitude of our microphone recordings.

If we play the recorded data back with this multiplier, we get a sound that sounds a similar amplitude, however the recorded sound is still a tiny bit lower in amplitude than the played sound, such a tiny loss is not unreasonable as a loss in amplitude due to transfer to a microphone.

Note that the recorded sound provided with Priscilla's data plots to 0 - 30,000 before normalisation. It is likely no coincidence that the recordings were done in SAP 2011, and the gain levels used are 30.

Gain = 20 * log(base 10) ( output / input ) - This is the equation for Voltage gain, assuming the same resistances on either side of the amplifier. Gain/Volume are ambiguous words, however within audio, voltage gain is predominated, Power gain would be more for radio-type applications.

It is interesting to note , that our microphone records at ~1/10th the power of Priscilla's wav file.
If we re-arrange the gain equation to calculate the multiplier used on the input to get the output. 10 ^ ( Gain / 20 ) = 10

10 is the multiplier we see from the gain equation, AND the difference we see in recordings. It is highly likely that this is the reason for the overly large amplitude difference in our recordings.

Will need to think of another proof regarding this, ultimately we have to consider the gain of our microphone being a necessary element in the hardware transfer function.



In [403]:

    
# PLAY THE NEW SOUND, FRESHLY RECORDED VIA MICROPHONE
quick_play(new_data*10)









    



0.0
0.999909297052
1.9998185941
2.99972789116
3.99963718821
4.99954648526
5.99945578231
6.99936507937



In [402]:

    
# PLAY THE OLD SOUND, AS IT WAS PLAYED THROUGH THE SPEAKER AT TIME OF RECORDING
quick_play(wav_data)









    



0.0
0.999909297052
1.9998185941
2.99972789116
3.99963718821
4.99954648526
5.99945578231



In [369]:

    
## QUICK SPECTROGRAM COMPARE
print "WAV FILE DATA:"
plt.specgram(wav_data, Fs = 44100)
plt.show()
print "RECORDED WAV FILE DATA:"
plt.specgram(new_data, Fs = 44100)
plt.show()









    



WAV FILE DATA:






    












    



RECORDED WAV FILE DATA:



In [388]:

    
def pass_through_hardware(sound, fs = 44100):
    """
    This definition is designed to take the sound (formated as a single dimensional numpy array) as 
    a first argument, and play it through the default audio device,
    while simultaneously recording from the default recording device. 

    The second argument is the sample rate of the sound, defaults to 44100 Hz
    
    This returns the raw trace of the output/input recording.
    """
    block_length = 1024 # The size of the sound blocks to be passed to the audio streamer
    output_wav = np.array(1) #Create an empty number array, this will be appended to with the microphone inputs

    # Create the stream instance (An object representing the sound card), and start it.
    s = Stream(sample_rate=fs, block_length=block_length)
    s.start()

    #Iterate over the length of the sound to be played with the block size,
    for i in range(len(sound)/block_length):
        # Note the read() and write() functions do not return until the full acquisition or playback is complete
        # This may have led to an issue of recording happening in moments of silence where the playback is complete.
        # This is the likely reason as to why the recordings are slightly offset to the playbacks. Interestingly,
        # The recordings are AHEAD of the playback.. caputuring extra information before playback had started,
        # Additionally , the recording is slightly shorter in total, indicating the possibility of cutting off
        # and not capturing the final moments of the playback. It is likely I will need to incorpate extra lengths of
        # Recording to ensure that the entire playback is captured everytime, with overlap.
        
        # PLAYBACK the next block of sound
        block_to_write = sound[i*block_length:i*block_length+block_length]
        s.write(block_to_write)
        
        # RECORD the next block of sound. Append to new Numpy array.
        new_block = s.read(block_length)[:,0]
        output_wav = np.append(output_wav, new_block)

        
        # Prints the time every ~ second , changing block length will cause issues with this due to dtypes.
        # Only included at this time to show progress.
        if (i%(fs/block_length) == 0):
            print (i*1.00/fs)*block_length 
    
    # Record the next 0.5s of sound (This is handy for addressing the possible undershoot recording problem discussed above)
    new_block = s.read(22050)[:,0]
    output_wav = np.append(output_wav, new_block)
    
    #Stop the stream
    s.stop()
    
    # Return the microphones experiences during sound playback.
    return output_wav



In [393]:

    
def quick_play(sound, fs = 44100):
    """
    Helper routine during testing to quickly play a numpy array of data to listen to it.
    """
    block_length = 32 # The size of the sound blocks to be passed to the audio streamer
    s = Stream(sample_rate=fs, block_length=block_length)
    s.start()

    #Iterate over the length of the sound to be played with the block size,
    for i in range(len(sound)/block_length):
        # Play the next block of sound
        s.write(sound[i*block_length:i*block_length+block_length])
        # Record the next block of sound.
        # Prints the time every ~ second , changing block length will cause issues with this due to dtypes.
        # Only included at this time to show progress.
        if (i%(fs/block_length) == 0):
            print (i*1.00/fs)*block_length 
    
    #Stop the stream
    s.stop()



In [403]:



In [403]:



In [403]:



In [403]:



In [403]:



In [403]:



In [403]:



In [403]:



In [403]:



In [ ]: