How big is the smallest pitch difference between 2 consecutive tones that a human listener can detect?
1. The smallest pitch difference between 2 consecutive tones that a musician listener can detect (3.18 MELs) is smaller than the smallest pitch difference perceived by non-musicians (11.41 MELs).
2. There is no evidences to confirm that the different ways we present the tones affect to the minimum pitch difference detectable between 2 consecutive tones.
Initial questions:
[TIME] Can affect to the perception the time that is left between tones?
Motivation: the hair cells need a recuperation time after being exposed to sound. Common terms to refer to this issue: TTS (Temporary Threshold Shift) or auditory fatigue. TTS is maximal at the exposure frequency of the sound.
Motivation: if a big time gap is lead between the two tones, maybe the first tone is not remembered.
[HIGH] The smallest pitch difference between 2 consecutive tones changes if we analyze this for low frequencies or for high ones?
Observation: the perceptual MEL scale. If the MEL scale is used instead of a Hz scale for generating the sounds, then the experiments in low frequencies and at high frequencies should be comparable because the perceptual difference due to this perceptual effect would be compensated.
For example:
Incrementing 3 Hz at 20 Hz would be increment:
36'43 (23Hz) – 31'74 (20Hz) = 4'68 mel
But incrementing 3Hz at 15kHz would mean to increment:
3505'59 (15003Hz) – 3505'38 (15000Hz) = 0'21 mel
[DIRECTION] Will we have the same result if we start from equal tones and separate? Or if we start from two separate tones and we put them together gradually?
Motivation: if we start from two tones that are perceived as the same, maybe we get used to perceive the sound together and the minimum difference between 2 tones becomes higher.
[TIMBRE] Which kind of tones? A pure sine, a piano, a guitar?
Motivation: different timbre could stimulate more hair cells than a pure tone. This could affect to the perception of the following tone.
[INTENSITY] Does the volume could be critical for my experiments?
Motivation: the equal loudness contours. As we are searching for the smallest pitch difference, this issue is not going to disturb us.
Observation: we cannot control the volume of the computer.
[DURATION] How could affect the reverberation time to two sounds? This should also depend on the separation time.
Motivation: if the reverberation time is higher than the silence between the sounds, we don't know how this overlapping could affect to our perception.
Primary decisions and reflections around the previous questions:
[TIME] The time that the two frequencies are separated is going to be determined empirically by the system designer.
[HIGH] Is going to be used the MEL frequency scale to generate the sounds and to compute the increment. In addition, as in the mid-range frequencies our hearing is better, the experiment should work around 1kHz for example.
[DIRECTION] As is not known if is better to: 1) start from two equal sounds and separate till is perceived a difference or 2) to start from two perceptual different frequencies and decrease until the difference is not perceived; a half of the experiments are going to consider 1) and the other half are going to consider 2). 1) and 2) are going to be mixed.
[TIMBRE] We would like to use MIDI piano synthesizer in order to model more real sounds. If is not possible to have the desired frequency resolution with MIDI, the experiments are going to be done with sine tones.
[INTENSITY] We cannot control intensity. We will ask to the user to adjust properly the volume.
[DURATION] With no reverberation, is one of the assumptions we do.
[INTERFACE] A web page to make the perceptual tests. The user should introduce some data of him/her: describe him/her musical studies, age and kind of headphones.
Considering the previous hypothesis/reflections, the audio files used to do the perceptual test are a concatenation of two pure tones around 1kHz (therefore, the reference tone would be always 1kHz) following the structure: 2 seconds of a tone, 0.5 seconds of silence, 2 seconds of the modified tone and 1 second of silence. The reference tone could be either the first or the second. Note that this is the structure of one block. Several blocks are concatenated, each of one increasing/decreasing the pitch 2 MELs; what means that the test resolution is of 2 MELs.
So, as described above two kinds of files are generated. The ones that increase the difference of pitch ("separates the pitch"), see an example: http://www.jordipons.me/percepTest/R1bseparant.wav
And the ones that decrease the difference of pitch ("put together the pitch"), see an example: http://www.jordipons.me/percepTest/R1bajuntar.wav
For the final experiments 8 sounds where used, taking care for all the casuistry considering that the progressions can "increase" or "decrease" the difference, that the reference 1kHz tone can be the first or the second tone and that the tone that is different can have a positive or a negative difference (respect to the reference tone).
So, 4 progressions of 11 blocks (where the first block the 2 tones are the reference of 1kHz) are going to increase the difference under the following conditions:
and 4 progressions of 11 blocks (where the last block the 2 tones are the reference of 1kHz) are going to decrease under the following conditions:
From the previous description, we can identify some weaknesses of this perceptive test:
Finally, some observations:
Regarding the previously presented data, the variables that are going to be studied are:
The best way to understand the procedure is to check the perceptual test that is located in an online website: http://www.jordipons.me/percepTest/
There, you should follow the attached instructions:
This perceptual test wants to be BRIEF, is why this presentation is BRIEF. For any question, just ask to: idrojsnop@gmail.com.
The testing sounds are composed by blocs of two concatenated tones (a reference constant tone and a changing tone). The changing tone increases or decreases respect to the constant one. You will listen both concatenated and you will notice that easily.
Steps to follow:
Important recommendation: when you perceive what Q.I suggests, push the pause of the audio player in order to note down that second in the form.
Finally, do NOT use speakers (is why is disabled the option in the questionnaire). Use headphones or earplugs.
One of the problems of using a web-site as interface is that those results are unsupervised. I have to trust that they put attention while the experiment, and I have to trust that they used headphones. The "up/down" toggles where included to prevent the users that introduce random results to the test. But at the end I did not used this information because some users complained that was not evident enough.
In order to check the reliability of the experiments I finally used my common sense and I discarded one that was suspicious.
12 people were tested: 8 were musicians and 4 were non-musicians.
Let's divide the results in those two categories, to see the first result: The smallest pitch difference between 2 consecutive tones that a musician listener can detect is smaller than the smallest pitch difference perceived by non-musicians. Note that in the following tables that represent the difference between the first two consecutive tones that where perceived equal/different (in MELs):
In the previous chart is presented first, in the top, the results for each of the musician subjects per experiment. In the lower part (in the left) there are the results for the non-musicians and at the lower part (in the right) there are the mean and standard deviation over all the experiments for each of the subgroups. The blue experiments decrease the difference between the two tones and in yellow the ones that increase the difference between the tones.
Then, for better understanding, let's compute the obtained smallest pitch difference between two consecutive tons in different measures:
measure | MUSICIANS | NON MUSICIANS |
MEL | 3.18 | 11.41 |
Hz | 1.97 | 7.12 |
So, observing the global mean for each of the subgroups we can conclude that musicians have a much more better resolution. This is due to the fact that musicians have trained this hability because they need this specific feature; meanwhile non-musicians don't need this feature for ordinary issues of life.
The second result is that there is no evidences to confirm that the different ways we present the tones affect to the minimum pitch difference detectable between 2 consecutive tones.
For studying if the "ascending/descending" evolutions of the sounds could affect, the mean and variance along all the results of the experiments under each of the conditions are the following:
MUSICIANS | ||
Descending: | mean | std |
2,75 | 0,6455 | |
Ascending: | mean | std |
3,625 | 0,59512 |
NON MUSICIANS | ||
Descending: | mean | std |
13 | 1,77951 | |
Ascending: | mean | std |
9,8125 | 4,68764 |
Here we can not see evident differences between the descending/ascending ways of presenting the data.
Now for studying the effect of changing the position of the reference tone (if is the first or the second), we see the mean and variance along all the results of the experiments under each of the conditions:
MUSICIANS | ||
Reference 1st tone: | mean | std |
3,3125 | 0,8165 | |
Reference 2nd tone: | mean | std |
3,0625 | 1,25831 |
NON MUSICIANS | ||
Reference 1st tone: | mean | std |
9,6875 | 4,72306 | |
Reference 2nd tone: | mean | std |
13,125 | 1,73205 |
Again, we can not observe really meaningful differences.
And now if we pay attention to the height of the modified tone respect to the reference tone (if is higher or lower), we observe:
MUSICIANS | ||
Higher: | mean | std |
3,25 | 0,57735 | |
Lower: | mean | std |
3,125 | 0,95743 |
NON MUSICIANS | ||
Higher: | mean | std |
10,625 | 2,78014 | |
Lower: | mean | std |
12,1875 | 4,54606 |
As in all the previous tables presented, we can not say that those studied variables (the ones that we are analyzing related to the way of presenting the sounds) are important for the human perception.
Finally, we cannot conclude any result with respect to the age because most of the subjects are in the age range of 20-30 years old.
Results
Pablo
Miquel
Jordi
Sanjeel
Conclusions
Similarities
Differences
The number of frequencies tested and methodologies were different for each one. Some were adaptive and others fixed. Hence the user interface for each differed
Some tested with both musicians and non-musicians too
To analyse the data and present the test Mel and relative frequency threshold was used by some
Different parameters were studied. Specific details could be dileneated