AudioMasters
 
  User Info & Key Stats   
Welcome, Guest. Please login or register.

Login with username, password and session length
November 12, 2007, 09:05:02 PM
62111 Posts in 6146 Topics by 2111 Members
Latest Member: cereboso
News:   | Forum Rules
+  AudioMasters
|-+  Audio Software
| |-+  Previous Versions
| | |-+  Cool Edit 96, 2000, 1.2a
| | | |-+  Frequency analysis strangeness
  « previous next »
Pages: [1] Print
Author
Topic: Frequency analysis strangeness  (Read 2506 times)
« on: December 15, 2006, 02:38:37 AM »
AndyH Offline
Member
*****
Posts: 1461



There have been some previous discussions about CoolEdit's Frequency Analysis anomalies, about the fact that what the help file says should happens doesn't seem to be quite what is happening, and just what is happening isn't exactly clear. I found something else that seems to me to be just completely wrong, but there is always the possibility that I just don't understand. Basically, I observed a situation where Frequency Analysis displays a signal where there is nothing but zero value samples.

I ran some experiments on dither.
Generate 10 seconds at 32 bit/44.1kHz, of the program's A 440 preset at -10dB.
Apply the Amplify/Fade Out preset, from 0dB to -240dB,  changed from linear to logarithmic, over the ten second duration.
Resample to 16 bit under various conditions.

Pohlmann writes "With dither, the resolution of a digitization system is far below the least significant bit; theoretically, there is no limit to the low-level resolution." I wanted to see how this came out with  computer generated tones, dealing only with numbers, where the vagaries of the analogue world would not come into play, and to find out what I could actually hear with my equipment.

One of the unexpected results was that I can observe and measure the tone to much lower signal levels without dithering than with dithering, and I can also audibly distinguish it at lower levels than when I dither (more than 3 seconds further out on the fade) . Of course this result is not the "pure" tone of the 32 bit space. At -75 dB and below, the sound is colored by the harmonics produced by the highly correlated quantization errors.

At 32 bit, the faded A 440 tone still extends all the way to 10 seconds, even though the logarithmic slope puts it below audibility (on my equipment) after 4seconds.  Resampled to 16 bits without dithering, the tone completely disappears before 7 seconds, leaving only zero valued samples for the last few seconds. What this post is about is that Frequency Analysis shows the tone, and its harmonics, beyond this 7 second cutoff point.

The last non-zero value sample is just before 6.962 seconds, yet Frequency Analysis shows the 440 peak, plus many harmonics, out to 7.7 seconds. This is so whether I just place the cursor anywhere between those two times and bring up the Frequency Analysis display, or I select a half -second or so of zero valued samples and Scan them in Frequency Analysis.

Beyond 7.7 seconds, Frequency Analysis show a flat line at -269.8dB. This is the same as a file of generated Silence. I can also observe that the sample values are zero by zooming in on the vertical scale and seeing that none ever vary from the zero line, and  I can further verify this by double clicking on individual samples to get the Edit Sample Directly dialogue.

Either this is another error in the functioning of Frequency Analysis or there is some reason, of which I am unaware, that Frequency Analysis should work this way. If there is some reason this result is valid, I would like to know it so that I can better understand my observations.

I know there have been some changes in the Frequency Analysis function in later versions of the program. I would be curious to know what they show in this circumstance.
Logged
Reply #1
« on: December 15, 2006, 04:02:37 PM »
MrHope Offline
Member
*****
Posts: 53



I didn't quite understand all of your post, but it sounds similar to something that I just noticed yesterday.  I was just working with 16 bit 44.1kHz files.  The spectrum analyzer shows broadband audio data at points where the waveform display shows values aproaching zero.  For example, if I make a short fade in at the beginning of a file, the spectrum view shows broadband audio data even though the original audio data was not broadband. 

The spectrum view looks as if there is a broadband peak starting at silence and sloping down to the normal spectrum data.  In waveform view, it's just a normal fade in. 

I suspect that the error has to do with FFT data window sizes and their shapes.  Perhaps FFT data window ripple causes the errors. 
I say window ripple, because FFT data windows are like passband filters.  Blackman, Butterworth, Triangular, Hamming, Hann etc are all FFT data window types each with certain shapes, advantages and disadvantages.  I think the disadvantages are what we are noticing. 

No FFT data window is perfect because audio is continuous in nature and not normally grouped into chunks as in the FFT data windows.  Also, the edges of FFT data windows are faded out which has both positive and negative effects on the data going through them. 

As for your experiment, it seem like perhaps the FFT data window size is precise down to .7 seconds wide.  This is just an educated guess.  I bet if you experiment with different FFT settings you will get different lengths of FFT overhang.  FFT data window sizes and shapes  affect their precision and bandwidth.  There are practical limits to these. 

I have the feeling that the errors encountered are errors that any FFT spectrum analyzer might generate. 

Here is a PDF file that explains FFT spectrum analyzers better than I can.  A hardware FFT spectrum analyzer is discussed so it's different from software, but some of the principles are the same. 
 
http://xray.rutgers.edu/ugrad/326/SR760m_chap2.pdf

At the very least, it demonstrates just how complex they can be. 
Logged
Reply #2
« on: December 15, 2006, 07:37:18 PM »
SteveG Offline
Administrator
Member
*****
Posts: 8250



One factor that you really need to be aware of as far as FFTs are concerned is what they are doing as far as the energy spectrum is concerned. Consider the basic definitions of different types of noise, which are as follows:

  • Pink Noise (aka 1/F noise) - the power spectral density is proportional to the reciprocal of the frequency - hence the aka. This means that there is equal energy in all octaves. The energy level falls with higher octaves at -3dB/octave. Because the number of Hz/octave doubles each time you go up one, there are more individual frequencies to disperse the energy in, so each displayed one has less in it.
  • White Noise has a flat power spectral density, so there is equal power in any band, at any centre frequency. Since there are more possible bands the higher we go, then the more energy there is in an octave's worth.

FFT's look at a large number of equal width bands across the spectrum, so an FFT of white noise gives a level display - which is what you would see in Audition or CE. If you look at the background of most signals, you will find that the noise is effectively pink - which is why the Audition FFT slopes down at -3dB/octave on most signals. This can easily fool you into thinking that signal and noise is present in bands when it isn't, really. Pink noise is rather more like what occurs naturally, though - which is why it's generally used as a reference in audio.

Logged

Reply #3
« on: December 16, 2006, 12:47:07 PM »
ryclark Offline
Member
*****
Posts: 270



I have been rather confused by this. I am used to looking at pink noise on a Real Time Analyser display which apparently shows pink noise with a flat response. So I was puzzled when doing some frequency response checks to see the display in Audition's Frequency Analyzer sloping down to the right.

So please Steve can you explain why we see the difference in the two types of display. You have sort of explained the CEP/Audition version of how it is done. So how does this differ from a multiband RTA?
Logged
Reply #4
« on: December 16, 2006, 02:05:54 PM »
SteveG Offline
Administrator
Member
*****
Posts: 8250



It isn't quite intuitive, I'll grant you. The foregoing was a partial copy of what's in the Acoustilyzer thread (page two) - there's more about how it works with a RTA in that.

If you break the spectrum up into, say, 1Hz bands, and then look at how many of these there are in the octave between 16 and 32Hz, you arrive (obviously!) at the answer 16. If you look at the same thing 4 octaves higher, ie between 256 and 512Hz, then the answer is 256 1Hz bands. So if you sum the energy available from a source of white noise, where each 1Hz band will have the same amount, and display it in octave bands, each band will appear to have more energy than the one below it. Now go back and look at the definition of Pink noise, which has an equal amount of total energy in each octave), you'll see why these bands display as level in an RTA with an octave or sub-octave based display.

But, displaying white noise with a FFT display, which would (or could) represent each individual 1Hz band as a separate entity on the screen, would result in a level display, and Pink noise would slope downwards. An RTA though, would display the white noise as an increase in each octave band's level. It doesn't matter whether it's 1Hz in each band or 10Hz - each FFT band is still the same in terms of the amount of energy, but they are not the same as octave bands - and that's essentially the difference.

Just sit down in a darkened room and think about it a bit!
Logged

Reply #5
« on: December 16, 2006, 04:31:51 PM »
ryclark Offline
Member
*****
Posts: 270



Thanks Steve, I will. undecided
Logged
Reply #6
« on: December 20, 2006, 07:17:49 AM »
AndyH Offline
Member
*****
Posts: 1461



A shortened version of the first post:

I generate a signal at level -10dB in 32 bit, 10 seconds long.
I apply a logarithmic fade to -240dB over the 10 second time.
By 4.5  seconds from zero time, the -10dB signal level has dropped below -96dB.
I resample to 16 bit -- with no dither.
In the 16 bit result, starting right before 7 seconds, extending to 10 seconds, all sample values are 0.

I place the cursor out amongst the zero value samples, say at 7.5 seconds. Frequency Analysis is supposed to display what is at the cursor. It shows the signal.

I select around 22,000 zero value samples, say between 7.2 seconds and 7.7seconds, open Frequency Analysis, and click on Scan. FA is supposed to show something about the selection. All selected samples have value zero. FA shows the signal.

I don't know what FA is trying to tell me, or if FA just doesn't work. I don't understand the relevance of pink noise energy distribution to this question. (But I believe I understand its relevance to what I generally see in a FA scan of music)
Logged
Reply #7
« on: December 20, 2006, 11:59:26 AM »
SteveG Offline
Administrator
Member
*****
Posts: 8250



I generate a signal at level -10dB in 32 bit, 10 seconds long.
I apply a logarithmic fade to -240dB over the 10 second time.

By 4.5  seconds from zero time, the -10dB signal level has dropped below -96dB.
I resample to 16 bit -- with no dither.
In the 16 bit result, starting right before 7 seconds, extending to 10 seconds, all sample values are 0.
I replicated this exactly in CE2000...

Quote
I place the cursor out amongst the zero value samples, say at 7.5 seconds. Frequency Analysis is supposed to display what is at the cursor. It shows the signal.
At 6.75 seconds, where the sample value falls to zero, the FA shows a straight line at a level determined by the reference value, if the sample window is small.

Quote
I select around 22,000 zero value samples, say between 7.2 seconds and 7.7seconds, open Frequency Analysis, and click on Scan. FA is supposed to show something about the selection. All selected samples have value zero. FA shows the signal.
Not here it doesn't...

Quote
I don't know what FA is trying to tell me, or if FA just doesn't work. I don't understand the relevance of pink noise energy distribution to this question. (But I believe I understand its relevance to what I generally see in a FA scan of music)
The pink noise energy was merely an explanation of what you would expect to see with a signal present. It has no relevance to the complete absence of one. The only way that you will get signal levels displayed at a later stage in this experiment is if you have the window set so wide that spot checks include a part of the signal area (there isn't really any such thing as a spot check with a FFT). If you set the FFT size to 65536, you will appear to get signal out to just over 7.4 seconds, because  the sample window will then include the end of the signal.
Logged

Reply #8
« on: December 20, 2006, 06:50:47 PM »
AndyH Offline
Member
*****
Posts: 1461



Yes, I know that small FFT sizes don't show anything. In general, small FFT sizes show little except gross trends.

So, with larger FFT sizes, which the help file says are "more accurate," it is not possible to analyze 'small' selections? How does one know if the display is ever meaningful when one just places the cursor, without selecting a range to analyze? The help file says the FA display show the frequency distribution at the cursor location, but apparently that is not quite true.

Larger FFT sizes display many frequency variations that are not in the graph at lower FFT sizes. With small FFT sizes one can sometimes see no evidence of a troublesome frequency component. One can hear the problem, but the display will show nothing useful. Larger FFT sizes reveal much more although, of course, the lesser detail of small FFT sizes is sometimes easier to use (when it does reveals something useful).

So FFT size 65536 "analyzes" more than 0.5 seconds of audio, regardless of what one selects? What would be very useful is some specific guidelines that will let one know when a particular FFT size is valid. e.g.:
How much audio does one have to select for any given FFT size in order for the display to be valid?
Is simply placing the cursor, without selecting a range, ever meaningful? What does it mean?
Does the validity of a selected range FA display depend on the sample rate? (i.e. is a large FFT useable with a smaller time range if the sample rate is higher (or lower)?)

This is where I find signal by placing the cursor (no selection). I did not try to discriminate finer than 0.1 second intervals).
7.7 seconds for FFT 65536
7.3 seconds for FFT 32768
7.1 seconds for FFT 16384
7.0 seconds for FFT 8192, and (sort-of) for FFT 4096 (last non-zero sample at ~ 9.962 seconds).
6.9 seconds for FFT 2048, which is among non-zero samples



Logged
Reply #9
« on: December 20, 2006, 11:29:40 PM »
pwhodges Offline
Member
*****
Posts: 916

WWW

The help file says the FA display show the frequency distribution at the cursor location, but apparently that is not quite true.
Think about it; a single sample doesn't have a frequency!  You have to look in a range for the concept to mean anything.

Paul
Logged
Reply #10
« on: December 21, 2006, 12:21:53 AM »
SteveG Offline
Administrator
Member
*****
Posts: 8250



So FFT size 65536 "analyzes" more than 0.5 seconds of audio, regardless of what one selects?
That depends on the sample rate...
Quote
What would be very useful is some specific guidelines that will let one know when a particular FFT size is valid. e.g.:
How much audio does one have to select for any given FFT size in order for the display to be valid?
How long is a piece of string?
Quote
Is simply placing the cursor, without selecting a range, ever meaningful? What does it mean?
What is the spectrum at this point, in terms of the relationships between the number of samples analysed?
Quote
Does the validity of a selected range FA display depend on the sample rate? (i.e. is a large FFT useable with a smaller time range if the sample rate is higher (or lower)?)
The only way to express this clearly is that in terms of time resolution, smaller sample sizes are more accurate, but in terms of frequency resolution, then larger sample rates are much better. How you optimise this between different sample rates is entirely up to you, but you can't simply assume that what you read at any given point is an accurate representation of what is actually happening there; this has always been the case with FFT's, and has been the cause of much merriment on my part - all these people trying to tell me that they have definitive answers to response questions based simply on an FFT reading just crack me up!  smiley
 
Logged

Reply #11
« on: December 21, 2006, 12:30:04 AM »
AndyH Offline
Member
*****
Posts: 1461



One sample does not, but that fact does not necessarily mean much in this case. Just what is going on is what I would like to know.

What exactly is meant, in this situation, by at the insertion point (yellow arrow cursor) ? The smallest selection one can make is one sample, but only one sample can only be selected if one is zoomed in enough to see individual samples. The smallest possible selection I can make on the 37 minute file current on my screen, when I am viewing it full screen, is just under 98,000 samples. Is the "yellow arrow cursor" actually a geometric line of zero width? If not, it's position covers a varying duration that depends on file size and/or zoom level.
Logged
Reply #12
« on: December 21, 2006, 02:00:49 AM »
SteveG Offline
Administrator
Member
*****
Posts: 8250



Is the "yellow arrow cursor" actually a geometric line of zero width? If not, it's position covers a varying duration that depends on file size and/or zoom level.
It depends entirely on the FFT size. If you set it to 128, then a single cursor position is 128 samples at whatever sample rate you are using. If you set it to 65536 samples at 44.1k, then it's about 1.48 seconds. What's the problem? (apart, that is, from you only very gradually realising how FFTs work....!)
Logged

Reply #13
« on: December 22, 2006, 04:13:37 PM »
hornet777 Offline
Member
*****
Posts: 86



If you haven't already AndyH, I might suggest you procure and read through The Engineer's Guide To Digital Signal Processing. Its a free book available online in PDF format, and will provide useful background to many of the arcane questions you post here.

As for the Frequency and/or Phase analysis windows in CEP, I don't know why its so funky either, but  afro does. He says its only meant to be a "ballpark" estimate, and quick graphical presentation, rather than a super-accurate precision instrument. If you want the latter, there is Agilent, Tektronix, SRS, and several other manufacturers that can supply your need.
Logged

After all has been invested in correctness, then how does it stand with truth?
Pages: [1] Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS! Ig-Oh Theme by koni.