360|Flex is THE conference for Flex developers and now it’s on the east coast. In case any east coasters aren’t familiar with 360|Flex, this is NOT that conference where suits go to learn cloud computing buzzwords. It’s not where you pay to have hundreds of corporate sponsors sell you their latest API. This is where Flex developers go to meet, hang out with, and learn from other Flex developers like you. I would recommend that you go ahead and click on this link to get your ticket while you can, but here are my top three reasons to see 360|Flex rock the east coast - just in case you need more convincing.Read on →
For anyone on the fence, get yourself down to San Jose for 360Flex next week. This is where the community began and it’s one of the few places where you can harass Deepa Subramaniam about Flex Mobile, hang out in a bar next to Doug McCune (I heard he might be around), and watch Jesse Warden describe RobotLegs through D&D metaphors. Tyler and Jacob will also be there to give you the low down on Reflex (the future of RIA), and I might even have a surprise session up my sleeve - but you won’t know unless you sign up.Read on →
In my last post I wrote about The Math Behind Flash’s FFT Results and discussed the need to transform the default linear values returned by computeSpectrum into logarithmically spaced values. In this post I’ll be discussing the Math behind my own FrequencyAnalyzer class (available below) which does just that.
The goal I had when creating this class was to finally quantify the values used to create visualizations. That is, to know the exact center frequency and bandwidth of the frequency bands displayed. This is extremely important when creating “EQ” frequency visualizations and still very useful for creating experimental visualizations, but the information just hasn’t been available in Flash - until now.
At the core of FrequencyAnalyzer is the ability to take a frequency value and determine the closest index from the computeSpectrum results to correlate with it. This allows it to retrieve the closest known amplitude for any given frequency. It sounds kind of complicated, but it’s actually just the inverse of the frequency to index calculation that I discussed in my previous post.
- frequency = i/1024 * 44100;
- i = frequency/44100 * 1024;
This makes everything else possible, but I need to know more than a single frequency’s amplitude. What I really need to know is the amplitude (or spectral density) of an entire range of frequencies. Furthermore this range needs to be defined logarithmically based on the center frequency given. Luckily, I have the math for that as well.
Since we all perceive audio in a logarithmic scale of base 2, I know that an octave below any given frequency is frequency/2 and an octave above it is frequency*2. Octaves are a great way to determine the frequency range because they are inherently logarithmic, and in fact this is what most real-world frequency displays show (1, 1/2, or 1/3 octave bands). Knowing the defined octave bandwidth, FrequencyAnalyzer finds the index for the min and max frequencies of each frequency band and performs an aggregate calculation for the amplitude value. Simply pass in an array of center frequencies (some common ones are provided as constants) and the bandwidth (in octaves) and the rest is done for you!
var amplitudes:Vector.<Number> = FrequencyAnalyzer.computeFrequencies([250, 400, 600, 800], 1); // 4-band, 1 octave
It’s important to note that it doesn’t calculate the average value for frequency bands. There’s a special aggregate calculation needed which I’m still testing, but for now using the peak values seems to work quite well. Another item of note is the optional stretchFactor parameter.
As discussed in my previous post, the sample rate (affected by stretchFactor) determines the highest frequency which is measured (11,025 Hz by default). For each increment of stretchFactor you will decrease the top measurable frequency by half, but you also boost the fidelity of the lower frequency ranges. This is important if you require a larger data set (~ 30+ frequency bands) and aren’t interested in higher frequencies (above 5512.5 Hz for stretchFactor = 1). It’s even possible to run computeSpectrum at both sample rates or determine the best stretchFactor automatically, but I’ve had enough math for today. I’ll leave the rest up to you.[swfobj src=”http://files.benstucki.com/fftmath/frequencyanalyzer.swf” width=”470” height=”220” flashVars=”url=http://www.benstucki.net/mp3/01 The One Infallible.mp3”]
Download: FrequencyAnalyzer.zip (2.37 MB)
I’ve been playing with the HYPE framework recently, and I noticed that they use a SoundAnalyzer class to wrap the Flash Player’s native SoundMixer.computeSpectrum method. This method’s FFT mode is known to have some problems (not really, but more on that later), and as it turns out I’ve dealt with this some already. So here it is (after sitting in my drafts for longer than I care to admit), the Math behind Flash’s FFT!
First, let’s take a look at some of the raw data from the computeSpectrum method. When I originally intended to write this post I took the opportunity to capture aggregate results over a large number of frequency sweeps using an AIR application, and I’ve posted some of the data for you here.
As I alluded to earlier, the long known problem with computeSpectrum’s FFT is that the majority of frequencies seem to be skewed toward the left side - but what they don’t mention in the docs is that it’s because the FFT results are still distributed linearly. Yep, even though the floor and ceiling values in my data were mostly useless (perhaps due to harmonics), you can see a rough pattern in the peak frequencies - an average linear distribution of about 43.07451!
…okay, now let me back up a second to explain that. The way that we naturally understand sound is logarithmic. In musical terms this means that an octave below the note A-440 (440 is the frequency) is A-220, but an octave above A-440 is actually A-880 (not 660!). Audibly we think of the distance between A-440 and A-220 to be the same as between A-440 and A-880, but when displayed in a linear distribution (like our FFT results) A-880 is twice the distance from A-440 as A-220 - and when you use this linear distribution to create visualizations, it doesn’t match our natural perception of the sound. So, just in case any of that got really confusing, I created a little Flash application to illustrate the issue better: FFTMath.
It’s important to note that this isn’t a flaw in Flash’s FFT results. FFT calculations are supposed to return linear frequency distributions, but this raw data needs to be redistributed into logarithmically spaced frequency bands before being used for visualizations. Before redistributing our linear FFT results though, we’re going to need some math to explain them.
After lots of searching on my magic number and a little bit of luck, I came across this amazing blog post: http://code.compartmental.net/2007/03/21/fft-averages/ - and Eureka!
Each point i in the FFT represents a frequency band centered on the frequency i/1024 * 44100 whose bandwidth is 2/1024 * 22050 = 43.0664062 Hz…
Alright math club, here’s the good part. Based on the math from this awesome blog post and the results from my testing, it looks like Flash uses a sample size of 1024 and a default sample rate of 44100 resulting in frequency bands of (2/1024) * (44100/2) or 43.0664062 Hz. Even though an FFT with 1024 samples should return 513 valid results below the Nyquist frequency (sampleRate/2), Flash further clips the data to 256 results for a top frequency of around 11,025 Hz - still much higher than needed for most visualizations.
- frequency = i/1024 * 44100
- bandwidth = (i==0) ? 1/1024 * 22050 : 2/1024 * 22050;
This information let’s us understand the results we’re getting from computeSpectrum. In my next post we’ll use this information to redistribute the data and create more accurate frequency visualizations in Flash.
Download: FrequencyAnalyzer.zip (2.37 MB)