Conceptual Talk: Realistic Game Audio
Posted: Fri Feb 15, 2013 12:15 pm
Good evening folks,
During the past few days I have been thinking a lot about game audio. In this case, I'm talking about achieving ‘phonorealism’ (I'm just going to coin this term for the purpose, I don't know if it's been used before), i.e., a most accurate representation of sound in computer games. This kind of implies that I'm not talking about doing a side-scroller or anything two-dimensional really, but creating a sound environment as realistic as the visual environment which it accompanies. And I'm fairly sure it adds to the gaming experience to care about these things.
I want to apply the same standards to sound as most people do to graphics in a simulator. What is not to be forgotten is that, because I talk about it today, what I mean is actually three to ten years in the future. So, to put it correctly, I want to apply the same standards to sound as most people will apply to graphics in five years. The trade-offs should be comparable.
What I am trying to achieve is best described by implementing a simulation of acoustics, the behavior of sound in three-dimensional room and the acoustic behaviors of materials, terrain, walls, and closed rooms. Most importantly traveling sound, resonance, reflection, absorption and reverberation.
I suppose that simulating sound waves in three dimensions for forty-four thousand and one hundred sound samples per second is too complex to do, and not even constructive, because we still keep simplifying literally everywhere in computer games. So, a simplified model of sound is needed which yield results comparable in accuracy to graphics.
There is an example I recently tried in a sound editing program, in which I tried to simulate the sound of a car radio from the ‘POV’ (actually, the point of listening) of the driver inside the car. After a six-page derivation of what filters I would need, I was able to construct some kind of realistic sound, incorporating sample offsets for different travel distances, a cuboid model of resonance behavior for the interior, a speaker modeler, and lastly, reverb. So I saw that it could be done, but neither am I an expert in writing sound filters, nor do I have any experience in approaching more general cases, like shaking your head inside the car. It’s sort of static. So I had to come up with a different thing.
I had one basic idea. The idea is to implement a system similar to graphics (you will see the analogy in a moment), in which sound sources are represented almost like light sources. The different materials would get separate ‘sound shaders,’ which define the reflection/absorption behavior of the former. I would create a ‘sound image’ for each output channel (be it binaural, or even a 7-channel sound system), to collect information about the distance traveled, the filters applied in the shaders, and eventually compute a mix of the different sound sources just like I compute the image for the eyes on the GPU. I would actually use a similar or even the very same geometry like the visual scene.
I endorse that you comment on this concept and improve or add to it wherever you see points. Or even propose a totally different concept, of course.
During the past few days I have been thinking a lot about game audio. In this case, I'm talking about achieving ‘phonorealism’ (I'm just going to coin this term for the purpose, I don't know if it's been used before), i.e., a most accurate representation of sound in computer games. This kind of implies that I'm not talking about doing a side-scroller or anything two-dimensional really, but creating a sound environment as realistic as the visual environment which it accompanies. And I'm fairly sure it adds to the gaming experience to care about these things.
I want to apply the same standards to sound as most people do to graphics in a simulator. What is not to be forgotten is that, because I talk about it today, what I mean is actually three to ten years in the future. So, to put it correctly, I want to apply the same standards to sound as most people will apply to graphics in five years. The trade-offs should be comparable.
What I am trying to achieve is best described by implementing a simulation of acoustics, the behavior of sound in three-dimensional room and the acoustic behaviors of materials, terrain, walls, and closed rooms. Most importantly traveling sound, resonance, reflection, absorption and reverberation.
I suppose that simulating sound waves in three dimensions for forty-four thousand and one hundred sound samples per second is too complex to do, and not even constructive, because we still keep simplifying literally everywhere in computer games. So, a simplified model of sound is needed which yield results comparable in accuracy to graphics.
There is an example I recently tried in a sound editing program, in which I tried to simulate the sound of a car radio from the ‘POV’ (actually, the point of listening) of the driver inside the car. After a six-page derivation of what filters I would need, I was able to construct some kind of realistic sound, incorporating sample offsets for different travel distances, a cuboid model of resonance behavior for the interior, a speaker modeler, and lastly, reverb. So I saw that it could be done, but neither am I an expert in writing sound filters, nor do I have any experience in approaching more general cases, like shaking your head inside the car. It’s sort of static. So I had to come up with a different thing.
I had one basic idea. The idea is to implement a system similar to graphics (you will see the analogy in a moment), in which sound sources are represented almost like light sources. The different materials would get separate ‘sound shaders,’ which define the reflection/absorption behavior of the former. I would create a ‘sound image’ for each output channel (be it binaural, or even a 7-channel sound system), to collect information about the distance traveled, the filters applied in the shaders, and eventually compute a mix of the different sound sources just like I compute the image for the eyes on the GPU. I would actually use a similar or even the very same geometry like the visual scene.
I endorse that you comment on this concept and improve or add to it wherever you see points. Or even propose a totally different concept, of course.