UNIT: Jan Dušek, Audio Lead, Audio Dept.
TO: Arma 3 Users
OPSUM: Outlining the current & future goals of Arma 3's Audio Department
EVALUATION
After an intense time on the run up to Marksmen DLC's launch, it's now the perfect time to look back upon the recently introduced audio features, explain them in a bit more detail, and provide new information about our future plans. There were also a lot of comments and questions from the community, so this OPREP will also serve as a summary reply to most of the topics which have been discussed publicly.
Before we get into the juicy technical details (which we'll try break up into consumable chunks), let's get a sense of what our general objectives and outputs have been. With Marksmen DLC, we mainly focused on the experience of infantry fire-fights; most significant changes were made to samples and features regarding shooting, together with some general enhancements.
OVERVIEW
To begin, we organized a shooting range sound recording session to obtain new samples and also to have fresh experience in mind for our design. Using these references, we redesigned all personal weapon samples, and introduced a brand new feature: an additional layer of dynamic sound reflection samples (aka, reflection 'tails'), mixed dynamically according to the environment. Hand in hand with that, we widened the dynamic range to provide a more authentic audio experience.
Another very significant and long-desired improvement is interior attenuation, which finally brings the experience of being inside structures much closer to the reality. There were also other features implemented / enhanced, such as frequency attenuation based on distance and super-sonic crack, together with several smaller improvements like better synchronization of breathing with weapon movement.
We consider several of these new audio features to be in a 'first round' of implementation. We are planning to carry out additional enhancements in further iterations according to our own vision, based on common solutions, but also by experimenting with new ideas and considering every single bit of constructive community feedback, too. Let's take a more detailed look at the most significant improvements we've made and their possible future.
REVIEW
Generally, volume balancing in game sound design is one of the biggest challenges of all. Although it's definitely not possible (and not desired) to get 1:1 'real' dynamic range into any game, we've been always into having a wider range in Arma. Recently, we decided to expand it a bit to fit reality better and, together with remixing shooting samples, rebalance it to avoid clipping / crazy limiting in edge cases.
There are also new boosting parameters for shooting (1st person shooting and one for AI shooting), which, in future, should appear in config for easier customization. What's the future? From a technology standpoint, we do have static dynamic range, but a more splendid solution would be an advanced High Dynamic Range Mixing System, together with the possibility of static dynamic range customization, because it's clear that wide dynamic range can be uncomfortable for some players. Although it would be very useful to have such an option, at this stage, we are only investigating the possibilities.
The attenuation of sounds produced outside of a building when you enter one has long been a desired feature. One reason it was not implemented much sooner is a bit 'historical': considering Arma gameplay mostly takes place in the open, development resources were simply focused on tasks with higher priority. But finally, it's there! Again - in a 'first version' of implementation. There are several outstanding issues we know about, and there are obviously several potential improvements to be made.
Currently, there is a time-based cross-fade of frequency attenuation (and interior controller for reflections). We hope to replace this with a distance-based transition. Another thing is, in fact, there is no 'real-time' reverb; this aspect, plus more detailed detection is something under consideration, set against, as usual, performance demands. There are many more ideas concerning this feature, but their priority is definitely lower than other, more general planned improvements.
This new feature is a simple, but powerful simulation of convolution reverb, where reflection samples designed for specific environment types are mixed dynamically. We think adding it to the game was a great improvement. Since release, there are two additional optimisations: to avoid flanging we added a tiny pitch randomisation and, to save resources, the number of reflection samples played simultaneously is restricted to the 3-4 most noticeable ones (1-2 for AI based). This limitation should, in future, be distance-based.
One issue (according to community feedback) is that it's sometimes annoying to listen to 'overly ambitious' stereo effects (fake reflections in one channel only); consequently, all 'tail' samples will pass through one more revision. There's also an idea to have multiple reflection samples for each environment; however, rather than implement this specifically for the 'tail' feature, it could be part of more general improvements to sample randomisation. Tails are currently present for shooting only, but we plan to implement the same approach for explosions, too.
We implemented the first version of Frequency Attenuation based on distance for all sounds in the game (with some exceptions). For now, attenuation is achieved via a simple low-pass filter with a non-linear shifting cut-off frequency.
In reality, there are many more aspects which affect frequency, from obvious things like air humidity or temperature, to obstacles, reflections or ground effect. Regardless, we consider it a great improvement, because there was nothing at all before. The current state is not final; the question is how far should we go with more advanced simulation, balancing performance penalties against gameplay / experience improvements.
Some feedback questioned why don't we use inverse square rule (Sound pressure? Sound intensity? Loudness?). We do use it. But consider this: in reality, this part of sound attenuation is represented by geometrical divergence and atmospheric absorption of sound (it can be called full spectrum frequency filtering); real world sounds do not have anything like a 'distance parameter' - a radius, where volume reaches zero. It's simply sound intensity itself being attenuated with 1/r^2 dependency going under the level of other sounds to be not recognizable any more.
Mainly because of different distance parameter values there is a need for different attenuation curves to be used for better gameplay experience. What we really want to see in the future is those specific volume attenuation curves present in config for better customization. Of course, there is much more to describe here, but this topic would deserve a separate article.
Sonic Crack (SC) is now positional and caused by every single bullet travelling above the speed of sound (also after ricochet or material penetration). If a bullet slows during flight, SC is simply presented by a 'fly-by' sound. We recorded 100s of cracks - using different ammo types and various distances; the current samples are our own recordings, with a designed reflection tail. Favouring a distance-based frequency attenuation approach, we abandoned SC samples for specific distances (although the possibility will remain for backward compatibility). In future, we intend to enable SC samples based on environment type (together with the possibility to have a sample array for each definition).
Currently, we are addressing several special cases, such as if there should be a SC when a bullet does not pass the listener. In reality, a SC is not a point, but a whole trajectory (with 1/r^2 attenuation of sound intensity), which forms a right-angled triangle (bullet-listener-SC, right-angle at the listener) where the listener-bullet line is the shockwave cone. For our simulation, a simplified calculation is considered sufficient because of common bullet speed; since there should be a SC even when bullet does not pass the listener, we simulate this by creating a SC on the impact position.
There is one more enhancement for shooting and explosions awaiting implementation: distant samples. We are aware that the sound of shooting heard from a distance is significantly different than the sound you can experience at close range (meaning right next to the weapon). Simple frequency filtering of close samples works, but can bring worse results than using samples recorded from a distance.
One solution is quite common: close and distant samples could be mixed together with volume ratio based on distance. What could be very interesting and could bring more possibilities to improve overall soundscape: we are considering to implement this as a general feature for all sounds. Its implementation, of course, will depend on how much time we have to create a consistent, splendid set of samples.
Speed of sound simulation is currently implemented for explosions (and several other purposes); we intend to extend this feature for more sound groups and categories, such as the looped sounds of vehicle engines, to resolve some existing issues. We know, for example, that the gap between instant cut-off of the looping sound and the forthcoming explosion is unfortunate, and we hope to focus on this first.
Although it looks like a simple task, it could quickly become really very difficult. Imagine you want to simulate delay caused by speed of sound not only for start and stop of the sample (which is the most obvious issue), but for all controllers which affect parameters of looped sounds; furthermore, the delay is constantly changing, because the sound source is moving. Another thing is, not only time delay is necessary to simulate, but it would be great to have the position of sound source 'delayed' to the observed position of object which caused the sound.
Various aspects of our audio engine deserve maintenance in the sense of unifying low-level features and their configuration. We are currently using several types of configuration of sounds with various features like random selection from arrays, looping, simple expressions and controllers usage, etc. We would like to start preparations for at least a partial unification.
This task will be very difficult (mainly because of the necessity of preserving backward compatibility), but unification is the key, which brings more possibilities to all sound categories. Another plan is to move several hard-coded parameters to config, which will speed up the tweaking/polishing process, and bring additional options for modders. As I mentioned above, it concerns boosting factors for shooting, volume attenuation curves, submix volumes for balancing whole categories, and more.
LONG-VIEW
Over the past year we've brought a lot of improvements to Arma's audio engine; we think this work has helped enrich the overall soundscape a lot, but it's definitely not over at all! Of course, there are many more ways to improve in Arma audio engine not mentioned here; however, let's pick out one exciting low-level enhancement awaiting implementation, namely: vector-based multichannel panning.
Some community feedback points out that there is still 'something flat' in Arma sound (for example, when standing beside a running helicopter). In reality, you experience a complex soundscape consisting of various sounds (engine, rotor, exhaust, etc.) but, more importantly, those specific sounds are coming from various directions. Let's take shooting as another example. All the reflections of actual shot sounds or sonic cracks are in the real world coming basically from all directions, but not in Arma.
Yes, we are mixing stereo samples to simulate this effect but, unfortunately, any positional sound in Arma is played as a mono sample; it does not matter how many channels the source sound has - it's being downmixed into mono in the case of stereo sample usage. Well, this mono downmix is actually what should happen smoothly with rising distance listener-source, so it will become a part of the upcoming feature. Additionally, it would be beneficial to have something like mono downmix factor as a parameter.
We believe this will be a superior audio engine enhancement, as it affects basically every single sound category in our game. It will open the door for other possible advanced features, such as positional environmental sounds. With this step, we will move things much closer to industry standard. By the way, an existing example of this kind of enhancement was released with the Helicopters DLC: interior loops within the vehicles are now stereo, moving (L-R panning only) according to camera rotation.
Currently, we are sorting out all the ideas, community feedback, upcoming milestones, and considering all possible improvements to update our internal Audio Roadmap to bring all new exiting enhancements to your ears as soon as possible. Of course, we'd also like to thank our community for their splendid feedback (we are reading it almost every day), with a special shout-out to (BIForum member) Megagoth, for collecting various questions and thoughts, which led a clear set of desired improvements!
Although the audio department (and the Arma team in general) is still relatively modest in size (if not ambition!) compared to the big development beasts, we've again managed to reinforce our team. While we can't say for sure exactly when these ideas will translate into implementation, hopefully, we'll be able to provide such improvements more frequently on the road to the Arma 3 Expansion.