Ask Uncle Willy #3

Originally from here.
Ask Uncle Willy #3: July 7, 1995

Uncle Willy has received several questions about the sounds and music in
games by Williams Electronics Games, Inc.  Since this a rather extensive
subject, Uncle Willy has decided to devote this week's issue of Ask Uncle
Willy to covering this subject in depth.  Uncle Willy apologizes for the
length of this article, but hopes this topic proves interesting.

Uncle Willy can be reached at

         uncle_willy@wms.com

All comments and questions are welcome.  Keep in mind that some questions
just do not have an answer.  Others, of a proprietary nature, do not permit
an answer.

Uncle Willy enjoys hearing from you!

Question:
         Can you tell me about the sounds and music from the games by
         Williams Electronics Games, Inc.?  How are the sounds made?  How is
         the music composed?  How does the sound system recreate sounds?
         Who are the people who compose music for Williams games?

Answer:
   Introduction
         Williams Electronics Games, Inc. designs and manufactures pinball
         games under the Williams and Bally names and coin-op video games
         under the Midway name.  Anyone who has played these games knows that
         the computing power and graphics behind them have become very
         sophisticated and lifelike.  Pinball games are equally sophisticated,
         often containing several computer-controlled playfield toys that are
         integrated into the pinball action.

         To keep the audio on par with these state-of-the-art games, Williams
         has developed a new sound system, called DCS (which stands for
         Digital Compression System).  The DCS sound board provides four
         channels of 16-bit digital audio, with independent control over the
         volume, looping and playback of each channel.  Each channel can
         dynamically play back anything from an entire piece of music to a
         short sound effect, with a typical game using one channel for music
         and the remaining three for sound effects, speech and foreground
         music such as fanfares or breaks.

         The first pinball game to use DCS was Indiana Jones.  The first
         video game to use DCS was Mortal Kombat II.

   Background
         DCS is not the first digital game sound system, but it is the first
         truly high-fidelity, digital audio system designed specifically for
         coin-op arcade pinball and video games.  A comparison to past and
         current game sound systems puts its sophistication and sound quality
         in perspective.

         The "beeps" and "blips" of early arcade games were made with analog
         circuits and simple digital tone generators.  These sounds dis-
         appeared from arcades in the early 1980's when chip sets became
         available to do FM synthesis.  For several years, most sound systems
         for arcade games combined a microprocessor, and FM synthesis chip
         set, and a low sample rate, low quality digitizing system.  These
         systems were very similar to many of the sound cards currently
         available for personal computers, and provided fairly complex music,
         somewhat understandable speech and sonically interesting (if not
         altogether realistic) sound effects.

         FM synthesis was in turn replaced by sample playback systems,
         implemented with custom hardware or with software running on a
         Digital Signal Processor (DSP).  Samples help to make music more
         realistic, but the systems often suffer from a lack of polyphony and
         and upper limit of a few seconds recording time for each sample.

         One of the most recent advances in game sound is CD-ROM.  Although
         CD-ROM technology is popular in home and personal computer games,
         there are several drawbacks that limit its usefulness in fast-paced,
         multi-player arcade games.  One problem is that CD-ROM audio schemes
         are limited to a single channel of mono or stereo sound.  Arcade
         games require several independent sound channels for layering music,
         sound effects and speech.  Access time is another problem.  Delays
         as short as 20 milliseconds between action and sound can make an
         arcade seem sluggish.  Most of the CD-ROM game systems rely on a
         separate sample playback or FM synthesis system for interactive sound
         effects.  Also, few CD-ROM systems can withstand the bumping and
         shaking excited players can inflict on arcade game cabinets.

         DCS was designed to overcome the limitations of other game sound
         systems.  There were three main design parameters:  improving sound
         quality, maintaining interactivity and streamlining sound develop-
         ment.  16-bit audio played back at 32 kHz delivers near-CD sound
         quality.  The ability to instantly start, stop, loop and control the
         volume of four independent channels in response to commands from the
         game provides interactivity.  Most importantly, the process of
         developing sounds changes from programming synthesized sounds to
         producing music and sound effects in profession recording studios.

   Audio Data Compression
         The biggest limitation with most game and multimedia sound systems is
         storage.  16-bit digital audio at a rate of 31,250 samples per second
         requires about 60 kilobytes of storage per second of sampled sound.
         Overall cost of a game limits the ROM available for sounds to about
         3 megabytes.  The problem is that 3 megabytes of ROM at 60 kilobytes
         a second is only enough for about one minute of total sound.  Most
         Williams games have a total of 10-15 minutes of non-repeating sound.

         The solution to this storage problem for DCS was the development of
         a proprietary transform coder algorithm that reduces a 500 kilobit/
         second data rate by a factor of ten or more, and runs on a low cost
         DSP chip.  This algorithm is similar to the algorithms used in the
         Sony MiniDisc format, the Philips Digital Compact Cassette format
         and the emerging digital sound formats for movies such as Dolby
         Stereo Digital, Sony Dynamic Digital Sound, and Digital Theater
         Sound.

         The encoding phase takes place on a PC, starting with digital audio
         files.  The fields are broken down into frames of 240 samples each.
         A frame is 7.68 milliseconds of audio, and files can range from
         one to several thousand frames in length.  Each frame is transformed
         by a 256-point Fast Fourier Transform, with simple cosine windowing
         and eight samples of overlap on each end.  The resulting spectrum is
         further broken down into 16 bands and quantized according to masking
         curves and user-controlled parameters.  The quantizing levels and
         the resulting data for each frame are entropy encoded into variable
         length packets. The packets for each file are combined with header
         blocks and stored as files that are used later to generate ROM
         images.

         The decompression phase takes place on the DSP chip.  Starting with
         the header, a packet of compressed data is read from ROM and
         decompressed into a frame.  This involves entropy decoding the data
         stream, and then de-quantizing the data into a frame of frequency
         domain data.  The four channels are decompressed independently and
         summed in the frequency domain.  Volume operations, such as level
         settings and cross-fades, are also performed in the frequency domain.
         The final frame is inverse transformed and clocked out a serial to a
         Digital to Analog Converter (DAC).  Music and sound effects compress
         to an average of 50 to 70 kilobits/second, and speech reduces to 20
         to 40 kilobits/second.  The resulting audio quality exceeds that of
         other algorithms operating at similar bitrates.

         Several aspects of the DCS algorithm capitalize on the nature of
         sounds for games.  The encoding, which is more complicated than
         decoding, can be performed on a PC without any restrictions on
         processing time.  The algorithm has a variable bitrate, which means
         that the amount of artifacts and the overall quality can be fine
         tuned for each individual sound.  Since the frame size is small,
         it is possible to carefully edit sound effects and seamlessly loop
         music.

   Hardware
         The DCS hardware is relatively simple.  The major components, a
         surface-mounted DSP chip, sockets for ROM, a 16-bit mono DAC,
         and an audio power amplifier, are contained on a two-layer printed
         circuit board measuring about eight inches square.  There are
         additional discrete analog and digital components making up filter,
         interface and power supply circuits.

         The DSP chip used is the ADSP-2105 by Analog Devices.  The 2105 runs
         at 40 MHz and executes over 10 million instructions per second.

         The 2105's on-chip RAM is supplemented with additional static RAM
         on the DCS board.  This memory is used primarily as temporary
         storage for the realtime decompression and inverse FFT operations
         required for each of the four playback channels.

         DCS contains a bi-directional, 8-bit interface between the sound
         board and the game host.  Playback commands from the game host
         can trigger anything from a single sound effect to a one-minute
         segment of music that loops indefinitely.  Commands from the host
         are asynchronously received and buffered and can arrive at a
         rate greater than 25 thousand per second, although a typical rate
         in game play is 10-100 per second.

         The sound board can also send timed data back to the game host.
         This data is useful for synchronizing animation, display effects and
         light shows with music and sound effects.

   Staff
         Several full time composers and one freelance composer produce sound
         effects, voice overs and music for all of the pinball and video game
         projects.

         Jon Hey has been at Williams for several years.  The Jon Hey Band
         recently headlined at the Chicago Film Makers benefit and performed
         at the Bucktown Arts Festival in Chicago.  Vince Pontarelli joined
         Williams two years ago, having previously worked at Libman Music, a
         well-known Chicago jingle and post-score house, and as a writer for
         Todd Scales at Windsound.  Dave Zabriskie also joined Williams fairly
         recently, having come from Premier pinball.  In addition to a long
         list of game sound credits, Dave has written extensively for
         orchestral and choral groups, including a recent string symphony for
         the Chicago String Ensemble.  Kevin Quinn is the newest addition to
         the Williams sound department.  Before coming to Williams, Kevin
         owned and operated Concord Music, a successful commercial jingle house
         in Chicago.  Freelancer Dan Forden works out of his own project studio.
         Dan has put together and worked out of several project studios over
         the last several years in connection with bands that he has both
         played in and recorded.

   Facilities
         Each staff composer has an office and a MIDI workstation.  Supporting
         the MIDI workstations is a fully equipped recording studio built
         around a Tascam M-2516 16-channel mixer.  Four patch bays connect
         the mixer to a DS-30 DAT recorder, a CD player, an Alesis ADAT and
         the outboard gear, which includes a Lexicon LXP-15, an Ensoniq DP/4,
         and various equalizers and compressors.  The studio is divided into
         two rooms, one of which is built around an isolation booth.

         The studio is used mostly for recording vocals and voice-overs in the
         isolation booth and for creating and editing sound effects on a
         Macintosh IIci running Digidesign Sound Tools.  Sound files are
         stored on removable 45 megabyte cartridges, making it easy for each
         person to manage the files for the game projects he is working on.

   Production
         Original music is composed for each game.  The MIDI workstations for
         the staff composers consists of a Mac Quadra 630 running Mark the
         Unicorn Performer software, a Kurzweil K2000 keyboard, one or more
         Emu Proteus modules, Roland JV 1080 sound modules, and Ensoniq DP/4
         effects processor, and Alesis Quadraverb II, and Alesis Quadraverb
         GT and an Alesis D4 drum module.  Each workstation also has its own
         MIDI interface, mixer, compressor and equalizer.  For variety, there
         is also a Roland JD-990, a Korg 01W/FD, and Ensoniq SQR and two Akai
         S1100 samplers and an Alesis ADAT digital multi-track tape machine.

         In addition to the Mac Quadra, each composer also has a '486 or
         Pentium PC.  The PCs are used for hard disk recording and editing,
         using custom hardware and software that generates files for the
         data compression system.

         All music must be uniquely arranged and recorded for a game, even
         when the music is based on a film or television score.  For example,
         in Star Trek the Next Generation pinball, which was arranged and
         scored by Dan Forden, the music package included an arrangement
         of the show's well known theme (based on the movie theme by Jerry
         Goldsmith) as well as several original Forden compositions.

         Sound effects for games fall into three basic categories:  sound
         effects pulled from CD, field recordings and sounds taken from film
         or video soundtracks.  Williams owns several sound effects libraries.
         A portable DAT recorder is used to make field recordings.  Many
         sounds, such as those used in the Indiana Jones pinball, are taken
         from sound effects submixes of film soundtracks.  Regardless of the
         source, almost every sound effect is custom edited and processed to
         work in the context of the game.

         Speech and voice effects are also an important part of the sound
         package for any game.  Anywhere from a fourth to a third of the
         available memory space is used for speech and voice effects which
         inform and entertain the player.  This is especially important in
         games based on movies or other high profile themes.  For example,
         each major cast member of the television series Star Trek the Next
         Generation recorded material expressly for the pinball game.  Vocal
         talent is also recorded in the isolation booth located in the
         Williams sound department studio.  This was the case with all of the
         screams, grunts and moans heard in the Mortal Kombat II and 3 video
         games.

         Because WIlliams sound designers now work in a recording studio using
         professional tools, as opposed to programming music and sounds in
         assembly language, the time to complete a sound package for arcade
         pinball and video games has been cut roughly in half.  More import-
         antly, the sound quality has been drastically improved with the
         advent of DCS.