A New Metric to Predict Listener Envelopment Based On Spherical Microphone Array Measurements and Higher Order Ambisonic Reproductions

Open Access
Dick, David Avi
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
November 17, 2017
Committee Members:
  • Michelle Celine Vigeant, Dissertation Advisor
  • Michelle Celine Vigeant, Committee Chair
  • Daniel Allen Russell, Committee Member
  • Victor Ward Sparrow, Committee Member
  • John F Doherty, Outside Member
  • Bill Rabinowitz, Special Member
  • Concert hall acoustics
  • Architectural acoustics
  • Ambisonics
  • Spherical microphone array
  • Loudspeaker array
  • Listener envelopment
The objective of this work was to create a new metric to predict listener envelopment (LEV), the sense of being surrounded by the sound field, based on 32-channel spherical microphone array measurements taken in a number of venues and a series of listening tests. A spherical microphone array was used to investigate LEV because it can be used for a) high resolution spatial analysis of the sound field in full 3D via beamforming techniques, and b) subjective listening tests using 3D reproductions of the sound fields over a loudspeaker array via Ambisonics. This work is comprised of three separate studies: a first study validating the spherical microphone array measurement system, a second study investigating LEV in a 2,000-seat concert hall, and a third study in which a new metric is proposed to predict LEV based on listening tests using measurements obtained in seven additional halls. A study was conducted to validate the spherical microphone array measurement system. Spatial room impulse response (IR) measurements were taken in a 2500-seat auditorium to determine how room acoustic metrics measured with a spherical microphone array compare to those measured with the traditional microphone setup (an omnidirectional and figure-8 microphone pair). Measurements were obtained at six receiver locations with three repetitions each to evaluate repeatability. The metrics considered in this study were: reverberation time (T30), early decay time (EDT), clarity index (C80¬), strength (G), lateral energy fraction (JLF) and late lateral energy level (LJ). For the spherical array measurements, the omnidirectional (monopole) and figure-8 (dipole) patterns were extracted via spherical harmonic beamforming. The measurements were found to be consistent both across repetition and microphone configuration. The results from this study indicate that spherical microphone arrays can be used to both measure existing LEV metrics, and to develop a new metric to predict LEV. An LEV study was conducted using spherical microphone array IRs obtained in a 2,000-seat concert hall in several receiver locations and hall absorption settings. The IRs were analyzed using a 3rd order plane wave decomposition (PWD) beamformer. Additionally, the IRs were convolved with anechoic music and processed for 3rd order Ambisonic reproductions and presented to subjects over a 30-loudspeaker array. Instances were found in which the energy in the late sound field did not correlate with LEV ratings as well as energy in a 70 to 100 ms time window. Follow-up listening tests were conducted with hybrid IRs containing portions of a highly enveloping IR and a highly unenveloping IR with crossover times ranging from 40-140 ms. Additional hybrid IRs were studied wherein portions of the spatial IRs were collapsed into all frontal energy with crossover times ranging from 40-120 ms. The tests confirmed that much of the important LEV information exists in the early portion of these IRs. In a final LEV study, spherical microphone array IRs were obtained in seven additional halls of various sizes and shapes. The IRs were used for listening tests that included stimuli that were presented as-measured, which included level differences, stimuli that were equalized for level differences, and hybrid stimuli generated by combining portions of enveloping IRs and unenveloping IRs. A new metric was developed named mid-late spatial energy, J_S, by integrating energy from a 3rd order PWD of the room IRs as a function of frequency, azimuthal angle, elevation angle, and time, and adjusting the integration limits to maximize correlation between integrated energy and LEV ratings. The difference in overall level between halls was found to be highly correlated with the perception of LEV, but for level-equalized stimuli the correlation was maximized by integrating energy from 60 ms to 400 ms, rejecting sound from the front ±20° in azimuth and rejecting sound from ±70° in azimuth behind the listener. This new metric has a higher correlation with LEV ratings than the currently used metric of late lateral energy level, L_J.