Stimulus presentation

Audio Hardware

Earphones (see also Resources.Earphones )

Commercially available headphones will vary in their frequency response and ability to isolate external sounds. For example,Apple earpodshave somewhat poor ability to reproduce frequencies below 100 Hz and above 10 kHz, but within this range the reproduction accuracy is fairly good. If particularly low or high frequency responses are needed then the experimenter should consider shipping headphones known to have good frequency responses in the desired range to participants. If prefect frequency reproduction is not particularly important for the experiment then typical commercially available headphones may be sufficient. Additionally, earbuds tend to be worse at isolating environmental noise, so unless participants are guaranteed to be in a quiet environment in-ear headphones or closed back headphones may be better to ensure the experimental sounds are clearly audible relative to the participant’s environment. Additionally, wireless bluetooth headphones may receive interference from other bluetooth devices in the area, and as a result may lose segments of the auditory signal. This is undesirable, so participants should use wired headphones whenever possible.


Loudspeakers vary more in their ability to faithfully recreate stimuli and also interact with the acoustic characteristics of the listening environment. Generally speaking, small speakers with a single driver will not be able to recreate the full spectrum of sounds. In particular, laptop loudspeakers often suffer from poor quality low frequency reproduction, and as such should be considered with caution for most experiments. Encourage participants to avoid listening to loudspeakers in rooms that have bare walls and floors, as the reverberation from the room may interfere with hearing the stimuli. If the participant is expected to be a certain distance from the loudspeaker to control for level or speaker characteristics one option is to send something like a yoga or floor mat along with the speaker to show exactly where the participant should sit and where the speaker stand should be set up.

Sound Cards

The only way to precisely control audio levels or calibrate the frequency response of the device is to use a known combination of headphones/loudspeakers and a sound card. If the experimenter wants to provide all of the equipment necessary to complete an experiment then they can ship a whole computer/tablet with earphones or a loudspeaker to participants (e.g. PART). Alternatively, if the experimenter wants to control the auditory stimulus but allow participants to run research software on their own computer then an alternative is to ship an external sound card and earphones to the participant. In this case, the sound card and earphones can be calibrated together by the experimenter so that the level and frequency response of the audio hardware is known and can be precisely controlled in the experiment. The experimenter should provide easy to understand instructions for how to connect hardware to the participant’s computer and be available to help troubleshoot any issues the participant has. Relying solely on the participant’s computer is possible, albeit with less precise control of stimuli. One particular issue to watch for is that the standard sound drivers in Windows 10 try to shape the amplitude of sounds to avoid sudden onsets. This means it is possible for Windows to decide to ramp up the audio amplitude of a program during stimulus presentation, which is problematic for short (less than approximately 1 s) sounds, and in some cases may even render short sounds inaudible. In some cases, Windows OS power settings may affect this behavior (see, for example, Consider testing stimulus playback on a variety of computer platforms before running an experiment with participant supplied hardware to ensure that specific platforms will not interfere with stimulus playback.

Audio playback

Sound file formats

Sound files can be saved in many formats. For experimental purposes, most labs use wav files, which save the exact signal that will be sent to the audio device. These files are ideal for reproducing stimuli, but take the most data. If a large amount of audio needs to be sent through an online server to the participant you may want to consider some form of compression. The compression format you are likely most familiar with is mp3, which is a format designed to compress sound files by removing information that is unlikely to be important for what most people can hear in music. This compression is lossy, which means that it does not perfectly preserve all of the details of the audio signal. mp3 has been superseded by m4a/mp4. As an example, Matlab’s audiowrite command can write mp4 files, but not mp3s. These lossy compression formats are based on assumptions about what music usually sounds like and what people are capable of hearing, so they may create weird artifacts when attempting to compress more psychophysical stimuli, such as quiet or bandlimited sounds. An alternative would be to use lossless compression, such as flac. This will save some space, but whether the process of converting audio files to flac from whatever the lab normally uses is one the experimenter will need to decide.

Providing Audio Online

When listening to most media sources online the audio is streamed, which means that portions of the signal are sent to the listener as they listen. This ensures that the listener doesn’t need to download and entire file before they can start listening and saves bandwidth if the listener skips the song or stops listening partway through. The downside of this approach is that playback can proceed faster than the listener receives the signal, so it can lead to waits for the signal to buffer. In most experiments this is undesirable. An alternative is to send the entire sound file in advance, which is what most JavaScript based implementations, such asjsPsych and Matlab Web Serverdo. These programs pause while sound files are sent to the participant, and do not proceed until the participant has the entire file.Some experiments generate audio files on the fly. This is somewhat difficult in an online format, as the website backend has to be able to create sound files (something that is possible but with limited functionality in JavaScript, and by extenstionjsPsych) and then send those sound files to the participant. If the sound files to be created are particularly complex consider usingMatlab web server, because it can handle both of these needs. One thing to note is that HTML tries to be efficient with the files it sends. If an experiment renders a sound file and sends it to the participant, then next time a sound file with the same name is requested the web browser will assume we already have that file and play the one that was originally sent. There are ways around this by making the request for a sound file look unique every time the request is made, often by adding an irrelevant seed to the request such as datetime (Example needed here).If stimuli are short (less than 45 seconds in duration), the audio can be generated on the fly within the browser using theAudioBufferfunction in JavaScript. Code for stimulus generation and playback (via AudioBuffer) can be run (for example) within a jsPsych experiment using jsPsych’sevent-related callback functions. An example of using this functionality to implement a three-alternative forced choice modulation detection task can be foundhere. Thepsychophysical task exampleused this method as well.Some experiments require audio to go to a single ear. This can be readily achieved in the lab using hardware controls and routing signals to a mono output to the left or right ear, but such control is not possible without sending participants hardware. An easy alternative is to create a mono channel audio file, then combine that mono channel audio with a second channel of silence in a stereo recording. This will ensure that the audio goes to one channel of a stereo device, which will cause the sound to only play in one ear in a pair of headphones. (Matlab example needed here)

It may be useful to record experimental sessions. On Windows computers, one default audio input device is called ‘Stereo Mix’, which is essentially a recording of any audio the computer is playing. If you have a video or audio call open with a research participant recording from the Stereo Mix device will record the audio signal from the call. Note that you should have Informed Consent from the participant to make these recordings. These recordings can be used post-hoc to determine if the background noise in the participant’s test environment is acceptable and to re-analyze verbal responses provided during the experiment. It is also possible to record the audio the participant hears. This can be useful to determine if stimuli are distorted or to check that the participant was hearing the correct stimuli. This can be done by having the participant share through the video or audio call the browser the experiment is running in and their computer audio. That way, the audio stream the participant sends to the experimenter will contain what they were hearing, any environmental noise that their microphone picks up, and the participant’s verbal responses in the same audio stream.


Desktop screens

Modern computer monitors are capable of high-quality display, but there is a wide variety of video processing and rendering options at the operating system, video card, and hardware levels. Critical aspects of image rendering (e.g. colors and contrast to different stimuli from one another and from backgrounds, legibility of text, absence of video artifacts) should be checked on the test hardware prior to an experiment. Rendering accuracy can be affected by screen resolution and by anti-aliasing. With low resolutions, stimuli may appear pixilated and will lose some fidelity. Most modern computers default to reasonably good resolutions, but if a participant is using an older computer or their video drivers are misconfigured it is possible their display will have a low resolution. Anti-aliasing is a rendering technique which smooths transitions between adjacent pixels to make a picture look higher fidelity. This is usually a good thing, but variability in anti-aliasing across displays could alter the clarity of images or alter the readability of text. Modern screens often include different processing modes that are optimized for games or movies. These modes vary from manufacturer to manufacturer, but usually alter the throughput delay of the screen and may notably affect the color balance of images. If a known, fixed timing between auditory and visual events is desired, as is often the case in studies of audio-visual integration, it may be necessary to use the same calibrated hardware across participants. Additionally, keep in mind that video screens have a slower updating and refresh rate than audio devices (usually 60 Hz), so there will be some variability in the timing between visual and auditory events across screens.


Participants should be sitting so they are directly facing the screen. Stimuli should be designed to accommodate some back and forth sway of the participants’ head relative to the screen, as they will not be completely still while performing a task. If attending the visual stimulus is essential to the task it may be helpful to have the experimenter monitor eye and head position through a video call during the experiment. Have the participant look at a fixation cross at the center of the screen and note the angle of the head and eyes on video, then monitor for obvious deviations from this position during stimulus presentation.

Tablet displays

Concerns about screens and placement apply to tablets, with the added concern that participants have more degrees of freedom for orienting their eyes and head relative to the display. Make sure stimuli are clearly visible even on small displays that are held far from the head, and consider what the experimenter should do if a participant accidentally drops the tablet or looks away while adjusting position.

Head mounted displays

Another approach to visual stimulation is the use of commercially available head-mounted displays (HMD) intended for virtual reality (VR), such as Oculus Rift, Quest, HTC Vive, etc. Some advantages of this approach include known and reproducible placement of the display, head (and possibly eye) tracking. Although the field of view is generally much narrower than natural vision, large visual displays can be simulated by tracking the head position and updating the eye-fixed display appropriately. This functionality is built in to these devices, and can be exploited using standard 3D game-programming techniques in development platforms such as Unity 3D and Unreal Engine. Calibration of video, tracking, and audio/video sync with HMD is beyond the scope of this article. However, although the capabilities of a specific device type and model should be assessed, modern manufacturing tends to produce units with very similar performance (as is the case for many tablet devices), so it may be reasonable to assume a standard level of performance, and specific calibration–in the field–of each unit may not be required.Most head-mounted displays also feature some means to deliver audio stimulation, either through earphones attached to the unit or small HMD-mounted loudspeakers. Where possible, earphones with good passive attenuation and a direct electrical (rather than acoustical) path to the ear are preferred. These may interface directly with the HMD or with a host PC, in which case audio concerns are similar to those discussed above. Bear in mind that software-based "3D audio" features may distort the binaural and spectral features of the audio in attempt to compensate for head movements and simulate virtual audio sources. In general, commercial 3D-audio algorithms may not be well suited to research purposes, and investigators should consider whether "3D" audio is important to the goals of the study or should be disabled. Convincing (i.e. true) 3D audio can also be achieved using loudspeakers, although the HMD will interfere to some degree with the spatial acoustics, particularly at high frequencies.

In each case, consider whether device type specified, provided, BYO

Many of the above issues can be handled with good precision on known devices. If participants use their own devices, consider adding quick checks at the beginning of the experiment to ensure essential details are visible. It may also help to ask participants if they use any additional video processing (e.g. accessibility options or night display modes) and to disable that processing if it interferes with the experimental stimuli.

Compatibility with clinical devices

Hearing aids

Earphones are generally not an option for aided listening. If a participant’s audiogram is known in advance stimuli presented through earphones can be amplified to improve audibility when listening without hearing aids. The experimenter should take care to check that amplification does not produce uncomfortably loud stimuli. Participants with hearing aids that fit entirely in the ear canal may be able to use their hearing aids while listening through earphones, although this should be tested to check for undesired physical or acoustic interactions between the earphone and the hearing aid. Loudspeaker presentation is an option for aided listening, although the experimenter should take care to ensure that participants are oriented relative to the loudspeaker to avoid confounding differences in hearing aid directionality. Some hearing aids also have streaming capabilities through Bluetooth, which could enable a direct connection from a computer to the hearing aid. It may be helpful to obtain permission from participants to contact their audiologists for information on how the device is programmed, as various settings (noise reduction algorithms, directional microphones, compression) will alter how acoustic stimuli are processed across individuals.

Cochlear implants

Similar to hearing aids, headphones do not provide good aided listening to participants with cochlear implants. However, there are published studies that used circumaural headphones to present stimuli (Grantham et al., 2008andGoupell et al., 2018). Loudspeaker presentation is an option, and some cochlear implants have direct connection audio jacks and/or Bluetooth streaming capabilities. Implants tend to process a narrower frequency range than acoustic hearing, so at-home audio devices (e.g. laptop speakers) may have a sufficient frequency response in the range that the cochlear implant processes, but this should be experimentally verified.

Other devices

The advice for hearing aids and cochlear implants may generalize to other assistive devices (e.g. bone-anchored hearing aids, auditory brainstem implants), but it is up to the experimenter to determine whether stimuli are being heard as intended. If you have experience with remote testing specific devices please share your advice here.