Issues related to collection of response data
Remote testing, as understood by this Task Force, involves the collection of data ("responses") from research participants interacting with a response device such as a paper form, web survey, tablet app, or VR game. This article attempts to describe some considerations that might apply to the collection and processing of response data.
Comparison to in-lab response collection (see also Task Performance. )
Assuming that appropriate hardware/software resources can be provided to the participant, the types of response data that can be collected during remote testing do not, in principle, differ from those available during in-lab testing. However, the types of response data that are most easily accessed will depend on the remote-testing platform and the types of tasks the platform is intended to implement.
Thus, a superficial but useful distinction can be made between two major types of tasks:
- survey-based tasks include a series of different question/answer (or stimulus/response) pairs
- trial-based tasks present a repeating series of similar stimulus/response pairs
For example, a typical survey might include a series of questions and question-specific response choices:
- How loud would you consider your workplace? [Scale of 0-100]
- How often do use hearing protection at work? [Five-point Likert scale, "Never" to "Always"]
- Does your workplace provide earplugs? [Yes / No]
- List the main noise sources present in your workplace: [free response]
A typical trial-based task would present the same type of trial repeatedly to assess a distribution of responses:
Note that either type of task could be administered in the lab or through remote testing. Many web-based platforms, however, are oriented primarily toward survey-based tasks. Trial-based tasks are more often implemented as standalone programs (e.g. MATLAB, PC, or tablet apps). Luckily this is not likely to be an issue for remote testing; most trial-based tasks can be easily reframed as survey-based tasks (by treating each "trial" as a survey "question"), although platforms vary in their support for common trial-based approaches such as randomized presentation order, adaptive presentation (using performance to select the next trial), etc.
Another difference between survey- and trial-based tasks has to do with whether individual participants complete the task once (as typical for a survey) or many times (as typical for trial-based tasks). Different considerations may apply to data handling (managing one vs. many data files per participant), counter-balancing conditions across repeated trial-based runs, randomizing question order across survey versions assigned to different participants, etc. Platforms may vary in their suitability for administering a survey task once to each of many (possibly anonymous) participants versus tracking a smaller number of participants across multiple sessions of trial-based tasks.
Types of response data that may be collected during remote testing
Most relevant to the purpose of this article are the types of response data collected during survey- vs trial-based tasks. There is no hard distinction between these, and most (all?) response types (multiple choice, rating scale, head pointing, pupil dilation) could, in principle, be used in either survey- and trial-based tasks. However, certain types of responses are commonly encountered in survey-based tasks, and these are the most commonly supported by many remote-testing platforms.
Note that the availability of specific response devices (buttons, sliders, etc) and response data types may be limited by platform. For example, Gorilla (seePlatforms) supports the following response types:
- Continue Button
- Response Button (optionally featuring Text, Paragraph, or Image content)
- Rating Scale/Likert
- Response Slider
- Keyboard Response (Single or Multi key press)
- Keyboard Space to Continue
- Response Text Entry (Single/Multi line / Area)
Some of these options can provide immediate response processing (e.g. response recorded when button clicked), which may support some degree of timing data or even conditional/adaptive processing. Other response types (e.g. text entry) require a second step, such as "click to continue" to record the response.
Other platforms, particularly non-browser platforms such as PC or tablet apps, may offer a wider range of response types, including:
- Touch Response (one or two dimensions)
- Multi-touch / Gesture Response (e.g. swipe left or right)
- Tilt / Acceleration Measures
- Special Hardware Support
- Camera or Depth Camera
- Tracked Controllers (head-mounted display / VR touch controllers)
- Physiological Sensors (heart rate, GSR, EKG, EEG, pupilometry, eye-tracking etc)
Platforms that provide additional support for accessing on-device sensors or hardware controllers include:
- Most PC or custom app frameworks (MATLAB, Windows, Android Studio, xCode, Unity, etc.)
- Some commercial platforms for physiological measurement, audiology, telehealth
Speech / Audio response collection
In-lab testing of speech perception (for example) often combines open-set responding ("Repeat the word BLUE") with in-person scoring by a human observer. Some platforms allow synchronous interaction between experimenter and remote participant which can support a similar approach. However, low quality audio/video streaming, dropouts, or distortion might disrupt accurate scoring. A few approaches may be used to support open-set data collection:
- Audio or AV recording and storage of responses for later verification
- Potential challenges: defining the response window, storing/transmitting audio/AV data files, ensuring participant privacy.
- Redundant response collection or self-scoring after feedback (e.g. "Say the word, then type it", or "Check box if you were correct")
- Potential challenges: reconciling mismatched responses
Response Calibration (see also Calibration
Although most survey-type responses should interpretable in an absolute sense and thus require no calibration to determine value, some continuously variable response data (for example, touch displays, tilt/force sensors) may require psychophysical or hardware calibration. See Calibration for more details.