Since the 2016 Supreme Court decision in Tyson Foods, Inc. v. Bouaphakeo, plaintiffs in class action lawsuits have increasingly relied on statistical evidence to prove liability on a class-wide.
Consistent with this trend, reliance on statistical evidence collected through surveys has become more common in class action litigation. While survey-based evidence can offer a valid approach to address class-wide questions, for this evidence to be reliable, the collection and interpretation of the data must conform to statistical standards.
In some class action litigation, surveys are used to support claims that a class of consumers was misled by certain representations. For example, in cases brought under the Fair Debt Collection Practices Act, communications between financial institutions and borrowers may be called into question.
A survey may be undertaken to assess the impact of the communications at issue, and this survey may be used as an empirical tool to assess perceptions and attitudes for a sample of relevant consumers.
An initial step in the survey process is identifying the correct population of interest (i.e., relevant consumers) to sample. Sampling involves selecting a subset of individuals from this population of interest. One common error in the development of survey-based evidence is population misspecification. Population misspecification occurs when a sample is drawn from the wrong population. For example, the relevant consumers are homeowners, but the sample is drawn from a population of renters. This type of error may result from a failure to understand the question at issue, or when the question is well understood, from targeting the wrong group of individuals to address the question of interest.
Population misspecification is of particular concern when the population of interest cannot be surveyed directly or the perception of interest is defined based on a legal construct. For example, cases brought under the Fair Debt Collection Practices Act may claim communications between a lender and borrowers are misleading. The legal standard in these cases is the potential to deceive the “least sophisticated consumer”—a figurative individual who has some basic context, information, and understanding. In these cases, a random sample of consumers is unlikely to match the profile defined by the least sophisticated consumer. To bridge the information gap between what respondents know and what a least sophisticated consumer is assumed to know according to the applicable legal standards, background information related to the communications at issue may be provided to survey respondents. Providing respondents background information, however, may be insufficient to generate reliable survey data. The effectiveness of this approach is dependent on the accuracy and completeness of the information provided to respondents as well as the survey’s ability to educate the respondents.
Additionally, there is the risk of potential selection error. The goal of sampling is to select a subset of the population of interest that is representative. If only individuals with a specific profile respond to the survey, any analyses based on the survey may have biased results, as responses will be not representative of the entire population of interest. Hence, it is important to understand whether those surveyed are representative of the entire population of interest.
A next step in the survey process is designing and implementing a questionnaire aimed at collecting the information of interest. To ensure the validity of survey responses, the questionnaire may include questions aimed at assessing response quality. For example, questions can be introduced to assess respondents’ level of understanding of background information and to determine whether it is reasonable to assume that respondents have a basic understanding of the relevant context. If the background information is too lengthy or complex, it may be determined that the survey design is not effective for putting the survey respondent in the situation of a consumer in the population of interest.
The reliability of survey responses also is directly dependent on the design of both open- and close-ended survey questions. Open-ended questions may be broad or interpreted broadly, which may increase response time and result in a wide range of views and opinions. Processing these responses can be challenging. Typically, verbatim responses can be processed and coded into data that can be analyzed. The reliability of coded data, however, may be compromised if processing verbatim responses requires a subjective evaluation by an analyst. Close-ended questions, on the other hand, restrict responses to a limited number of options, which narrows the breadth of the responses to the alternatives offered in the questionnaire. This helps avoid the bias associated with a subjective evaluation of survey responses by an analyst. Bias, however, may be introduced in the choice of the limited number of response options. Further, if response options are not exhaustive, survey respondents that are forced to choose among limited alternatives may choose an option that does not reveal accurately their perceptions.
Once the survey data are collected, statistics based on the sample survey responses can be developed to draw conclusions about the population of interest. By construction, a sample is a subset of the population of interest, and therefore subject to sampling error. Sampling error is the difference between the value of the parameter of interest in the population (for example, the mean) and the sample analog we calculate using data from the sample (for example, the sample mean). The size of this error can be quantified if probability sampling methods are employed, such as simple random sampling, stratified sampling, and clustered sampling.
Further, even rigorous design, implementation and analysis of survey data is not sufficient to conclude that the reported results derived from a survey are reliable. Tractability and quality of the survey design define the boundaries of the information collected by the survey. These boundaries also constrain the breadth of the conclusions that can be drawn from the survey analysis. In some instances, a technically valid and sound survey simply may not provide the empirical basis to support a proposition.
In sum, the reliability of survey evidence is dependent on its conformance with rigorous statistical standards, as well as boundaries inherent in the information collected by the survey.