Huffington Post / YouGov Public Opinion Polls

Methodology

The HuffPost/YouGov poll is a collaborative effort of the Huffington Post and YouGov, who share responsibility for survey content and the costs of data collection. Each survey consists of approximately 1,000 completed interviews among U.S. adults using a sample selected from YouGov’s opt-in online panel of all 50 states plus the District of Columbia to match the demographics and other characteristics of the adult U.S. population.

This methodology differs from a traditional telephone poll in a number of ways. Typically, telephone polls work by randomly sampling working numbers (or numbers sampled from an official list of registered voters). For polls conducted on the internet, there is no comparable mechanism for drawing a random sample of all email addresses or other online accounts. YouGov approaches this problem by recruiting a large panel of internet users who have agreed to participate in online surveys. This panel is itself not representative of the U.S. population, but samples are drawn from that panel to match a random sample of respondents drawn from the Census Bureau’s American Community Survey (more information available below).

Questions asked on the HuffPost/YouGov poll are administered as part of a daily omnibus process on YouGov’s internet panel. These surveys are conducted in English only. In most cases, the panelists are directed to various other surveys after answering the questions presented here. All results are presented with actual question wording, the sample size, and the margin of sampling error (see below for further details) used in the particular survey being reported.

The target population for the survey is adults aged 18 or higher residing in the U.S. Sample respondents are drawn from YouGov’s panel, an opt-in internet panel that is recruited primarily through internet advertising. All panelists are, by necessity, at least occasional internet users.

The sample is selected to approximately match the joint distribution of age, race, gender, and education in the 2016 American Community Survey (ACS). This is a purposive, rather than random, method of selection, designed to eliminate selection bias and non-coverage of the target population in the panel from which respondents were drawn. Email invitations are sent to panelists based upon their demographics. As panelists arrive at YouGov’s web site, they are assigned to take a survey so that demographics of each sample approximately match that of the U.S adult population, not that of either the internet population or the panel.

Some types of people are more likely to join online panels than others — for example, a 32 year old white female with a college degree is more likely to join the panel than a 64 year old Latino male with a high school degree. For this reason, each individual is given a “propensity score” which helps determine whether they will be included in a sample for a given poll. People with a high propensity score are less likely to be included in a sample, while people with a low propensity score are more likely to be included. Propensity scores for sample inclusion are estimated using a logistic regression. The predictors in the propensity score model include the demographics described above (age, race, gender, and education) plus marital status, number of children, household ownership, frequency of internet use, interest in politics, and past voting behavior. (The variables for each study are selected using cross-validation. The estimated propensity score equation is available upon request.) The matched sample is then weighted using propensity score deciles, as described in Ansolabehere and Rivers (2013).

The method of inviting respondents means that invitations are shared by multiple studies, so it is impossible to identify a participation rate for individual studies. Per survey break-off and skip rates and monthly average participation rates are available on individual survey pages. All respondents are “cookied” to discourage attempts by persons to take the survey multiple times.

Many interpret the “margin of error,” commonly reported for public opinion polls, as accounting for all potential errors from a survey. It does not. There are many non-sampling errors, common to all surveys, that can include effects due to question wording and misreporting by respondents. In a telephone survey, which begins with a random sample of phone numbers, such errors can occur due to those not covered by the sample, those who cannot be reached and those who do not respond to the survey. With YouGov’s sampling methodology, errors can occur due to a failure to fully correct for the non-representative nature of the online panel (or, more specifically, due to variables not included among the matching and weighting variables).

YouGov reports a margin of sampling error for its surveys because, like all polls, the results are subject to random variability that is an inherent part of the sampling process. The reported margin of error is computed using the standard error formula described in Rivers and Bailey (2009). As noted by Gelman and Little (1997), this estimate is conservative, in the sense of slightly over-estimating the sampling error. Because we adjust for the impact of weighting, the reported margins of error for our surveys will be higher than those for survey of equivalent size which fail to adjust for weighting. The margin of error for subsamples will be larger and is reported separately for subsamples based on substantially smaller sample sizes.