How to Detect Bots and Duplicate Responses in Survey Panels

Guide · by Licrat

The moment a survey carries an incentive, it attracts people — and software — whose goal is the reward, not the answer. Online panels concentrate that pressure: many respondents, real money on the line, and an open door to anyone who finds the link. Bots, survey farms, and respondents entering the same survey two, five, or ten times are the predictable result.

Unlike speeders and straight-liners, who are usually genuine but lazy, this category is often adversarial. That changes how you have to think about detection.

The two problems are different

Duplicates are one respondent appearing more than once. Sometimes it's innocent — someone refreshing and re-submitting. Often it isn't: a person re-entering to collect the incentive again, or a farm running the same script repeatedly. Duplicates inflate your N with correlated answers, which is worse than random noise because it manufactures false confidence.

Bots are non-human or scripted submissions. Modern ones are good — they fill required fields, respect formats, and finish in plausible time. The crude tell-tales of blank fields and impossible timestamps only catch the lazy ones.

Detecting duplicates

The reliable approach is a fingerprint: a stable identifier derived from the attributes of a submission, so that two responses from the same source collapse to the same key even when the obvious fields differ.

You need to check duplicates at two levels:

Within a batch (intra-batch): the same fingerprint appearing twice in the dataset you're cleaning right now.
Across batches (inter-batch): a fingerprint you've already seen in a previous wave or export.

The second one is the one people forget, and it's where serial incentive-farmers live — they come back next wave, not within the same upload.

Open-ended text gives you a second, independent check: near-identical or copy-pasted answers across supposedly different respondents are a strong duplication signal even when other fields have been varied to dodge detection.

Detecting bots

No single signal proves a bot. The robust method is to build a profile from several weak signals and act on the combination:

Gibberish in open-ended text. Bots and farms struggle most with free text. Keyboard mashing ("asdfjkl"), irrelevant boilerplate, or copy-pasted filler in a comment box is one of the most reliable individual signals you have.
Uniform timing. Humans are irregular — they slow on hard questions and speed through easy ones. A submission where every page takes a suspiciously even amount of time often isn't a person.
Attention-check failures. A trap question with an explicit instruction ("select 'Strongly disagree' here") cleanly separates respondents who are reading from those who aren't — human or otherwise.

The principle is the one panel-quality teams rely on: confirm a suspicion with independent evidence. A fast response alone proves little; a fast response with a failed attention check and gibberish in the open-end is a clear reject.

Why platform tools only get you partway

Most panel and survey platforms ship some defences — reCAPTCHA scoring, panel-side identity and location checks, ID exclusions. They help, and you should turn them on. But they have real gaps. CAPTCHA and bot scores do little against human click-farms. Identity checks live on the panel's side and don't travel with your exported data. And the rules differ from platform to platform, so a dataset stitched together from two sources gets cleaned two different ways.

What's missing is a single, consistent layer that looks at the responses themselves, the same way, wherever they came from.

Scoring it consistently

Winnow covers this category with four of its six deterministic signals — duplicate detection by fingerprint (both intra- and inter-batch), gibberish in open text, uniform timing, and attention-check failures — and combines them into one quality_score and an explicit list of flags. Because it's deterministic, the same submission always yields the same verdict; because the flags are explicit, you can see exactly why something was rejected rather than trusting a probability.

It works on your responses regardless of which platform or panel they came from, so a multi-source dataset gets one consistent standard instead of three.

Get a free API key →