Is Cèrcol based on the Big Five?

Yes. Cèrcol measures personality using the OCEAN model (Big Five) via the IPIP public-domain item pool (Goldberg et al. 2006). The 12 team roles are derived from the AB5C circumplex (Hofstee et al. 1992) and team composition research (Bell 2007; Neuman & Wright 1999).

What makes Cèrcol different from Belbin or DISC?

Cèrcol's roles are grounded in the Big Five (OCEAN) personality model using the IPIP public-domain item pool. The scoring pipeline is fully open source and auditable. Witness Cèrcol uses forced-choice adjective selection — not Likert scales — to eliminate social desirability bias in peer assessment. Unlike Belbin or DISC, all items are public domain and the entire methodology is published and citable.

Is the personality assessment free?

The New Moon Cèrcol (10 items, Big Five snapshot) and First Quarter Cèrcol (60 items, IPIP-NEO-60, 30 facets) are always free — no account required. The Full Moon Cèrcol (120 items, IPIP-NEO-120, Witness peer assessment, cognitive ability measure) requires a one-time payment.

What is Witness Cèrcol?

Witness Cèrcol is a peer personality assessment where someone who knows you well rates you using a forced-choice adjective selection method — picking the best-fit and worst-fit adjective per round from a set covering all five OCEAN dimensions. Forced choice eliminates the social desirability bias that affects standard Likert-scale peer ratings. Dimensions where your self-rating and peer ratings diverge by more than 0.8 standard deviations are flagged as potential blind spots.

How are the 12 team roles derived?

The 12 roles are derived from the AB5C circumplex (Hofstee, De Raad & Goldberg 1992), covering all six intersections of the three team balance dimensions (Presence/Extraversion × Bond/Agreeableness × Vision/Openness) at both poles. The selection of these three dimensions as requiring team-level balance is grounded in Bell (2007) and Neuman & Wright (1999). Discipline (Conscientiousness) and Depth (Neuroticism) modulate role expression but do not define team balance.

No account is required for any instrument. During assessment, no personal data is collected — only anonymous scores are logged. Data is stored on our own servers (Hetzner Online GmbH). No third-party analytics. No data is shared with or sold to third parties.

Is Cèrcol based on the Big Five (OCEAN)?

Yes. Cèrcol measures personality using the OCEAN model (Big Five) via the IPIP — the International Personality Item Pool, a public-domain collection validated in thousands of published studies. The five dimensions are Presence (Extraversion), Bond (Agreeableness), Vision (Openness), Discipline (Conscientiousness), and Depth (Neuroticism). Because the IPIP is public domain there are no licence restrictions: the full item pool and scoring logic are open and citable.

How is Cèrcol different from Belbin, DISC, or StrengthsFinder?

Three things set Cèrcol apart. First, the items come from the Big Five (OCEAN), the most replicated personality model in academic research — not a proprietary framework. Second, the full item pool (IPIP) and scoring pipeline are public domain and auditable; there is no black box. Third, the Witness peer assessment uses forced-choice adjective selection instead of Likert scales, which eliminates the social desirability bias that affects most 360-feedback tools. Belbin and DISC use closed, proprietary methodologies.

What are blind spots in team personality assessment?

A blind spot is a personality dimension where how you see yourself and how others see you diverge significantly — more than 0.8 standard deviations apart. Cèrcol's Witness peer assessment detects blind spots by comparing your self-report with forced-choice adjective ratings from people who know you. Blind spots are neither good nor bad: they show where your self-perception and others' experience of you don't match, which is often more actionable than the score itself.

Social desirability bias in personality tests: how it distorts results and what to do

If you have ever taken a personality test and found yourself wondering whether to answer as you honestly are or as you would like to be, you have encountered social desirability bias firsthand. This tendency — to present oneself in a favourable light when responding to questionnaires — is one of the most well-documented problems in personality assessment, and one of the most persistent.

Understanding what social desirability bias is, how much it actually distorts personality test results, and what methodological approaches can reduce it is essential for anyone using personality data seriously.

Social desirability bias is the tendency to give answers that are likely to be viewed favourably by others — or by oneself — rather than answers that accurately reflect reality.

In the context of personality assessment, it operates at two levels. The first is impression management: consciously adjusting your answers to present a better image. A job candidate who wants to appear conscientious rates themselves highly on organisation and reliability, even if this overstates their actual tendencies. The second is self-deceptive enhancement: genuinely believing a more positive version of oneself, without conscious awareness of the distortion. This second form is more insidious because it cannot be eliminated simply by telling participants to be honest.

Both forms have been studied extensively since the 1950s. The foundational work by Edwards (1957) established that the social desirability of a statement is one of the strongest predictors of endorsement rate — people agree with socially desirable statements not just because they are true, but because they are desirable. Subsequent decades of research have confirmed this finding across cultures, contexts, and assessment instruments.

Acquiescence Bias: Why People Agree With Everything on Personality Tests

Social desirability bias has a close cousin that compounds its effects in Likert-scale assessments: acquiescence bias. Acquiescence is the tendency to agree with statements regardless of content — to say "yes" more often than "no," to tick "agree" or "strongly agree" more than the content warrants.

In personality questionnaires that use Likert scales (strongly disagree → strongly agree), acquiescence systematically inflates all scores. If you tend to agree with statements, you will score higher on every dimension you are rated on. This makes profiles appear more extreme in the positive direction than they actually are, and it inflates apparent similarities between people who may actually differ substantially.

Acquiescence and social desirability interact: both push responses toward the upper end of the scale for positively-worded items, compounding the distortion. For an explanation of the scoring-level protections — reverse coding, negative items — that partially mitigate this, see how personality test scores are calculated.

"Social desirability is not merely a nuisance variable — it accounts for a substantial and systematic portion of variance in self-report personality measures, particularly for dimensions perceived as socially valued."
— Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson et al. (Eds.), Measures of personality and social psychological attitudes.

The question of how much social desirability bias distorts personality scores has been studied by correlating scores on social desirability scales (instruments designed to measure the tendency to respond in socially desirable ways) with scores on standard personality measures.

The results are substantial. Correlations between social desirability and Agreeableness typically range from .30 to .50 — meaning that a significant portion of variance in Agreeableness scores reflects the desire to appear agreeable, not actual agreeableness. Conscientiousness shows similar effects, with correlations of .25–.45. Neuroticism (Depth) is inversely affected: people systematically underrate their emotional instability because admitting it is socially undesirable, producing negative correlations of similar magnitude.

These are not trivial effects. They mean that in a standard Likert-scale personality assessment, the scores you see are a mixture of the trait you are trying to measure and the person's general tendency toward self-presentation. Separating these two is difficult — and in high-stakes contexts (hiring, selection, high-visibility development programmes), the motivation to present well is highest, and the distortion is most severe. For the hiring-specific context, see personality testing in hiring: what is legal and what is ethical.

Not all dimensions are equally vulnerable. The pattern is consistent across studies:

Bond (Agreeableness) and Discipline (Conscientiousness) are most inflated by social desirability. Both involve traits that are widely valued: being kind, cooperative, reliable, organised. People rate themselves higher on these dimensions not necessarily because they are higher, but because the ratings carry social implications they want to endorse.

Depth (Neuroticism) is most deflated: people systematically rate themselves as less anxious, less irritable, and less emotionally reactive than their actual experience warrants, because admitting emotional instability is socially costly.

Presence (Extraversion) shows moderate effects. Extraversion is valued in many professional contexts, producing mild inflation, but the observable nature of the dimension makes gross distortion harder to sustain.

Vision (Openness) also shows moderate effects, particularly for intellectual curiosity facets — people like to see themselves as curious and open-minded.

Relative bias magnitude across Big Five dimensions. Neuroticism is actively suppressed (people hide it); Agreeableness and Conscientiousness are inflated (people project them). Extraversion and Openness show comparatively modest distortion.

This pattern has direct implications for how to interpret DISC, 16Personalities, and other Likert-scale assessments that teams commonly use. See DISC vs Big Five: why four styles aren't enough and 16Personalities vs Big Five: the viral test that gets it half right for the specific distortions in each framework.

Likert Scale vs Forced-Choice: Comparing Bias Vulnerability

Feature	Likert Scale	Forced-Choice
Response format	Rate each item 1–5 or 1–7	Choose one from each pair
Acquiescence bias	High — can agree with everything	None — choice is forced
Social desirability	High — easy to select high-valence options	Reduced — pairs matched on valence
Score type	Normative — absolute level per trait	Ipsative — relative priorities between traits
Ease of faking	High in transparent items	Lower — valence parity makes the "right answer" unclear
Cognitive demand	Low	Moderate — genuine choice required
Best use	Research, low-stakes development	Selection, high-stakes assessment, peer ratings

The most effective methodological response to social desirability bias in personality assessment is forced-choice design. Rather than rating each item independently on a scale, respondents are presented with pairs (or triples) of items and asked to choose which best describes them.

Forced-choice works because it makes social desirability harder to act on. If both items in a pair are positive — "warm and empathetic" versus "precise and thorough" — there is no obviously socially desirable answer. You are forced to reveal which of two valued traits more accurately describes you. The choice reveals relative priorities between dimensions, rather than absolute levels on each dimension in isolation.

The psychometric literature on forced-choice methods, reviewed by Stark et al. (2005) and more recently by Brown and Maydeu-Olivares (2011), confirms that forced-choice assessments reduce social desirability inflation substantially. For the complete technical explanation of how this works in Cèrcol's Witness instrument, see forced-choice personality assessment: why it produces more honest data.

Cèrcol's Witness instrument uses a forced-choice format specifically designed to reduce social desirability bias in peer ratings. Witnesses (peer assessors) are presented with pairs of personality adjectives — drawn from the AB5C circumplex, which maps adjectives onto Big Five intersections — and asked to choose which word better describes the person they are rating.

Because the Witness is rating someone else, self-presentational motives are less directly operative than in self-report. But Witnesses still have social incentives to rate the target favourably (friendship, collegiality, desire to provide positive feedback). Forced-choice format reduces this tendency by making favourability-maximisation genuinely difficult: if both options are positive, you cannot simply choose the "nicer" answer without revealing which trait you genuinely perceive in them.

The result is Witness data that more accurately reflects actual perceived personality rather than generalised positive impression. For the full case for why peer data is a necessary complement to self-report, see why self-assessment alone isn't enough: peer personality feedback. Anonymity in peer ratings also matters — see anonymity in personality assessment: why it matters for the evidence.

Honest Caveats: What Forced-Choice Design Cannot Fully Fix

Forced-choice design is not a complete solution. The primary limitation is that forced-choice data is ipsative: scores reflect relative priorities between dimensions, not absolute levels. This makes certain types of comparison — for example, comparing one person's absolute Agreeableness score to another's — methodologically complex. The research on how to handle ipsative data appropriately is ongoing, and Cèrcol's interpretation framework accounts for this.

Additionally, forced-choice design does not eliminate motivated distortion by highly determined participants. Someone who strongly wants to present as conscientious can still systematically choose Discipline-related adjectives over alternatives. For the full research on what faking looks like in practice, see can you fake a personality test?. Forced-choice raises the cognitive cost of strategic responding, but it does not make it impossible.

The honest position is that no assessment design fully eliminates response biases. What forced-choice does is reduce the most common and most impactful biases — acquiescence and social desirability — to a level where the signal-to-noise ratio of the resulting data is substantially better than with standard Likert-scale approaches.

Social desirability bias systematically inflates scores on valued traits (Bond, Discipline) and deflates scores on stigmatised traits (Depth) in standard Likert-scale personality assessments. Acquiescence bias compounds this by pushing all scores toward agreement. These are not minor technical issues — they substantially reduce the validity of self-report personality data, particularly in high-stakes contexts.

Forced-choice design, as used in Cèrcol's Witness instrument, addresses these biases by making it structurally difficult to simultaneously maximise social desirability across all dimensions. The result is more honest, more differentiated, and more useful personality data. For a ranked comparison of which free assessment tools handle bias best, see the best free personality tests for teams in 2026.

Social desirability bias is not a minor inconvenience — it systematically inflates Bond and Discipline scores and deflates Depth scores in every standard Likert-scale assessment. No amount of "please be honest" instructions changes the structural incentives.

Cèrcol addresses this at the instrument level, not the instruction level. The Witness peer assessment uses a forced-choice format in which adjective pairs are matched for social desirability value — making it structurally hard to present an idealised picture without making genuine personality choices. The forced-choice design is grounded in the AB5C circumplex (Hofstee, de Raad & Goldberg, 1992) and calibrated against the IPIP item bank.

The self-report Big Five assessment uses Likert scales — with reverse-coded items and scale-level protections — and is free at cercol.team. Adding Witness ratings from peers produces the multi-perspective picture that reveals where social desirability is likely distorting the self-report. Read the full scientific design to see exactly how both instruments handle bias.

References
Edwards, A. L. (1957). The social desirability variable in personality assessment and research. Dryden Press.
Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson et al. (Eds.), Measures of personality and social psychological attitudes (pp. 17–59). Academic Press.

Social desirability bias in personality tests: how it distorts results and what to do

Acquiescence Bias: Why People Agree With Everything on Personality Tests

Likert Scale vs Forced-Choice: Comparing Bias Vulnerability

Honest Caveats: What Forced-Choice Design Cannot Fully Fix

Further reading

Related articles

Anonymity in personality assessment: why it matters more than you think

How personality test scores are calculated: from items to dimensions

What reliability and validity mean in personality testing — explained plainly

What Social Desirability Bias Is and How It Distorts Big Five Scores

Acquiescence Bias: Why People Agree With Everything on Personality Tests

How Much Does Social Desirability Bias Actually Distort Big Five Scores?

Which Big Five Dimensions Are Most Distorted by Social Desirability

Likert Scale vs Forced-Choice: Comparing Bias Vulnerability

How Forced-Choice Design Reduces Social Desirability Bias

How Cèrcol's Witness Instrument Minimises Social Desirability in Peer Ratings

Honest Caveats: What Forced-Choice Design Cannot Fully Fix

Social Desirability Bias: Key Takeaways for Personality Test Users

How Cèrcol handles social desirability bias

Further reading

Related articles

Anonymity in personality assessment: why it matters more than you think

How personality test scores are calculated: from items to dimensions

What reliability and validity mean in personality testing — explained plainly