Is Cèrcol based on the Big Five?

Yes. Cèrcol measures personality using the OCEAN model (Big Five) via the IPIP public-domain item pool (Goldberg et al. 2006). The 12 team roles are derived from the AB5C circumplex (Hofstee et al. 1992) and team composition research (Bell 2007; Neuman & Wright 1999).

What makes Cèrcol different from Belbin or DISC?

Cèrcol's roles are grounded in the Big Five (OCEAN) personality model using the IPIP public-domain item pool. The scoring pipeline is fully open source and auditable. Witness Cèrcol uses forced-choice adjective selection — not Likert scales — to eliminate social desirability bias in peer assessment. Unlike Belbin or DISC, all items are public domain and the entire methodology is published and citable.

Is the personality assessment free?

The New Moon Cèrcol (10 items, Big Five snapshot) and First Quarter Cèrcol (60 items, IPIP-NEO-60, 30 facets) are always free — no account required. The Full Moon Cèrcol (120 items, IPIP-NEO-120, Witness peer assessment, cognitive ability measure) requires a one-time payment.

What is Witness Cèrcol?

Witness Cèrcol is a peer personality assessment where someone who knows you well rates you using a forced-choice adjective selection method — picking the best-fit and worst-fit adjective per round from a set covering all five OCEAN dimensions. Forced choice eliminates the social desirability bias that affects standard Likert-scale peer ratings. Dimensions where your self-rating and peer ratings diverge by more than 0.8 standard deviations are flagged as potential blind spots.

How are the 12 team roles derived?

The 12 roles are derived from the AB5C circumplex (Hofstee, De Raad & Goldberg 1992), covering all six intersections of the three team balance dimensions (Presence/Extraversion × Bond/Agreeableness × Vision/Openness) at both poles. The selection of these three dimensions as requiring team-level balance is grounded in Bell (2007) and Neuman & Wright (1999). Discipline (Conscientiousness) and Depth (Neuroticism) modulate role expression but do not define team balance.

No account is required for any instrument. During assessment, no personal data is collected — only anonymous scores are logged. Data is stored on our own servers (Hetzner Online GmbH). No third-party analytics. No data is shared with or sold to third parties.

Is Cèrcol based on the Big Five (OCEAN)?

Yes. Cèrcol measures personality using the OCEAN model (Big Five) via the IPIP — the International Personality Item Pool, a public-domain collection validated in thousands of published studies. The five dimensions are Presence (Extraversion), Bond (Agreeableness), Vision (Openness), Discipline (Conscientiousness), and Depth (Neuroticism). Because the IPIP is public domain there are no licence restrictions: the full item pool and scoring logic are open and citable.

How is Cèrcol different from Belbin, DISC, or StrengthsFinder?

Three things set Cèrcol apart. First, the items come from the Big Five (OCEAN), the most replicated personality model in academic research — not a proprietary framework. Second, the full item pool (IPIP) and scoring pipeline are public domain and auditable; there is no black box. Third, the Witness peer assessment uses forced-choice adjective selection instead of Likert scales, which eliminates the social desirability bias that affects most 360-feedback tools. Belbin and DISC use closed, proprietary methodologies.

What are blind spots in team personality assessment?

A blind spot is a personality dimension where how you see yourself and how others see you diverge significantly — more than 0.8 standard deviations apart. Cèrcol's Witness peer assessment detects blind spots by comparing your self-report with forced-choice adjective ratings from people who know you. Blind spots are neither good nor bad: they show where your self-perception and others' experience of you don't match, which is often more actionable than the score itself.

Forced-choice scales: how they cut faking

Why Traditional Likert-Scale Personality Tests Have a Bias Problem

Likert-scale personality tests dominate both research and applied assessment. The Big Five Inventory, the NEO-PI-R, the IPIP scales, and hundreds of proprietary instruments all use variations of the same format: rate yourself on a set of statements from "disagree" to "agree."

The strengths of this format are real. It is intuitive for respondents, easy to score, and produces normative data — meaning scores can be compared across individuals on an absolute scale. A score of 4.2 on Conscientiousness is directly comparable across different people who took the same test.

But Likert scales have two structural weaknesses that cannot be fully addressed through careful item writing or instructions to "answer honestly."

The first is acquiescence bias: the tendency to agree rather than disagree, regardless of content. Across cultures and populations, people tend to endorse statements at rates higher than chance — saying "agree" is the path of least resistance. This inflates all trait scores uniformly.

The second is social desirability bias: the tendency to endorse statements that present a favourable self-image. When the socially valued answer is obvious (and on most personality items it is), motivated self-presenters can maximise their scores on valued dimensions without any constraint. For a full explanation of how much this distorts Big Five profiles, see social desirability bias in personality tests.

These two biases combine to produce scores that are a mixture of genuine trait levels and response style — and that mixture is difficult to disentangle after the fact. For teams wondering whether their DISC or 16Personalities scores are inflated by this effect, the answer is almost certainly yes. See DISC vs Big Five: why four styles aren't enough for a broader discussion of what gets lost when measurement design does not address bias.

What Forced-Choice Personality Assessment Actually Is

Forced-choice personality assessment — see Forced choice — presents items differently. Instead of rating each statement independently, respondents are presented with pairs (or triplets) of statements or adjectives and asked to choose which one better describes them.

For example, instead of separately rating "I am talkative" and "I am thorough," a forced-choice item might present both together and ask: "Which of these words better describes you?" The respondent must choose one. They cannot simultaneously endorse both at a high level.

This simple structural change has important consequences:

Acquiescence becomes impossible: you cannot agree with both options. Every choice reveals a preference between two traits.
Social desirability is reduced: when both options are positively valenced (as in well-designed forced-choice instruments), there is no obviously "good" answer. Choosing "warm" over "precise" does not make you look better or worse — it just reveals relative priorities.

"Forced-choice formats eliminate acquiescence responding and substantially reduce social desirability inflation by requiring respondents to allocate fixed amounts of endorsement across competing trait descriptions."
— Adapted from Stark, S., Chernyshenko, O. S., & Drasgow, F. (2005). An IRT approach to constructing and scoring pairwise preference items. Applied Psychological Measurement, 29(3), 184–201. See also doi:10.1037/0022-3514.63.1.146

Why forced-choice works: Standard Likert scales allow respondents to rate themselves 'highly' on every trait. Forced-choice formats require trade-offs between equally desirable options — forcing respondents to reveal relative priorities rather than absolute ideals. Research shows forced-choice reduces social desirability bias by 40–60% compared to Likert scales.

Ipsative Scoring: What It Means for Big Five Score Interpretation

Forced-choice instruments produce what is called ipsative data. An ipsative score represents a person's standing on a trait relative to their own other traits — not relative to a population norm. If your profile shows high Presence and low Depth, this means you are more extraverted than you are neurotic in relative terms. It does not necessarily tell you whether you are more extraverted than the average person.

This is a genuine limitation. Ipsative data cannot be used for all the same purposes as normative data. In particular, comparing two people's profiles directly (person A's Presence score vs person B's) is methodologically complicated with ipsative data, because both profiles are internally referenced. For a full treatment of normative vs ipsative scoring, see how personality test scores are calculated. Research on how to handle ipsative data appropriately, and on approaches that produce more normative estimates from forced-choice designs (such as IRT-based scoring), is ongoing.

Cèrcol's approach acknowledges this limitation. The Witness instrument is designed primarily to reveal relative priorities and blind spots — where is this person perceived as stronger or weaker relative to their own overall profile, and how does that compare to their self-perception? This is a valid and valuable use of ipsative data, even if absolute cross-person comparisons require additional methodological care.

The AB5C Circumplex: How Adjectives Map onto Big Five Intersections

Cèrcol's Witness instrument is grounded in the Abridged Big Five Circumplex (AB5C), developed by Hofstee, de Raad, and Goldberg (1992). The AB5C is a systematic framework for mapping personality adjectives onto the Big Five dimensions — not as pure, single-factor indicators, but as weighted combinations of two factors.

In the AB5C framework, a word like "assertive" is not simply an Extraversion adjective — it loads on both Extraversion and (low) Agreeableness. A word like "creative" loads on both Openness and (low) Conscientiousness. By mapping adjectives onto these intersections, the AB5C captures the rich, overlapping structure of personality language more accurately than a simple factor-by-factor approach. For the broader context of how personality language was systematically analysed to produce the Big Five, see history of the Big Five from Allport to Goldberg.

This matters for forced-choice design because it allows pairs to be constructed that are genuinely psychometrically informative — each choice between a pair of adjectives provides information about a respondent's position on the relevant Big Five dimensions. The pairs are not arbitrary; they are principled.

In Cèrcol's Witness instrument, adjective pairs are selected to maximise discriminative information across the five dimensions (Presence, Bond, Vision, Discipline, Depth) while keeping social desirability values as equal as possible within each pair. This ensures that choices reveal genuine personality differentiation rather than differential social desirability.

How Cèrcol's Witness Applies Forced-Choice to Peer Assessment

The Witness instrument is the peer-assessment component of Cèrcol. Rather than asking Witnesses (peer assessors) to rate the target person on behavioural statements, Witnesses are presented with adjective pairs and asked to choose which word better describes the person they know.

The instrument is built on the IPIP tradition — the open-science alternative to commercially controlled personality instruments. All item development is transparent and documented. The source code, scoring algorithm, and psychometric documentation are available under an open-source licence at cercol.team/science.

A typical Witness session takes 8–12 minutes. The resulting profile shows the target person's scores on each of the five Cèrcol dimensions as perceived by that Witness, and the aggregate across all Witnesses provides the peer composite. This composite is then compared to the target's self-report to identify alignment and gaps. For the full rationale for why this peer layer matters, see why self-assessment alone isn't enough: peer personality feedback. The question of anonymity in peer ratings is addressed in anonymity in personality assessment: why it matters.

Likert Scale vs Forced-Choice: Full Methodology Comparison

Dimension	Likert Scale	Forced-Choice (Cèrcol Witness)
Acquiescence bias	High structural risk	Eliminated by design
Social desirability	High, especially for valued traits	Substantially reduced (matched valence pairs)
Score interpretation	Normative (absolute level)	Ipsative (relative priorities)
Cross-person comparison	Straightforward	Requires methodological care
Faking resistance	Low for transparent items	Higher — both options typically positive
Theoretical grounding	Factor-by-factor	AB5C circumplex
Cognitive demand	Low	Moderate — genuine deliberation required

Honest Limitations of Forced-Choice Personality Assessment

Forced-choice is not a panacea. Several limitations deserve honest acknowledgement.

First, the ipsative scoring problem described above. While IRT-based approaches (Thurstonian IRT, as developed by Brown and Maydeu-Olivares) can recover more normative-like estimates from forced-choice data, these methods are computationally demanding and require substantial sample sizes to calibrate accurately. Simpler forced-choice scoring remains somewhat ipsative.

Second, forced-choice assessments are cognitively more demanding. Respondents must genuinely compare two options and decide which fits better, rather than simply rating each item independently. This can slow completion times and may be frustrating for respondents who feel that "both apply equally." The inability to say "both" is by design, but it can feel unnatural.

Third, forced-choice does not eliminate all strategic responding. A determined self-enhancer who knows which adjectives map onto which valued dimensions can still systematically choose the "right" adjectives. For the full literature on what motivated faking actually does to personality test scores, see can you fake a personality test?. Forced-choice raises the cognitive cost of strategic responding, but it does not make it impossible — particularly for respondents with prior exposure to personality theory.

Despite these limitations, the weight of psychometric evidence is clear: forced-choice instruments produce less biased, more differentiating data than Likert scales in high-stakes assessment contexts. For the Witness use case — peer assessment where social incentives to rate favourably are real — forced-choice design is the methodologically superior choice. And if you are evaluating this against the broader landscape of what tools are available, see the best free personality tests for teams in 2026.

Forced-Choice vs Likert: Which Produces More Honest Big Five Data?

Forced-choice personality assessment eliminates acquiescence bias by construction and substantially reduces social desirability bias through matched-valence item pairs. The AB5C circumplex provides the theoretical grounding for psychometrically principled adjective pair selection. Cèrcol's Witness instrument applies these principles in an open-source, IPIP-grounded peer assessment tool designed to produce the most honest, most differentiating peer personality data available. The limitation — ipsative scoring — is real and acknowledged, and interpretation is designed accordingly.

Try a forced-choice Big Five assessment: Cèrcol's Witness instrument

Most personality assessments — DISC, 16Personalities, even Likert-scale Big Five tools — are vulnerable to the same structural problem: when the socially desirable answer is visible, motivated respondents (and even honest ones trying to be accurate) will skew their responses toward it. Forced-choice design is the most evidence-backed solution available.

Cèrcol's Witness peer assessment is a forced-choice instrument built on the AB5C circumplex framework and grounded in the public-domain IPIP item tradition. Witnesses choose between adjective pairs matched for social desirability — making it structurally hard to be uniformly positive about the person they are rating. The result is peer personality data that reflects how the person is genuinely experienced, not just how much the Witness likes them.

The self-assessment at cercol.team is free. Adding Witness assessments takes each peer 8–12 minutes. Read the full scientific rationale to understand how the forced-choice design and AB5C grounding work together to deliver more honest Big Five data.

References
Hofstee, W. K. B., de Raad, B., & Goldberg, L. R. (1992). Integration of the Big Five and circumplex approaches to trait structure. Journal of Personality and Social Psychology, 63(1), 146–163. doi:10.1037/0022-3514.63.1.146
Brown, A., & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460–502.

Common questions

What is a forced-choice personality scale?

A format that makes you pick between equally desirable options instead of rating each one. It removes the chance to simply agree with everything.

Does forced choice reduce faking?

Yes. By pitting traits against each other it blunts social desirability bias and is harder to game, which is why it is used in high-stakes hiring.

What are the drawbacks of forced choice?

It can feel harder to complete, and historically produced scores that compared poorly across people. Modern scoring models address much, though not all, of this.

Forced-choice scales: how they cut faking

Why Traditional Likert-Scale Personality Tests Have a Bias Problem

What Forced-Choice Personality Assessment Actually Is

Ipsative Scoring: What It Means for Big Five Score Interpretation

The AB5C Circumplex: How Adjectives Map onto Big Five Intersections

How Cèrcol's Witness Applies Forced-Choice to Peer Assessment

Likert Scale vs Forced-Choice: Full Methodology Comparison

Honest Limitations of Forced-Choice Personality Assessment

Forced-Choice vs Likert: Which Produces More Honest Big Five Data?

Try a forced-choice Big Five assessment: Cèrcol's Witness instrument

Further reading

Common questions

What is a forced-choice personality scale?

Does forced choice reduce faking?

What are the drawbacks of forced choice?

Related articles

Anonymity in personality assessment: why it matters more than you think

What reliability and validity mean in personality testing — explained plainly

Why 120 items is better than 10: the trade-off in personality test length