Beta launch — 500 free Full Moon licences remaining. Help us find bugs.
Claim free access

What the Cèrcol Witness instrument measures — and how it differs from self-report

The Cèrcol Witness peer assessment uses forced-choice pairs to reveal how colleagues experience you — capturing blind spots that self-report cannot reach.

Miquel Matoses·8 min read

Every personality instrument faces the same fundamental problem: the person most motivated to manage the impression they create is also the person answering the questions. Self-report is fast, cheap, and scalable — but it is contaminated by self-enhancement, blind spots, and the way the subject wants to be seen rather than how they actually behave with others.

Cèrcol's Witness instrument is a peer assessment designed to address exactly this gap. It uses a different methodology from the self-report assessment, captures a different slice of personality, and when combined with self-report, produces a substantially more informative picture.

How Forced-Choice Peer Assessment Eliminates Rater Bias

The Witness instrument is a forced-choice adjective-pair assessment. Rather than asking "how extraverted is this person on a scale of 1–5?", it presents the assessor with pairs of personality-relevant adjectives and asks them to choose which one better describes the person they are rating.

For example: bold vs. considerate. The assessor cannot choose "both" or "neither." They must commit to one.

This design has two important properties. First, it eliminates acquiescence bias — the tendency for people (especially high-Agreeableness raters) to give uniformly positive ratings on Likert scales. When every item forces a choice between two equally positive options, the rater's response carries genuine discriminatory information.

Second, the adjective pairs are derived from the AB5C (Abridged Big Five Circumplex) framework, which positions personality adjectives in a two-dimensional space defined by two Big Five factors (Hofstee, De Raad & Goldberg, 1992, DOI: 10.1037/0022-3514.63.1.146). This means each adjective pair covers not just a single dimension but a blend of dimensions — the circumplex structure allows the instrument to capture subtle profile nuances that single-dimension rating scales miss.

For a fuller discussion of this methodology, see forced-choice personality assessment: more honest data.

Why Peers See Personality the Subject Cannot

The case for peer assessment is empirically well established. Across multiple meta-analyses, observer ratings of Big Five traits show incremental validity beyond self-report in predicting job performance, relationship quality, and leadership effectiveness (Connelly & Ones, 2010, DOI: 10.1037/a0021Don't).

This incremental validity comes from two sources.

Blind spots. Self-report captures how people experience themselves from the inside. It misses how they appear from the outside — particularly in states of stress, conflict, or high social stakes, where behaviour diverges most sharply from self-image. A person may genuinely believe they are a calm, patient communicator and be genuinely unaware that colleagues experience them as dismissive when under deadline pressure. For an in-depth treatment, see blind spots in teams.

Impression management. Even in anonymous self-report, people are motivated to present themselves favourably on socially desirable dimensions (conscientiousness, agreeableness) and less favourably on dimensions associated with weakness (neuroticism). The forced-choice peer format makes this kind of management much harder for raters to execute — and entirely impossible for the subject.

360-degree feedback in organisational settings has documented these dynamics extensively. The Witness instrument operationalises the same principle with a psychometrically tighter methodology.

What the Witness Score Adds Beyond Self-Report

The relationship between self-report and Witness scores is itself informative. There are four meaningful patterns:

High self / High Witness (agreement high): The person's self-image matches how colleagues see them. This is the most straightforward case — both sources point in the same direction, and confidence in the trait description is highest.

Low self / Low Witness (agreement low): Agreement in the low direction — the person presents this trait consistently and is aware of doing so. Also high confidence.

High self / Low Witness (self-enhancement or blind spot): The subject rates themselves higher on a dimension than their colleagues do. This may reflect genuine self-enhancement, or it may reflect that the behaviours associated with this trait are context-specific and colleagues see different contexts than the subject uses to evaluate themselves. This gap is usually the most productive for coaching conversations.

Low self / High Witness (self-deprecation or context effect): The subject rates themselves lower than colleagues do. This can indicate false modesty, but more often reflects that the behaviours are more visible to others than to the subject — for example, someone who consistently facilitates others in meetings but does not count this as "leadership behaviour" in their self-assessment.

The dimension where this gap is largest matters too: research shows that Depth (Neuroticism) shows the lowest self-other agreement while Presence (Extraversion) shows the highest. The full breakdown is in self-other agreement by Big Five dimension: where the gaps are biggest.

Assessment sourceWhat it capturesWhat it missesCombined value
Self-reportInternal experience, self-concept, motivation, private behaviourBlind spots, impression management, external impactProvides the subjective baseline
Witness (peer)Observable behaviour, external impact, social reputationInternal states, private context, effort and intentionProvides the external perspective
CombinedFull profile: self-experience + social reality, gaps = development targetsCannot fully separate trait from contextHighest predictive validity; identifies self-awareness gaps

How Cèrcol's Anonymity Design Protects Rater Honesty

One major limitation of peer assessment in organisational settings is rater honesty. When colleagues know their ratings are identifiable, they inflate scores — particularly for direct supervisors and people they like. This social desirability effect in raters can be as large as the one in subjects.

Cèrcol's Witness design addresses this through full anonymity. The subject never sees individual Witness responses — only the aggregate. Raters are not shown to the subject even at the aggregate level. The system requires a minimum of three Witnesses before displaying aggregate scores, which prevents de-anonymisation by subtraction.

The invitation flow is also designed to minimise social obligation: Witnesses are invited by the subject, but the invitation explicitly states that not responding is fine, and no follow-up reminders are sent by default. This reduces the coercive social pressure that degrades rater honesty.

How many Witnesses you need depends on how reliable you want the composite to be. How many peer assessors do you need for reliable personality data? works through the Spearman-Brown formula with concrete numbers: three Witnesses reaches .62 reliability, five reaches .73.

3+
peer assessors minimum for reliable signal
r = 0.18
average self-other agreement on personality
Neuroticism
most under-reported in self-report vs peer ratings
360°
coverage: behaviour observed across multiple contexts

Limitations of the Witness Instrument: What to Watch For

The Witness instrument is not without constraints, and any interpretation should hold them clearly.

Small N effects. With only three to five Witnesses, a single outlier rater has a large effect on aggregate scores. Cèrcol displays confidence intervals that widen as N decreases, and users should treat very small N aggregates with appropriate scepticism.

Relationship effects. Witnesses selected by the subject are not a random sample of all colleagues. They may be biased toward people who know the subject well, people the subject trusts, or — if the subject has not thought carefully about selection — people likely to rate them favourably. Cèrcol guides subjects to select Witnesses from different relationship contexts (peer, collaborator, direct report, manager) to mitigate this.

Snapshot vs. pattern. The Witness scores reflect colleagues' impressions at a point in time. Personality is relatively stable, but reputation can lag behaviour change significantly — a person who has recently changed how they work may receive Witness scores that reflect older patterns.

No clinical use. Like all Big Five assessments, the Witness instrument is designed for personal and professional development in non-clinical settings. It is not a diagnostic tool and should not be used as such.

For the development application — including how to use Witness data in a first-quarter conversation — see using Cèrcol for team development: a practical guide. For the underlying personality science, the IPIP item pool is the open-access repository from which Cèrcol's self-report items are drawn.

"Your self-report tells you who you think you are at work. Your Witness scores tell you something closer to who your colleagues experience you as. The distance between those two things is one of the most useful development data points available."

Cèrcol is free and open source at cercol.team.

Try the Witness peer assessment

You've read about what the Witness measures — now use it. Go to cercol.team/instruments, complete your own profile, and invite three to five colleagues to rate you as Witnesses. The result is a side-by-side comparison of your self-perception and their experience of you — the gaps are where real development conversations start. It takes under fifteen minutes to set up, costs nothing, and the data is yours entirely.

Further reading

Related articles

Cèrcol uses only functional cookies — no analytics, no advertising trackers. Privacy policy