The grading rubric

How you’re judged.

Placed has four interview modes, each with its own rubric modeled on what real interviewers actually listen for. This is the full breakdown — the signal, the pitfalls, and exactly what the AI grader is checking on your behalf.

Learn the patterns behind the questions

Mode 01Classic algorithmic

Coding Interview

A timed session with an AI interviewer who probes your reasoning while you code. Default 25 minutes, and you can dial it from 15 up to 45 before you start if the problem looks harder. Five criteria scored 0–10 each, graded at a top-tier SWE hiring bar — well-written but behavior-empty answers are capped at 5.

15–45 min · 25 default0–10 scale5 criteria

Problem Understanding & Clarification

0–10

"Do they actually know what we're solving?" Top candidates restate the problem, nail down inputs/outputs, and surface edge cases before writing a single line of code.

Good signal

Clarifying questions up front
Pins down input ranges, output shape, constraints, and edge cases before coding.
Restates in their own words
Plays the problem back to confirm understanding.
Surfaces assumptions explicitly
"I'm assuming the array fits in memory — let me know if not."

Bad signal

Starts coding on first read
No clarification, visible misreading of the problem later.
Silent assumptions
Builds for one shape of input without checking.
Asks only when stuck
Clarifies mid-implementation when it's already costly.

Common pitfalls

·Skipping clarification on "easy-looking" problems — that's exactly where misreads hide.
·Asking clarifying questions but not listening to the answer.

›What the AI grader actually checks

9-10 = Asked sharp clarifying questions about inputs, outputs, constraints, and edge cases BEFORE coding; restated assumptions. 7-8 = Asked meaningful clarifying questions up front. 5-6 = Some clarification but incomplete; started coding with unverified assumptions. 3-4 = Minimal clarification; visible misunderstandings. 0-2 = No clarification; coded from a misreading.

Problem Solving & Algorithm Design

0–10

"Can they get to a correct, efficient answer — and do they know why it's correct?" Pattern recognition, decomposition, and complexity awareness.

Good signal

Traces through examples first
Runs a small case by hand to expose structure before coding.
Names the pattern
"This is sliding window" / "DP on subsequences" — and defends the fit.
States and improves complexity
Starts from a brute force, optimizes with justification.

Bad signal

Jumps to code without a plan
Debugs into the answer instead of designing it.
Ignores edge cases
Empty input, duplicates, overflow — never surfaced.
Can't justify complexity
Writes nested loops without reasoning about cost.

Common pitfalls

·Forcing a memorized pattern onto a poorly-fitting problem.
·Declaring "O(n)" without accounting for hidden work (hash resize, inner sort).
·Optimizing prematurely before a correct baseline exists.

›What the AI grader actually checks

9-10 = Optimal approach, articulated complexity, weighed trade-offs, decomposed cleanly. 7-8 = Correct approach with sound complexity reasoning. 5-6 = Works but not optimal; complexity reasoning thin. 3-4 = Brute-force only, or approach not clearly reasoned. 0-2 = Approach is incorrect or incoherent.

Technical Communication

0–10

"I can solve it in my head too — I need to hear how you think." Interviewers hire the person they want to pair with, not the one who silently produces an answer.

Good signal

Narrates intent before typing
States the approach and why it fits before writing code.
Responds precisely to probes
Answers the interviewer's actual question, not a nearby one.
Holds the thread when stuck
Keeps narrating even when the going gets hard.

Bad signal

Heads-down silent coding
Ten minutes pass without a word — nothing to evaluate.
Monologuing
Talks at the interviewer instead of inviting collaboration.
Surprise pivots
Changes approach mid-solution without explaining why.

Common pitfalls

·Going silent the moment things get hard — that's exactly when narration matters most.
·Vague words ("it works", "something like this") instead of precise ones.
·Arguing with interviewer push-back instead of probing their concern.

›What the AI grader actually checks

9-10 = Narrated intent continuously, responded precisely to probes, checked in at decision points, held the thread when stuck. 7-8 = Thinking mostly visible. 5-6 = Intermittent narration; interviewer had to probe. 3-4 = Mostly silent coding. 0-2 = No useful verbal signal; intent opaque.

Code Quality & Craft

0–10

"Would I want to review this in a PR?" The final artifact matters — variable names, structure, and the ugly cases are all signal.

Good signal

Readable names and structure
Short functions, intention-revealing names, no mystery vars.
Idiomatic
Uses the language's strengths rather than fighting them.
Handles edge cases explicitly
Null / empty / boundary inputs are addressed.

Bad signal

Copy-paste duplication
Same three lines appear four times with no extraction.
Magic numbers and cryptic vars
`i`, `j`, `tmp`, `x2` with no context.
Off-by-ones left in
Submits without tracing boundaries.

Common pitfalls

·Optimizing for brevity at the cost of clarity (clever one-liners).
·Leaving debug `console.log`s or commented-out code in the final submission.
·Forgetting to return, or returning the wrong thing.

›What the AI grader actually checks

9-10 = Readable, well-structured, idiomatic, no dead code, naming carries intent. 7-8 = Solid with minor stylistic slips. 5-6 = Works but messy — a reviewer would call it out. 3-4 = Quality problems hurt readability. 0-2 = Barely-working code, or scaffolding without substance.

Testing & Verification

0–10

"Show me you don't just trust your own code." The single most common missing signal in 2025–26 SWE loops. Interviewers want to watch you walk through your own solution.

Good signal

Dry-runs example inputs
Traces happy path and at least one edge case line-by-line.
Proactive edge-case hunt
Names empty / single / duplicate / overflow cases unprompted.
Debugs methodically
When something breaks, isolates cause instead of guess-and-check.

Bad signal

Declares "done" without walking through it
Hands in code they haven't verified.
Confuses "compiles" with "correct"
No real trace, just absence of syntax errors.
Only tests happy path
Boundary cases untouched.

Common pitfalls

·Running the code but not reading what it actually output.
·Claiming correctness confidently without evidence.

›What the AI grader actually checks

9-10 = Proactively traced examples, identified and handled edge cases, debugged methodically. 7-8 = Traced at least one example end-to-end, considered at least one edge case. 5-6 = Some verification, mostly prompted. 3-4 = No real testing. 0-2 = Did not verify; made confident claims without evidence.

Overall rating thresholds

Insufficient

overall < 4.0

Developing

overall 4.0–6.4

Proficient

overall 6.5–8.4

Exceptional

overall ≥ 8.5

Your final rating is the mean of your per-criterion scores (0–10), rounded to one decimal.

Start a Coding Interview

Mode 02Code with AI — graded on how you use it

GenAI Coding

You solve a problem using an AI assistant. Default 25 minutes, dialable from 15 up to 45 before you start. The grader isn't just checking the output — it's scoring how you prompt, validate, push back, and take ownership. Six criteria scored 0–10, with hard caps for observable behaviors like skipping tests or submitting AI code verbatim.

15–45 min · 25 default0–10 scale6 criteria

Problem Understanding

0–10

"Did they frame the problem for the AI, or dump it and hope?" The context you provide is the ceiling on what the AI can give you back.

Good signal

Precise framing
Names inputs, outputs, constraints, edge cases in the first prompt.
Verifies AI understanding
Asks the AI to restate or confirm before producing a solution.
Shares relevant context
Provides calling code, types, or domain detail the AI needs.

Bad signal

Dumps the raw problem
Pastes the prompt verbatim with no framing.
Lets AI assume
Accepts whatever interpretation the AI lands on.
No context provided
Asks about a function with no surrounding code.

Common pitfalls

·Assuming the AI has read your mind.
·Correcting misunderstanding prompt-by-prompt instead of front-loading clarity.

›What the AI grader actually checks

Context built across multiple turns counts equally to up-front framing. 9-10 = Framed the problem with precise inputs, outputs, constraints, edge cases — front-loaded OR built up coherently across turns; verified AI interpretation. 7-8 = Gave enough context to reason correctly. 5-6 = Partial context; relied on AI to fill gaps. 3-4 = Dumped the raw problem, or surfaced context too late to matter. 0-2 = No framing.

Prompt Quality

0–10

Modern teams pair with AI constantly — the question is whether you drive it or it drives you. Precise, contextual, iterative prompts separate practitioners from passengers.

Good signal

Specific, contextual prompts
Includes constraints, shape, expected behavior, relevant code.
Iterates with intent
Follow-ups target specific issues, not "try again".
Scopes the ask
Breaks big problems into smaller prompts.

Bad signal

One-shot "solve this"
Hopes for the best.
Vague follow-ups
"This is wrong, fix it" — without saying what.
No context provided
Asks about a function without sharing the caller.

Common pitfalls

·Treating the AI like a search engine rather than a collaborator.
·Not giving constraints — letting the AI invent APIs or assumptions.
·Over-deferring: ignoring your own domain knowledge.

›What the AI grader actually checks

Iterative refinement across turns is a strength, not a weakness — late-turn prompts that advance the problem are not "ad hoc". 9-10 = Prompts specific, purposeful, well-contextualized; iterated with precision (front-loaded or turn-by-turn). 7-8 = Good prompts with minor gaps. 5-6 = Workable but generic or under-specified. 3-4 = One-shot or shallow. 0-2 = Almost no prompting, or prompts that don't advance the problem. (Hard cap: <3 prompts in the session → max 5.)

Output Validation

0–10

AI code looks confident even when it's wrong. Trust-but-verify is the single biggest skill separating senior from junior AI-augmented engineers.

Good signal

Actually runs the code
Executes, tests sample inputs, checks the output.
Probes edge cases
Empty, large, unicode, overflow — tested before trusted.
Questions suspicious claims
Pushes back when the AI asserts something implausible.

Bad signal

Blind copy-paste
Submits without running it.
Confuses "compiles" with "correct"
Absence of errors ≠ proof of correctness.
Skips edge cases
Happy path works — done.

Common pitfalls

·Hallucinated APIs that look real (fake signatures, wrong library names).
·Plausible-but-wrong explanations that sound authoritative.

›What the AI grader actually checks

9-10 = Ran the code, tested edge cases, challenged suspicious output, probed for errors proactively. 7-8 = Ran the code, at least one meaningful check. 5-6 = Happy-path only. 3-4 = Did not test rigorously. 0-2 = Never ran or verified the code. (Hard cap: didn't run code → max 3. No edge case probed → max 6.)

Code Quality & Craft

0–10

"Is the final submission something you'd be happy to ship?" Using AI doesn't absolve you of craft — graders still score the artifact.

Good signal

Readable, well-structured
Idiomatic, appropriate for the problem, no dead code.
Consistent style
AI-generated sections refactored to match your codebase.
Handles edges
Null / empty / boundary inputs addressed.

Bad signal

Patchwork styles
Different functions solve the same problem differently because the AI happened to.
Dead or commented-out code
Left in from earlier AI iterations.
Cryptic names
AI placeholders like `tmp`, `x`, `data` never renamed.

Common pitfalls

·Accepting overengineered patterns the AI reaches for by default.
·Shipping code that runs but you'd be embarrassed to review.

›What the AI grader actually checks

9-10 = Readable, well-structured, idiomatic, appropriate; no dead code. 7-8 = Solid with minor style issues. 5-6 = Works but messy — a reviewer would flag readability. 3-4 = Poor structure, confusing naming. 0-2 = Barely functional or structurally broken.

Human Judgment

0–10

The AI is fluent — you're the one with taste and context. Graders look for edits, overrides, and "no, do it this way" moments that show you're steering.

Good signal

Overrides AI suggestions
Rewrites sections to match your conventions or a better approach.
Rejects bad ideas
"That's O(n²), use a hash map" — and explains why.
Adapts to real constraints
Customizes generic AI code to the problem's requirements.

Bad signal

Rubber-stamps everything
Accepts every suggestion verbatim.
Ping-pongs with the AI
Changes approach every time the AI does.
No independent edits
Final code is 100% AI output, untouched.

Common pitfalls

·Deferring on decisions that require your domain or codebase knowledge.
·Losing coherence across functions because you accepted whatever came back.

›What the AI grader actually checks

9-10 = Pushed back, customized, overrode clearly-wrong decisions, made independent calls. 7-8 = Modified AI output meaningfully. 5-6 = Some edits; judgment narrow. 3-4 = Largely accepted without scrutiny. 0-2 = Pure rubber-stamping. (Hard cap: submitted AI code verbatim → max 3.)

Accountability & Understanding

0–10

"Walk me through this code" — you wrote it, even if the AI drafted it. If you can't defend it, you don't own it.

Good signal

Explains every section
Articulates why each part exists and what it does.
Engages with AI explanations
Reads the reasoning, asks clarifying questions.
Documents decisions
Leaves comments on why, not just what.

Bad signal

"The AI wrote it"
Can't explain a section when asked.
Skips the explanations
Never reads the AI's reasoning.
Mystery blocks
Entire sections the user doesn't understand.

Common pitfalls

·Treating AI output as black-box magic.
·Mistaking "it runs" for "I understand it".

›What the AI grader actually checks

9-10 = Clear ownership — can explain every choice, asked clarifying questions. 7-8 = Mostly understands what they submitted. 5-6 = Surface-level; some parts are black boxes. 3-4 = Does not engage enough to claim ownership. 0-2 = Submitted code the candidate clearly does not understand. (Hard cap: submitted AI code verbatim → max 4.)

Overall rating thresholds

Insufficient

overall < 4.0

Developing

overall 4.0–6.4

Proficient

overall 6.5–8.4

Exceptional

overall ≥ 8.5

Your final rating is the mean of your per-criterion scores (0–10), rounded to one decimal.

Start a GenAI Coding Session

Mode 03Behavioral assessment — how you think about AI

GenAI Fluency

Seven behavioral criteria scored 0–10 each. This isn't about code — it's about the judgment, ethics, and communication skills hiring managers probe when deciding if you can be trusted with AI tools at work. Hypothetical "I would" answers are capped hard.

Open-ended0–10 per criterion7 criteria

Specificity & Concreteness

0–10

"Tell me about a time you…" — the signal is a specific tool, task, project, and outcome. Vagueness reads as inexperience.

Good signal

Names the tool
"Copilot in VS Code", "Claude via the API" — not "an AI".
Names the project
Concrete artifact, timeframe, outcome.
Verifiable detail
The story could be checked against a commit history or PR.

Bad signal

Hypothetical framing
"I would use AI to…" instead of "I used AI to…"
Generic "we used AI"
No tool, no task, no outcome.
Could apply to anyone
No personal fingerprint on the story.

Common pitfalls

·Hiding behind "we" instead of saying what you did.
·Picking a tiny example when a bigger one exists.

›What the AI grader actually checks

9-10 = Names specific tool, task, project, verifiable outcome. 7-8 = Mostly specific, one key detail missing. 5-6 = Generic but names at least a tool or context. 3-4 = Vague; no named artifacts. 0-2 = Entirely hypothetical. (Hard cap: no tool named → max 3. No project/artifact → max 4.)

GenAI Literacy

0–10

Can you explain what the tool actually does, and where it fails? Understanding hallucination, prompt sensitivity, and variability is table stakes.

Good signal

Knows the failure modes
Hallucination, prompt sensitivity, stale training, nondeterminism.
Uses correct terminology
Distinguishes completion from chat, RAG from fine-tuning.
Maps tools to tasks
Knows which tools suit which jobs and why.

Bad signal

Treats it as magic
No mental model for why it fails.
Conflates AI with search
Expects deterministic, factual output.
One tool for everything
Never evaluated alternatives.

Common pitfalls

·Parroting marketing copy instead of speaking from experience.
·Overclaiming deep knowledge of training dynamics.

›What the AI grader actually checks

9-10 = Clear understanding, aware of hallucinations, prompt sensitivity, variability. 7-8 = Practical understanding, no deeper technical awareness. 5-6 = Uses GenAI as black boxes. 3-4 = Holds misconceptions. 0-2 = Conflates GenAI with deterministic software. (Hard cap: no failure mode mentioned → max 5.)

Critical Thinking & Evaluation

0–10

Do you have a review process? Proactive verification separates someone who ships AI-assisted work from someone who ships AI slop.

Good signal

Has a review ritual
Runs, tests, reads — every time, not occasionally.
Caught real errors
Specific time they caught a hallucination before it shipped.
Iterates on prompts based on quality
Treats low-quality output as feedback.

Bad signal

Gut-feel review
"If it looks right, it probably is."
Only reviews on complaint
Reactive, not proactive.
Takes at face value
No mental check between read and ship.

Common pitfalls

·Claiming to "always check" without describing how.
·Over-trusting output from a familiar tool.

›What the AI grader actually checks

9-10 = Clear review process, caught errors proactively, iterates based on quality. 7-8 = Reviews but process informal. 5-6 = Occasional review, mostly gut-feel. 3-4 = Accepts with minimal scrutiny. 0-2 = Takes AI content at face value. (Hard cap: no failure mode mentioned → max 5.)

Judgment & Risk Awareness

0–10

When NOT to use AI is as important as when to. Teams need people who weigh downstream risk, not enthusiasts who reach for AI reflexively.

Good signal

Explicit decision criteria
"I use AI for X and Y, but not for Z because…"
Has said no
Specific time they chose not to use AI for a task.
Thinks downstream
Considers who else is affected by the output.

Bad signal

AI as default
Starts every task by opening the AI tool.
Reactive risk awareness
Only thinks about risk after a problem.
No holding back
Has never declined to use AI.

Common pitfalls

·Equating "no AI" with "against AI" — interviewers hear the nuance.
·Conflating risk awareness with fear.

›What the AI grader actually checks

9-10 = Articulates decision criteria, has examples of choosing NOT to use GenAI. 7-8 = Aware of risks, applies judgment reactively. 5-6 = Uses GenAI broadly without strong filtering. 3-4 = Weak risk awareness. 0-2 = Views GenAI as a default solution. (Hard cap: no "chose NOT to" example → max 6.)

Responsibility & Ethics

0–10

Data privacy, bias, IP, compliance, human accountability — hiring managers want people whose ethical instincts are proactive, not policy-driven.

Good signal

Concrete responsible decisions
Redacted data before a prompt, or flagged a bias issue.
Multiple dimensions
Privacy AND IP AND bias — not just one.
Owns accountability
Doesn't hide behind "the AI decided".

Bad signal

Ethics only when prompted
Doesn't bring it up unless asked.
Delegates to policy
"Compliance handles that" — no personal framework.
Unaware of risks
No mention of privacy, bias, or IP.

Common pitfalls

·Reciting buzzwords without concrete examples.
·Treating ethics as a blocker instead of a design constraint.

›What the AI grader actually checks

9-10 = Proactively considered privacy, bias, IP, compliance, or accountability with concrete examples. 7-8 = Aware, relies on org policy. 5-6 = Mentions ethics only when prompted. 3-4 = No concrete examples. 0-2 = No awareness.

Learning Agility

0–10

AI tooling changes monthly. Candidates who experiment and iterate beat candidates who plateaued on the first tool that worked.

Good signal

Deliberate experimentation
Tries new tools, new techniques, with clear hypotheses.
Learns from failures
Specific story of a failure that changed how they work.
Skills visibly evolving
Can describe what they do now that they didn't six months ago.

Bad signal

Same workflow as day one
No evolution in approach.
Passive learning
Only improves when mistakes force it.
No curiosity
Hasn't explored beyond defaults.

Common pitfalls

·Conflating tool-switching with learning.
·Learning trivia (model names) without practical change.

›What the AI grader actually checks

9-10 = Deliberate experimentation, iteration, learning from failures; skills evolving. 7-8 = Has improved, mostly reactively. 5-6 = Same way they started. 3-4 = Little evidence of growth. 0-2 = No curiosity.

Communication & Influence

0–10

Can you translate AI capabilities to non-technical stakeholders, and navigate skeptics? Influence is the multiplier on every other skill here.

Good signal

Audience calibration
Explains AI differently to execs, engineers, end users.
Has shifted minds
Brought a skeptic or zealot to a balanced view.
Clear, jargon-free writing
Describes what AI did without impenetrable vocabulary.

Bad signal

One-mode communicator
Same jargon for every audience.
Can't say why
Describes what but not the reasoning.
Never navigated resistance
No examples with skeptical colleagues.

Common pitfalls

·Overloading technical detail to sound credible.
·Dismissing skeptics instead of engaging.

›What the AI grader actually checks

9-10 = Calibrates message to different audiences; has influenced others. 7-8 = Communicates clearly, no significant resistance navigated. 5-6 = Can explain what, struggles with why. 3-4 = Jargon-heavy. 0-2 = Communication unclear throughout.

Automatic red flags

Hit any of these during the assessment and they’re flagged in your report regardless of score:

RF1Candidate has never used a GenAI tool in a real work or personal context
RF2Candidate shows no awareness that GenAI can produce inaccurate or misleading output
RF3Candidate has never considered data privacy when using GenAI tools
RF4Candidate describes using GenAI to generate work submitted as their own without any review
RF5Candidate expresses blanket refusal to use GenAI tools without a reasoned explanation

Overall rating thresholds

Insufficient

overall < 4.0

Developing

overall 4.0–6.4

Proficient

overall 6.5–8.4

Exceptional

overall ≥ 8.5

Your final rating is the mean of your per-criterion scores (0–10), rounded to one decimal.

Start a Fluency Assessment

Mode 04Classic SWE "tell me about a time"

Behavioral Interview

Two STAR-format questions drawn from publicly-reported behavioral rounds at Google, Meta, Amazon, Stripe, and others. Eight criteria scored 0–10, grader calibrated to the bar a real hiring manager would set. Hypothetical answers are capped at 3.

~15 min0–10 per criterion8 criteria

Specificity & Evidence

0–10

"Is this a story that actually happened?" Specific tool, artifact, person, timeframe, and verifiable outcome — the #1 thing hiring managers listen for. Hypothetical answers get capped at 3.

Good signal

Names the artifact
Specific project, tool, commit, document, meeting.
Names the people
Team composition, stakeholder role — not "someone".
Verifiable timeframe
"Q2 2024" — not "a while back".

Bad signal

"I would"
Hypothetical or aspirational framing.
No artifact named
The story could apply to any project.
No timeframe
"At some point" — impossible to ground.

Common pitfalls

·Over-sanitizing the story until no specifics remain.
·Keeping the story vague "to protect confidentiality" when a general summary would be fine.

›What the AI grader actually checks

9-10 = Names specific tool, artifact, project, person, timeframe; outcome is verifiable. 7-8 = Concrete example, one detail missing. 5-6 = Generic but attempts named specifics. 3-4 = Vague; no named artifacts. 0-2 = Entirely hypothetical.

Ownership & Accountability

0–10

"Who actually did the work?" Interviewers listen for clear "I" framing — what you personally did, decided, and owned, not what the team did around you.

Good signal

Clear "I" framing
"I proposed…", "I decided…", "I shipped…" — specific to you.
Owns the outcome
Good or bad — takes responsibility without shifting blame.
Names what was at stake
Explains why the decision mattered and who it affected.

Bad signal

Heavy "we" framing
"We built…", "We decided…" — impossible to tell what you did.
Passive voice
"It was decided that…" — distances you from the call.
Deflects blame
When things went wrong, it was someone else's fault.

Common pitfalls

·Overclaiming — taking credit for things a teammate drove.
·Undersharing — hiding behind the team when you actually led.
·Confusing responsibility with authority.

›What the AI grader actually checks

9-10 = Uses "I" correctly, names what they did, owns outcomes. 7-8 = Mostly owns it with occasional drift to "we". 5-6 = Heavy "we" framing. 3-4 = Attributes credit or blame away. 0-2 = Deflects entirely. (Hard cap: heavy "we" → max 4.)

Dealing with Ambiguity

0–10

"What did you do before the path was clear?" Senior engineers scope, decide, and adjust; less-experienced ones wait for someone to tell them what to do.

Good signal

Scoped the problem
Broke an unclear task into questions, assumptions, a first bet.
Decided with incomplete info
Picked a path and named the risk.
Adjusted on signal
Changed direction when new info came in.

Bad signal

Waited for direction
"I asked my manager what to do" — no independent thinking.
Oversimplified
Collapsed the ambiguity by ignoring the hard parts.
False certainty
Projected confidence they didn't have.

Common pitfalls

·Framing "I asked for clarity" as the full answer — it's the setup, not the story.
·Analysis paralysis dressed up as "being thorough".

›What the AI grader actually checks

9-10 = Scoped, chose a path, communicated reasoning, adjusted on signal. 7-8 = Competently but waited for clarity. 5-6 = Navigated with visible friction. 3-4 = Stalled or oversimplified. 0-2 = Collapsed under uncertainty.

Collaboration & Influence

0–10

"How did you move people who don't report to you?" Senior signal is earned influence — trust, evidence, well-framed arguments — not structural authority.

Good signal

Influenced without authority
Moved a peer team or skeptical stakeholder via reasoning.
Named the other perspective
Can articulate what the other person was worried about.
Adapted their approach
Changed how they communicated based on the audience.

Bad signal

Escalated to win
"I told my manager" — a decision by authority.
Parallel play
Worked alongside others without actually collaborating.
Worked around people
Avoided the hard conversation.

Common pitfalls

·Confusing "got buy-in" with "told everyone the plan".
·Over-crediting yourself for team-driven work.

›What the AI grader actually checks

9-10 = Moved others through clarity/trust/evidence without authority. 7-8 = Earned but narrow influence. 5-6 = Parallel play. 3-4 = Influence via authority/escalation only. 0-2 = No collaboration evidence.

Conflict & Feedback

0–10

"Did you engage, or did you avoid?" Strong candidates have had real disagreements, handled them directly, and walked away with intact relationships.

Good signal

Engaged directly
Had the hard conversation.
Separated idea from person
Disagreed strongly without making it personal.
Updated when warranted
Changed their mind when the other side had a point.

Bad signal

Avoided
"We agreed to disagree" with no actual engagement.
"Won" at cost
Got their way, but burned a bridge.
No real disagreement
Story is conflict-adjacent, not contentious.

Common pitfalls

·Describing a disagreement where you were obviously right — unconvincing.
·Confusing "being heard" with "being right".

›What the AI grader actually checks

9-10 = Engaged directly, separated idea from person, updated when warranted, relationship preserved. 7-8 = Slight over-accommodation or grudge. 5-6 = Avoided some of the hard conversation. 3-4 = Avoided or escalated damagingly. 0-2 = No real conflict or handled so poorly it's a red flag.

Impact & Outcomes

0–10

"What changed because of you specifically?" Activity ≠ impact. Interviewers want numbers, users, time saved — and a clear line back to your contribution.

Good signal

Concrete numbers
"Cut latency 40%", "saved 12 hours a week".
Traced back to you
Clear line between the work and the outcome.
Owned the follow-through
Stuck around to measure and iterate.

Bad signal

Activity without outcome
"We launched it" — no metric attached.
Fuzzy attribution
Big outcome, unclear what you did.
Hypothetical impact
"It would save X" — never measured.

Common pitfalls

·Vanity metrics (commits) instead of outcome metrics.
·Claiming impact for something you shipped but never measured.

›What the AI grader actually checks

9-10 = Concrete, measurable outcome tied to specific contribution. 7-8 = Clear outcome, mostly clear attribution. 5-6 = Fuzzy metrics or unclear attribution. 3-4 = Activity, not outcome. 0-2 = No outcome, or hypothetical. (Hard cap: no measurable outcome → max 4.)

Self-Awareness & Growth

0–10

"What are you actually bad at?" Credible self-awareness is specific, non-performative, and paired with evidence of actual change.

Good signal

Specific real weakness
Genuine blindspot or gap.
Named what changed
Concrete behavior change or habit shift.
Evidence it stuck
Recent situation where the change showed.

Bad signal

Humblebrag weakness
"I care too much".
Generic growth
"I learned to communicate better" — no specifics.
Defensive posture
Minimizes or justifies the weakness.

Common pitfalls

·Picking a weakness that's obviously a strength.
·Naming a weakness but showing no evidence it's changed.

›What the AI grader actually checks

9-10 = Names specific weakness, what they changed, how they know it stuck. 7-8 = Soft evidence. 5-6 = Surface-level or mildly defensive. 3-4 = Humblebrag or deflected. 0-2 = No self-awareness visible.

Judgment in Context

0–10

"How do you decide when it's hard?" Interviewers want to see you weigh real trade-offs, pick pragmatically, and own the reasoning — not retrofit a good outcome.

Good signal

Weighed real alternatives
Names what they considered and rejected, and why.
Pragmatic over pure
Picked the right call for the situation, not the textbook answer.
Would make the same call
Can defend, not just rationalize.

Bad signal

Single rule applied reflexively
"I always…" — with no weighing.
Post-hoc rationalization
Suspiciously clean in hindsight.
Can't articulate why
Knows what they chose, can't explain the trade-off.

Common pitfalls

·Picking the "elegant" answer when the situation called for the ugly one — and pretending it was fine.
·Describing the good outcome as evidence the decision was good.

›What the AI grader actually checks

9-10 = Weighed trade-offs, named rejected alternatives, chose pragmatically. 7-8 = Thin trade-off reasoning. 5-6 = One rule without weighing alternatives. 3-4 = Poor decision, or cannot articulate why. 0-2 = No judgment visible. (Hard cap: no rejected alternative → max 5.)

Overall rating thresholds

Insufficient

overall < 4.0

Developing

overall 4.0–6.4

Proficient

overall 6.5–8.4

Exceptional

overall ≥ 8.5

Your final rating is the mean of your per-criterion scores (0–10), rounded to one decimal.

Start a Behavioral Interview

The canonical rubric text on each criterion mirrors the exact anchors used by the AI grader in production. If the grader changes, this page changes.