flowchart TB
E{Evaluation in<br/>Higher Education} --> A[Elements & Types]
E --> B[CBCS Evaluation]
E --> C[Computer-Based Testing]
E --> D[Innovations<br/>Authentic · Rubrics · OBE]
classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;
7 Evaluation Systems: Elements and Types of evaluation, Evaluation in Choice Based Credit System in Higher education, Computer based testing, Innovations in evaluation systems
7.1 The Four Examined Sub-topics
The syllabus carves this topic into four examined heads:
- Elements and types of evaluation
- Choice Based Credit System (CBCS) evaluation
- Computer-Based Testing (CBT)
- Innovations in evaluation systems
The two most frequent PYQ patterns are (a) distinguishing measurement vs assessment vs evaluation, and (b) matching type of evaluation (formative/summative/diagnostic/placement) with its purpose.
7.2 Three Layers — Measurement, Assessment, Evaluation
These three terms are routinely confused in PYQs. The classical distinction:
| Layer | What it does | Output | Example |
|---|---|---|---|
| Measurement | Assigns a number to a property | Numeric score | Raw test marks (62/100) |
| Assessment | Gathers and interprets evidence of learning | Description of performance | Performance grade, feedback note |
| Evaluation | Judges worth, makes a decision | Value judgement, decision | Pass/fail, promote, certify |
A common chain: measurement → assessment → evaluation. Measurement is quantitative; assessment can be quantitative or qualitative; evaluation is interpretive and decision-oriented.
7.2.1 The Three Domains of Educational Outcomes
| Domain | Lead theorist | Levels (highest first) | Tool |
|---|---|---|---|
| Cognitive | Bloom 1956 / Anderson-Krathwohl 2001 | Create · Evaluate · Analyse · Apply · Understand · Remember | Written test, viva |
| Affective | Krathwohl 1964 | Characterising · Organising · Valuing · Responding · Receiving | Attitude scale, observation |
| Psychomotor | Simpson 1972 / Dave 1970 / Harrow 1972 | Origination · Adaptation · Mechanism · Manipulation · Imitation · Perception | Performance test, checklist |
7.3 Elements of Evaluation
A complete evaluation system has four interlocking elements:
- Objectives — what the learner should know, do, value (stated in Bloom verbs).
- Learning experiences — methods, materials, activities.
- Evaluation procedures — tools, tests, rubrics.
- Feedback / reform — using results to revise.
This is the Tyler model (1949) — Ralph W. Tyler in Basic Principles of Curriculum and Instruction. Tyler is the “father of curriculum evaluation”.
7.3.1 Essential Qualities of a Good Evaluation Tool
| Quality | What it means | Worked example |
|---|---|---|
| Validity | Measures what it claims to measure | A statistics test that tests statistics — not language |
| Reliability | Consistent across raters / occasions | Two graders give same essay similar marks |
| Objectivity | Free from grader bias | MCQ, machine-marked |
| Practicability | Feasible to administer | Time, cost, infrastructure realistic |
| Discrimination | Separates high from low achievers | Item-analysis discrimination index |
| Comprehensiveness | Covers full content & all Bloom levels | Blueprint with table of specifications |
Validity types: content, construct, criterion (concurrent & predictive), face. Reliability methods: test-retest, parallel-form, split-half, Kuder-Richardson (KR-20, KR-21), Cronbach’s α.
7.4 Types of Evaluation — The PYQ-Heavy Section
7.4.1 By Purpose
| Type | When | What it tells | Theorist |
|---|---|---|---|
| Placement | Before instruction | Where to place the learner in a sequence | — |
| Diagnostic | Before / during instruction | Specific weaknesses, learning difficulties | — |
| Formative | During instruction | Where the learner is now — for improving learning | Michael Scriven, 1967 |
| Summative | At end of instruction | Whether the learner has achieved — for judging | Michael Scriven, 1967 |
The formative/summative distinction is Michael Scriven’s, in The Methodology of Evaluation (1967). Memory cue: formative = forming (in progress); summative = sum (at the end).
7.4.2 By Reference
| Type | Compares learner to | Use |
|---|---|---|
| Norm-Referenced Test (NRT) | Other learners in the group | Ranking, selection (NEET, JEE) |
| Criterion-Referenced Test (CRT) | A pre-defined standard | Mastery, certification (driving licence) |
Robert Glaser (1963) coined both terms.
7.4.3 By Approach / Mode
- Internal (by the institution) vs External (by an outside body).
- Continuous (CCE-style) vs Terminal (one-shot exam).
- Oral (viva, interview) vs Written vs Practical.
- Subjective (essay) vs Objective (MCQ).
- Process (how the learner solves) vs Product (what the learner produces).
7.4.4 The Big Four Tests — Achievement, Aptitude, Intelligence, Personality
| Test | What it measures | Examples |
|---|---|---|
| Achievement | What has been learned | School exam, UGC NET |
| Aptitude | Potential to learn / perform | GRE, JEE Advanced |
| Intelligence | General cognitive ability | Stanford-Binet, WAIS, Raven’s |
| Personality | Traits, dispositions | MMPI, NEO-PI, 16-PF |
7.5 Choice-Based Credit System (CBCS) Evaluation
7.5.1 Concept and Origin
CBCS is a credit-based, semester-based, learner-choice framework adopted by UGC in 2015 for Indian HEIs (implementation document: Choice Based Credit System, UGC, 2015). It replaces year-end-only summative evaluation with semesterised, continuous, multi-component assessment.
- Core Courses (CC) — compulsory in the discipline.
- Elective Courses (EC) — chosen by learner; Discipline-Specific Elective (DSE), Generic Elective (GE), Skill Enhancement Course (SEC), Ability Enhancement Compulsory Course (AECC).
- Foundation Courses — value-based, language, environmental.
7.5.2 The CBCS Credit Definition
- 1 Theory credit = 1 hour of lecture per week for one semester (~15 weeks → ~15 contact hours).
- 1 Tutorial credit = 1 hour per week.
- 1 Practical credit = 2 hours per week.
- A typical UG semester carries 20–24 credits; a 3-year UG = 132–148 credits.
7.5.3 CBCS Grading — 10-Point Scale
| Letter | Grade | Grade Point |
|---|---|---|
| O | Outstanding | 10 |
| A+ | Excellent | 9 |
| A | Very Good | 8 |
| B+ | Good | 7 |
| B | Above Average | 6 |
| C | Average | 5 |
| P | Pass | 4 |
| F | Fail | 0 |
| Ab | Absent | 0 |
7.5.4 SGPA, CGPA — How to Compute
SGPA (Semester Grade Point Average) for a semester:
\[\text{SGPA} = \frac{\sum (C_i \times G_i)}{\sum C_i}\]
where \(C_i\) = credit of course \(i\) and \(G_i\) = grade point earned in course \(i\).
CGPA (Cumulative Grade Point Average) across all semesters:
\[\text{CGPA} = \frac{\sum (C_i \times S_i)}{\sum C_i}\]
where \(S_i\) = SGPA of semester \(i\) weighted by total credits of that semester. UGC also publishes a CGPA → percentage conversion: % ≈ CGPA × 10 − 7.5 (varies by university).
7.5.5 Internal vs External Components
Typical CBCS course evaluation splits 25/75 or 30/70 or 40/60: internal (mid-semester test, assignment, attendance, project) + external/end-semester (written exam, viva, practical).
7.5.6 NEP 2020 + NCrF — Beyond CBCS
NEP 2020 introduces multi-disciplinary flexibility on top of CBCS:
- Academic Bank of Credits (ABC) — UGC 2021. Learner’s credits stored centrally; can be redeemed across HEIs.
- Multiple Entry and Multiple Exit (MEME) — Certificate (1 yr) → Diploma (2 yr) → Bachelor’s (3 yr) → Bachelor’s Honours / Research (4 yr).
- National Credit Framework (NCrF) 2022 — integrates school, HE, vocational; gives credit for learning outcome, not seat-time.
- Four-Year Undergraduate Programme (FYUP) with research option (UGC 2022 curriculum framework).
- PARAKH (Performance Assessment, Review, and Analysis of Knowledge for Holistic development) — NEP-mandated national assessment centre (2023, NCERT).
7.6 Computer-Based Testing (CBT)
7.6.1 What CBT Is
A Computer-Based Test is delivered, recorded, and (often) marked through a computer. Includes online, kiosk-based, iBT (Internet-based test), and CAT (Computer-Adaptive Test).
- Linear CBT — same items in same order for every candidate (NTA NET).
- Linear-on-the-fly Testing (LOFT) — items drawn at random from an item bank per candidate.
- Computer-Adaptive Testing (CAT) — item difficulty adjusts to learner’s running performance using Item Response Theory (IRT); GRE, GMAT use this.
7.6.2 Indian CBT Landscape
- NTA (National Testing Agency, 2017) conducts JEE Main, NEET-UG, UGC-NET, CUET-UG, CUET-PG as CBTs.
- NCERT NAS (National Achievement Survey) has moved toward CBT.
- PARAKH is the NEP-2020 holistic-assessment centre under NCERT.
- SWAYAM proctored exams are CBTs at TCS-iON centres.
7.6.3 CBT Strengths and Limitations
| Strengths | Limitations |
|---|---|
| Fast scoring; instant feedback | Digital divide; rural connectivity |
| Item-bank security; randomised items | Item-bank construction is expensive |
| Multimedia items (video, simulation) | Limited for essay / creative tasks |
| Adaptive testing (CAT) shortens tests | Requires IRT-trained item-writers |
| Auto-proctoring (AI, webcam) | Privacy concerns; bias in proctoring AI |
| Data analytics on item performance | Hardware failure risk |
7.6.4 Item Response Theory (IRT)
IRT underlies modern CBTs. It models the probability of a correct response as a function of:
- 1-PL (Rasch model) — only item difficulty.
- 2-PL — item difficulty + discrimination.
- 3-PL — adds guessing parameter (chance of correct by guess).
- 4-PL — adds upper asymptote (carelessness).
IRT lets the same ability score come from different item sets — essential for CAT and equating across forms.
7.7 Innovations in Evaluation Systems
7.7.1 Authentic / Performance Assessment
Authentic assessment (Grant Wiggins, 1989) asks learners to perform real-world tasks. Examples: portfolio, exhibition, internship report, capstone project.
7.7.2 Portfolio Assessment
A curated collection of the learner’s work over time — drafts, finals, reflection. Captures growth, not snapshot. Used in NEP-2020 holistic assessment and PARAKH.
7.7.3 Rubrics
A scoring tool with criteria × performance levels in a matrix.
- Holistic rubric — single overall score; fast.
- Analytic rubric — separate score for each criterion; diagnostic.
7.7.4 Open-Book Examination
Learner brings prescribed reference material. Tests application and analysis, not recall. NEP-2020 endorses open-book in HE.
7.7.5 Continuous and Comprehensive Evaluation (CCE)
CCE treats evaluation as ongoing throughout the year, covering both scholastic and co-scholastic domains. Introduced by NCF 2005; embedded in CBCS internal-assessment design.
7.7.6 Outcome-Based Education (OBE) and Bloom-Anderson Verb Use
OBE (William Spady, 1994) defines course outcomes (COs) and programme outcomes (POs), written with measurable Bloom verbs, then aligns curriculum and assessment to them. Used by NAAC, NBA (engineering), and the Washington Accord (engineering global mobility).
Outcomes ↔︎ Teaching-Learning ↔︎ Assessment must all align (the constructive-alignment principle of John Biggs, 1996).
7.7.7 Online Proctoring, AI in Evaluation
- AI-proctored exams — face detection, gaze tracking, browser lockdown.
- Automated essay scoring (AES) — e-Rater (ETS), IntelliMetric.
- Adaptive practice systems — Khan Academy, ALEKS.
- Learning analytics dashboards — predict at-risk learners.
- Generative-AI in feedback — ChatGPT-assisted, formative feedback.
- Blockchain credentialing — tamper-proof certificates (NAD: National Academic Depository).
7.7.8 360° / Peer / Self-Assessment
Peer assessment: peers grade each other (used in MOOC essays). Self-assessment: learner judges own work against a rubric. 360° combines self + peer + teacher + external.
7.7.9 Concept Mapping and Mind Mapping
Concept mapping (Joseph Novak, 1972, based on Ausubel’s subsumption theory) is now used as an evaluation tool — to surface a learner’s mental schema. Mind mapping (Tony Buzan, 1974) is its single-centre cousin.
7.8 Common Threats to Evaluation Quality
| Threat | What goes wrong |
|---|---|
| Halo effect | One impression colours rating of unrelated traits |
| Leniency / Severity | Rater consistently too generous or too harsh |
| Central tendency | Rater clusters everyone in the middle |
| Order / sequence bias | Position of paper in pile affects mark |
| Cultural / language bias | Items disadvantage a subgroup |
Defences: double-blind grading, multiple raters, moderation, rubrics, item analysis, IRT calibration.
7.9 How the Pieces Fit Together
flowchart LR
O[Learning Outcome<br/>3 domains · Bloom verbs] --> P[Plan<br/>Tyler's 4 elements]
P --> T[Teach<br/>Method · Support · Environment]
T --> A[Assess<br/>Formative · Summative<br/>NRT · CRT · CBT]
A --> R{Reform}
R -.-> O
R -.-> P
R -.-> T
classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;
Each arrow is a deliberate design decision; the dotted loop is the evaluation-into-improvement cycle that distinguishes a professional teacher from a content deliverer.
7.10 Theory Anchors at a Glance
| Person | Year | Contribution | PYQ hook |
|---|---|---|---|
| Ralph W. Tyler | 1949 | Curriculum evaluation; 4 elements | “Tyler model” |
| Benjamin S. Bloom | 1956 / 1968 | Taxonomy (cognitive); Mastery Learning | Verbs · OBE root |
| Anderson & Krathwohl | 2001 | Revised Bloom (Remember → Create) | Verb shift |
| D.R. Krathwohl | 1964 | Affective domain (Receiving → Characterising) | 5 levels |
| E.J. Simpson | 1972 | Psychomotor (Perception → Origination) | 7 levels |
| Michael Scriven | 1967 | Formative vs Summative | Distinction |
| Robert Glaser | 1963 | NRT vs CRT | Reference distinction |
| Grant Wiggins | 1989 | Authentic assessment | Performance tasks |
| John Biggs | 1996 | Constructive Alignment | OBE alignment |
| William Spady | 1994 | Outcome-Based Education (OBE) | NAAC/NBA root |
| Joseph Novak | 1972 | Concept mapping | Schema visualisation |
| Tony Buzan | 1974 | Mind mapping | Note-taking innovation |
| UGC | 2015 | CBCS framework, 10-point grade | Indian implementation |
| UGC / NTA / NEP | 2017–22 | NTA (2017), ABC (2021), NCrF (2022), PARAKH (2023) | National reforms |
7.11 Practice Questions
Which one of the following is the BROADEST in scope?
View solution
The distinction between FORMATIVE and SUMMATIVE evaluation was given by:
View solution
In a norm-referenced test (NRT), the learner's performance is compared with:
View solution
A teacher gives a short quiz at the end of each topic to identify what learners have not yet mastered. This is BEST described as:
View solution
"A test gives the same result when administered twice to the same group." This describes the test's:
View solution
Under UGC's CBCS, ONE theory credit equals:
View solution
In UGC's 10-point CBCS grade scale, the letter grade "O" stands for "Outstanding" and carries a grade point of:
View solution
Under CBCS, "AECC" stands for:
View solution
The Academic Bank of Credits (ABC), notified by UGC in 2021, primarily enables:
View solution
PARAKH, the national assessment centre proposed under NEP 2020 and set up in 2023, falls under:
View solution
In Computer-Adaptive Testing (CAT), the difficulty of the next item depends on:
View solution
The National Testing Agency (NTA), which conducts UGC-NET, JEE Main, NEET-UG and CUET as CBTs, was established in:
View solution
Outcome-Based Education (OBE), used by NBA and Washington Accord, was systematised by:
View solution
"Outcomes, teaching-learning activity, and assessment must all align." This principle of constructive alignment is by:
View solution
A rubric that gives ONE overall score, rather than separate scores for criteria, is called:
View solution
"Asking learners to perform real-world tasks (portfolio, project, exhibition)" is the central idea of:
View solution
A rater forms an overall favourable impression and consequently grades all separate criteria (content, structure, citations) higher. This is:
View solution
"Father of curriculum evaluation", who in 1949 proposed a four-element model of objectives, learning experiences, evaluation, and reform, is:
View solution
Arrange the four types of evaluation in their typical temporal sequence in a course:
(i) Diagnostic (ii) Placement (iii) Summative (iv) Formative
View solution
A student earns the following in one semester: Course X (3 credits, grade point 8), Course Y (4 credits, grade point 9), Course Z (3 credits, grade point 7). The SGPA is:
View solution
7.12 Quick Recall
- Three layers: Measurement (number) → Assessment (description) → Evaluation (decision).
- Three Bloom domains: Cognitive (Bloom 1956 / Anderson-Krathwohl 2001), Affective (Krathwohl 1964 — Receiving → Characterising), Psychomotor (Simpson 1972 — Perception → Origination).
- Tyler 4 elements (1949): Objectives → Learning experiences → Evaluation → Reform. “Father of curriculum evaluation.”
- 6 qualities of a tool: Validity · Reliability · Objectivity · Practicability · Discrimination · Comprehensiveness.
- Reliability methods: test-retest, parallel-form, split-half, KR-20/21, Cronbach’s α.
- Scriven (1967): Formative (improve) vs Summative (judge).
- Glaser (1963): NRT vs CRT.
- 4 types by purpose: Placement → Diagnostic → Formative → Summative.
- CBCS (UGC 2015): 1 theory credit = 1 hr/wk; 10-point scale (O 10 · A+ 9 · A 8 · B+ 7 · B 6 · C 5 · P 4 · F 0). Course types: CC · DSE · GE · SEC · AECC.
- SGPA = Σ(Cᵢ × Gᵢ) / Σ Cᵢ.
- NEP-2020 reforms: ABC (2021), NCrF (2022), PARAKH (2023, NCERT), FYUP with research, MEME (Cert/Dip/Bach/Bach-Honours).
- NTA (2017): conducts UGC-NET, JEE Main, NEET, CUET as CBTs.
- CBT types: Linear · LOFT · CAT (using IRT).
- IRT models: 1-PL Rasch (difficulty) → 2-PL (+ discrimination) → 3-PL (+ guessing) → 4-PL (+ carelessness).
- OBE (Spady 1994): COs/POs, measurable Bloom verbs. Constructive Alignment (Biggs 1996): Outcomes ↔︎ Teaching ↔︎ Assessment.
- Authentic assessment (Wiggins 1989): real-world tasks.
- Rubric types: Holistic (one score) vs Analytic (per criterion).
- Rater bias: Halo, Leniency/Severity, Central tendency, Order, Cultural.
- Concept mapping (Novak 1972, based on Ausubel) vs Mind mapping (Buzan 1974).