10 Steps of Research

10.1 The Nine-Step Research Process

The research process is a logically sequenced but iterative workflow. The classical sequence has nine steps; failure to complete any step properly weakens every subsequent step. Most-repeated PYQ patterns: (a) sequence ordering — arrange the given steps in correct order, (b) first step / last step identification, and (c) matching activities (literature review, hypothesis, sampling) to their step.

Nine Steps of the Research Process

Identification of the research problem
Review of related literature
Formulation of objectives and hypotheses
Research design
Sampling — defining population, selecting sample
Data collection
Data analysis
Interpretation and conclusion
Reporting and publication

flowchart TB
  S1[1 Problem<br/>Identification] --> S2[2 Literature<br/>Review]
  S2 --> S3[3 Objectives &<br/>Hypotheses]
  S3 --> S4[4 Research<br/>Design]
  S4 --> S5[5 Sampling]
  S5 --> S6[6 Data<br/>Collection]
  S6 --> S7[7 Data<br/>Analysis]
  S7 --> S8[8 Interpretation<br/>& Conclusion]
  S8 --> S9[9 Reporting &<br/>Publication]
  S9 -. feedback .-> S1
    classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;

10.2 Step 1 — Identification of the Research Problem

10.2.1 What a Research Problem Is

A research problem is a clearly-stated, unresolved question worth investigating. It is the first and most consequential step — a poorly defined problem cannot be rescued by a strong design later.

10.2.2 Sources of Research Problems

Where Research Problems Come From

Theory — gaps, contradictions, untested implications.
Practice — recurring practical challenges (clinical, classroom, industrial).
Literature — explicit “directions for future research”.
Replication — verifying contested findings.
Personal experience — observation, intuition, anomaly.
Policy / social need — public-interest priorities.
Funding-agency calls — DST, ICSSR, UGC, ICMR, CSIR.

10.2.3 Characteristics of a Good Research Problem

Seven Marks of a Good Problem

Researchable — answerable with available methods.
Significant — adds to theory or practice.
Original — not a duplicate.
Clear and unambiguous.
Feasible — within time, budget, ethics.
Ethical — does not harm subjects.
Specific — narrow enough to investigate deeply.

10.2.4 Defining vs Delimiting the Problem

Define vs Delimit

Defining — write the problem in a clear, testable statement.
Delimiting — set explicit boundaries (population, geography, time, variables).
Limitations — what the design cannot control (acknowledged in the final report).

A research problem is usually written as a declarative statement (“To examine the impact of …”) and, optionally, as a research question (“What is the effect of …?”).

10.3 Step 2 — Review of Related Literature

10.3.1 Purposes of a Literature Review

Seven Purposes of Literature Review

Map existing knowledge.
Identify gaps and contradictions.
Avoid duplication.
Refine the research problem.
Suggest theoretical framework.
Guide method and instrument choice.
Establish the researcher’s credibility.

10.3.2 Sources

Two Categories of Source

Source	Examples
Primary	Original research articles, theses, conference papers, lab reports
Secondary	Textbooks, review articles, meta-analyses, encyclopaedias, abstracts

10.3.3 Major Databases and Repositories

Databases the Researcher Must Know

Scopus, Web of Science — indexing services with citation metrics.
Google Scholar — broad academic search.
PubMed — biomedical.
ERIC — education.
JSTOR — humanities, social sciences.
SciHub / Anna’s Archive — controversial, not recommended.
Indian repositories — Shodhganga (theses), e-ShodhSindhu (journals), e-PG Pathshala (PG content), NDLI (digital library).
OER — DOAJ, OpenDOAR, OER Commons.

10.3.4 Types of Literature Review

Five Types of Literature Review

Type	What it does
Narrative / Traditional	Critical overview of selected works
Systematic	Pre-registered protocol; transparent inclusion/exclusion (e.g., PRISMA guidelines)
Meta-analysis	Statistical synthesis of quantitative findings
Scoping	Maps the breadth of a field
Integrative	Combines diverse research types, including theory and methods

10.3.5 Reference Management

Use a reference manager from day one: Mendeley, Zotero, EndNote, RefWorks, JabRef. Common citation styles: APA, MLA, Chicago, Harvard, Vancouver, IEEE.

10.4 Step 3 — Formulation of Objectives and Hypotheses

10.4.1 Research Objectives

SMART Research Objectives

Specific · Measurable · Achievable · Relevant · Time-bound. Distinguish general (overall) from specific objectives.

10.4.2 Hypothesis — Definition and Properties

A hypothesis is a tentative, testable statement about the relationship between two or more variables. Best definitions: Goode and Hatt — “a proposition which can be put to a test to determine its validity”; Kerlinger — “a conjectural statement of the relation between two or more variables”.

Six Qualities of a Good Hypothesis

Testable / Falsifiable — measurable variables, refutable in principle (Popper).
Specific — clear conditions and predictions.
Empirically grounded — connects to theory or prior evidence.
Parsimonious — simplest plausible explanation.
Consistent with knowledge.
States a relationship between variables.

10.4.3 Types of Hypotheses

Five Hypothesis Pairs / Types

Type	What it says
Research / Working / Alternative (H₁)	Anticipates a relationship or effect
Null (H₀)	States no relationship; the default to be disproved
Directional	Specifies direction (e.g., A > B)
Non-directional	States only that a difference exists
Statistical	Cast in inferential-statistics language

Other useful distinctions: descriptive, relational, causal; simple vs complex; a-priori vs ad-hoc.

10.4.4 Sources of Hypotheses

Theory, prior literature, personal observation, analogy, intuition, replication, and folk wisdom that needs testing.

10.4.5 Testing a Hypothesis — Type I and Type II Error

Type I and Type II Error

	H₀ is TRUE	H₀ is FALSE
Reject H₀	Type I error (α)	Correct
Fail to reject H₀	Correct	Type II error (β)

Power = 1 − β — the probability of correctly rejecting a false H₀. Conventional α = 0.05; power ≥ 0.80.

10.5 Step 4 — Research Design

10.5.1 What a Design Specifies

Five Elements a Design Must Specify

Method — experimental, descriptive, historical, qualitative, quantitative, mixed.
Setting — lab, field, online, archive.
Subjects — participants, materials, units.
Procedure — sequence, timing, instructions.
Measurement — variables, scales, instruments.

10.5.2 Major Design Families

Covered in detail in Topic 8: Pre-experimental, True experimental, Quasi-experimental, Descriptive (survey, case, correlational, comparative), Historical, Qualitative (phenomenology, ethnography, grounded theory, case study, narrative), Mixed-methods (Convergent Parallel, Explanatory Sequential, Exploratory Sequential, Embedded).

10.5.3 Design Quality — Validity Layers

Four Validity Layers (Shadish, Cook & Campbell, 2002)

Validity	Question it answers
Statistical conclusion validity	Are the inferences about covariation correct?
Internal validity	Did the IV cause the DV?
Construct validity	Do the measures capture the intended construct?
External validity	Do the findings generalise beyond this study?

10.6 Step 5 — Sampling

10.6.1 Key Terms

Sampling Vocabulary

Population — the full set the researcher wishes to generalise to.
Target population — the conceptual full set.
Accessible population — the set actually reachable.
Sampling frame — the operational list of units.
Sampling unit — individual or cluster.
Sample — the subset actually studied.
Parameter vs Statistic — population value vs sample estimate.
Sampling error — random difference between sample and population value.
Non-sampling error — bias from instrument, non-response, coverage.

10.6.2 Probability vs Non-Probability Sampling

Sampling Methods at a Glance

Probability	Non-Probability
Simple Random Sampling (SRS)	Convenience
Stratified Random Sampling	Purposive / Judgemental
Systematic Sampling	Quota
Cluster Sampling	Snowball
Multi-stage Sampling	Voluntary / Self-selected
Probability-proportional-to-size (PPS)	—

Probability sampling permits statistical generalisation; non-probability sampling does not.

10.6.3 Sample Size

Drivers: desired confidence level (usually 95 %), margin of error (often 5 %), population variability, expected effect size, design effect (DEFF) for cluster/stratified, subgroup analyses. Cochran’s formula is a standard starting point:

\[n_0 = \frac{Z^2 \cdot p (1-p)}{e^2}\]

where Z = 1.96 for 95 % CI, p = expected proportion (use 0.5 for max), e = margin of error.

10.7 Step 6 — Data Collection

10.7.1 Tools by Type of Data

Tools and Methods of Data Collection

Quantitative	Qualitative
Questionnaire	In-depth interview
Structured interview	Focus group discussion
Test / inventory	Participant observation
Rating scale (Likert, Thurstone, Guttman, semantic differential)	Field notes
Observation schedule	Document analysis
Existing dataset	Photo / video / audio data

10.7.2 Standardising the Instrument

Before main data collection, every instrument must be piloted, validated (content, criterion, construct) and reliability-checked (test-retest, parallel-form, split-half, KR-20/21, Cronbach’s α).

10.7.3 Ethics During Collection

Informed consent · Anonymity & confidentiality · Right to withdraw · Minimal risk · IRB / IEC approval · Vulnerable groups protection. (Full treatment in Topic 12.)

10.8 Step 7 — Data Analysis

10.8.1 Preparing the Data

Data Preparation Sequence

Editing → Coding → Classification → Tabulation → Visualisation → Analysis

10.8.2 Choosing the Analytic Technique

Picking an Analysis

One variable, descriptive: frequency, mean, SD, percentiles, distribution shape.
Two variables, association: Pearson’s r, Spearman’s ρ, χ², t-test, ANOVA, Mann-Whitney, Wilcoxon, Kruskal-Wallis.
Predicting outcome: linear regression, logistic regression, multiple regression.
Reducing dimensions: factor analysis, PCA.
Modelling: SEM, multilevel modelling, time series, machine-learning models.
Qualitative: thematic analysis (Braun & Clarke 2006), content analysis, narrative analysis, discourse analysis, IPA, grounded theory’s constant comparison.

10.8.3 Software

Quantitative: SPSS, R, Python, SAS, Stata, JASP, jamovi, MATLAB. Qualitative: NVivo, ATLAS.ti, MAXQDA, Dedoose, QDA Miner.

10.9 Step 8 — Interpretation and Conclusion

10.9.1 What Interpretation Adds

Analysis produces numbers, themes, models. Interpretation explains what the findings mean in relation to theory, prior research, and the original problem.

What Interpretation Must Cover

Compare findings with hypotheses and prior literature.
Explain unexpected results.
Discuss practical and theoretical implications.
Acknowledge limitations.
Suggest directions for future research.

10.9.2 Common Inferential Errors

Correlation-causation confusion · Over-generalisation beyond sample · Ignoring effect-size · p-hacking · HARKing (hypothesising after results are known) · Confirmation bias · Survivorship bias.

10.10 Step 9 — Reporting and Publication

10.10.1 The Standard Research Report Structure (IMRaD)

IMRaD — The Universal Article Structure

Introduction → Methods → Results → and → Discussion (with Conclusion, References, Appendices, Abstract on top).

10.10.2 Outlets

Where Research Gets Published

Peer-reviewed journals (UGC-CARE, Scopus, WoS, ABDC).
Conferences — proceedings, posters.
Books and chapters — Springer, Elsevier, OUP, Routledge, Sage.
Theses — Shodhganga deposit mandatory for Indian PhDs.
Working papers and preprints — SSRN, arXiv, bioRxiv, PsyArXiv.
Public scholarship — policy briefs, blogs, newspaper op-eds.

10.10.3 Reporting Standards

Use the right reporting standard for the study type: CONSORT (trials), STROBE (observational), PRISMA (systematic reviews), COREQ / SRQR (qualitative), GRADE (evidence quality), MIAME (microarrays). Honest reporting includes declaring funding, conflicts of interest, ethics approval, data availability.

10.10.4 Citation, Plagiarism, Predatory Journals — Brief Pointers

Plagiarism, conflicts of interest, predatory journals, data fabrication are full Topic-12 (Research Ethics) territory. Important here: always cite primary sources, use a reference manager, run Turnitin / iThenticate, avoid predatory journals (Beall’s list legacy).

10.11 Practice Questions

Q 01 Sequence Easy

The FIRST step in the research process is:

AReview of related literature
BIdentification of the research problem
CFormulation of hypothesis
DData collection

View solution

Correct Option: B

Identifying the research problem is always step 1 — every other step depends on it.

Q 02 Sequence Medium

Arrange the following steps in correct order:

(i) Sampling (ii) Literature review (iii) Hypothesis formulation (iv) Problem identification

A(iv) → (ii) → (iii) → (i)
B(ii) → (iv) → (i) → (iii)
C(iv) → (iii) → (ii) → (i)
D(i) → (ii) → (iii) → (iv)

View solution

Correct Option: A

Problem → Literature → Hypothesis → Sampling.

Q 03 Problem Easy

Which is NOT a characteristic of a good research problem?

AResearchable
BOriginal
CVague and broad
DFeasible

View solution

Correct Option: C

A good problem is clear and specific, not vague.

Q 04 Hypothesis Medium

"A hypothesis is a proposition which can be put to a test to determine its validity." This definition is by:

AKerlinger
BGoode and Hatt
CC.R. Kothari
DKarl Popper

View solution

Correct Option: B

Goode and Hatt, Methods in Social Research (1952).

Q 05 Null Medium

The null hypothesis (H₀) states that:

AThere IS a difference / relationship
BThere is NO difference / relationship
CThe relationship is positive
DThe relationship is negative

View solution

Correct Option: B

H₀ is the "no effect" default; statistical inference tries to reject H₀.

Q 06 Type I / II Hard

Rejecting a TRUE null hypothesis is called:

AType I error (α)
BType II error (β)
CSampling error
DPower error

View solution

Correct Option: A

Type I (α) = false positive (rejecting true H₀). Type II (β) = false negative (failing to reject false H₀). Conventionally α = 0.05, power = 1 − β ≥ 0.80.

Q 07 Lit Review Medium

A systematic review that statistically pools the quantitative findings of multiple studies is called:

ANarrative review
BScoping review
CMeta-analysis
DAnnotated bibliography

View solution

Correct Option: C

Meta-analysis = statistical synthesis of effect sizes across studies (Glass, 1976).

Q 08 Lit Review Medium

The PRISMA guidelines are used in:

ARandomised controlled trials
BObservational studies
CSystematic reviews and meta-analyses
DQualitative case studies

View solution

Correct Option: C

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). CONSORT for trials; STROBE for observational; COREQ/SRQR for qualitative.

Q 09 Sampling Easy

In stratified random sampling, the population is FIRST divided into:

ARandom units
BHomogeneous strata
CHeterogeneous clusters
DQuotas

View solution

Correct Option: B

Stratified = population divided into homogeneous strata (e.g., gender, region), then random sampling within each stratum. Cluster = heterogeneous clusters, random selection of clusters.

Q 10 Sampling Medium

A researcher selects every 10th name from a college register. This is:

ASimple random sampling
BSystematic sampling
CCluster sampling
DSnowball sampling

View solution

Correct Option: B

Every k-th element from an ordered list = Systematic sampling.

Q 11 Reliability Medium

Cronbach's α primarily assesses:

AValidity of a test
BInternal consistency reliability of a test
CDiscriminant power of items
DTest-retest reliability

View solution

Correct Option: B

Cronbach's α = average inter-item correlation; an estimate of internal consistency. Conventionally α ≥ 0.70 acceptable.

Q 12 Likert Medium

Likert scales typically generate data on which scale of measurement?

ANominal
BOrdinal (often treated as interval)
CInterval
DRatio

View solution

Correct Option: B

Likert items yield ordinal data; with multiple items, totals are commonly treated as interval for parametric analysis.

Q 13 Analysis Medium

The sequence of qualitative data preparation in grounded theory is:

ASelective → Axial → Open coding
BOpen → Axial → Selective coding
CAxial → Open → Selective coding
DOpen → Selective → Axial coding

View solution

Correct Option: B

Glaser & Strauss coding sequence: Open → Axial → Selective (each refines into higher-order categories).

Q 14 Interpretation Hard

"Hypothesising After the Results are Known" — fabricating a hypothesis to match the data afterwards — is called:

Ap-hacking
BHARKing
CCherry-picking
DConfirmation bias

View solution

Correct Option: B

HARKing (Hypothesising After the Results are Known) — coined by Norbert Kerr (1998). A widely-flagged research-integrity issue.

Q 15 IMRaD Easy

"IMRaD" is the standard structure of a research article. It stands for:

AIdea → Method → Research → Display
BIntroduction → Methods → Results → and Discussion
CInference → Measure → Replicate → and Document
DIssue → Method → Reasoning → and Decision

View solution

Correct Option: B

IMRaD = Introduction, Methods, Results, and Discussion (with Abstract, References, Appendices).

Q 16 Indian Repositories Medium

In India, doctoral theses are deposited in the national repository called:

AVIDWAN
Be-PG Pathshala
CShodhganga
De-ShodhSindhu

View solution

Correct Option: C

Shodhganga (INFLIBNET) is the national theses repository. Synopses are at Shodhgangotri; expert database at VIDWAN; PG content at e-PG Pathshala; journal consortium at e-ShodhSindhu.

Q 17 Step Function Medium

Match each activity to its research step:

(i)	Running Cronbach's α on the pilot questionnaire	(a)	Sampling
(ii)	Cochran's formula calculation	(b)	Data collection prep
(iii)	PRISMA flow diagram	(c)	Reporting
(iv)	IMRaD structure	(d)	Literature review

A(i)-b, (ii)-a, (iii)-d, (iv)-c
B(i)-a, (ii)-b, (iii)-c, (iv)-d
C(i)-c, (ii)-d, (iii)-a, (iv)-b
D(i)-d, (ii)-c, (iii)-b, (iv)-a

View solution

Correct Option: A

Cronbach's α → data-collection prep; Cochran's formula → sampling; PRISMA → literature review; IMRaD → reporting.

Q 18 Sampling Error Hard

Which of the following is a NON-SAMPLING error?

AA random difference between sample mean and population mean
BBias from leading question wording in a survey
CVariance reduced by increasing sample size
DConfidence-interval width

View solution

Correct Option: B

Non-sampling error = instrument bias, non-response, coverage failure, data-entry mistakes. Larger samples do NOT cure non-sampling error.

Q 19 Population Medium

A numerical characteristic of a POPULATION (such as μ or σ) is called:

AStatistic
BParameter
CEstimate
DVariable

View solution

Correct Option: B

Parameter = population value. Statistic = sample value, used to estimate the parameter.

Q 20 Sequence Hard

Arrange the FULL 9-step research process in correct order:

(i) Sampling (ii) Problem identification (iii) Hypothesis (iv) Data collection (v) Literature review (vi) Reporting (vii) Research design (viii) Interpretation (ix) Data analysis

A(ii) → (v) → (iii) → (vii) → (i) → (iv) → (ix) → (viii) → (vi)
B(ii) → (iii) → (v) → (vii) → (i) → (iv) → (ix) → (viii) → (vi)
C(v) → (ii) → (iii) → (vii) → (iv) → (i) → (ix) → (viii) → (vi)
D(ii) → (v) → (vii) → (iii) → (i) → (iv) → (ix) → (viii) → (vi)

View solution

Correct Option: A

Problem → Literature → Hypothesis → Design → Sampling → Collection → Analysis → Interpretation → Reporting.

10.12 Quick Recall

Quick recall

9 steps: Problem → Literature → Hypothesis → Design → Sampling → Collection → Analysis → Interpretation → Reporting.
Good research problem: Researchable, Significant, Original, Clear, Feasible, Ethical, Specific.
Lit-review purposes: map, gap, avoid duplication, refine problem, framework, method guide, credibility.
Lit-review types: Narrative · Systematic (PRISMA) · Meta-analysis (Glass 1976) · Scoping · Integrative.
Hypothesis definitions: Goode & Hatt — “proposition testable for validity”; Kerlinger — “conjectural statement of relation between two or more variables”.
Hypothesis types: Research/H₁ · Null/H₀ · Directional · Non-directional · Statistical.
Type I (α) = false positive · Type II (β) = false negative · Power = 1 − β ≥ 0.80 · α = 0.05.
4 validity layers (Shadish, Cook & Campbell 2002): Statistical conclusion · Internal · Construct · External.
Sampling vocabulary: Population · Target · Accessible · Frame · Unit · Parameter vs Statistic · Sampling error vs Non-sampling error.
Probability: SRS · Stratified · Systematic · Cluster · Multi-stage · PPS. Non-probability: Convenience · Purposive · Quota · Snowball · Voluntary.
Cochran’s formula: n₀ = Z²·p(1-p)/e²; Z=1.96 for 95% CI.
Reliability: test-retest, parallel-form, split-half, KR-20/21, Cronbach’s α ≥ 0.70.
Validity: content, construct, criterion (concurrent & predictive), face.
Quantitative software: SPSS · R · Python · SAS · Stata · JASP · jamovi. Qualitative: NVivo · ATLAS.ti · MAXQDA · Dedoose.
Inferential errors: correlation-causation · over-generalisation · p-hacking · HARKing · confirmation/survivorship bias.
IMRaD: Introduction · Methods · Results · and Discussion (plus Abstract, References, Appendices).
Reporting standards: CONSORT (trials) · STROBE (observational) · PRISMA (reviews) · COREQ/SRQR (qualitative) · GRADE (evidence).
Indian repositories: Shodhganga (theses) · Shodhgangotri (synopses) · VIDWAN (experts) · e-ShodhSindhu (journals) · e-PG Pathshala (PG content) · NDLI.