flowchart TB
S1[1 Problem<br/>Identification] --> S2[2 Literature<br/>Review]
S2 --> S3[3 Objectives &<br/>Hypotheses]
S3 --> S4[4 Research<br/>Design]
S4 --> S5[5 Sampling]
S5 --> S6[6 Data<br/>Collection]
S6 --> S7[7 Data<br/>Analysis]
S7 --> S8[8 Interpretation<br/>& Conclusion]
S8 --> S9[9 Reporting &<br/>Publication]
S9 -. feedback .-> S1
classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;
10 Steps of Research
10.1 The Nine-Step Research Process
The research process is a logically sequenced but iterative workflow. The classical sequence has nine steps; failure to complete any step properly weakens every subsequent step. Most-repeated PYQ patterns: (a) sequence ordering — arrange the given steps in correct order, (b) first step / last step identification, and (c) matching activities (literature review, hypothesis, sampling) to their step.
- Identification of the research problem
- Review of related literature
- Formulation of objectives and hypotheses
- Research design
- Sampling — defining population, selecting sample
- Data collection
- Data analysis
- Interpretation and conclusion
- Reporting and publication
10.2 Step 1 — Identification of the Research Problem
10.2.1 What a Research Problem Is
A research problem is a clearly-stated, unresolved question worth investigating. It is the first and most consequential step — a poorly defined problem cannot be rescued by a strong design later.
10.2.2 Sources of Research Problems
- Theory — gaps, contradictions, untested implications.
- Practice — recurring practical challenges (clinical, classroom, industrial).
- Literature — explicit “directions for future research”.
- Replication — verifying contested findings.
- Personal experience — observation, intuition, anomaly.
- Policy / social need — public-interest priorities.
- Funding-agency calls — DST, ICSSR, UGC, ICMR, CSIR.
10.2.3 Characteristics of a Good Research Problem
- Researchable — answerable with available methods.
- Significant — adds to theory or practice.
- Original — not a duplicate.
- Clear and unambiguous.
- Feasible — within time, budget, ethics.
- Ethical — does not harm subjects.
- Specific — narrow enough to investigate deeply.
10.2.4 Defining vs Delimiting the Problem
- Defining — write the problem in a clear, testable statement.
- Delimiting — set explicit boundaries (population, geography, time, variables).
- Limitations — what the design cannot control (acknowledged in the final report).
A research problem is usually written as a declarative statement (“To examine the impact of …”) and, optionally, as a research question (“What is the effect of …?”).
10.3 Step 2 — Review of Related Literature
10.3.1 Purposes of a Literature Review
- Map existing knowledge.
- Identify gaps and contradictions.
- Avoid duplication.
- Refine the research problem.
- Suggest theoretical framework.
- Guide method and instrument choice.
- Establish the researcher’s credibility.
10.3.2 Sources
| Source | Examples |
|---|---|
| Primary | Original research articles, theses, conference papers, lab reports |
| Secondary | Textbooks, review articles, meta-analyses, encyclopaedias, abstracts |
10.3.3 Major Databases and Repositories
- Scopus, Web of Science — indexing services with citation metrics.
- Google Scholar — broad academic search.
- PubMed — biomedical.
- ERIC — education.
- JSTOR — humanities, social sciences.
- SciHub / Anna’s Archive — controversial, not recommended.
- Indian repositories — Shodhganga (theses), e-ShodhSindhu (journals), e-PG Pathshala (PG content), NDLI (digital library).
- OER — DOAJ, OpenDOAR, OER Commons.
10.3.4 Types of Literature Review
| Type | What it does |
|---|---|
| Narrative / Traditional | Critical overview of selected works |
| Systematic | Pre-registered protocol; transparent inclusion/exclusion (e.g., PRISMA guidelines) |
| Meta-analysis | Statistical synthesis of quantitative findings |
| Scoping | Maps the breadth of a field |
| Integrative | Combines diverse research types, including theory and methods |
10.3.5 Reference Management
Use a reference manager from day one: Mendeley, Zotero, EndNote, RefWorks, JabRef. Common citation styles: APA, MLA, Chicago, Harvard, Vancouver, IEEE.
10.4 Step 3 — Formulation of Objectives and Hypotheses
10.4.1 Research Objectives
Specific · Measurable · Achievable · Relevant · Time-bound. Distinguish general (overall) from specific objectives.
10.4.2 Hypothesis — Definition and Properties
A hypothesis is a tentative, testable statement about the relationship between two or more variables. Best definitions: Goode and Hatt — “a proposition which can be put to a test to determine its validity”; Kerlinger — “a conjectural statement of the relation between two or more variables”.
- Testable / Falsifiable — measurable variables, refutable in principle (Popper).
- Specific — clear conditions and predictions.
- Empirically grounded — connects to theory or prior evidence.
- Parsimonious — simplest plausible explanation.
- Consistent with knowledge.
- States a relationship between variables.
10.4.3 Types of Hypotheses
| Type | What it says |
|---|---|
| Research / Working / Alternative (H₁) | Anticipates a relationship or effect |
| Null (H₀) | States no relationship; the default to be disproved |
| Directional | Specifies direction (e.g., A > B) |
| Non-directional | States only that a difference exists |
| Statistical | Cast in inferential-statistics language |
Other useful distinctions: descriptive, relational, causal; simple vs complex; a-priori vs ad-hoc.
10.4.4 Sources of Hypotheses
Theory, prior literature, personal observation, analogy, intuition, replication, and folk wisdom that needs testing.
10.4.5 Testing a Hypothesis — Type I and Type II Error
| H₀ is TRUE | H₀ is FALSE | |
|---|---|---|
| Reject H₀ | Type I error (α) | Correct |
| Fail to reject H₀ | Correct | Type II error (β) |
Power = 1 − β — the probability of correctly rejecting a false H₀. Conventional α = 0.05; power ≥ 0.80.
10.5 Step 4 — Research Design
10.5.1 What a Design Specifies
- Method — experimental, descriptive, historical, qualitative, quantitative, mixed.
- Setting — lab, field, online, archive.
- Subjects — participants, materials, units.
- Procedure — sequence, timing, instructions.
- Measurement — variables, scales, instruments.
10.5.2 Major Design Families
Covered in detail in Topic 8: Pre-experimental, True experimental, Quasi-experimental, Descriptive (survey, case, correlational, comparative), Historical, Qualitative (phenomenology, ethnography, grounded theory, case study, narrative), Mixed-methods (Convergent Parallel, Explanatory Sequential, Exploratory Sequential, Embedded).
10.5.3 Design Quality — Validity Layers
| Validity | Question it answers |
|---|---|
| Statistical conclusion validity | Are the inferences about covariation correct? |
| Internal validity | Did the IV cause the DV? |
| Construct validity | Do the measures capture the intended construct? |
| External validity | Do the findings generalise beyond this study? |
10.6 Step 5 — Sampling
10.6.1 Key Terms
- Population — the full set the researcher wishes to generalise to.
- Target population — the conceptual full set.
- Accessible population — the set actually reachable.
- Sampling frame — the operational list of units.
- Sampling unit — individual or cluster.
- Sample — the subset actually studied.
- Parameter vs Statistic — population value vs sample estimate.
- Sampling error — random difference between sample and population value.
- Non-sampling error — bias from instrument, non-response, coverage.
10.6.2 Probability vs Non-Probability Sampling
| Probability | Non-Probability |
|---|---|
| Simple Random Sampling (SRS) | Convenience |
| Stratified Random Sampling | Purposive / Judgemental |
| Systematic Sampling | Quota |
| Cluster Sampling | Snowball |
| Multi-stage Sampling | Voluntary / Self-selected |
| Probability-proportional-to-size (PPS) | — |
Probability sampling permits statistical generalisation; non-probability sampling does not.
10.6.3 Sample Size
Drivers: desired confidence level (usually 95 %), margin of error (often 5 %), population variability, expected effect size, design effect (DEFF) for cluster/stratified, subgroup analyses. Cochran’s formula is a standard starting point:
\[n_0 = \frac{Z^2 \cdot p (1-p)}{e^2}\]
where Z = 1.96 for 95 % CI, p = expected proportion (use 0.5 for max), e = margin of error.
10.7 Step 6 — Data Collection
10.7.1 Tools by Type of Data
| Quantitative | Qualitative |
|---|---|
| Questionnaire | In-depth interview |
| Structured interview | Focus group discussion |
| Test / inventory | Participant observation |
| Rating scale (Likert, Thurstone, Guttman, semantic differential) | Field notes |
| Observation schedule | Document analysis |
| Existing dataset | Photo / video / audio data |
10.7.2 Standardising the Instrument
Before main data collection, every instrument must be piloted, validated (content, criterion, construct) and reliability-checked (test-retest, parallel-form, split-half, KR-20/21, Cronbach’s α).
10.7.3 Ethics During Collection
Informed consent · Anonymity & confidentiality · Right to withdraw · Minimal risk · IRB / IEC approval · Vulnerable groups protection. (Full treatment in Topic 12.)
10.8 Step 7 — Data Analysis
10.8.1 Preparing the Data
Editing → Coding → Classification → Tabulation → Visualisation → Analysis
10.8.2 Choosing the Analytic Technique
- One variable, descriptive: frequency, mean, SD, percentiles, distribution shape.
- Two variables, association: Pearson’s r, Spearman’s ρ, χ², t-test, ANOVA, Mann-Whitney, Wilcoxon, Kruskal-Wallis.
- Predicting outcome: linear regression, logistic regression, multiple regression.
- Reducing dimensions: factor analysis, PCA.
- Modelling: SEM, multilevel modelling, time series, machine-learning models.
- Qualitative: thematic analysis (Braun & Clarke 2006), content analysis, narrative analysis, discourse analysis, IPA, grounded theory’s constant comparison.
10.8.3 Software
Quantitative: SPSS, R, Python, SAS, Stata, JASP, jamovi, MATLAB. Qualitative: NVivo, ATLAS.ti, MAXQDA, Dedoose, QDA Miner.
10.9 Step 8 — Interpretation and Conclusion
10.9.1 What Interpretation Adds
Analysis produces numbers, themes, models. Interpretation explains what the findings mean in relation to theory, prior research, and the original problem.
- Compare findings with hypotheses and prior literature.
- Explain unexpected results.
- Discuss practical and theoretical implications.
- Acknowledge limitations.
- Suggest directions for future research.
10.9.2 Common Inferential Errors
Correlation-causation confusion · Over-generalisation beyond sample · Ignoring effect-size · p-hacking · HARKing (hypothesising after results are known) · Confirmation bias · Survivorship bias.
10.10 Step 9 — Reporting and Publication
10.10.1 The Standard Research Report Structure (IMRaD)
Introduction → Methods → Results → and → Discussion (with Conclusion, References, Appendices, Abstract on top).
10.10.2 Outlets
- Peer-reviewed journals (UGC-CARE, Scopus, WoS, ABDC).
- Conferences — proceedings, posters.
- Books and chapters — Springer, Elsevier, OUP, Routledge, Sage.
- Theses — Shodhganga deposit mandatory for Indian PhDs.
- Working papers and preprints — SSRN, arXiv, bioRxiv, PsyArXiv.
- Public scholarship — policy briefs, blogs, newspaper op-eds.
10.10.3 Reporting Standards
Use the right reporting standard for the study type: CONSORT (trials), STROBE (observational), PRISMA (systematic reviews), COREQ / SRQR (qualitative), GRADE (evidence quality), MIAME (microarrays). Honest reporting includes declaring funding, conflicts of interest, ethics approval, data availability.
10.10.4 Citation, Plagiarism, Predatory Journals — Brief Pointers
Plagiarism, conflicts of interest, predatory journals, data fabrication are full Topic-12 (Research Ethics) territory. Important here: always cite primary sources, use a reference manager, run Turnitin / iThenticate, avoid predatory journals (Beall’s list legacy).
10.11 Practice Questions
The FIRST step in the research process is:
View solution
Arrange the following steps in correct order:
(i) Sampling (ii) Literature review (iii) Hypothesis formulation (iv) Problem identification
View solution
Which is NOT a characteristic of a good research problem?
View solution
"A hypothesis is a proposition which can be put to a test to determine its validity." This definition is by:
View solution
The null hypothesis (H₀) states that:
View solution
Rejecting a TRUE null hypothesis is called:
View solution
A systematic review that statistically pools the quantitative findings of multiple studies is called:
View solution
The PRISMA guidelines are used in:
View solution
In stratified random sampling, the population is FIRST divided into:
View solution
A researcher selects every 10th name from a college register. This is:
View solution
Cronbach's α primarily assesses:
View solution
Likert scales typically generate data on which scale of measurement?
View solution
The sequence of qualitative data preparation in grounded theory is:
View solution
"Hypothesising After the Results are Known" — fabricating a hypothesis to match the data afterwards — is called:
View solution
"IMRaD" is the standard structure of a research article. It stands for:
View solution
In India, doctoral theses are deposited in the national repository called:
View solution
Match each activity to its research step:
| (i) | Running Cronbach's α on the pilot questionnaire | (a) | Sampling |
| (ii) | Cochran's formula calculation | (b) | Data collection prep |
| (iii) | PRISMA flow diagram | (c) | Reporting |
| (iv) | IMRaD structure | (d) | Literature review |
View solution
Which of the following is a NON-SAMPLING error?
View solution
A numerical characteristic of a POPULATION (such as μ or σ) is called:
View solution
Arrange the FULL 9-step research process in correct order:
(i) Sampling (ii) Problem identification (iii) Hypothesis (iv) Data collection (v) Literature review (vi) Reporting (vii) Research design (viii) Interpretation (ix) Data analysis
View solution
10.12 Quick Recall
- 9 steps: Problem → Literature → Hypothesis → Design → Sampling → Collection → Analysis → Interpretation → Reporting.
- Good research problem: Researchable, Significant, Original, Clear, Feasible, Ethical, Specific.
- Lit-review purposes: map, gap, avoid duplication, refine problem, framework, method guide, credibility.
- Lit-review types: Narrative · Systematic (PRISMA) · Meta-analysis (Glass 1976) · Scoping · Integrative.
- Hypothesis definitions: Goode & Hatt — “proposition testable for validity”; Kerlinger — “conjectural statement of relation between two or more variables”.
- Hypothesis types: Research/H₁ · Null/H₀ · Directional · Non-directional · Statistical.
- Type I (α) = false positive · Type II (β) = false negative · Power = 1 − β ≥ 0.80 · α = 0.05.
- 4 validity layers (Shadish, Cook & Campbell 2002): Statistical conclusion · Internal · Construct · External.
- Sampling vocabulary: Population · Target · Accessible · Frame · Unit · Parameter vs Statistic · Sampling error vs Non-sampling error.
- Probability: SRS · Stratified · Systematic · Cluster · Multi-stage · PPS. Non-probability: Convenience · Purposive · Quota · Snowball · Voluntary.
- Cochran’s formula: n₀ = Z²·p(1-p)/e²; Z=1.96 for 95% CI.
- Reliability: test-retest, parallel-form, split-half, KR-20/21, Cronbach’s α ≥ 0.70.
- Validity: content, construct, criterion (concurrent & predictive), face.
- Quantitative software: SPSS · R · Python · SAS · Stata · JASP · jamovi. Qualitative: NVivo · ATLAS.ti · MAXQDA · Dedoose.
- Inferential errors: correlation-causation · over-generalisation · p-hacking · HARKing · confirmation/survivorship bias.
- IMRaD: Introduction · Methods · Results · and Discussion (plus Abstract, References, Appendices).
- Reporting standards: CONSORT (trials) · STROBE (observational) · PRISMA (reviews) · COREQ/SRQR (qualitative) · GRADE (evidence).
- Indian repositories: Shodhganga (theses) · Shodhgangotri (synopses) · VIDWAN (experts) · e-ShodhSindhu (journals) · e-PG Pathshala (PG content) · NDLI.