A Systematic Framework for Research Design in Quantitative Social Science (Brief Draft) - From Phenomenon to Operationalization
Introduction: Ontological and Epistemological Foundations of Research Questions
The effectiveness of quantitative social science research depends on the precision and testability of research questions. However, the most common methodological deficiency in empirical research stems not from technical errors in statistical inference, but from conceptual ambiguity and operational unmeasurability at the question level itself.
A research question that appears clear at the natural language level often reveals numerous implicit ontological commitments, untested causal assumptions, and unclear measurement models when entering the operationalization stage of empirical research.
This paper aims to systematically articulate the front-end epistemic work in research design: how to transform a phenomenological observation into an operable, measurable, and testable scientific question. This process includes four core components: question discovery, conceptual clarification, theoretical positioning, and operationalization modeling.
1. Research Question Discovery
1.1 Sources and Typology of Research Questions
Research question generation follows specific knowledge production logic. Based on the nature of question sources, research questions can be categorized as:
(1) Literature-driven questions
- Theoretical gaps: phenomenal domains not covered by existing theoretical frameworks
- Empirical gaps: theoretical propositions lacking empirical testing
- Contradictory findings: conflicting conclusions across different studies
- Methodological limitations: systematic deficiencies in measurement or inference in existing research
(2) Phenomenon-driven questions
- Social reality observations: empirical patterns in practical domains requiring explanation
- Policy evaluation needs: causal identification problems of intervention effects
- Theoretical paradoxes: anomalous phenomena that existing theories cannot explain
(3) Method-driven questions
- New data availability: research opportunities from large-scale data or new measurement technologies
- New method applicability: application expansion of computational methods or statistical models
- Replicability issues: re-examination and robustness testing of existing research
1.2 Structural Relationship Between Theory and Research Questions
A bidirectional constructive relationship exists between research questions and theory:
(1) Deductive approach
- Deriving testable specific hypotheses from theoretical propositions
- Confirmatory research design
- Emphasizing internal validity and causal inference
(2) Inductive approach
- Extracting theoretical patterns from empirical observations
- Exploratory research design
- Emphasizing external validity and concept discovery
(3) Abductive approach
- Iterative cycling between theory and empirical evidence
- Theory-driven data exploration
- Balancing explanatory and predictive power
1.3 Types of Research Contributions
Clarifying the type of research question helps position its potential contribution:
- Theoretical contribution: Proposing new concepts, mechanisms, or causal pathways
- Empirical contribution: Providing new evidence, measurements, or facts
- Methodological contribution: Developing new research designs or estimation strategies
- Integrative contribution: Synthesizing scattered research conclusions into systematic knowledge
2. Refining Research Questions: From Vague to Precise
2.1 Formal Criteria for Good Research Questions
A researchable question must satisfy the following formal conditions:
(1) Specificity
- Clearly define research objects, key variables, and relationship types
- Avoid ambiguous concepts or overly broad expressions
(2) Answerability
- Question can in principle be answered through empirical evidence
- Viable research design and data collection plan exists
(3) Boundedness
- Question has clear spatiotemporal boundaries and scope of applicability
- Does not attempt to answer all related questions in a single study
(4) Operationalizability
- Core concepts can be transformed into observable, measurable variables
- Theoretical relationships can be converted into statistical models or computational procedures
2.2 Value Judgment Criteria for Research Questions
Beyond formal criteria, research questions must meet substantive value standards:
(1) Significance
- Theoretical significance: Advancing disciplinary knowledge boundaries
- Practical significance: Solving real-world problems or guiding policy
- Methodological significance: Demonstrating new research paradigms
(2) Feasibility
- Data availability: Whether required data exists or can be collected
- Technical feasibility: Whether existing methods can support research design
- Resource constraints: Whether time, funding, sample size are sufficient
(3) Originality
- Avoiding simple repetition of existing research
- Innovation in concepts, methods, or evidence
2.3 Iterative Process of Question Refinement
From preliminary interest to clear research question typically goes through these stages:
Stage 1: Broad interest → e.g., “How does social media affect mental health?”
Stage 2: Theoretical positioning → e.g., “Relationship between social media use and depression”
Stage 3: Mechanism specification → e.g., “Social comparison as mediating mechanism”
Stage 4: Boundary delimitation → e.g., “Effects of specific platforms in adolescent populations”
Stage 5: Operationalized question → e.g., “Does Instagram usage frequency increase adolescent depressive symptoms through upward social comparison?”
3. From Concept to Operation: The Epistemological Architecture of Operationalization
3.1 The Essence of Operationalization: Measurement Theory
Operationalization is not simply “variable selection,” but an epistemological process involving ontological commitments and measurement theory. Its core task is:
Under explicit assumptions, construct a formalized measurement-relationship model that maps conceptual entities to observable variables.
The complete definition of operationalization can be expressed as:
$$\text{Operationalization} = f_{\text{map}}: \mathcal{C} \to \mathcal{O} \mid \mathcal{A}$$
Where:
- $\mathcal{C}$ = conceptual space
- $\mathcal{O}$ = observational space
- $f_{\text{map}}$ = mapping function
- $\mathcal{A}$ = assumption set
3.2 Case Analysis: Dissecting a Research Question
Consider the following research question:
“Can cognitive behavioral therapy reduce depressive symptoms in adolescents?”
This question is clear at the natural language level but fundamentally unresearchable at the operational level.
Each key concept is ambiguously presented as a challenge, specifically:
(1) Who are “adolescents”?
- Age range: 10-18 years? 12-17 years?
- Population characteristics: Clinical sample? Community sample? School sample?
- Inclusion criteria: Depression diagnosis required? Comorbidities allowed? Medication allowed?
(2) What is “cognitive behavioral therapy” (CBT)?
- Specific type: Manualized CBT? Individual or group? Online or offline?
- Treatment protocol: How many sessions? Duration? Therapist qualifications?
- Fidelity: How to ensure implementation is actually CBT and not something else?
(3) What does “therapy” mean?
- Exposure definition: Attendance? Completion? Minimum dose?
- Control group: No treatment? Waitlist? Treatment as usual?
- Time dimension: Immediate effects? 3-month follow-up?
(4) How to measure “depressive symptoms”?
- Measurement tool: PHQ-9? BDI? CDI?
- Reporter: Self-report? Clinician? Parent?
- Scale nature: Continuous score? Clinical threshold?
- Time point: Post-treatment? Change score? Trajectory?
(5) What causal structure does “reduce” imply?
- Causal estimand: Average Treatment Effect (ATE)? Local effect?
- Comparison object: Relative to what?
- Confounding control: Randomization? Matching? Covariate adjustment?
3.3 From Natural Language to Operationalized Question
After operationalization, the original question might become:
“Among 13-17 year-old adolescents diagnosed with moderate depression, compared to treatment as usual, can a 12-session manualized individual CBT program reduce PHQ-9 scores at 12 weeks?”
Now:
- Population is specified
- Intervention is defined
- Outcome is measurable
- Causal contrast is set
3.4 Formalized Operationalization Process
We can formalize the operationalization process into five steps:
Step 1: Ontology specification
- Identify all entities, actions, outcomes, contexts
Step 2: Measurement model
- Construct observable variables for each concept
- Specify measurement error and validity threats
Step 3: Action encoding
- Quantify intervention, exposure, treatment
- Define control conditions
Step 4: Structural relations
- Specify hypothesized dependency paths and causal mechanisms
- Construct Directed Acyclic Graph (DAG) or Structural Equation Model (SEM)
Step 5: Estimand definition
- Define target causal or associational quantity
- Specify identification assumptions
3.5 Hierarchical System of Operationalization: From Philosophy to Technology
The operationalization process can be understood as a multi-level epistemological system:
Level 1: Ontological foundation
What exists in our research domain?
- Entity identification: populations, interventions, outcomes, contexts
- Attribute definition: intrinsic vs. relational properties
- Relation types: causal, correlational, constitutive
- Philosophical stance: realism vs. constructivism, observable vs. latent constructs
Level 2: Measurement model
How do concepts map to observations?
Classical measurement theory
- Classical Test Theory (CTT): true score + error
- Item Response Theory (IRT): latent variable-item probability curve
- Scale development: reliability, validity
Modern measurement methods
- Multimodal measurement: behavioral, physiological, self-report, observational
- Digital phenotyping: passive sensor data
- Neuroimaging: fMRI, EEG as indicators of psychological processes
Latent variable modeling
- Factor analysis: reducing multiple observed indicators to latent constructs
- Structural Equation Modeling (SEM): measurement model + structural model
Level 3: Action encoding
How are interventions/treatments represented?
Discrete representation
- Binary: treatment vs. control
- Categorical: different treatment types
Continuous representation
- Dose-response: continuous variation in treatment intensity
- Time dimension: exposure duration
Multidimensional representation
- Component analysis: treatment composed of multiple components
- Vector representation: $t = [\text{duration}, \text{intensity}, \text{fidelity}, \text{component}_1, \ldots, \text{component}_k]$
Temporal structure
- Single time-point treatment
- Time-varying treatment: $T(t)$ describes intervention state at each moment
- Adaptive interventions: adjusted based on intermediate outcomes
Level 4: Structural relations
What is the dependency structure among variables?
Causal graphs (DAG)
1 | Classical confounding: X → T → Y |
Structural Causal Models (SCM)
1 | T := f_T(X, U_T) |
Identification assumptions
- Ignorability: $Y(1), Y(0) \perp T \mid X$ under $P(T|X)$
- Consistency: If $T=t$ then $Y = Y(t)$
- Positivity: $0 < P(T=t|X) < 1$
- SUTVA: individual treatment effects not affected by other individuals
Level 5: Estimand definition
What exactly are we estimating?
Types of causal estimands
- ATE (Average Treatment Effect): $E[Y(1) - Y(0)]$
- ATT (Average Treatment effect on Treated): $E[Y(1) - Y(0) \mid T=1]$
- CATE (Conditional ATE): $E[Y(1) - Y(0) \mid X=x]$
Effect quantification methods
- Absolute difference: $\Delta = \bar{Y}{\text{treatment}} - \bar{Y}{\text{control}}$
- Relative difference: $RR = \bar{Y}{\text{treatment}} / \bar{Y}{\text{control}}$
- Standardized difference: Cohen’s $d = \Delta / \sigma_{\text{pooled}}$
- NNT (Number Needed to Treat): $1/(P_{\text{response}T} - P{\text{response}_C})$
Inference target
- Sample vs. population
- Causal vs. associational
- Prediction vs. explanation
4. After Operationalization: Choosing Research Designs
4.1 Critical Juncture: From Operationalization to Evidence Generation
After completing operationalization, researchers do not directly enter the “data analysis stage” but face the most critical decision point in research design:
How to generate credible evidence to answer the operationalized research question?
The core of this stage is not “choosing statistical methods” but selecting an evidence-generating mechanism. Different research designs represent fundamentally different:
- Data source logic
- Causal inference strategies
- Types of validity threats
- Generalization boundaries
4.2 Fundamental Divisions in Evidence Generation
First fork: Evidence agency
Active generation
- Researcher controls or manipulates key variables
- Typical: experiments, intervention studies
- Advantage: strong causal inference
- Cost: external validity, ethics, resources
Passive collection
- Utilizing already existing or naturally occurring data
- Typical: observational studies, surveys, secondary data
- Advantage: ecological validity, feasibility
- Cost: difficulty in confounding control
Second fork: Evidence temporality
Prospective
- Tracking from present to future
- Advantage: clear temporal ordering
- Typical: cohort studies, RCT
Retrospective
- Looking back from present to past
- Advantage: efficiency, low cost
- Typical: case-control, literature review
4.3 Systematic Classification of Research Designs
Design 1: Experimental and Quasi-Experimental Designs
(1) Randomized Controlled Trial (RCT)
Core features:
- Random assignment to treatment conditions
- Strongest causal inference capability
- Highest internal validity
Design variants:
- Parallel group design: participants randomly assigned to different groups
- Crossover design: participants receive different treatments sequentially
- Factorial design: testing multiple factors simultaneously (2×2, 2×3)
- Cluster randomization: randomizing groups as units
Key decisions:
How to set control conditions?
- No treatment control
- Placebo control
- Treatment as usual (TAU) control
- Waitlist control
Treatment allocation ratio?
- 1:1
- 2:1 (more participants receive treatment)
- Adaptive randomization
Blinding strategy?
- Single-blind (participants unaware)
- Double-blind (participants and researchers unaware)
- Triple-blind (plus data analysts)
Sample size and power?
- Expected effect size
- α level (typically 0.05)
- 1-β power (typically 0.80)
(2) Quasi-Experimental Designs
When randomization is infeasible or unethical:
Difference-in-Differences (DiD)
- Difference in changes between treatment and control groups before/after treatment
- Core assumption: parallel trends
- Application: policy evaluation, natural experiments
Regression Discontinuity (RDD)
- Treatment assignment based on cutoff value of continuous variable
- Core assumption: similarity near discontinuity point
- Application: scholarship, admission policy evaluation
Instrumental Variables (IV)
- Finding variables that affect treatment but not directly outcome
- Core assumptions: relevance, exclusion, monotonicity
- Application: returns to education, medical treatment effects
Interrupted Time Series (ITS)
- Multiple time-point measurements before/after treatment
- Core: level or trend change at treatment time
- Application: policy intervention, public health
(3) Laboratory/Online Experiments
- Controlled experimental environment
- Online A/B testing (large-scale)
- Behavioral economics experiments
- Advantage: precise control, replicable
- Disadvantage: ecological validity issues
Design 2: Survey Designs
(1) Cross-sectional Survey
Features:
- Single time-point measurement
- Describing current state
- Exploring associational relationships
Key design elements:
Sampling strategy
- Probability sampling: simple random, stratified, cluster
- Non-probability sampling: convenience, quota, snowball
Survey mode
- Online questionnaire
- Telephone interview
- Face-to-face interview
- Mixed-mode
Questionnaire design
- Question types: open/closed
- Scale selection: validated instruments
- Order effect control
- Response bias detection
Inference limitations:
- Cannot establish causality: correlation ≠ causation
- Reverse causality: Y may affect X
- Third variables: Z affects both X and Y
(2) Longitudinal Survey
Panel study
- Repeated measurements of same individuals
- Controls for unobserved heterogeneity
- Can test change and causality
Cohort study
- Tracking specific groups (e.g., birth cohorts)
- Prospective causal inference
- Suitable for developmental questions
Repeated cross-sections
- Different samples at different times
- Describing population trends
- Cannot track individual change
Key advantages:
- Establishing temporal ordering
- Separating within/between effects
- Testing developmental trajectories
Design 3: Observational Studies
(1) Cohort Study
- Prospective tracking from exposure to outcome
- Can calculate incidence, relative risk
- Suitable for rare exposure studies
(2) Case-Control Study
- Retrospectively tracing exposure from outcome
- High efficiency, suitable for rare diseases
- Odds Ratio estimation
(3) Ecological Study
- Population as unit of analysis
- Using aggregated data
- Beware of ecological fallacy
Design 4: Literature Review and Meta-Analysis
This is an independent research design, not just “preliminary work.”
(1) Systematic Review
Core steps:
PICO framework
- Population: study population
- Intervention: intervention measures
- Comparison: control conditions
- Outcome: outcome indicators
Literature search strategy
- Database selection
- Keyword combinations
- Time range
Inclusion/exclusion criteria
- Study design types
- Sample characteristics
- Quality assessment
Data extraction
- Study characteristic coding
- Effect size extraction
- Risk of bias assessment
(2) Meta-Analysis
Quantitative synthesis of multiple studies:
Core tasks:
Effect size standardization
- Cohen’s d
- Odds ratio
- Correlation coefficient r
Heterogeneity testing
- I² statistic
- Q test
- τ² estimation
Model selection
- Fixed-effect: assumes same true effect
- Random-effect: allows between-study variation
Publication bias
- Funnel plot
- Egger’s test
- Trim-and-fill
(3) Scoping Review
- Exploratory literature mapping
- Conceptual boundary mapping
- Suitable for emerging fields
Design 5: Text and Document Analysis
(1) Content Analysis
Classical methods:
- Manual coding
- Codebook development
- Inter-coder reliability (Cohen’s κ)
Modern methods:
- Automatic text classification (NLP)
- Supervised learning: training on labeled data
- Unsupervised learning: clustering, topic models
(2) Discourse Analysis
- Conversation analysis
- Critical discourse analysis
- Frame analysis
(3) Computational Text Analysis
Topic models:
- LDA (Latent Dirichlet Allocation)
- STM (Structural Topic Model)
Sentiment analysis:
- Dictionary methods
- Machine learning classification
- Deep learning (BERT, etc.)
Word embeddings:
- Word2Vec
- GloVe
- Contextual embeddings (transformers)
Text networks:
- Co-occurrence networks
- Semantic networks
Design 6: Secondary Data Analysis
(1) Existing Survey Data
- Large public datasets: GSS, ANES, NHANES
- Advantage: large sample, low cost
- Disadvantage: variable constraints, cannot control measurement
(2) Administrative Records
- Medical records (EMR/EHR)
- Educational data (grades, attendance)
- Government databases (census, tax)
(3) Big Data Sources
- Social media: Twitter, Reddit, Weibo
- Digital traces: search logs, browsing records
- Sensor data: wearables, IoT
- Transaction data: credit cards, e-commerce
Key challenges:
- Unknown data generation mechanisms
- Selection bias
- Non-standardized measurement
- Privacy and ethics
Design 7: Mixed Methods Designs
(1) Sequential Design
Exploratory sequential:
- Phase 1: Qualitative exploration (interviews, focus groups)
- Phase 2: Quantitative study based on qualitative findings
- Purpose: generate hypotheses → test hypotheses
Explanatory sequential:
- Phase 1: Quantitative survey/experiment
- Phase 2: Qualitative depth (explain quantitative results)
- Purpose: test hypotheses → understand mechanisms
(2) Concurrent Design
Convergent:
- Simultaneously collect qualitative and quantitative data
- Compare both types of results
- Purpose: triangulation
Embedded:
- Auxiliary method embedded in main method
- Example: interviews embedded in RCT
- Purpose: supplementary understanding
Design 8: Simulation and Computational Modeling
(1) Agent-Based Modeling (ABM)
- Individual rules → emergent patterns
- Suitable for: social processes, diffusion, collective behavior
(2) System Dynamics
- Feedback loops, stocks and flows
- Suitable for: policy simulation, macro processes
(3) Network Simulation
- Social network evolution
- Diffusion process simulation
(4) Monte Carlo Simulation
- Sensitivity analysis
- Uncertainty quantification
4.4 Decision Framework for Research Design Selection
Dimension 1: Research Question Type
| Question Type | Priority Design |
|---|---|
| Causal effect (X → Y?) | Experiment, quasi-experiment |
| Descriptive (distribution of X?) | Survey, observational study |
| Exploratory (what patterns exist?) | Mixed methods, text analysis |
| Synthetic (existing evidence?) | Systematic review, meta-analysis |
| Mechanistic (how does it happen?) | Mixed methods, longitudinal study |
Dimension 2: Feasibility Constraints
| Constraint Type | Viable Design |
|---|---|
| Ethics prohibit randomization | Quasi-experiment, observational study |
| Cannot collect new data | Secondary data, literature review |
| Limited sample size | Case study, in-depth interviews |
| Time pressure | Cross-sectional survey, secondary data |
| Abundant resources | RCT, large-scale longitudinal study |
Dimension 3: Validity Trade-offs
| Priority Validity | Design Choice |
|---|---|
| Internal validity | Laboratory RCT |
| External validity | Field quasi-experiment, large survey |
| Construct validity | Mixed methods, multiple measurements |
| Statistical conclusion validity | Large sample, experimental control |
Dimension 4: Inference Target
| Inference Type | Design Requirements |
|---|---|
| Causal inference | Experiment or strong identification strategy |
| Associational inference | Survey, observational study |
| Descriptive inference | Representative sampling |
| Predictive inference | Big data, machine learning |
4.5 Core Principles of Design Selection
Principle 1: Logical matching between design and RQ
Not every RQ deserves every design.
- Descriptive RQ + RCT = resource waste
- Causal RQ + cross-sectional survey = overclaiming
Principle 2: Assumption management
Every design has its core assumptions:
- Experiment: assumes ethics and control feasibility
- Survey: assumes measurement validity and extrapolation limits
- Literature: assumes quality of existing research
- Secondary data: assumes uncontrollable data generation process
Principle 3: Inferential humility
Must actively answer three questions:
- Under this design, can my RQ still be fully answered?
- Which part of the original RQ can my conclusion address?
- Which claims must I actively abandon?
Principle 4: Design cannot “remedy” poor operationalization
If operationalization has already failed (vague concepts, invalid measurement),
even the best research design cannot salvage the study.
The premise of design selection is: operationalization is complete and reasonable.
4.6 Key Takeaways
At the research design selection stage, one must understand:
Heterogeneity of evidence generation logic
- Experiments generate “what if we intervene” evidence
- Surveys generate “what is the current state” evidence
- Literature generates “what is known” evidence
No “best” design, only “best matched” design
- RCT is not always the gold standard
- Observational studies are more appropriate in certain contexts
- Design selection is a constrained optimization problem
Design selection commits to specific assumptions
- Each design has different validity threats
- Identification assumptions must be explicit
- Conclusion boundaries must be acknowledged
5. Literature Review as a Tool for Question Discovery
5.1 Functions of Systematic Literature Review
Literature review is not merely “background introduction” but a core tool for research question discovery:
(1) Identifying knowledge gaps
- Which questions remain unstudied?
- Which theoretical mechanisms remain untested?
- Which populations or contexts remain uncovered?
(2) Understanding contradictory findings
- Why do different studies reach different conclusions?
- Do contradictions stem from differences in conceptual definitions, measurement methods, or sample characteristics?
- Are there moderating variables or boundary conditions?
(3) Positioning your contribution
- How does your research advance existing knowledge?
- Avoiding simple repetition of existing work
- Clarifying your unique angle or incremental contribution
5.2 Systematic vs. Narrative Review
Systematic review
- Explicit inclusion/exclusion criteria
- Exhaustive literature search
- Structured information extraction
- Suitable for integrative research in mature fields
Narrative review
- Selective literature coverage
- Theory-oriented organization
- Critical interpretation and synthesis
- Suitable for conceptual mapping in emerging fields
5.3 Pathway from Literature to Questions
Excellent literature reviews should:
- Not only summarize “what has been done” but point out “what is missing”
- Not only list conclusions but analyze “why such conclusions”
- Not only describe current state but propose “what should be studied next”
6. Completeness Checklist for Research Questions
For any research question, researchers should be able to answer the following six questions:
Who exactly?
- Clear population definition and selection rules
What exactly is done?
- Precise intervention, exposure, or treatment definition
Compared to what?
- Clear counterfactual or control conditions
Measured how?
- Specific measurement tools, reporters, time points
Over what time?
- Time range, follow-up periods, causal timing
Under which assumptions?
- Identification assumptions, measurement assumptions, causal assumptions
Key principle:
If you cannot explain how a concept is measured,
then you don’t yet have a true research question.
7. Integration of Classical and Modern Research Paradigms
7.1 Characteristics of Classical Paradigm
Theory-driven hypothesis testing
- Deriving hypotheses from explicit theoretical frameworks
- Confirmatory research design
- Relying on existing constructs and measurement tools
- Emphasizing internal validity and causal inference
7.2 Characteristics of Modern Paradigm
Data-driven pattern discovery
- Exploring patterns from large-scale data
- Exploratory analysis and machine learning
- Computational methods enabling new questions
- Emphasizing prediction accuracy and generalization capability
7.3 Integration Pathway: Research Questions in Computational Social Science
Modern quantitative social science needs to find balance between two paradigms:
(1) Dialogue between theory and data
- Using theory to guide direction of exploratory analysis
- Using data to test and refine theoretical predictions
- Balancing explanatory and predictive power
(2) New questions from new methods
- Large-scale text data → discourse analysis and opinion dynamics
- Network data → social structure and diffusion processes
- Digital traces → behavioral patterns and decision mechanisms
- Computational simulation → mechanism exploration and counterfactual reasoning
(3) New evidence for classic questions
- Re-examining classic theories with new data
- Improving credibility of causal inference with new methods
- Expanding scale and complexity of problems with computational power
8. Role of Ethical Considerations in Question Formation
8.1 Upstream Nature of Research Ethics
Ethical considerations are not an “additional step” in research but should be integrated into the question discovery stage:
(1) Vulnerable population protection
- Does the research question involve children, patients, marginalized groups?
- How to ensure the research process causes no additional harm?
- How to design informed consent and privacy protection?
(2) Social consequences of research
- How might research results be used or misused?
- Could it reinforce stereotypes or stigma?
- What potential impact on policy and practice?
(3) Data justice
- Is data collection fair?
- Are there systematic biases in algorithms?
- Do research benefits reach the studied population?
8.2 Special Considerations in Clinical Psychology Research
In clinical psychology, question formation requires particular attention to:
- Treatment equity: Does control group design deprive participants of effective treatment opportunities?
- Pathologization risk: Does the research framework overly pathologize normal behavioral variation?
- Cultural sensitivity: Are concepts and measurements applicable across different cultural backgrounds?
- Long-term tracking: How to balance scientific value with participant burden in longitudinal research?
9. From Research Design to Analysis Strategy
9.1 Design Determines Boundaries of Analytical Possibilities
After selecting a research design, research enters the analysis design stage. But one must recognize:
Analysis methods cannot remedy fundamental design flaws.
The design has already determined:
- Which causal claims are possible
- Which confounders are controllable
- Which assumptions must be relied upon
- Which generalizations are reasonable
9.2 Mapping from Design to Analysis
Experimental design → Analysis strategy
- ITT analysis (Intention-to-Treat)
- PP analysis (Per-Protocol)
- CACE analysis (Complier Average Causal Effect)
- Subgroup analysis and heterogeneous treatment effects
Quasi-experimental design → Identification strategy
- DiD: parallel trends test, robustness checks
- RDD: bandwidth selection, continuity tests
- IV: weak instrument tests, overidentification tests
Survey design → Inference strategy
- Sampling weight adjustment
- Non-response bias handling
- Multilevel models (nested data)
- Structural equation models
Observational data → Confounding control
- Propensity score matching/weighting
- Doubly robust estimation
- Sensitivity analysis
- E-value assessment
Text data → Validity verification
- Coding reliability
- Model robustness
- Topic consistency
- Semantic validity
9.3 Critical Decisions Before Analysis
Before actual analysis, must clarify:
(1) Specify estimation target
Not: Is CBT effective?
But:
- ATE under ITT framework?
- Effect among completers?
- CATE under different baseline severity?
(2) Missing data handling
- Missing mechanism: MCAR, MAR, MNAR
- Handling strategy: deletion, imputation, full information maximum likelihood
- Sensitivity analysis: result robustness under different assumptions
(3) Multiple comparison control
- Primary vs. secondary outcomes
- Confirmatory vs. exploratory analysis
- FDR control, Bonferroni correction
(4) Heterogeneity handling
- Pre-specified subgroup analysis
- Exploratory heterogeneity testing
- Machine learning to identify CATE
(5) Robustness checks
- Model specification changes
- Sample restriction changes
- Measurement method changes
9.4 Bridge from Analysis to Interpretation
Analysis produces statistics, but interpretation gives them meaning:
Statistical significance ≠ Substantive importance
- Effect size: Cohen’s d, R²
- Clinical significance: MCID, NNT
- Practical relevance
Association ≠ Causation
- Are identification assumptions credible?
- Is reverse causality possible?
- How large is omitted variable bias?
Sample results ≠ Population truth
- External validity threats
- WEIRD sample problem
- Context dependency
10. Conclusion: The Epistemological Chain of Research Design
10.1 Complete Picture of Research Design
The research design framework presented in this paper is a complete epistemological chain:
1 | Phenomenal observation |
Each link is indispensable, irreplaceable epistemic work.
10.2 Core Insights
Insight 1: Operationalization is not a technical step
Operationalization is core epistemological work that:
- Forces researchers to make implicit assumptions explicit
- Makes conceptual ambiguity visible
- Establishes bridges between theory and empirical evidence
- Determines the valid interpretation range of research conclusions
Insight 2: Design selection is not tool selection
Research designs represent fundamentally different evidence-generating logic:
- Different designs produce different types of evidence
- Different designs carry different assumption commitments
- Different designs have different validity threats
- No “best” design, only “best matched” design
Insight 3: Analysis cannot remedy design flaws
Statistical analysis sophistication cannot remedy fundamental design flaws.
- If operationalization fails, even advanced methods are futile
- If design selection is wrong, causal inference is not credible
- If measurement is invalid, results are meaningless
Insight 4: Inferential humility
Excellent researchers know:
- Which claims can be made
- Which claims are excessive
- Which boundaries must be acknowledged
- Which assumptions must be relied upon
10.3 Core Competencies of Researchers
An excellent quantitative researcher is not someone who masters statistical tools, but someone who can:
Transform phenomena into precise questions
- From vague observation to answerable question
- From natural language to operational definition
Transform concepts into measurable variables
- Specify ontological commitments
- Construct valid measurement models
Transform theory into testable hypotheses
- Identify causal structure
- Specify identification assumptions
Select matched evidence-generating mechanisms
- Understand design logic
- Acknowledge design limitations
Transform results into theoretical contributions
- Go beyond descriptive statistics
- Connect to broader knowledge systems
The complete path of research design can be formalized as:
$$\text{Phenomenon} \xrightarrow{\text{abstraction}} \text{Concept} \xrightarrow{\text{theory}} \text{Question} \xrightarrow{\text{operationalization}} \text{Model} \xrightarrow{\text{design}} \text{Evidence} \xrightarrow{\text{analysis}} \text{Conclusion}$$