Welcome to the World of Data!
You’ve designed your experiment, gathered your participants, and finished your research. Now what? You have a pile of "clues" (data), but they don’t tell a story yet. In this chapter, we learn how to organize, analyze, and present that data so we can actually understand what’s happening in the human mind. Think of yourself as a translator, turning messy numbers and words into clear, scientific facts.
Don't worry if you aren't a "math person"—we will break everything down step-by-step!
1. Handling Raw Data
Raw data is simply the "unprocessed" information you collect during your study before you do anything to it. It’s like the raw ingredients for a cake—you can't see the final result until you start mixing them together.
Raw Data Tables
When you are in the middle of an experiment or observation, you need a raw data recording table. This is a simple grid where you write down the scores or behaviors as they happen. Example: A table with "Participant Name" in one column and "Time taken to solve a puzzle" in the next.
Math Skills for Data
Psychologists use a few math tricks to keep things tidy:
Standard Form: This is used for very large or very small numbers. It looks like this: \( A \times 10^n \). For example, 500 would be \( 5 \times 10^2 \).
Significant Figures: This means rounding a number to the digits that actually matter. If a calculator gives you 12.345678, you might round it to 12.3 (3 significant figures) to make it readable.
Estimations: Sometimes, it's helpful to make a "guestimate" before you do a big calculation to see if your final answer makes sense. If you are averaging 10, 12, and 14, and your answer is 50, you know you’ve made a mistake!
Quick Review: Raw data is the "first draft" of your results. Use tables to stay organized from the start!
2. Levels and Types of Data
Not all data is created equal! Psychologists categorize data so they know which math tests to use later.
Quantitative vs. Qualitative
Quantitative Data: Data in the form of numbers (e.g., "Scores on a memory test"). It’s easy to analyze but lacks detail.
Qualitative Data: Data in the form of words (e.g., "Descriptions of how participants felt during the test"). It’s rich in detail but harder to summarize.
Primary vs. Secondary
Primary Data: Data you collected yourself for your own study.
Secondary Data: Data that already exists, collected by someone else (e.g., using government crime statistics).
The Three Levels of Data
This is a very important part of the OCR syllabus. To remember them, think of the word NOI:
1. Nominal Level: Data that consists of categories or names. You are just "counting heads." (Example: Counting how many people like cats vs. dogs).
2. Ordinal Level: Data that can be put in an order or "ranked," but the gaps between the ranks aren't equal. (Example: Finishing 1st, 2nd, and 3rd in a race. You know who is faster, but not by exactly how many seconds).
3. Interval Level: Data measured on a scale with equal, fixed intervals. (Example: Temperature in Celsius or time in seconds). This is the most "scientific" level of data.
Key Takeaway: Identifying your level of data (Nominal, Ordinal, or Interval) is the first step in deciding which statistical test to use.
3. Descriptive Statistics
Descriptive statistics are used to summarize your data so it’s easy to understand at a glance.
Measures of Central Tendency (The Middle)
Mean: The average. Add all scores together and divide by the number of scores. (Very sensitive; one extreme score can ruin it!)
Median: The middle score when they are all lined up in order. (Great if you have one or two weirdly high or low scores).
Mode: The most common score. (Useful for nominal data).
Measures of Dispersion (The Spread)
These tell us if the scores are all bunched together or spread out widely.
Range: The difference between the highest and lowest score. (Calculation: Highest - Lowest + 1).
Variance and Standard Deviation: These are more complex. They tell us how much, on average, each score differs from the mean. A low standard deviation means everyone scored similarly. A high standard deviation means the results were all over the place.
Visualizing Data (Graphs)
Bar Charts: Used for nominal data (categories). The bars do NOT touch.
Histograms: Used for interval data (continuous numbers). The bars DO touch.
Scatter Diagrams: Used for correlations to show the relationship between two variables.
Pie Charts: Used to show proportions or percentages of a whole.
Common Mistake: Don't let your bars touch on a Bar Chart! That's only for Histograms. Think of Bar Charts as "social distancing" for categories.
4. Inferential Statistics
This sounds scary, but it’s just about answering one question: "Is my result just a lucky fluke, or did something real happen?"
Probability and Significance
Psychologists use a significance level of \( p < 0.05 \). This means there is a less than 5% chance that the results happened by accident. We want to be at least 95% sure our findings are real!
The Normal Distribution Curve
If you measure something like height in a big population, most people are average, with a few very tall and a few very short people. This creates a symmetrical, bell-shaped curve called a Normal Distribution.
If the curve leans to one side, it is Skewed. (Example: If a test was too easy, most people get high marks, creating a negative skew).
Choosing a Statistical Test
The syllabus requires you to know when to use specific non-parametric tests. Don't worry about the math; just know when to pick them!
1. Chi-square: Use for Nominal data and independent measures.
2. Binomial Sign Test: Use for Nominal data and repeated measures.
3. Mann-Whitney U: Use for Ordinal data and independent measures.
4. Wilcoxon Signed Ranks: Use for Ordinal data and repeated measures.
5. Spearman’s Rho: Use when looking for a correlation between two variables.
Type 1 and Type 2 Errors
Type 1 Error (The "False Positive"): You claim there is a significant result, but there isn't. You were too optimistic! (Usually happens if your significance level is too lenient, like 10%).
Type 2 Error (The "False Negative"): You claim there is no result, but there actually was! You were too cautious. (Usually happens if your significance level is too strict, like 1%).
Did you know? A Type 1 error is like a fire alarm going off when there is no fire. A Type 2 error is a fire happening, but the alarm stays silent!
5. Methodological Issues
When analyzing data, we must check if our research is actually good "science."
Reliability: Is the research consistent? If we did it again, would we get the same result?
Check: Test-retest (doing the test again) or Inter-rater (do two observers agree?).
Validity: Is the research measuring what it claims to measure?
Check: Ecological validity (is it like real life?) or Face validity (does it look right at first glance?).
Bias: Watch out for Social Desirability (participants acting "better" than they are) and Researcher Bias (the psychologist seeing what they want to see).
Ethical Considerations: Always remember the BPS Code of Ethics. You must ensure Respect (informed consent), Competence, Responsibility (debriefing), and Integrity (avoiding deception unless necessary).
Final Tip: When you write about data, always be honest about its limitations. No study is perfect, and acknowledging that is the mark of a great psychologist!