Introduction: Making Sense of the Numbers
Welcome! You’ve planned your study, gathered your participants, and collected your data. But now what? You probably have a pile of numbers or a stack of interview transcripts. This chapter is all about how Psychologists tidy up that "messy" information, analyze it to see what it means, and present it so others can understand it. Don't worry if you aren't a "maths person"—we will break everything down step-by-step!
1. Raw Data: The Starting Point
Raw data is simply the original, unprocessed information you collect during a study. Before you can do any fancy calculations, you need to organize it.
Designing Data Recording Tables
A good recording table should be clear and logical. It usually has columns for the Independent Variable (IV) conditions and rows for each participant. Example: If you are testing if coffee helps memory, one column would be "Coffee Group" and the other "Water Group."
Handling Numbers (Maths Skills)
Psychology involves some basic numeracy. You need to be comfortable with:
- Standard Form: A way of writing very large or small numbers. For example, \(1.2 \times 10^3\) is just another way of writing \(1,200\).
- Decimal Form: Ensure you can convert fractions (like \(1/4\)) into decimals (\(0.25\)).
- Significant Figures: This is about rounding to the most important digits. If a calculator gives you \(12.34567\), your teacher might ask for 3 significant figures, which is \(12.3\).
- Estimations: Sometimes you need to "guess-timate" a result to see if your final calculation looks right.
Quick Review: Always label your tables clearly so a stranger could understand what the numbers represent!
2. Types of Data
In Psychology, we categorise data in two main ways: by its nature (what it looks like) and its source (where it came from).
Nature of Data: Quantitative vs. Qualitative
Quantitative Data: Numbers and statistics.
Example: A score of 8/10 on a test.
Strengths: Easy to compare and put into graphs.
Weakness: Lacks detail; we don't know *why* the person got that score.
Qualitative Data: Words, descriptions, and meanings.
Example: A participant describing how they felt during an experiment.
Strengths: Rich in detail and depth.
Weakness: Hard to analyze or compare between people.
Source of Data: Primary vs. Secondary
Primary Data: Data you collected yourself for your specific study.
Secondary Data: Data collected by someone else (e.g., using government statistics or another researcher's results).
Memory Aid: **P**rimary is **P**ersonal (You did it!). **S**econdary is **S**econd-hand.
3. Levels of Data (Levels of Measurement)
This is a way of "ranking" how precise your numerical data is. This is a very common exam topic!
- Nominal Level: Data is in separate categories. There is no "order."
Example: Categorizing people by their favorite color (Red, Blue, Green). - Ordinal Level: Data that can be put into an order or rank, but the gaps between the ranks aren't equal.
Example: A race finish (1st, 2nd, 3rd). We know who was faster, but not by how many seconds. - Interval Level: Data measured on a fixed scale with equal gaps between points. This is the most precise.
Example: Temperature in Celsius or time in seconds.
Key Takeaway: Moving from Nominal to Interval is like upgrading from a blurry photo to a High-Definition video—you get much more detail!
4. Analysis of Qualitative Data
How do we turn a long interview into something we can analyze? We use a process of converting qualitative to quantitative data. This usually involves "coding"—looking for themes in the words and counting how many times they appear.
Example: If five participants mention feeling "nervous," you can record the number "5" for the theme "Anxiety."
5. Descriptive Statistics
These are used to summarize your data so you can see the general "trend."
Measures of Central Tendency (Averages)
1. Mean: Add all scores together and divide by the number of scores.
Use when: You have interval data and no extreme "outliers."
2. Median: The middle score when all numbers are put in order.
Use when: You have "wonky" data with one or two very high or low scores.
3. Mode: The most common score.
Use when: You have nominal data (categories).
Measures of Dispersion (The Spread)
These tell us if the scores are all bunched together or spread out widely.
- Range: The difference between the highest and lowest score. (Highest \(-\) Lowest). It's easy but can be misleading if there is one weirdly high score!
- Variance and Standard Deviation: These tell us how much, on average, scores differ from the mean. A high standard deviation means the scores are very spread out; a low standard deviation means the participants all scored very similarly.
Did you know? Frequency tables (tally charts) are the quickest way to organize raw data before calculating these measures!
6. Visualizing Data: Graphs
Graphs help us see patterns instantly. You must know which graph fits which data:
- Bar Charts: Used for nominal data (categories). The bars do NOT touch.
- Histograms: Used for continuous data (like age or time). The bars DO touch.
- Line Graphs: Shows how something changes, often over time.
- Pie Charts: Shows proportions or percentages of a whole.
- Scatter Diagrams: Used for correlations to show the relationship between two variables.
Common Mistake: Forgetting to label the X-axis (horizontal) and Y-axis (vertical). Always give your graph a clear title!
7. Inferential Statistics: The "Big" Decision
Descriptive stats tell us what the data *looks* like. Inferential statistics help us decide if our results are actually "real" or just happened by luck. This is where we test our Hypothesis.
Probability and Significance
Psychologists use a significance level, usually written as \(p < 0.05\). This means there is a less than 5% possibility that our results happened by chance. If \(p < 0.05\), we reject the null hypothesis and accept our alternative hypothesis.
Type 1 and Type 2 Errors
- Type 1 Error: A "False Positive." You claim your results are significant, but they actually happened by chance. (The "Boy Who Cried Wolf" when there was no wolf).
- Type 2 Error: A "False Negative." You claim your results aren't significant, but there actually *was* an effect. (Missing the wolf when it was really there).
Choosing a Statistical Test
You don't need to do the math for all of these, but you must know when to use them! The "Criteria" depend on the design of the study and the level of data.
Non-Parametric Tests (The Big 5):
1. Mann-Whitney U: Independent measures design, Ordinal data.
2. Wilcoxon Signed Ranks: Repeated measures design, Ordinal data.
3. Chi-Square: Independent measures design, Nominal data. (Note: You must be able to calculate this one!)
4. Binomial Sign: Repeated measures design, Nominal data.
5. Spearman’s Rho: Used for Correlational research.
Criteria for a Parametric Test:
These are more "powerful" tests used only when:
- Data is Interval level.
- Data follows a Normal Distribution (a bell-shaped curve).
- The variance (spread) in the groups is similar.
Quick Tip: Use the mnemonic "Carrots Should Come In Mashed With Swede" to remember the tests, but check your textbook for a "deciding which test to use" flow chart—it’s a lifesaver!
Summary: The Data Journey
1. Collect Raw Data and organize it in a table.
2. Identify the Level of Data (Nominal, Ordinal, or Interval).
3. Calculate Descriptive Stats (Mean, Median, Mode, Range, SD).
4. Create a Graph to visualize the trend.
5. Run an Inferential Test to see if your result is significant (\(p < 0.05\)).
6. Check for Type 1 or Type 2 errors before drawing your final conclusion.