IB DP · Thinka-original Practice Paper

2025 IB DP Computer Science Practice Paper with Answers

Thinka May 2025 HL (TZ3) IB Diploma Programme-Style Mock — Computer Science

195 marks270 mins2025

An original Thinka practice paper modelled on the structure and difficulty of the May 2025 HL (TZ3) IB Diploma Programme Computer Science paper. Not affiliated with or reproduced from IB.

Paper 1 Section A

Answer all questions in the spaces provided.

8 Question · 25 marks

Question 1 · Short Answer

3.125 marks

Describe one social or ethical issue associated with the implementation of a new computerized system in an organization.

Show answer & marking scheme

Worked solution

When a new computerized system is implemented, tasks previously performed by humans might become automated. This can lead to job displacement or redundancies, which is a major ethical and social issue causing financial instability and stress for the affected employees. Companies must consider retraining or supporting these individuals.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for identifying an issue (e.g., job redundancy, data privacy, resistance to change).
- 1 mark for explaining the impact of this issue on stakeholders (e.g. stress, loss of income, privacy leakage).
- 1 mark for linking the issue to the implementation of the new system.

Question 2 · Short Answer

3.125 marks

Explain the role of the Program Counter (PC) in the fetch-execute cycle.

Show answer & marking scheme

Worked solution

During the fetch-execute cycle, the CPU needs to know which instruction to execute next. The Program Counter (PC) stores the memory address of this next instruction. Once the instruction at that address is retrieved and passed to the Memory Address Register (MAR) / Instruction Register (IR), the PC is updated (usually incremented by 1) to point to the subsequent instruction in memory, ensuring sequential execution.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for stating that it holds the memory address of the next instruction to be fetched.
- 1 mark for explaining that its value is copied/sent to the Memory Address Register (MAR).
- 1 mark for explaining that it is incremented/updated automatically to point to the following instruction.

Question 3 · Short Answer

3.125 marks

Explain the purpose of packet switching in data transmission across a network.

Show answer & marking scheme

Worked solution

Packet switching splits a file or message into smaller, manageable chunks called packets. Each packet contains payload data and a header with destination and routing info. These packets are sent independently across the network, potentially taking different paths depending on network congestion. This optimizes bandwidth usage and makes the transmission robust against node failures, as packets can be rerouted and reassembled at the destination.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for explaining that data is divided into smaller packets containing routing information (headers).
- 1 mark for explaining that packets travel independently/dynamically across different paths depending on traffic/network conditions.
- 1 mark for explaining an advantage (e.g., maximizing bandwidth utility, robust against node failure, or packet reassembly at the destination).

Question 4 · Short Answer

3.125 marks

Describe how abstraction is applied in the development of a complex software application.

Show answer & marking scheme

Worked solution

In complex software development, abstraction is used to reduce complexity by hiding low-level technical details and showing only the essential features of a component. For example, a developer can use a high-level function like 'print()' without knowing how the operating system coordinates the hardware pins to output characters. This allows programmers to focus on higher-level logic, design, and modularity without being overwhelmed by implementation details.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for defining abstraction (removing/hiding unnecessary details while keeping essential features).
- 1 mark for explaining how this benefits development (reduces complexity, allows focusing on high-level logic, modular design).
- 1 mark for providing a clear, relevant example (e.g., using libraries/APIs, high-level programming languages, or object methods).

Question 5 · Short Answer

3.125 marks

Compare a static stack (implemented using an array) and a dynamic stack (implemented using a linked list) in terms of memory usage.

Show answer & marking scheme

Worked solution

A static stack is implemented using an array, meaning a fixed block of contiguous memory must be allocated beforehand. This can lead to wasted memory if the stack is mostly empty, or stack overflow if it exceeds the limit. A dynamic stack uses a linked list, which allocates memory dynamically on the heap only when new nodes are added. However, each node in a dynamic stack requires extra memory overhead to store the reference/pointer to the next node, which a static array does not require.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for explaining the static stack memory allocation (pre-allocated, fixed size, potential waste or overflow issues).
- 1 mark for explaining the dynamic stack memory allocation (on-demand allocation, avoids fixed limits).
- 1 mark for identifying the pointer/reference overhead required for each node in a dynamic stack.

Question 6 · Short Answer

3.125 marks

Explain how a closed-loop feedback control system differs from an open-loop control system.

Show answer & marking scheme

Worked solution

In a closed-loop feedback control system, the output of the system is continuously monitored using sensors, and this data is fed back to the controller. The controller compares the current state with the desired target state and adjusts the actuators accordingly (e.g., a thermostat adjusting heating based on temperature). In contrast, an open-loop control system performs action without feedback, executing commands purely based on pre-programmed inputs or time, regardless of the output results (e.g., a simple microwave timer heating food for a set duration).

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for explaining closed-loop systems (uses feedback from sensors, compares output with target, adjusts accordingly).
- 1 mark for explaining open-loop systems (runs based on time/pre-set instructions, ignores actual output/has no feedback loop).
- 1 mark for providing a comparative example or describing the continuous decision-making cycle of the closed-loop.

Question 7 · Short Answer

3.125 marks

Define the term 'polymorphism' in object-oriented programming and outline one of its advantages.

Show answer & marking scheme

Worked solution

Polymorphism is an OOP concept where a single interface or method call can have multiple forms depending on the underlying object calling it. For instance, a superclass 'Shape' may have a method 'draw()', and subclasses 'Circle' and 'Square' implement 'draw()' differently. The advantage is that code can be written to manipulate objects of the superclass 'Shape' without needing to know their specific subclass types, which makes the codebase highly reusable, cleaner, and easy to extend with new subclasses without modifying existing code.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for a clear definition of polymorphism (different classes responding to the same method call in class-specific ways / dynamic binding).
- 1 mark for outlining a benefit (e.g., code reuse, ease of extensibility, simplified code maintenance, reduced conditional statements).
- 1 mark for a simple illustrative example (e.g., animals speaking, shapes drawing, or payment methods processing).

Question 8 · Short Answer

3.125 marks

Explain the role of a prototype in the design stage of a new software system.

Show answer & marking scheme

Worked solution

A prototype is a preliminary, simplified version of a system used during the design stage. Its main roles are: to allow end-users and stakeholders to interact with the proposed design concept, providing immediate feedback; to help developers clarify ambiguous requirements; and to test technical feasibility before committing large amounts of time and resources to full development. This reduces the risk of project failure and ensures the final product meets user needs.

Marking scheme

Award up to 3.125 marks:
- 1.125 marks for defining a prototype (early, simplified, or non-functional/semi-functional model).
- 1 mark for explaining its role in gathering user feedback and refining requirements.
- 1 mark for explaining its role in reducing design risks, identifying flaws early, or proving technical feasibility.

Paper 1 Section B

Answer all structured questions and algorithms.

5 Question · 75 marks

Question 1 · Structured & Algorithmic

15 marks

An international shipping company uses parallel arrays to manage parcels: PARCEL_IDS (array of integers), PARCEL_WEIGHTS (array of reals), and DELIVERED (array of booleans). There are \( N \) parcels in total. (a) Outline the purpose of parallel arrays and state one disadvantage of using parallel arrays compared to an array of objects [3 marks]. (b) Construct a pseudocode algorithm for a method findHeavyUndelivered(limit) that receives a threshold weight limit as a parameter. The algorithm must count and output the IDs of all undelivered parcels that exceed this weight threshold. It should also calculate and return the average weight of these heavy undelivered parcels [8 marks]. (c) Trace the algorithm using these sample arrays: PARCEL_IDS = [101, 102, 103, 104], PARCEL_WEIGHTS = [12.5, 5.0, 18.2, 22.0], DELIVERED = [false, true, false, false], and limit = 10.0. State the console outputs and the final returned value [4 marks].

Show answer & marking scheme

Worked solution

(a) Parallel arrays link elements at matching indices to represent an individual record. Disadvantage: Harder to maintain, insert/delete operations require updating all arrays to maintain synchronization. (b) Pseudocode: method findHeavyUndelivered(limit) count = 0, totalWeight = 0, loop I from 0 to N-1, if DELIVERED[I] == false then, if PARCEL_WEIGHTS[I] > limit then, output PARCEL_IDS[I], count = count + 1, totalWeight = totalWeight + PARCEL_WEIGHTS[I], end if, end if, end loop, if count > 0 then, avg = totalWeight / count, return avg, else, return 0.0, end if, end method. (c) Trace: I=0: DELIVERED[0] is false, weight 12.5 > 10.0 -> Output 101, count=1, total=12.5. I=1: DELIVERED[1] is true -> skipped. I=2: DELIVERED[2] is false, weight 18.2 > 10.0 -> Output 103, count=2, total=30.7. I=3: DELIVERED[3] is false, weight 22.0 > 10.0 -> Output 104, count=3, total=52.7. Loop ends. count > 0, avg = \( 52.7 / 3 = 17.57 \). Returns 17.57.

Marking scheme

(a) [1 mark] for explaining matching indices. [2 marks] for stating and explaining a valid disadvantage (e.g. maintenance, sync issues). (b) [1 mark] initializing accumulator and count; [2 marks] correct loop boundary from 0 to N-1; [2 marks] nested conditional checks for DELIVERED and limit; [1 mark] outputting correct PARCEL_ID; [1 mark] updating accumulators; [1 mark] calculating and returning correct average weight (handling zero division). (c) [2 marks] for identifying correct outputs 101, 103, and 104. [2 marks] for calculating and identifying correct returned average of 17.57.

Question 2 · Structured & Algorithmic

15 marks

A medical clinic is transitioning from an offline desktop system to a new cloud-based electronic health record (EHR) system. (a) Distinguish between direct changeover and phased implementation methods [4 marks]. (b) Explain two problems that may arise during data migration from the legacy database to the cloud-based database [4 marks]. (c) The clinic plans to use User Acceptance Testing (UAT). Outline the importance of UAT in this development cycle and describe two methods of collecting user feedback during testing [7 marks].

Show answer & marking scheme

Worked solution

(a) Direct changeover involves turning off the old system and immediately starting the new one. It is cheap and fast but carries extreme risk of data loss or operation stoppage. Phased implementation rolls out modules or departments step-by-step. It is safer because failures only affect small parts, but it is slow and requires managing legacy-new system compatibility. (b) 1. Incompatible formats: The old system might store data in structures that do not map directly to the new cloud database, causing errors or data misalignment. 2. Data loss: Incomplete transfers or connection drops can result in missing patient records. (c) UAT is critical because it ensures the system meets actual clinician requirements and workflows, identifying usability bugs before deployment. Feedback methods: 1. Surveys/Questionnaires to capture quantitative user satisfaction. 2. Structured interviews/observational sessions to document specific user difficulties and flow bottlenecks.

Marking scheme

(a) [2 marks] for explaining direct changeover with pros/cons; [2 marks] for explaining phased implementation with pros/cons. (b) [2 marks] for explaining incompatible schemas/data structures; [2 marks] for explaining data validation/loss during transmission. (c) [3 marks] for explaining the purpose and critical importance of UAT in a clinical environment; [2 marks] for explaining survey method; [2 marks] for explaining structured observation/interview method.

Question 3 · Structured & Algorithmic

15 marks

A multinational financial firm has implemented a Virtual Private Network (VPN) for its remote employees. (a) Describe how a VPN provides a secure connection over the public Internet [4 marks]. (b) Explain the difference between symmetric and asymmetric encryption in securing network transmission [4 marks]. (c) Identify and describe the role of two protocols at different layers of the TCP/IP stack that facilitate remote data transmission [4 marks]. (d) Discuss why lossy compression is inappropriate for financial transaction records, whereas it is acceptable for VoIP communication [3 marks].

Show answer & marking scheme

Worked solution

(a) A VPN creates an encrypted tunnel between the user's remote device and the private corporate network. It uses tunneling protocols to encapsulate IP packets, encrypts the payload so unauthorized intermediaries cannot read it, and enforces authentication so only verified users gain network entry. (b) Symmetric encryption uses a single key for both encryption and decryption, which is fast and efficient but poses key distribution challenges. Asymmetric encryption uses a public key to encrypt and a private key to decrypt, allowing secure communication without sharing private keys, but it is slower and computationally intensive. (c) 1. TCP (Transport Layer): Provides reliable, ordered delivery of packets, managing flow control and error recovery. 2. IP (Internet/Network Layer): Handles routing and addressing of packets across multiple intermediate routers to their final destination. (d) Lossy compression permanently discards data. For financial records, every character is critical; losing data would corrupt transactions. For VoIP, dropping redundant sound frequencies is unnoticeable to humans and prioritizes streaming speed over perfect fidelity.

Marking scheme

(a) [1 mark] for mentioning tunneling/encapsulation; [1 mark] for encryption; [1 mark] for authentication; [1 mark] for overall data integrity. (b) [2 marks] for symmetric key definition and performance; [2 marks] for asymmetric key mechanics and secure key exchange benefits. (c) [2 marks] for identifying and explaining TCP; [2 marks] for identifying and explaining IP. (d) [1 mark] for defining lossy compression effect; [1 mark] explaining impact on financial precision; [1 mark] explaining human tolerance in real-time voice streams.

Question 4 · Structured & Algorithmic

15 marks

(a) Distinguish between a stack and a queue in terms of data access orders [2 marks]. (b) A static circular queue of capacity 5 is used to schedule print jobs. It uses 'head' (index of first item) and 'tail' (index of next free slot). Trace and show the exact array contents, head, and tail values step-by-step after these actions: 1. Initialize empty: head=0, tail=0. 2. Enqueue 'DocA'. 3. Enqueue 'DocB'. 4. Dequeue. 5. Enqueue 'DocC'. 6. Enqueue 'DocD'. 7. Enqueue 'DocE' [5 marks]. (c) Describe one disadvantage of a static array queue compared to a dynamic linked list [2 marks]. (d) Construct a pseudocode algorithm for a recursive function findSum(node) that traverses a singly-linked list of integers and returns the total sum. Each node has node.value and node.next [6 marks].

Show answer & marking scheme

Worked solution

(a) A stack is Last In First Out (LIFO), meaning the last element added is the first removed. A queue is First In First Out (FIFO), meaning the oldest element is processed first. (b) Trace: 1. Empty: head=0, tail=0, array=[nil, nil, nil, nil, nil]. 2. Enqueue 'DocA': head=0, tail=1, array=['DocA', nil, nil, nil, nil]. 3. Enqueue 'DocB': head=0, tail=2, array=['DocA', 'DocB', nil, nil, nil]. 4. Dequeue: head=1, tail=2, array=[nil, 'DocB', nil, nil, nil]. 5. Enqueue 'DocC': head=1, tail=3, array=[nil, 'DocB', 'DocC', nil, nil]. 6. Enqueue 'DocD': head=1, tail=4, array=[nil, 'DocB', 'DocC', 'DocD', nil]. 7. Enqueue 'DocE': head=1, tail=0, array=[nil, 'DocB', 'DocC', 'DocD', 'DocE'] (tail wraps around modulo 5). (c) A static array has a fixed capacity. If the queue is full, new elements cannot be added (overflow error) even if physical memory is available, whereas dynamic linked lists grow dynamically. (d) Pseudocode: function findSum(node) if node == null then return 0 else return node.value + findSum(node.next) end if end function

Marking scheme

(a) [2 marks] for distinguishing LIFO vs FIFO. (b) [1 mark] for step 2 & 3 state; [1 mark] for step 4 (dequeuing shifts head to 1); [1 mark] for steps 5 & 6; [2 marks] for step 7 showing the correct modulo tail wrap-around (tail=0, array showing 'DocE' at index 4). (c) [2 marks] for explaining fixed capacity issues (overflow/under-utilization). (d) [2 marks] for the base case (checking if node is null and returning 0); [2 marks] for the recursive step (node.value + recursive call); [2 marks] for correct syntax and flow.

Question 5 · Structured & Algorithmic

15 marks

An automated greenhouse uses a microcontroller system to maintain optimal growing conditions. (a) Explain the role of a transducer in this system [2 marks]. (b) Identify two different physical sensors and their matching actuators used in this system [4 marks]. (c) Explain the concept of a feedback loop within this system, describing how it maintains a stable temperature when external weather changes [5 marks]. (d) Discuss one advantage and one disadvantage of using a centralized processor rather than distributed processors in this environment [4 marks].

Show answer & marking scheme

Worked solution

(a) A transducer converts physical signals to electrical signals (or vice-versa). Sensors act as input transducers, converting phenomena like heat to digital signals. Actuators act as output transducers, converting electrical control commands into physical movements or heating. (b) 1. Thermistor (temperature sensor) paired with a heater/fan actuator. 2. Soil moisture sensor paired with a solenoid water valve actuator. (c) A feedback loop is a continuous cycle of sensing, processing, and acting. First, the thermistor reads the internal temperature. The microcontroller compares this reading with the target setpoint (e.g. 24C). If the reading is lower, the controller computes the error and sends a signal to turn on the heater. The cycle repeats continuously; once the sensor detects the temperature has reached 24C, the processor shuts off the heater, keeping the environment stable. (d) Advantage: Centralized systems are simpler to design, update, and manage, and they allow easy correlation between variables (e.g., coordinate heating and ventilation together). Disadvantage: Vulnerability to a single point of failure; if the central processor crashes, all subsystems (watering, lighting, temperature control) fail instantly.

Marking scheme

(a) [2 marks] for defining input/output signal conversion. (b) [2 marks] for first sensor-actuator pair; [2 marks] for second sensor-actuator pair. (c) [1 mark] for defining feedback loop; [2 marks] for explaining the sensor-input to processor comparison; [2 marks] for explaining the output action adjusting the state and completing the loop. (d) [2 marks] for explaining centralized advantages (easy data fusion, lower cost); [2 marks] for explaining centralized disadvantage (single point of failure, wire routing limits).

Paper 2 Option D (OOP)

Answer all questions from the Object-Oriented Programming Option.

4 Question · 65 marks

Question 1 · OOP Java/Pseudocode Structured

16.25 marks

A retail store inventory system utilizes object-oriented programming to manage its products. A general class Product is defined, and a specialized class PerishableProduct inherits from it.

(a) Define the term "inheritance" in the context of object-oriented programming. [2 marks]

(b) Draw a UML diagram showing the relationship between Product and PerishableProduct, including at least one attribute and one method for each class. [3 marks]

(c) Write the Java class definition for Product. It must include:
- Private attributes: id (String) and price (double).
- A constructor that initializes these attributes.
- Getter and setter methods for each attribute. [4 marks]

(d) The PerishableProduct class has an additional private attribute daysToExpiry (int). Write the Java implementation for PerishableProduct, including:
- A constructor that initializes id, price, and daysToExpiry (using super).
- A method applyDiscount() that reduces the price by 20% if daysToExpiry is less than 3, and by 10% if daysToExpiry is between 3 and 7 (inclusive). Otherwise, the price remains unchanged. [7.25 marks]

Show answer & marking scheme

Worked solution

(a) Inheritance is a mechanism in OOP where a new class (subclass or derived class) is created from an existing class (superclass or base class). The subclass inherits all public and protected fields and methods of the superclass, allowing for code reusability and hierarchical classification.

(b) UML Diagram structure:
+---------------------------------------+
| Product |
+---------------------------------------+
| - id: String |
| - price: double |
+---------------------------------------+
| + getId(): String |
| + getPrice(): double |
+---------------------------------------+
^
| (Inheritance / Extends arrow)
+---------------------------------------+
| PerishableProduct |
+---------------------------------------+
| - daysToExpiry: int |
+---------------------------------------+
| + applyDiscount(): void |
+---------------------------------------+

(c) Java implementation of Product:
```java
public class Product {
private String id;
private double price;

public Product(String id, double price) {
this.id = id;
this.price = price;
}

public String getId() {
return id;
}

public void setId(String id) {
this.id = id;
}

public double getPrice() {
return price;
}

public void setPrice(double price) {
this.price = price;
}
}
```

(d) Java implementation of PerishableProduct:
```java
public class PerishableProduct extends Product {
private int daysToExpiry;

public PerishableProduct(String id, double price, int daysToExpiry) {
super(id, price);
this.daysToExpiry = daysToExpiry;
}

public int getDaysToExpiry() {
return daysToExpiry;
}

public void setDaysToExpiry(int daysToExpiry) {
this.daysToExpiry = daysToExpiry;
}

public void applyDiscount() {
if (this.daysToExpiry < 3) {
double newPrice = this.getPrice() * 0.80;
this.setPrice(newPrice);
} else if (this.daysToExpiry >= 3 && this.daysToExpiry <= 7) {
double newPrice = this.getPrice() * 0.90;
this.setPrice(newPrice);
}
}
}
```

Marking scheme

(a) Award up to 2 marks:
- 1 mark for defining inheritance as creating a new class based on an existing one.
- 1 mark for mentioning the acquisition of properties/methods or code reusability.

(b) Award up to 3 marks:
- 1 mark for correct UML notation of inheritance (closed arrow pointing from subclass to superclass).
- 1 mark for correctly showing private attributes (- symbol) and public methods (+ symbol).
- 1 mark for including classes with correct names and the requested attributes/methods.

(c) Award up to 4 marks:
- 1 mark for correct private attribute declarations.
- 1 mark for correct constructor initialization with parameters.
- 1 mark for correct getter methods.
- 1 mark for correct setter methods.

(d) Award up to 7.25 marks:
- 1 mark for declaring class extension (extends Product).
- 1 mark for private daysToExpiry attribute.
- 1.25 marks for constructor calling super(id, price) correctly.
- 1 mark for constructor initializing daysToExpiry.
- 1.5 marks for implementing correct logic for daysToExpiry < 3 (applying 20% discount via getters/setters).
- 1.5 marks for implementing correct logic for daysToExpiry between 3 and 7 inclusive (applying 10% discount).

Question 2 · OOP Java/Pseudocode Structured

16.25 marks

A hotel booking system defines an interface Billable that contains a method calculateBill(). Two classes, StandardRoom and SuiteRoom, implement this interface.

(a) Outline the differences between an abstract class and an interface in Java, reference how variables and methods are declared in each. [3 marks]

(b) Explain how the concept of polymorphism allows a single list of Billable objects to process payments for both StandardRoom and SuiteRoom objects dynamically during runtime. [3.25 marks]

(c) Write the implementation for:
- The Billable interface. [1 mark]
- The StandardRoom class containing attributes flatRate (double) and nights (int) implementing Billable. [2.5 marks]
- The SuiteRoom class containing attributes flatRate (double), nights (int), and serviceFee (double) implementing Billable. [2.5 marks]

(d) Write a client-side method, calculateTotalRevenue(Billable[] bookings), which iterates through an array of Billable bookings and returns the sum of all room bills. [4 marks]

Show answer & marking scheme

Worked solution

(a) Differences between abstract classes and interfaces:
- An abstract class can have instance variables (non-static, non-final) and fully implemented concrete methods, whereas an interface (prior to Java 8) can only declare abstract methods and public static final constants.
- A class can implement multiple interfaces, but can extend only one abstract class (single inheritance).
- Abstract classes use the 'extends' keyword, while interfaces use the 'implements' keyword.

(b) Polymorphism allows objects of different classes that implement the same interface to be treated as instances of that interface. At runtime, Java uses dynamic binding to determine which concrete method to invoke. Thus, iterating through an array of type Billable allows calling calculateBill() on each object, and Java automatically invokes StandardRoom's implementation for standard rooms and SuiteRoom's implementation for suites, without needing explicit type casting.

(c) Code implementation:

```java
public interface Billable {
double calculateBill();
}

public class StandardRoom implements Billable {
private double flatRate;
private int nights;

public StandardRoom(double flatRate, int nights) {
this.flatRate = flatRate;
this.nights = nights;
}

public double calculateBill() {
return flatRate * nights;
}
}

public class SuiteRoom implements Billable {
private double flatRate;
private int nights;
private double serviceFee;

public SuiteRoom(double flatRate, int nights, double serviceFee) {
this.flatRate = flatRate;
this.nights = nights;
this.serviceFee = serviceFee;
}

public double calculateBill() {
return (flatRate * nights) + serviceFee;
}
}
```

(d) Code for revenue calculation:
```java
public static double calculateTotalRevenue(Billable[] bookings) {
double total = 0.0;
if (bookings == null) {
return total;
}
for (int i = 0; i < bookings.length; i++) {
if (bookings[i] != null) {
total += bookings[i].calculateBill();
}
}
return total;
}
```

Marking scheme

(a) Award up to 3 marks:
- 1 mark for explaining that a class can implement multiple interfaces but only extend one abstract class.
- 1 mark for noting that abstract classes can have state (instance variables) while interfaces typically cannot (only constants).
- 1 mark for noting that abstract classes can have non-abstract methods while interfaces focus primarily on method contracts (signatures).

(b) Award up to 3.25 marks:
- 1 mark for stating that Billable can act as a common reference type for both StandardRoom and SuiteRoom.
- 1.25 marks for explaining runtime dynamic binding (the JVM calls the specific implementation of the underlying object).
- 1 mark for highlighting that this removes the need for instance-of checking or typecasting.

(c) Award up to 6 marks:
- 1 mark for correct interface definition.
- 2.5 marks for StandardRoom class (correct implements clause, fields, constructor, and implementation of calculateBill()).
- 2.5 marks for SuiteRoom class (correct implements clause, fields including serviceFee, constructor, and implementation of calculateBill() with serviceFee added).

(d) Award up to 4 marks:
- 1 mark for correct method signature returning double and accepting Billable[].
- 1 mark for initializing total accumulation variable.
- 1 mark for safe iteration loop with null safety check.
- 1 mark for calling calculateBill() polymorphically and returning the sum.

Question 3 · OOP Java/Pseudocode Structured

16.25 marks

A software application for managing academic courses uses an array of Student objects inside a Course class.

(a) Describe the concept of encapsulation and how it prevents direct, unauthorized modification of an object's internal state. [3 marks]

(b) Contrast aggregation and composition. State which relationship exists between Course and Student, justifying your answer. [3.25 marks]

(c) The Course class contains an attribute enrolledStudents which is an array of Student objects (Student[] enrolledStudents) and an integer studentCount. Write a Java method addStudent(Student s) that adds the student s to the array if there is space available (i.e., studentCount is less than the array's capacity). The method should update studentCount and return true. If the array is full or the input is null, it should return false. [5 marks]

(d) The Student class has a method getGpa() which returns a double. Implement a method getHonorRoll() within the Course class that returns a new array containing only the Student objects with a GPA of 3.6 or higher. The returned array must contain no empty/null elements and must be sized exactly to fit the honor roll students. [5 marks]

Show answer & marking scheme

Worked solution

(a) Encapsulation is the bundling of data (attributes) and methods that operate on that data into a single unit (class) while restricting direct access to some of the object's components. It is achieved by declaring attributes as private and providing public getter and setter methods. This prevents external code from putting an object into an invalid or inconsistent state.

(b) Comparison:
- Aggregation is a 'has-a' relationship where the child object can exist independently of the parent container (e.g., if a Course is deleted, Student objects still exist in the school system).
- Composition is a stronger 'has-a' relationship where the child object's life cycle is strictly dependent on the parent (e.g., if Course is deleted, its individual weekly Schedule objects are also destroyed).
- Between Course and Student, aggregation is the correct relationship because students exist independently of any individual course they enroll in.

(c) Java method `addStudent`:
```java
public boolean addStudent(Student s) {
if (s == null) {
return false;
}
if (this.studentCount < this.enrolledStudents.length) {
this.enrolledStudents[this.studentCount] = s;
this.studentCount++;
return true;
}
return false;
}
```

(d) Java method `getHonorRoll`:
```java
public Student[] getHonorRoll() {
int count = 0;
for (int i = 0; i < this.studentCount; i++) {
if (this.enrolledStudents[i] != null && this.enrolledStudents[i].getGpa() >= 3.6) {
count++;
}
}

Student[] honorRoll = new Student[count];
int index = 0;
for (int i = 0; i < this.studentCount; i++) {
if (this.enrolledStudents[i] != null && this.enrolledStudents[i].getGpa() >= 3.6) {
honorRoll[index] = this.enrolledStudents[i];
index++;
}
}
return honorRoll;
}
```

Marking scheme

(a) Award up to 3 marks:
- 1 mark for defining encapsulation (bundling data and operations).
- 1 mark for mentioning access modifiers (private fields, public accessors/mutators).
- 1 mark for explaining validation / control over state transitions.

(b) Award up to 3.25 marks:
- 1 mark for explaining aggregation (independent lifetimes).
- 1 mark for explaining composition (dependent lifetimes).
- 1.25 marks for identifying the relationship between Course and Student as aggregation and justifying it (students exist outside of the course).

(c) Award up to 5 marks:
- 1 mark for checking null safety on input student parameter.
- 1 mark for checking boundary limit (studentCount < capacity).
- 1 mark for assigning the student to the array index correctly.
- 1 mark for incrementing studentCount.
- 1 mark for returning boolean results appropriately (true/false).

(d) Award up to 5 marks:
- 1 mark for first loop to count the number of qualifying honor roll students.
- 1 mark for declaring and instantiating the new array with the exact size needed.
- 1 mark for second loop to populate the honor roll array.
- 1 mark for checking null pointers and GPA threshold (>= 3.6) accurately.
- 1 mark for returning the populated array.

Question 4 · OOP Java/Pseudocode Structured

16.25 marks

A digital library catalog manages its physical inventory of Book objects.

(a) State the purpose of the static keyword when applied to a variable in Java. [2 marks]

(b) Using Java, show how a static integer variable nextId can be used in the Book class constructor to automatically assign a unique, incremental bookId (int) to each new Book object. Include the class variables declaration and the constructor in your code. [4.25 marks]

(c) In Java, objects are passed to methods as references by value, whereas primitives are passed strictly by value. Describe the difference in behavior when a helper method attempts to modify:
- An object's field (e.g., updating a book's availability status).
- A primitive variable (e.g., incrementing an integer counter). [4 marks]

(d) Write a recursive method searchBook(Book[] books, String targetTitle, int index) that searches for a book with a title that matches targetTitle. The search must start at the position specified by index. If the book is found, return its object reference; if not found or if index goes out of bounds, return null. Avoid using any iterative loops. [6 marks]

Show answer & marking scheme

Worked solution

(a) The static keyword means that the variable belongs to the class itself rather than to individual instances of the class. There is only one copy of a static variable shared among all instances of the class.

(b) Implementation of static counter for unique IDs:
```java
public class Book {
private static int nextId = 1000; // Shared static counter
private int bookId; // Unique per-instance ID
private String title;

public Book(String title) {
this.title = title;
this.bookId = nextId; // Assign current unique value
nextId++; // Increment for next instance
}

public int getBookId() {
return bookId;
}

public String getTitle() {
return title;
}
}
```

(c) Difference in parameter passing behavior:
- Primitives: Passed by value, meaning a copy of the value is made in the method's local activation record. Any modification made inside the method only affects the local copy and does not persist outside the method.
- Objects: Passed as a reference by value, meaning the method receives a copy of the reference pointing to the actual object in heap memory. Modifying the fields of this object via the reference will persist outside the method call, as both caller and callee reference the same memory allocation.

(d) Recursive search method:
```java
public static Book searchBook(Book[] books, String targetTitle, int index) {
// Base case 1: Array is null or index is out of bounds
if (books == null || index < 0 || index >= books.length) {
return null;
}
// Base case 2: Current element is not null and titles match
if (books[index] != null && books[index].getTitle().equals(targetTitle)) {
return books[index];
}
// Recursive step: Increment index and search remainder
return searchBook(books, targetTitle, index + 1);
}
```

Marking scheme

(a) Award up to 2 marks:
- 1 mark for stating that the static variable belongs to the class, not instance.
- 1 mark for stating that all instances share a single memory location of the variable.

(b) Award up to 4.25 marks:
- 1 mark for declaring static private variable nextId.
- 1 mark for declaring instance variable bookId.
- 1.25 marks for constructor assigning nextId to bookId.
- 1 mark for incrementing nextId inside the constructor.

(c) Award up to 4 marks:
- 1 mark for explaining primitive pass-by-value (copies the actual value, no outside effect).
- 1 mark for explaining object pass-by-reference-value (copies memory address reference).
- 1 mark for explaining that changes to object properties persist after the call.
- 1 mark for explaining that changing the reference variable itself to point to another object does not affect the caller's original reference.

(d) Award up to 6 marks:
- 1 mark for correct method signature with parameters and return type.
- 1.5 marks for base case checks (handling null array and index boundary checks correctly).
- 1.5 marks for base case check for a match (null check on element and .equals() match check).
- 1 mark for correct recursive call passing index + 1.
- 1 mark for correctly returning the result of the recursive call without using iteration (loops).

Paper 3 Case Study

Answer all questions based on the Case Study booklet.

4 Question · 30 marks

Question 1 · Short Answer

7.5 marks

With reference to the Case Study, evaluate the use of Reinforcement Learning from Human Feedback (RLHF) compared to supervised fine-tuning (SFT) for reducing "hallucinations" in a domain-specific chatbot deployed in a legal advisory context.

Show answer & marking scheme

Worked solution

In a legal advisory context, preventing 'hallucinations' (the generation of factually incorrect or fabricated legal claims) is critical.

1. **Supervised Fine-Tuning (SFT):** Under SFT, the model is trained on curated pairs of legal prompts and correct answers. While SFT helps the chatbot learn the correct domain terminology and formal style, it struggles to prevent hallucinations because the model is only taught what *to* say, not what *not* to say. It lacks a generalized mechanism to penalize confident but false assertions.

2. **Reinforcement Learning from Human Feedback (RLHF):** RLHF introduces a feedback loop. First, a reward model is trained on human pairwise comparisons (where human legal experts rank different model outputs based on accuracy and truthfulness). Second, a reinforcement learning policy (such as Proximal Policy Optimization) fine-tunes the LLM to maximize this reward. This actively penalizes hallucination tendencies and encourages the model to express uncertainty (e.g., "I do not have access to that specific statute") rather than fabricating information.

**Evaluation & Trade-offs:**
- **Accuracy and Alignment:** RLHF is significantly more effective at aligning the model with safety guidelines and factual consistency than SFT alone.
- **Subjectivity and Cost:** RLHF is highly dependent on the quality and objectivity of the human raters. In legal domains, hiring certified legal professionals to rate outputs is extremely expensive and time-consuming.
- **Reward Hacking:** RLHF can sometimes lead to 'reward hacking' where the chatbot learns to output overly cautious or generic legal disclaimers rather than helpful, accurate answers.

Marking scheme

**[1.5 Marks]** Explains the limitations of Supervised Fine-Tuning (SFT) in a legal context, noting that SFT teaches form and style but struggles with penalizing fabricated outputs.
**[2.0 Marks]** Details the mechanism of RLHF (creation of a reward model from human comparison data and policy optimization) and how it explicitly targets and penalizes hallucinations.
**[2.0 Marks]** Evaluates the specific application to a legal context, highlighting the need for extreme precision and the danger of false legal precedents.
**[2.0 Marks]** Identifies critical drawbacks/trade-offs of RLHF, including the high cost of expert legal annotators, risk of reward hacking, and potential over-conservatism (excessive disclaimers) in responses.

Question 2 · Short Answer

7.5 marks

A developer is building a Retrieval-Augmented Generation (RAG) chatbot using a vector database. Explain how the choice of document chunk size and chunk overlap affects both the retrieval accuracy and the processing overhead of the chatbot.

Show answer & marking scheme

Worked solution

To implement a RAG system, source documents must be segmented into smaller units (chunks) before being vectorized and stored in a vector database.

1. **Chunk Size:**
- *Small Chunks (e.g., 100-200 tokens):* Improve retrieval precision because the vector embeddings are highly specific to narrow topics. This reduces the risk of retrieving irrelevant background information. However, if a concept spans across multiple paragraphs, a small chunk size may miss the broader context, leading to incomplete or incorrect syntheses by the LLM.
- *Large Chunks (e.g., 1000+ tokens):* Preserve rich contextual information and long-range dependencies within the document. However, they introduce noise (irrelevant information within the same chunk), which can dilute the vector similarity score and result in less precise matches. Furthermore, feeding large chunks to the LLM increases token consumption and computational costs.

2. **Chunk Overlap:**
- *Purpose:* Overlap (e.g., 10-20% of the chunk size) ensures that sentences or concepts split at the boundary of a chunk are preserved in their entirety in at least one of the adjacent chunks. This prevents vital transition words or modifying clauses from being separated from their subject.
- *Overhead:* High overlap increases the total number of chunks stored in the vector database, leading to larger index sizes, higher memory footprint, and marginally slower query lookup times. It can also result in redundant information being returned in top-K retrieval results, wasting precious context window capacity.

Marking scheme

**[2.0 Marks]** Analyzes the impact of small vs. large chunk sizes on retrieval precision and context retention.
**[2.0 Marks]** Analyzes the impact of chunk size on computational overhead (token consumption, processing costs, and LLM attention performance).
**[2.0 Marks]** Explains the purpose of chunk overlap in preserving semantic integrity across boundaries.
**[1.5 Marks]** Discusses the operational trade-offs of chunk overlap, specifically vector database storage expansion and context window redundancy.

Question 3 · Short Answer

7.5 marks

An enterprise is planning to deploy a corporate customer-support chatbot. Evaluate the decision to self-host an open-source Large Language Model (LLM) on local GPU infrastructure versus utilizing a proprietary cloud-based LLM API. Your evaluation must address security, cost, latency, and maintenance.

Show answer & marking scheme

Worked solution

When selecting an architectural strategy, the enterprise must balance multiple competing operational parameters:

1. **Security & Data Privacy:**
- *Self-Hosting:* Highly advantageous. Sensitive customer data, intellectual property, and proprietary support logs remain entirely within the enterprise's private network security perimeter, ensuring compliance with strict regulations like GDPR or HIPAA.
- *Cloud API:* Introduces third-party data transmission risks. Even with enterprise agreements, data leaves the local network, which can be a regulatory bottleneck for highly regulated industries.

2. **Cost Structure:**
- *Self-Hosting:* Involves high Capital Expenditure (CapEx) for procurement of specialized GPU hardware (e.g., NVIDIA A100/H100) and cooling systems. Operational costs are flat and predictable regardless of query volume.
- *Cloud API:* Operational Expenditure (OpEx) model based on token usage. Low initial setup barrier, but costs scale linearly with user traffic, potentially becoming expensive at high scale.

3. **Latency & Scalability:**
- *Self-Hosting:* Latency is determined by local hardware capacity. Scaling to meet peak demand requires provisioned overcapacity, leading to idle resources during low-traffic periods.
- *Cloud API:* Typically offers highly optimized, global edge infrastructure with auto-scaling capabilities, but latency depends on external network conditions and API rate-limiting constraints.

4. **Maintenance & Development Velocity:**
- *Self-Hosting:* Demands dedicated ML engineering and DevOps resources to manage model weights, optimize quantization (e.g., FP16, INT8, AWQ), run container orchestration, and handle hardware failures.
- *Cloud API:* Minimal maintenance required. The cloud provider handles model updates, hardware maintenance, and security patches, allowing the enterprise to focus solely on application-level logic.

Marking scheme

**[2.0 Marks]** Evaluates the security and compliance trade-offs (private perimeter vs. third-party transit and privacy agreements).
**[2.0 Marks]** Analyzes the cost differences, distinguishing clearly between Capital Expenditure (CapEx) for self-hosting and Operational Expenditure (OpEx) for cloud APIs.
**[2.0 Marks]** Explains performance and scalability differences, including local hardware constraints vs. cloud provider elasticity.
**[1.5 Marks]** Explains the maintenance overhead, referencing the engineering resources needed to run and optimize open-source infrastructure versus the simplicity of API integrations.

Question 4 · Short Answer

7.5 marks

Explain the architectural and operational limitations imposed by fixed context windows (token limits) in Transformer-based chatbots, and propose two technical strategies to mitigate these limitations in long conversational sessions.

Show answer & marking scheme

Worked solution

1. **Architectural & Operational Limitations:**
- **Quadratic Complexity:** The core self-attention mechanism of the Transformer architecture calculates attention scores between every pair of tokens in a sequence. This results in computational and memory complexity of \(O(N^2)\), where \(N\) is the number of tokens. Consequently, as the context size increases, memory consumption on GPUs grows quadratically, leading to hardware bottlenecks and "Out of Memory" (OOM) errors.
- **Loss of Coherence:** Once a conversation exceeds the maximum token limit, the chatbot must drop older parts of the dialogue. This causes the model to "forget" instructions, context, or user preferences established earlier in the session.

2. **Mitigation Strategy 1: Recursive Summarization & Sliding Windows:**
- Instead of feeding the entire historical transcript to the LLM, a sliding window technique is used.
- Older exchanges are periodically processed by a background summarization LLM. This short, compressed summary is prepended to the active sliding window of recent raw messages. This preserves key facts (e.g., user name, main goal) while keeping the input prompt within the token boundary.

3. **Mitigation Strategy 2: Vector-Based Long-Term Memory (RAG for History):**
- Past conversations are saved to an external database and chunked.
- When a user inputs a new prompt, a vector similarity search is executed against the history database. Only the most semantically relevant historical context is retrieved and injected into the current context window, ignoring irrelevant intermediate messages and preserving active token space.

Marking scheme

**[2.5 Marks]** Explains the technical cause of the token limit, explicitly linking it to the quadratic complexity \(O(N^2)\) of the self-attention mechanism and the resulting GPU memory constraints.
**[2.5 Marks]** Outlines the first mitigation strategy (sliding window attention with recursive summarization) and explains how it compresses old information to fit context boundaries.
**[2.5 Marks]** Outlines the second mitigation strategy (Vector database / RAG-based historical retrieval) and explains how semantic search filters historical inputs to minimize active token consumption.

Wondering how well you actually know this?

Thinka is an AI practice app for DSE students: unlimited questions, instant auto-marking, and detailed step-by-step solutions. 100,000+ students use it to confirm they actually know it, not just think they do.

Start Practising Free See pricing

Want more questions like this? Practice unlimited on Thinka, instant answers included.

Start Practising Free