Worked solution
### Technical Analysis
- **Option 1 (Fine-tuning & On-Premise)**: Fine-tuning updates the neural weights of an open-source model (such as Llama-3) on university curricula. This excels at adapting the model's tone, terminology, and deep domain knowledge. However, fine-tuning is inherently static; it cannot dynamically reflect real-time changes, such as a student's newly updated grades or last-minute course schedule changes. Hosting on-premise requires significant local computational resources (e.g., high-end GPUs) and complex system orchestration for low-latency inference.
- **Option 2 (Proprietary API & RAG)**: Retrieval-Augmented Generation (RAG) queries the university's database at runtime, fetches relevant documents (e.g., specific student transcripts, course syllabus), and injects them into the prompt's context window. This allows a proprietary cloud model (e.g., OpenAI GPT-4) to reason over real-time, highly dynamic data without retraining. This drastically reduces hallucinations because the model's response is grounded in the retrieved text. However, it introduces dependencies on network latency, API rate limits, and the context window capacity of the model.
### Financial Analysis
- **Option 1**: High Capital Expenditure (CapEx). Buying and installing local GPU clusters is highly expensive. Furthermore, ongoing Operational Expenditure (OpEx) includes local electricity, cooling, and high salaries for specialized machine learning engineers to maintain local servers and pipelines. However, there are no ongoing per-token API fees.
- **Option 2**: Low CapEx. No specialized local hardware is required. Instead, it features variable, ongoing OpEx based on API consumption (pay-per-token model). While initially cheaper, highly active chatbot deployments with massive student cohorts can lead to high and unpredictable monthly API costs, especially since RAG requires appending large context documents to every single prompt.
### Ethical & Security Analysis
- **Option 1**: Superior data security. Student records remain within the university's private firewall, complying with strict student data protection laws such as FERPA (in the US) or GDPR (in Europe). There is zero risk of data being leaked to third-party corporations or used to train public commercial models.
- **Option 2**: Significant privacy risks. Sending sensitive student data (such as grades, personal details, or disciplinary records) across external APIs raises major compliance issues. Unless the university secures an enterprise-grade contract guaranteeing that data is encrypted in transit and at rest and explicitly excluded from model training, this option presents severe liability concerns.
### Recommendation and Justification
Neither option is perfect in its pure form. However, because *EduBot* must handle private student records, compliance with data privacy laws is a non-negotiable ethical and legal constraint, which heavily penalizes Option 2. Concurrently, because *EduBot* must provide accurate, real-time data (e.g., individual student grades and schedules), fine-tuning alone (Option 1) is technically insufficient because weights cannot be retrained daily.
Therefore, the optimal solution is a **hybrid architecture**: deploying an **open-source RAG pipeline hosted entirely on-premise (or within a secure, dedicated private university cloud)**. By using an open-source LLM hosted locally, the university retains complete control over data privacy (solving the ethical issue). By utilizing a RAG pipeline rather than pure fine-tuning, the system can dynamically retrieve real-time student records from internal databases and feed them safely to the local LLM (solving the technical limitation of static data).
Marking scheme
Marks are awarded using the following 12-mark holistic rubric:
- **Level 4 (10-12 marks)**:
- Shows excellent understanding of fine-tuning, open-source hosting, APIs, and RAG architectures.
- Offers a balanced, deep evaluation across all three perspectives: technical (static vs. dynamic data, latency, hardware), financial (CapEx vs. OpEx), and ethical/security (FERPA/GDPR compliance, third-party data risks).
- Provides a highly realistic, well-justified recommendation (e.g., identifying that a hybrid approach—hosting a local RAG system—maximizes both privacy and real-time query capability).
- Terminology is accurate, clear, and highly professional throughout.
- **Level 3 (7-9 marks)**:
- Shows good understanding of both options and their general mechanisms.
- Evaluates most aspects (technical, financial, and security), though one perspective may be slightly less developed than the others.
- Provides a logical recommendation supported by relevant arguments from the analysis.
- Uses appropriate computer science vocabulary.
- **Level 2 (4-6 marks)**:
- Describes fine-tuning and API/RAG options but the analysis is descriptive rather than evaluative.
- Focuses heavily on one perspective (e.g., only data privacy) while ignoring others (e.g., financial or technical issues of static data in fine-tuning).
- Recommendation is superficial, general, or poorly justified.
- **Level 1 (1-3 marks)**:
- Shows limited or flawed understanding of LLM architectures.
- Fails to meaningfully compare the options or address the specific requirements of the university context.
- Recommendation is missing, incorrect, or lacks any technical justification.