Every SEAS student hits this wall
This was the final project for CSCI 6366, Neural Networks and Deep Learning at GWU's School of Engineering and Applied Science. A team project with my classmate Anurag Dhungana. We had a semester to build something real using what we'd learned, and we wanted to pick a problem we actually felt.
The problem we picked was one every SEAS student has hit: trying to figure out which courses you need before you can take the one you actually want. The official GWU systems are fragmented, the course bulletin is a PDF, the schedule is a separate portal, and prerequisite chains require you to manually trace through multiple pages. We wanted to build something that could answer the question naturally: "What do I need to take before CSCI 6364?"
LLMs hallucinate prerequisites. Confidently.
The obvious first move was fine-tuning a language model on course data. We did that. It worked for simple questions. But it completely broke on anything that required tracing a chain. Language models don't naturally reason about structured relationships. They pattern-match. They hallucinate. They'd give you a confident wrong answer about prerequisites that didn't exist.
Course relationships aren't a text problem. They're a graph problem.
Prerequisites form a directed acyclic graph. If you want to answer multi-hop questions correctly, you need something that can traverse that structure, not just retrieve text that sounds relevant. That's what pushed us toward building a Knowledge Graph.

A knowledge graph that does the reasoning first
We scraped and structured data from two sources: the GWU CSCI and DATS course bulletin (187 courses) and the Spring 2026 course schedule (586 instances). From that, we built a Knowledge Graph using NetworkX where nodes are courses and edges are prerequisite relationships. spaCy handled the entity extraction from the bulletin text.
The Knowledge Graph became the backbone for our QA system. At query time, instead of just feeding the question to the model, we'd first traverse the graph to resolve any prerequisite chains in the question, then pass that structured context to the language model to generate a natural language answer. The model's job changed from "figure out what the prerequisites are" to "explain the prerequisites I've already retrieved correctly."

Fine-tuning with LoRA
For the language model, we fine-tuned Llama 3.1 8B using LoRA adapters via Unsloth. We generated 2,828 training Q&A pairs: a mix of simple factual questions and complex multi-hop questions. LoRA meant we weren't fine-tuning all 8 billion parameters, we were adapting a small set of low-rank matrices. This made training feasible on academic resources: 14x less data and 25x faster than full fine-tuning.
Four attempts, four lessons
We didn't get to the Knowledge Graph approach immediately. We went through four approaches across the semester.
The first two were baseline experiments: fine-tuning without graph augmentation. These handled factual lookups reasonably well but failed on relational reasoning. Accuracy on multi-hop questions: 26%.
The third approach introduced the Knowledge Graph as a retrieval mechanism but didn't train the model with graph-augmented examples, better, but inconsistent.
The fourth approach, training the model with KG-augmented examples so it learned to use the graph context, is where we hit 34-38% accuracy on multi-hop reasoning. That gap between 26% and 38% was the whole point of the Knowledge Graph.

Designing for a system that's sometimes wrong
Building the frontend raised a design problem the model work didn't: how do you build an interface for a system that's partially correct, where the stakes of being wrong are real? A student asking about prerequisites might use the answer to decide what to register for.
The choice to use a chat interface rather than a search box was deliberate. Chat implies reasoning rather than fact retrieval, which is more honest for this system. More practically, chat lets students ask in the natural language they already use. "What do I need before 6364?" is a question a student would ask an advisor. Forcing that into a structured query form adds cognitive overhead for no benefit.
The interface distinguishes between two modes of response: simple factual lookups and complex graph traversal queries. These get different visual treatment. A factual lookup shows a compact answer. A graph traversal query shows which part of the knowledge graph was traversed and what relationships were resolved before the model generated its answer. This is communicating model confidence through design rather than burying a probability score in an API response.
For anything touching graduation requirements or prerequisite chains, the output card includes an active prompt to confirm with an academic advisor, as prominent as the answer itself, not a disclaimer tucked at the bottom.
The model wasn't the problem. The data structure was.
The language model was the same throughout. What changed was how we structured the knowledge, and that alone was the difference.
This project was my first time working with knowledge graphs seriously. The thing that stuck with me: how much the data representation matters. The training data was similar in size across approaches. What changed was structure, and that alone was the difference between a system that hallucinated prerequisites and one that could trace them correctly.
And that insight extends past the model into the interface. Because the data was structured as a graph, the interface could expose graph-level concepts to users. The interactive knowledge graph visualization was only possible because the underlying data was organized that way. If course relationships had been stored as unstructured text or a flat table, there would have been nothing to visualize.
Data architecture and UX are not separate decisions. How you structure your data determines what your interface can show.



