Training an AI on Zurich Timetables: When Models Invent Trains That Do Not Exist

2025-10-09 · by minikim

I gave an AI a slice of Zurich’s Sunday train schedules and asked it to become a pocket timetable assistant. Armed with LoRA, GGUF, and a GPU that sounds like a jet engine when training, I set out to teach a machine when trains really leave Zürich HB. What happened? The model learned to sound exactly like a timetable… and then it invented trains that do not exist.

Aim

The goal of this experiment was to fine tune a small language model on Zurich train schedules and run it locally. I wanted to see if a lightweight AI could serve as a timetable assistant, answering natural questions like “When is the next train from Zürich HB to Oerlikon after 08:00 on Sunday?” Beyond the functionality, the aim was to explore the full pipeline: fine tuning with LoRA, compressing into GGUF, and serving it with Ollama.

Data

The data came from the SBB Open Data GTFS feed. To keep things manageable, I filtered to Zurich City stations and Sundays only. This subset was small, irregular, and perfect for a first iteration. It gave me a few thousand departure–arrival pairs to train on.

Preparing Data

The data was reshaped into instruction and answer pairs. Each entry asked a question in natural language and provided a structured timetable answer.
Example

LoRA Fine Tuning

Base model: Microsoft Phi 2 with 2.7 billion parameters under MIT license.
LoRA configuration
Training parameters
Only about 2.6 million parameters (0.09 percent of the model) were updated. This efficiency is the key strength of LoRA: instead of retraining billions of weights, it adapts only a thin adapter layer. The model quickly picked up the timetable format.

GGUF Quantization

After merging the LoRA into the base model, the checkpoint was converted into GGUF. I chose Q4_K_M quantization, a 4 bit setting that balances size and performance. This reduced the model size from about 10 GB (FP16) to around 2 GB (GGUF Q4).
Quantization made the model portable to different machines. While it introduced slightly more noise, the structure of the output remained intact.

Run with Ollama

With Ollama, the quantized model could be served locally. I used conservative parameters:
At first, the results were strong. The model returned rows in the expected format.
But with longer generations, it drifted into hallucination, inventing impossible times such as:
The model had learned the style of timetable answers, but not the reality of the schedules.

Deployment and Performance

The most striking part of this experiment is portability. The same fine tuned GGUF model that ran on Linux with an RTX 4070 Ti also ran smoothly on a MacBook Air M4 with 16 GB RAM, using Ollama and Apple’s Metal backend.
Performance metrics from the MacBook Air run
This is interactive performance on a fanless ultraportable. The quantized model still represents a 2.7 billion parameter base, but thanks to LoRA only 2.6 million parameters were actually trained and merged. On the MacBook Air, all parameters are loaded in 4 bit precision, making it efficient enough to serve locally.
Output on MacBook Air M4
This shows that the style transfer worked, although hallucinations remain.
I also plan to test deployment on a Jetson Nano 8 GB. Running a 2.7B parameter model on the Nano will require llama.cpp built with CUDA for ARM and aggressive quantization (Q4 or even Q2). It will be slower than Mac or PC, but still functional. This would prove the principle: train once, run anywhere.

Observation

The model mastered the format but not the facts. It produced realistic looking timetable rows but invented times and continued generating far beyond the expected number of results. This demonstrates a core limitation of generative AI: it predicts plausible sequences, but it does not guarantee factual correctness.

Conclusion

This project confirmed that with LoRA and GGUF, it is possible to fine tune, compress, and run a model locally on consumer hardware. From RTX 4070 Ti to MacBook Air M4 and soon Jetson Nano, the same model runs seamlessly. The experiment also highlighted the importance of matching the tool to the task. LoRA is excellent at style, but timetables require factual recall, better suited to retrieval methods.

Next Steps

From a gaming GPU to a MacBook Air and soon a Jetson Nano, the same 2.7B-parameter brain now runs wherever I take it. It still hallucinates, but that is part of the charm: this project is not just about trains, it is about showing how far a hobbyist can go with modern AI tools. Train once, run anywhere.

Experiment Summary

Training setup
Quantization
Ollama run parameters
Performance
MacBook Air M4 16 GB with Metal backend
Jetson Nano 8 GB (planned test)
Final model size

← Home

This site uses no cookies, no tracking.
moderated by minikim