On Automating Data Entry

This entry is all about how I helped my friend automate his data entry job without losing privacy.

Ollama is an open-access framework for running LLMs and SLMs in local computers, relieving users of privacy concerns: privacy that one might give up when using SaaS products (cloud-based providers of LLMs).

Bothered with heavy workload, my friend decided to give up on manually editing and transcribing transaction records from physical documents. One day, my friend asked me if there is a way to parse scanned and OCR’ed documents then automatically transcribe it to Excel file. Of course, we could’ve gone with private LLM providers like Claude, Gemini, or ChatGPT, but that means giving up sensitive information and my friend’s not ready to be sentenced for violating NDAs and whatnot! That’s not an option to my friend who handles documents with multiple transactions from multiple companies.

I first went with heuristics, such as Tesseract for converting scanned and image copies of these tables to OCR. It turned out well, but I didn’t know it’s where the road ends when it comes to heuristics — unless I manually copy-paste these selected texts already, thanks to OCR. But, it’s still manual! I want it so that I won’t have to do the repetitive task of copying and pasting it from the PDF to Excel spreadsheet anymore.

Then, I remembered about local SLMs. My friend owns a consumer-grade gaming GPU that could run models of sizes 1B, 4B, and at most, 8B. So, I researched models of those size and found Gemma 3 and Qwen 2.5 VL, both with vision capabilities, that beats most heuristic models, which I needed to realize the automation process.

My workflow is this: I basically instruct an SLM to parse a given screenshot to extract key information from a table to comma-separated values. This way, I could easily move it over to an Excel file. All in the local machine!

While the output of these small models are not perfect and still need manual fixes (perhaps tuning), my friend won’t have to do manual mapping of PDF text to spreadsheet cells anymore and his job now is to correctly put the cell (from CSV file) to its proper location in its Excel equivalent.

Indeed, Nvidia is correct1; small language models are the future. You don’t need the humanity’s knowledge to perform tasks that help reduce repetition!

In the end, my friend was fascinated with these SLMs and he is now interested in learning Python scripting and prompt engineering. I did save him hours if not days from manually entering data from these scanned documents to a spreadsheet, but he’s not saved from the curiosity I brought him! LOL

Footnotes

  1. P. Belcak, G. Heinrich, S. Diao, Y. Fu, X. Dong, S. Muralidharan, Y. C. Lin, and P. Molchanov, “Small Language Models are the Future of Agentic AI,” arXiv preprint arXiv:2506.02153, 2025. [Online]. Available: https://arxiv.org/abs/2506.02153. doi: 10.48550/arXiv.2506.02153