Denes Tornyi
About
I thrive in interdisciplinary environments, as my goal is to integrate different fields of art and technology by making connections. I believe that human development is driven more by emotion than by technology alone. I wish to believe in a society where people are not just building blocks, but individuals who actively take part in the system they form.
Born in 1996. My earliest exposure to technology was the single computer in our household, which did not have an internet connection. I spent my days playing video games and exploring the file system, clicking on every installed executable and daydreaming about a database engine that could link documents and render text, images, and videos seamlessly. During the summer of 2006, I stumbled upon the digital manual of the video game "StarCraft" written in HTML3 and was astonished by the rendering capabilities of Internet Explorer 6. Despite having no knowledge of English and no formal training, I learned HTML by dissecting the manual's source code.
Since that summer, my passion for learning has only grown stronger. In recent years, I have experimented with various technologies, including a WebRTC-powered peer-to-peer 2D video game engine written in TypeScript, an artistic video game exploring the feeling of solitude built in Godot 3, and a weekend project creating an OCR-powered screen translator using a DirectX hook, among many other ventures.
In my career, I moved from writing business logic to designing data pipelines. I am curious about the hidden structure of language models. I question whether contextual vectors might be sums of many individual features, and whether links between concepts are already inherent in the latent vector space. I am spending my time studying and thinking about these questions as I wish to contribute to research in this area.
References
2024 Mar –
Nokia Solutions and Networks Kft.
-
Architected and implemented a Go-based retrieval-augmented generation (RAG) pipeline that begins with multi-threaded scraper services responsible for synchronizing data from external APIs into a centralized PostgreSQL database. Services do not continuously watch for changes: each runs on its own schedule (depending on whether its dependencies are met). Processor services read raw data from PostgreSQL and produce Markdown documents (e.g. by converting HTML to Markdown). An IDF-based document validation layer vets the generated Markdown before it proceeds to the next stage. A chunking service partitions documents into discrete fragments optimized for embedding. Two embedding services generate vector representations: a dense embedding service leveraging Hugging Face TEI and a sparse embedding service powered by a BM25 ranking model (implemented from scratch) along with a single-pass wordification and TF-IDF dictionary-building algorithm. An indexer service exports embeddings from PostgreSQL and ingests them into Qdrant and Milvus vector databases to enable efficient retrieval. The retrieval service performs hybrid searches by fusing the two vectors spaces to deliver precise, contextually relevant results. The entire solution is deployed via Docker Compose and uses HTTP/1.1 and SQL for inter-service communication.
In addition to the main project, conducted an experiment on an architecture based around PostgreSQL. The idea was to use PGRX to write extensions in Rust to reduce HTTP 1.1 / SQL communication overhead between services and integrate everything (chunking, embedding, retrieving) into a single monolithic database process. The experiment relied on fastembed and pgvector, with the latter proving to lag behind in search speed compared to Milvus and Qdrant.
Collaborate with various departments to reduce internal chaos within Nokia and create an internal project space for sharing all research, contributing many packages. Actively mentor junior colleagues.
2023 Aug – 2024 Jan
ScoutinScience B.V.
-
Designed and implemented a matchmaking system to link Dutch companies with university graduates by processing company websites and students' theses using LLMs (Llama 2nd generation, Mistral 1st generation models) and sentence embedding technologies (SBERT). The solution consisted of 4 different components: a data pipeline written in Python; a .NET-based back-end; a ReactJS-based front-end and an SQL database storing raw and processed data. Beside the main project, conducted code reviews, refined CI/CD of the company and refactored the existing architeture of user management.
2022 Mar – 2023 Jun
Nokia Solutions and Networks Kft.
-
Designed and implemented (in Go using the K8s Operator SDK) an update orchestrator and an update receiver service to handle firmware rollouts for customer-owned hardware. Developed a MinIO-based firmware storage service with synchronization between regional data centers, along with client-side tools for managing firmware entries. Mentored junior colleagues.
2018 Mar – 2021 Feb
Accedo Broadband HU Kft.
-
Contributed to the development OTT client-side applications for WebOS, Tizen, and Web platforms (for OSN, NRK, ProSieben, Deutsche Telekom, and HBO) using JavaScript with an internally developed library and ReactJS.
Designed and implemented a real-time third-party API validator middleware between the application and the network layer; a Docker-powered CI/CD toolset for Tizen and WebOS; an internal service operated inside Amazon AWS to synchronize Salesforce time tracking with the BambooHR calendar.
2014 Jul – 2015 May
Youwon Hungary Kft.
-
Contributed to the development of an online second-hand marketplace using TypeScript and AngularJS on the client side, C# and ASP.NET Web API 4 on the server side, and Entity Framework 6 for database management with Apache Lucene indexing.
Designed and implemented a telecommunication service using WebRTC and SignalR over the existing architecture: enabling audio/video calls, managing message history and synchronizing ongoing conversation tabs across devices.
Languages
Hungarian native
English fluent