about us
ML Software Engineer – Python | Go | Large Language Models | AI Serving Infrastructure | English-speaking | EU Work Permit Required
At Nebul, we’re building Europe’s sovereign AI cloud infrastructure — trusted, secure, and optimized for the future of generative AI. We’re looking for an ML Software Engineer to help design and build the next-generation serving infrastructure for LLMs. You will work at the intersection of machine learning and systems programming, making AI inference faster, more reliable, and scalable across our NVIDIA-powered platforms.
If you’re passionate about building performant AI serving systems, writing production-grade Python and Go, and want to help architect our own custom serving engine — this role is for you.
What You’ll Do
You will design, implement, and optimize serving pipelines and runtimes for large language models and generative AI workloads. Your work will focus on writing robust Python and Go code to build efficient, scalable APIs and backend systems for low-latency inference. You will contribute to the architecture of Nebul’s proprietary LLM serving engine, exploring ways to reduce memory footprint, improve throughput, and handle multi-tenant requests securely and reliably. You’ll collaborate closely with our AI research, platform, and DevOps teams to bring cutting-edge models into production at scale.
What We’re Looking For
You have 3+ years of experience as a software engineer with strong skills in Python and Go. You understand how to work with AI/ML systems at the infrastructure level — from optimizing runtimes to working with APIs and orchestration layers. You’re a systems thinker who can translate abstract performance requirements into production-ready software. Experience with modern model serving frameworks and GPU-accelerated inference environments is highly desirable.
Bonus If You Have
- Hands-on experience optimizing LLM inference performance
- Knowledge of NVIDIA GPU architectures and CUDA optimization
- Familiarity with containerized infrastructure (Docker, Kubernetes)
- Experience developing custom inference runtimes or AI APIs
- Exposure to observability and telemetry for large-scale serving environments
Why Nebul?
- Help build Europe’s most advanced AI serving infrastructure from the ground up
- Work directly on cutting-edge generative AI workloads (LLMs, multimodal models)
- Join a mission-driven team blending AI, systems engineering, and cloud-native tooling
- Flexible hybrid setup, based near The Hague and Amsterdam
- Competitive compensation, equity options, and fast career growth opportunities
Must Haves
- Based in the Netherlands (commute to The Hague)
- Valid EU work permit (no sponsorship available)
- Fluent in English (Dutch not required)
Ready to help us design the serving infrastructure behind Europe’s sovereign AI cloud?
Apply now through Frank Poll and join Nebul in building fast, reliable, scalable AI platforms tailored for LLM workloads.