Inference Engineer @ Glam AI

Why Join Us?

Collaborate with a powerhouse team of marketing professionals from top industry players like Lensa, Picsart, Viber, AIRI, Yandex.
Benefit from the guidance of investors with a history of successful exits, including the sale of Looksery and AI Factory to Snap for $150M and $166M, respectively.
Be part of a rapidly growing company with $50M ARR and 250K+ happy customers across the US and Europe.
Engage in innovative AI-driven projects in a dynamic and fast-paced startup environment.

About the Role

As a Inference Engineer at GlamAI, you will be responsible for optimizing neural networks in real production environments — from profiling and performance analysis to leveraging existing solutions and implementing custom ones when needed. If you've actually made models faster and enjoy working at the intersection of high-level ML and low-level GPU performance, we'd love to talk.

Key Responsibilities

Profile, benchmark, and identify performance bottlenecks in neural network inference pipelines
Port, adapt and optimize models for on-device inference (latency, memory, battery, thermal stability)
Optimize server-side inference for throughput and cost efficiency
Collaborate with ML researchers to co-design model architectures with inference efficiency in mind

Qualifications

Experience:
- Experience in deep learning inference optimization (mobile or edge)
- Hands-on with at least one of:
- Core ML / TFLite / ONNX Runtime / TensorRT
- Metal / Vulkan / OpenCL / OpenGL / CUDA / Triton
Technical Skills:
- Strong understanding of GPU/NPU architecture and execution model
- Solid grasp of inference optimization techniques: quantization, operator fusion, graph optimization

Benefits

Competitive salary and leadership growth opportunities.
Fitness benefits.
Opportunity to work on innovative AI-based applications.
Supportive, fast-paced startup environment with a strong engineering team.