Why Join Us?
- Collaborate with a powerhouse team of marketing professionals from top industry players like Lensa, Picsart, Viber, AIRI, Yandex.
- Benefit from the guidance of investors with a history of successful exits, including the sale of Looksery and AI Factory to Snap for $150M and $166M, respectively.
- Be part of a rapidly growing company with $50M ARR and 250K+ happy customers across the US and Europe.
- Engage in innovative AI-driven projects in a dynamic and fast-paced startup environment.
About the Role
As a Inference Engineer at GlamAI, you will be responsible for optimizing neural networks in real production environments — from profiling and performance analysis to leveraging existing solutions and implementing custom ones when needed. If you've actually made models faster and enjoy working at the intersection of high-level ML and low-level GPU performance, we'd love to talk.
Key Responsibilities
- Profile, benchmark, and identify performance bottlenecks in neural network inference pipelines
- Port, adapt and optimize models for on-device inference (latency, memory, battery, thermal stability)
- Optimize server-side inference for throughput and cost efficiency
- Collaborate with ML researchers to co-design model architectures with inference efficiency in mind
Qualifications
-
Experience:
-
Experience in deep learning inference optimization (mobile or edge)
-
Hands-on with at least one of:
-
Core ML / TFLite / ONNX Runtime / TensorRT
-
Metal / Vulkan / OpenCL / OpenGL / CUDA / Triton
-
Technical Skills:
- Strong understanding of GPU/NPU architecture and execution model
- Solid grasp of inference optimization techniques: quantization, operator fusion, graph optimization
Benefits
- Competitive salary and leadership growth opportunities.
- Fitness benefits.
- Opportunity to work on innovative AI-based applications.
- Supportive, fast-paced startup environment with a strong engineering team.