Principal Engineer, Compute Platform
馃嚭馃嚫Pinterest
Job Description
About Pinterest: Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we鈥檙e on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product. Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other鈥檚 unique experiences and embrace the flexibility to do your best work. Creating a career you love? It鈥檚 Possible. At Pinterest, AI isn't just a feature, it's a powerful partner that augments our creativity and amplifies our impact, and we鈥檙e looking for candidates who are excited to be a part of that. To get a complete picture of your experience and abilities, we鈥檒l explore your foundational skills and how you collaborate with AI. Through our interview process, what matters most is that you can always explain your approach, showing us not just what you know, but how you think. You can read more about our AI interview philosophy and how we use AI in our recruiting process here . Pinterest serves over 600 million users through sophisticated visual and social capabilities which connect inspiration, advertisement, and shopping. Compute Platform provides the underlying compute capabilities to run jobs and processes for all of the systems and workloads needed behind the scenes to create the best experience for our users and advertisers. This includes distributed processing, data systems, search, experimentation, monetization, AI/ML for ranking and recommendations, GenAI, and internal systems. We are looking for a Principal Engineer who can lead and scale the consolidation and modernization of this infrastructure under what we call PinCompute, with an emphasis on some of the largest and most challenging stateful workloads, as well as GPU-heavy AI workloads. The scale and scope of the effort will require designing and building around Kubernetes and solving its scaling limitations, handling stateful systems and data-intensive workloads, formalizing mechanisms to stack and bin pack workloads, working with multiple internal customers and giving them migration paths, and working through ambiguous and unforeseen situations which arise from workload requirements, production and operability requirements, and unique multi-tenancy challenges. What you'll do: Solving the challenges of replacing isolated pools of dedicated compute resources with a very large scale shared compute platform, shifting from machine-based designs to container-based designs. Working with leads across various platforms, especially stateful and data platforms, to buil
Read original posting