About Grainger:
Grainger is a broad line distributor with operations in North America, Japan and the United Kingdom. We achieve our purpose, We Keep the World Working, by serving more than 4.5 million customers with multiple products that keep their operations running and their people safe. Grainger also delivers services and solutions, such as technical support and inventory management, to save customers time and money.
We're looking for passionate people who can move our company forward. As one of the 100 Best Companies to Work For, we have a welcoming workplace where you can build a career for yourself while fulfilling our purpose to keep the world working. We embrace new ways of thinking and recognize everyone is an individual. Find your way with Grainger today.
Position Details:
The Infrastructure Engineer specializes in managing AWS-hosted Kubernetes (K8s) platforms engineered primarily for machine learning (ML) training, experimentation, and serving. You are tasked with ensuring a robust and scalable infrastructure that supports advanced ML workloads. Additionally, you will be responsible for the implementation and management of our monitoring ecosystem using Grafana, Loki, Prometheus, and Thanos, as well as maintaining continuous deployment via GitOps best practices with ArgoCD and Flux.
They build, test, implement, configure, tune and support the Kubernetes infrastructure in the Cloud, including server platforms, storage systems, middleware infrastructure, network, and client technologies.
They pursue the physical design, implementation, and support of major automation solutions in a multiplatform environment, make recommendations for improved usability of automated tools, and identify opportunities for increased adoption of orchestration technologies.
At Grainger, our team members have an opportunity to work in one of the largest SAP-centric and complex, 24x7, E-commerce environments and gain knowledge and experience with many SAP and other application modules running on-prem and in the Cloud. Our Machine Learning Operations team is seeking an experienced Platform or Site Reliability Engineer to support the ML Platform.
You Will:
You Have:
Rewards and Benefits:
With benefits starting day one, Grainger is committed to your safety, health and wellbeing. Our programs provide choice and flexibility to meet our team members' individual needs. Check out some of the rewards available to you atGrainger.
DEI Statement:
We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal opportunity workplace.