Tech Lead - Data Infrastructure / Platform
| Hours | Full-time, Part-time |
|---|---|
| Location | Austin, Texas |
About this job
Job Description
About the Team
This team builds and operates the data processing infrastructure that supports large-scale simulation, machine learning, and algorithm development for autonomous systems. The platform ingests, stores, and processes massive volumes of sensor data—including cameras, lidar, radar, and other modalities—generated by real-world operations and simulation.
The work involves petabyte-scale datasets, low-latency access patterns, and compute-intensive pipelines. The infrastructure supports teams across simulation and ML, enabling experimentation, training, and evaluation using specialized algorithms similar to those deployed on autonomous systems themselves.
About the Role
We are seeking a Technical Lead to own and evolve a large-scale data processing and compute platform used by Simulation and ML teams (including Perception, Prediction, and Planning). You will define the technical vision, make architectural decisions, and build a robust, scalable platform that will significantly expand over the next year.
This role combines hands-on engineering with technical leadership. You will write production code, prototype solutions, guide system design, and help grow and mentor an infrastructure team. You will also collaborate closely with downstream users to ensure the platform meets real-world performance, reliability, and usability needs.
What You’ll Do
- Own the compute platform: Lead the design and operation of the data processing infrastructure supporting simulation and ML workloads across autonomous systems
- Scale for massive data: Architect solutions for petabyte-scale datasets and high-throughput, low-latency processing of sensor data (camera, lidar, radar)
- Build and evolve systems: Prototype, develop, and refine distributed compute solutions, validating them in real production workflows
- Lead technically and organizationally: Set direction, guide architecture, and help grow an infrastructure team while remaining hands-on
- Partner across teams: Work closely with simulation, perception, prediction, and planning teams to understand requirements and align platform capabilities
- Evaluate and integrate tools: Assess and integrate open-source and internal technologies (e.g., Spark, Ray, Beam, Airflow, Argo, Kubernetes)
- Enable adoption: Produce clear documentation, examples, and best practices to drive platform usage and developer productivity
What You’ll Need
- Strong proficiency in Python; solid C++ experience is highly desirable
- Hands-on experience with distributed systems and large-scale data processing
- Familiarity with frameworks and orchestration tools such as Apache Spark, Ray, Apache Beam, Airflow, Argo Workflows, and Kubernetes
- Experience designing and operating data platforms that support ML pipelines and large compute workloads
- Proven technical leadership in architecture, system design, and mentoring engineers
- Ability to operate at both the system-wide “big picture” level and deep in implementation details
Nice to Have
- Background in ML/AI systems or experience building ML pipelines at scale
- Experience working with autonomous vehicle or robotics sensor data
- Experience optimizing cost, performance, and reliability in large distributed computing environments