Description & Requirements
Bloomberg's internal and enterprise compute and data analytic solutions are being established to support development efforts around data-driven compute, machine learning, and business analytics! As we center our data-analytical efforts around a dataset-first approach, we work to unify our ingestion and exploration solutions around the data we wish to manage. Compute Analytics is a critical piece within the Data Analytics ecosystem, in charge of providing a fully managed data querying and exploration product. The solutions we built - using containerization and cloud native architecture - aim to provide scalable compute, specialized hardware and first-class support for a variety of workloads such as Trino, Superset, Spark, and Jupyter!
Want more jobs like this?
Get jobs in New York, NY delivered to your inbox every week.
As the needs of distributed compute, machine learning, and data analysis advance, so do the needs of the compute solutions that underscore it. Accentuated by the widespread success of Large-Language-Models and AI initiatives across Bloomberg, these solutions are poised for continued growth to accommodate the endless number of products across Bloomberg that rely on a robust compute environment. Highlights from the Compute Analytics upcoming roadmap focus on creating a highly scaled and performant compute solution that abstract away common requirements that appear across many use cases (like compute federation and dynamic resource management); a dataset-first approach that introduces a unified catalog product with fully-managed data governance, data lineage, and an extensible data metastore; and closer integration with our ingestion solutions which are powered by Spark and Flink.
As a member of the Trino and Catalog Engineering Team, you'll have the opportunity to make key technical decisions to keep these solutions moving forward. Our team makes extensive use of open source (e.g. Trino, Kubernetes, Istio, Envoy, Buildpacks, Superset, Iceberg, Jupyter etc.) and is deeply involved in a number of communities. We collaborate widely with the industry, contribute back to the open source projects, and even present at conferences. While working on the solution, the backbone for many of Bloomberg's up and coming products, you will have the opportunity to collaborate with engineers across the company and learn about the technology that delivers products from the news to financial instruments. If you are a software engineer who is passionate about building resilient, highly available infrastructure and seamless, usable full stack solutions, we'd like to talk to you about an opening on our team.
In the first few months on the team, you will work on enabling resilient federated queries on the company's new managed Trino Platform by using open source solutions like Trino Gateway. In the longer term road map of the team, you will have an opportunity to work closely with the open source Iceberg community to deploy a managed Lakehousing solution that integrates seamlessly with our Trino solution.
WE'LL TRUST YOU TO:
- Interact with data engineers and data scientists across the company to assess their development flow and scale requirements
- Solve complex problems such as cluster federation, compute resource management and public cloud integration.
- Build first-class observability in a cloud-native way that provides insights that our users need
- Educate users through tech talks, professional training, and documentation
- Collaborate across data engineering teams on proper use/integration of our platform
- Tinker at a low level and communicate your work at a high level
- Research, architect and drive sophisticated technical solutions, consisting of multiple technologies
- Mentor junior engineers and be a strong engineering voice who takes charge driving part of Trino's technical vision
YOU'LL NEED TO HAVE:
- 4+ years of programming experience with at least 2 object-oriented programming languages (Go, Python, Java) and willingness to learn more as needed
- A degree in Computer Science, Engineering or similar field of study or equivalent work experience.
- Experience building and scaling container-based systems using Kubernetes
- Experience with distributed data analytics frameworks eg. Trino, Presto, Spark
- Ability to keep up with open source tech and trends for data analytics
- A passion for providing reliable and scalable enterprise-wide infrastructure
WE'D LOVE TO SEE:
- Experience with Kubebuilder and Kubernetes operator-based frameworks
- Experience working with platform security standards such as Spiffe and Spire
- Open source involvement such as a well-curated blog, accepted contribution, or community presence
- Experience operating production systems in the public cloud e.g. AWS, GCP, or Azure
- Experience with continuous integration tools and technologies (Jenkins)
- Experience with Data Lakehouse Open Table Formats (e.g. Iceberg, Hudi)
Salary Range = 160000 - 240000 USD Annually + Benefits + Bonus
The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.
We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation, [Exempt roles only], paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.