SDE III I Database Reliability Engineer (PostgreSQL)
Who we are:
grofers is leading the charge in transforming India’s vast, unorganised grocery landscape through cutting-edge technology and innovation. We believe every Indian deserves the opportunity to continually improve their life – a process that often begins at home. As part of our mission of helping consumers make healthier, better choices when buying everyday products, we make a wide range of high-quality grocery and household products accessible, affordable, and available right at their doorsteps.
Built on a proprietary technology stack, the grofers platform serves as a convergence of consumers looking for everyday essentials, partner stores who serve their needs efficiently, and manufacturers looking for a channel to reach a nation of consumers. While our technology caters to the burgeoning population of urban India, it is ready and poised to serve the next 100+ million Indians who are yet to start shopping online.
We believe the ecosystem we power can transform the lives of a billion Indians significantly over the coming decade. They will have access to everyday essentials like groceries at the best value, be able to discover products that improve their health and wellbeing, and spend more meaningful time with their families – with the assurance that their essential needs are being looked after by us. On the other side of this virtuous cycle are the millions of local businesses catering to a nation’s needs, helping create more opportunities for employment, growth, and above all, a better life.
It's a $600 Billion challenge to solve, which is why we are looking at hiring smart, articulate and ambitious individuals to be a part of the team building the future at grofers. If this seems exciting to you, join us! Read more about us here.
Why you will love working with us:
- Customer love: We always put the interests of customers ahead of our own. We work hard to earn and keep their trust, and to bring them delight
- Bias for action: We dream big, take risks and have a strong bias for action. In difficult situations we make sound decisions and take thoughtful action
- Frugality: We are always looking for ways to do more with less - by creating the highest leverage possible with our time, as well as resources
- Confidence: We are tenacious and optimistic, and do not take no for an answer. Our people are quietly confident and openly humble
- Challenge status-quo: We are candid, authentic and transparent. We speak our mind, make connections that others miss and take smart risks
- Learner’s mindset: We keep learning and evolving to be able to meet our audacious goal of empowering every Indian to lead a better life
About the Infrastructure team:
grofers technology platform comprises hundreds of microservices written using polyglot stack and built by over fifteen different engineering teams. The infrastructural needs of these microservices and teams varyvaries and are evolvingis evolving rapidly with our growing business. From secure inter-service communication to resilience built in our microservices from the ground up, there are multiple touch points for the infrastructure platform to simplify adoption of engineering practices that enable us to deliver high quality software while not losing our agility and speed.
Infrastructure team is responsible for building the infrastructure platform as a product that we provide to engineers at grofers, like Kubernetes, Prometheus, Jaeger, Grafana, Consul, Vault, Postgres, Kafka, Redis, etc. These products make the lives of engineers easy so that they can focus on business value and adopt DevOps practices easily.
About the role:
As a database reliability expert, you will be responsible for maintaining a healthy database infrastructure and database usage practices for optimal performance, reliability, cost, security and compliance.
You will be collaborating closely with engineers on the infrastructure team as well as product engineering teams (owners of microservices and their databases) to understand their usage of Postgres and helping them optimize the database infrastructure as well as recommending fixes in applications. You will be expected to build solutions in collaboration with developers to help them meet their uptime requirements. This may involve advising on schema design decisions, reviewing changes in database configuration, reviewing existing usage of databases using monitoring tools to identify performance bottlenecks, analyzing the architecture for availability and scalability, building disaster recovery plans, securing databases for unauthorized changes and access to sensitive data, and helping resolve production incidents to closure.
Another responsibility for this team is to build monitoring and observability tools that help developers identify incidents caused by databases quickly and accurately, thus reducing MTTR for production incidents.
We’re looking for people who have been developers and have a strong background and interest in systems and databases. We’d love to hear from you whether you’re a seasoned database admin, or whether you’ve just learned you might like working with databases.
What you will do:
- Work on database reliability and performance aspects for all of grofers' products
- Analyze solutions and implement best practices for our main PostgreSQL database cluster and its components.
- Build tools for observability and monitoring of our database to lower the impact of production incidents
- Work with peer engineers to roll out changes to our production environment and help mitigate database-related production incidents.
- Provide oncall support to the team. Support and debug database production issues across services and levels of the stack.
- Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations). Scale database engineering as a practice in other engineering teams.
- Work on automation of database infrastructure and help engineering succeed by providing self-service tools.
- Plan the growth of grofers' database infrastructure by evaluating novel SQL as well as NoSQL solutions specific to varying business needs.
- Make monitoring and alerting alert on symptoms and not on outages.
- Drive DevOps culture in the tech organization by working with engineering and product teams.
- Own our database ecosystem. Take charge of planning the roadmap for improving usage of databases and work with all the teams to continuously improve this ecosystem.
EXPERTISE AND QUALIFICATIONS
What you need:
- 6-10 years of software engineering experience.
- At least 2 years of Infrastructure development and operations experience, particularly with Postgres.
- Experience in maintaining internet facing production-grade applications in cloud environments.
- Have solid understanding of SQL and PL/pgSQL
- Have solid understanding of the internals of PostgreSQL
- Strong data modeling and data structure design skills
- Have some backend experience with any modern programming language (such as Python, Ruby, Golang, Java, etc.), web development framework (such as Rails, Django, Flask, Spring, etc.). It is important to us that you have some experience of building applications.
- Experience in solving problems and working with a team to resolve large-scale production issues.
- Experience in Unix and/or Linux system administration.
- Experience with Infrastructure-as-Code and configuration management, deployment and orchestration technologies (such as Terraform, Ansible, Puppet, Chef, Docker). We are big on Terraform and Ansible.
- Experience with cloud platforms such as AWS, Azure or GCP. We use AWS.
- Experience with setting up data pipelines, managing ingestion of batch / real-time data flow, configuring databases for analytical workloads.
- Experience of setting up reliable databases, disaster recovery procedures, RTO/RPO objectives.
- Proficiency with Git or a similar version control system.
Good to have:
- Experience with a distributed datastore (such as RabbitMQ, Kafka, Redis, Elasticsearch, Cassandra, etc.)
- Experience of working with data lake and data warehouses
- Experience with containers and container orchestration systems (such as Kubernetes, Docker Swarm etc) and cloud-native technologies (such as Helm, Skaffold, Draft, Telepresence, Jenkins X, etc.)
- Have contributed to opensource (however basic that might be).
Excited? You will be, once you visit our Engineering Blog where you can deep dive into all the cool stuff that our engineers have been working on.