Three months ago I started leading Vinted’s new SRE (Site Reliability Engineering) team. What’s new about it you ask? Well, the previous lead left and the team was merged with two developers who work on our data warehouse - which now became SRE responsibility. Our head of engineering put his trust in me - a backend engineer in Vinted at the time - to lead this new team.
The first thing we got to do was to set OKR’s for our next quarter. Then we set to work.
Three months have passed.
I’m sharing an outline of the speech I gave in our off-site in Labanoras. We went there to reflect on the past and plan for the future.
I was worried as hell when I came to manage this team.
- The previous lead had great Linux and system administration skills, while I had little.
- My teammates were all more experienced in their respective domains than I was. This turned out to work especially well.
I am very happy with the results of our work. We have less people on more responsibilities, we have bigger challenges ahead than ever and we’re doing alright.
Our result this quarter - all of our portals were up for more than 99.9%, except for Kleiderkreisel, which was at 99.9% exactly. This includes our scheduled maintenance to bring one of our Data Centers down to test emergency procedures and look for weak parts in our infrastructure.
- Vaidotas led the migration from Nutcracker - which was indeed cracking - to Mcrouter to stabilize our cache. This resulted in more stable and reliable deployments. It was a huge help to our uptime target, since we do many deploys a day and we did not want to slow our developers down.
- Paulius led our three Data Centers’ setup resilience and failover procedures test - which involved putting one Data Center offline. We discovered some things to fix and gained confidence in our systems.
- 99.9% Uptime for Kleiderkreisel is the best result we could have had. We hit the target set for us by the business and we used every opportunity to move fast and innovate, using up 0.1% uptime budget for that. This is exactly what the excellent SRE book recommends.
- We worked more with Vinted’s backend engineers. Thanks to Tomas’ and Karolis’ mentoring and education during and after our Failboat Captain initiative we were able to stabilize our MySQL slow queries. This increased understanding of mutual needs among us and the developers.
Second Key Result - a 20% cut in infrastructure costs. We started reducing costs rather than increasing costs. While we haven’t hit exact numbers we were aiming for, our efforts amounted to a 10% cost decrease and some costs being shifted to other teams, where they can be impacted.
- We started to pay closer attention to business needs when evaluating what to buy and what to build.
- We understand the bills and contracts with our partners much better, which allows for more efficient capacity planning.
- We got rid of the costs which were assigned to us, but we could not do much about them.
- We renegotiated contracts with our main partners and got better deals.
- We returned a bunch of expired hardware and are running our systems on fewer more powerful servers.
Vaidotas left our team, while Karolis and Lech joined us from our Data Warehouse. In the next three months, we’ll be hiring a new colleague. I understand our responsibilities and the strong sides of my team better now, which makes me comfortable to hire a new person - we are looking for a performance oriented engineer to join us.
It’s very important to give people actual responsibilities and the means to implement solutions to meet these responsibilities. Today I’m very happy to say that I can trust every member of my team in their domains fully - they are doing their jobs better than I could.
Recently I started paying much more attention to my habits. I have begun by reading books and watching TED talks. I realized, that there are only 3 ways to change yourself: by having an epiphany, by changing your surroundings or… by doing small steps. Today, I’m working on a few little habits. In the morning - the Maui habit, making my bed and drinking a teaspoon of fish oil. In the evening - writing one sentence in my log. I write 5 or more sentences usually. The first habits are there to cheer me up for the upcoming day and the last one is for reflection. This continues to lead me to better understanding myself, seeing what annoys me, controlling my emotions and focusing better. This is the best investment in myself that I have made in the last year or so.
I discovered how important belief is in our lives while reading “The Power of Habit”. A lot of people associate belief with a belief in a deity. I’m talking about belief per se. I believe in Vinted - I see it as our generation of Lithuanian people who are building a great startup in the likes of those in San Francisco. This does remind me of a complex I think us Lithuanians have around recognition - and makes me smile a little.
Reading “7 Habits of Highly Effective People” led me to naming humility a more important value than pride.
“Metaphors We Live By” was the most entertaining and thought provoking read. I learned about two important metaphors that are quite ingrained in our culture.
- Argument as War. This is seen in how people talk about a discussion - there are winners and losers, you can destroy someone’s argument. You look for a killer argument. You change strategy. This is war rhetoric, not discussion rhetoric. What does a winner win in an argument? At most, an ego boost. What does the loser lose? When I discuss, I usually learn a new fact, or that some of my assumptions are wrong, or a new way to interpret given facts. How am I a loser here? I’d like for us to see argument as proof (as is common among mathematicians) or even better - imagine argument as dance. If we discussed this way in Vinted, everything would be possible.
- Time is Money. This is a metaphor that I would like to find a better replacement for. You can see I even used this a few minutes ago, when I was talking about investing in myself. A better metaphor that I am using now is Time is a Gift, but I don’t think it’s perfect. There should be a better one to live by. Let me know if you have suggestions, I’d love to discuss this :)