Engineering

Our Engineering Vision: 5 Guiding Principles of an Engineering-First Company

Image of a screen with text code

Index Exchange is an engineering-first company. Our first six hires were engineers, and still today, more than half of our employees are focused on engineering and product.  

Prioritising engineering, as well as efficiency and scale, allows us to foster an environment of creativity and innovation. Because of this, our engineers can confidently challenge the status quo, explore new ways of solving problems, and prosper in their place of work. 

As Chief Technology Officer of Index Exchange, maintaining and preserving this environment is my number one priority. When I show up to the office every day (which, for the last fourteen months has been my desk at home), I strive to live and breathe a set of five guiding principles of an engineering-first company that help us prioritise, scale, and optimise. And I encourage my team to do the same.

In no particular order, here are the five guiding principles that help guide our engineering vision at Index Exchange:

1. Continuous Reinvention: Challenge the Status Quo and Build for the Future

Tech stacks, hardware, and engineering practices are always evolving. We’re a workplace that evolves alongside these changes, continuously reinventing to ensure that our teams are always working with the latest technologies and tools that empower them and their work. 

Wanting to maximise the potential of our exchange in an effort to more effectively scale, we recently made the call to transition from Perl to Go. Concurrent processing is a lot easier in a multi-threaded programming language like Go, allowing us to address the challenge of the C10K problem (something we could not do in Perl). Additionally, Go is crucial when it comes to eliminating tech debt and improving the overall development experience from rapid code deployment to robust test automation.

Rewriting our exchange in Go has also allowed us to build better staging environments and accommodate our immense growth in supply. In the past year alone, we’ve more than doubled the number of slot requests we process, bringing us to 210 billion requests per day, generating 100 terabytes of audit data. To put that in perspective, that’s enough to store 12,000 4K movies! 

“Previously, we would explicitly avoid introducing more parallelism to our systems to avoid the complexity that it brought; now, however, we are able to embrace it where it makes sense, and where it allows us to more efficiently exploit the high core-count CPUs in our data centre nodes. Beyond Go being a better technical fit for our problem domain, and enabling us to scale the exchange to billions of requests a day, it also enables us to scale our teams by offering our developers a modern technical stack on which to grow their own skill-sets.”
Michael Harris, Software Architect
 
 

We don’t take transformative decisions like this one lightly, but we know we need to provide our engineers with the most optimal tech stack possible. We also know that in order to maintain this approach, Go will likely not be the last language we code in. We’ll continue to evaluate all of our coding language options and make pivots as necessary, ensuring we’re able to move as quickly as we’d like.

We’re also an Agile workplace by nature, guided by a dedicated group of full-time Agile coaches. Critical to how our engineers work, our objective isn’t to implement Agile, it’s to be Agile. Via a combined approach of rapid learning, continuous improvement, and team empowerment, our Agile coaches work with our teams, helping them learn, understand, and follow Agile principles. Invaluable to an organisation of our size, our Agile group stresses the importance of prioritisation, efficiency, and reflection so that we become more efficient with every single project. Learn more about how we’ve embraced Agile.

2. Building Top Teams and Talent: Encourage Learning, Development, and Collaboration

Index Exchange prides itself on our transparent and connected culture, and our engineering team is no different. We prioritise providing a space for learnings and failures, and encourage our engineers to learn even beyond the expectations of their day-to-day roles. We’re also laser focused on the people that make up our engineering-first company and teams, consistently showing up for one and other by listening, collaborating, and expressing genuine care for them. 

Here are some ways we encourage this:

  • Engineers are constantly encouraged to purchase a book, take a course, or attend an event. If someone wants to attend a conference about Machine Learning, they can purchase a ticket and attend. We want our teams to feel empowered to focus on education, and to share key insights with the rest of the team. 
  • We also have an extensive library of internal training resources available to our teams. When we switched to Go, we brought in William Kennedy of Ardan Labs, one of the world’s foremost experts on the programming language, who ran an extensive training program for our engineers, teaching them Go’s syntax and paradigms. Our engineers also have access to a plethora of LinkedIn Learning courses on subjects including Kubernetes, Kafka, Agile, and more. 
  • We hold an annual internal Hackathon, an opportunity for our teams to create, develop, and solve pressing technological issues facing our industry. The IX Hackathon spurs creativity and innovation, bringing together our engineers to solve a larger technical problem in the spirit of collaboration (while also partaking in some friendly competition). 

Investing in our people and encouraging collaboration is one of the most important things we can do to create an environment where our engineers feel supported in their learning, development, and growth. With no shortage of resources to grow their skillset and advance their careers, our teams are set up for success from day one. 

3. Intensely Understanding the Exchange: Improve Efficiencies by Tackling Big Problems

As mentioned earlier, our exchange processes an astronomical amount of data each and every day (210 billion auctions, more than 1,000 billion bid requests, and 100 terabytes, to be exact), and it’s at the core of what we do. In fact, our exchange processes more transactions in an hour than Visa processes in an entire day. It is exactly for this reason that our exchange posits the most interesting and exciting problems for engineers to tackle in today’s tech industry.

201 bIllion actions, 1000 billion bid requests, 100 terabytes processed on our exchange which is a top priority at a Engineering-First Company

The exchange is a living, breathing machine, and one that grows more complicated every day. Because of these complexities, we regularly inspect the technology, utilising our deep data to better understand its dynamics so that it is able to run at peak efficiency.

The scale at which our exchange operates requires atypical solutions and out-of-the-box thinkers on our team. We’re always looking for ways to drive efficiencies and ultimately optimise the dynamics of the auctions we’re running. 

Here are some of the recent complex challenges our teams have worked to solve:

  • Leveraging machine learning to optimise auction dynamics: When COVID hit and the world pivoted to a work-from-home environment, we immediately saw a huge increase in traffic on our exchange. In order to continue scaling while being more efficient, we pivoted to a supervised machine learning approach.
  • Resolving latency in our TCP connection pooling subsystem: In order to optimise the exchange, every level of the auction process is analysed, as even a minor efficiency can represent a sizable dollar figure when repeated billions of times a day. This extends beyond analysing our own code, and can require looking down into the network stack to diagnose kernel-level interactions between our systems and our DSP partners. When we theorised that our TCP connection pooling subsystem might be subject to slow down when our downstream DSP partners’ system experienced degradations, we instrumented our system to analyse the conversations happening at the TCP/IP layer. In those packets, we discovered patterns of latency in certain DSPs’ systems related to the relative dormancy of individual TCP sockets. Using any relatively dormant socket was found to increase latency, which could manifest as slightly longer auction times or auction timeouts. By identifying the source of these cases of latency, we were able to predict and avoid them.

4. We Run Our Own Metal: Minimise Latency and Optimize our Infrastructure

One of our most notable key differentiators as an engineering-first company is that we run our own data centres, and prioritise investing in these data centres in order to safely and independently process transactions. Owning and operating our own hardware helps us minimise latency while also making the most of every CPU cycle. Additionally, we are able to have even more control over our operations, data, and intellectual property.

We recently completed our first official data centre – IX1 in Toronto. A collaborative effort across multiple teams, IX1 now houses many of our key services for big data, monitoring, telemetry, and orchestration. By moving these services from a co-located data centre to our own, we gain 6 times the capacity with the ability to house more than 100 server racks, allowing us to better support the massive scale of our operations.  

While running our own metal introduces new challenges (such as ensuring efficient use of our entire cluster and supporting engineers who want to scale their code), we have also invested in technologies like Kubernetes to help us make the most of all that power while investing in our engineers’ experience.

By running our own metal, we’re able to optimise both our hardware and software for each other. We know our software runs the best on our servers, and our servers are designed to specifically accommodate our software. As a result, our engineers can better understand the relationship between the two, always putting us steps closer in our quest to eliminate latency and optimise our infrastructure.

5. End-to-End Ownership: Focusing on Owning Outcomes and Team Improvement

Our engineers have complete ownership over their projects, all the way from inception to production, and beyond. It begins even before any code is written, through close collaboration and brainstorms with our Product teams so that our engineers understand not just the what of what they are building but the why. Understanding the business case of our projects is critical throughout the building, testing, staging, and production processes.

By fostering an environment that encourages ownership, our teams are better able to focus on delivering strong outcomes while identifying areas of opportunity. We constantly search for the root causes of both successes and failures to improve how our teams work. Ownership isn’t just about fixing bugs in production or problems in production incidents; it’s about optimising how we run our projects as we work on them and being proactive in our approach.

One of our teams recently adopted a process called Operational Excellence in an effort to monitor code running in production. They designated one member of the team to evaluate the code’s performance in production during the course of a sprint and then present their findings to the rest of the team. The team then came together to fix any bugs and determine if any efficiency gains could be made to really own the running of their code in production. As a result of Operational Excellence, the team achieved several impactful outcomes including:

  • A reclamation of 40% of memory storage and 15% of disk storage across all of our Aerospike clusters globally
  • Improved service-level monitoring on our Aerospike connections when previously, it wasn’t possible to determine if an increase in connections were due to increased traffic or an inefficiently managed connection to the database 
  • The identification of a broader CPU issue on the exchange due to increased monitoring on the number of database writes we were performing and the speed at which we were performing them

We have since scaled Operational Excellence to all of our divisions to cement the notion of total ownership and service health. It’s via this process of ownership from inception to production that our teams are able to work effectively and collaboratively, both within and across other organisations, to drive better outcomes.

Join Our Engineering-First Company

Each of these guiding principles that make up our engineering-first company vision is instrumental to our success as an engineering-first company. Our teams put a strong emphasis on driving innovation, continuous growth and learning, and team-wide support; and we’re always looking for like-minded individuals to join us in our journey to being the workplace of choice for engineers and associated functions. 

If this sounds like the kind of environment you want to not only work but thrive in, please view our open positions. I cannot wait to see what you can bring to our team.

Leave a Reply

Your email address will not be published. Required fields are marked *