All resources

Lessons for Today’s Leaders: Don’t Wait for the Cracks

Five lessons from building and breaking data infrastructure at scale: spot the slow drift before it becomes a crisis, revisit build-vs-buy honestly as the business grows, put the right people on the right problems, design for change not stability, and treat data movement as a strategic investment not just plumbing.

Andy Guy
Posts
4/17/2026
3
min
Lessons for Today’s Leaders: Don’t Wait for the Cracks

Since publishing my whitepaper on what went wrong with data infrastructure at Torchlight, I've had several conversations with CTOs and engineering leaders who read it and said some version of: "Yeah, that's us right now." 

Those conversations are what pushed me to write this. The whitepaper tells the Torchlight story in detail, including the technical specifics, architecture decisions and consequences. What keeps coming up in these follow-on conversations is more about leadership than it is about the tech. How do you know when your infrastructure has become a liability? How do you make the call to change course when the current setup is technically still functioning? Where are the blind spots that let these problems compound for months or years before anyone names them?

Here are five lessons I keep coming back to.

Lesson 1: Learn to Recognize the Inflection Point

Infrastructure problems are easy to ignore because they degrade over time. Nobody wakes up one morning to a system that suddenly can't keep up. Instead, reports start needing caveats. Schema changes that used to take a day start taking a week. Your best engineers are spending more time monitoring than building, and it happened so slowly that nobody formally raised the issue.

That drift is the thing to watch for. By the time it's obvious, you've already been living with the problem for months. The danger is that everything still technically works. Data might still be flowing, but underneath, you're paying for that flow with engineering hours, delayed product work and trust in your numbers that's slipping away.

My advice? Look at how your engineering team actually spends its time. Not how you think they spend it, but how they really spend it. If maintenance is eating into build time, and especially if that shift has been gradual enough to fly under the radar, you've got a structural problem that won't fix itself.

Lesson 2: Be Ruthlessly Honest About Build vs. Buy

The "build it ourselves" instinct runs deep in technical founders. I'm Exhibit A. At Torchlight, we went all-in on in-house infrastructure and believed we were building something few vendors could replicate. In some ways, we were; we just didn't anticipate how much that custom system would cost to maintain as the business grew.

Here's what I wish I'd understood earlier: Build vs. buy isn't a one-time decision. Circumstances change as you scale. Custom solutions that feel lean at low volume become anchors at high volume, and the transition happens quickly. Every new data source, every schema change, every new partner requirement ripples through a homegrown system in ways that purpose-built tooling has already solved. You end up paying a compounding tax on every change, and the bill gets steeper as the system gets more complex.

The hardest version of this question is the personal one: "Am I still making the right call, or am I just defending a decision I made two years ago?" That's uncomfortable when you wrote the code yourself, but sitting with that discomfort is part of the job.

Lesson 3: Match the Right Skills to the Right Problems

At Torchlight, our software developers ended up owning the data pipelines almost by default; they were the closest thing we had to the right people for the job, so the work landed on their desks. We were paying senior developer rates for pipeline babysitting, and the work still wasn't getting done the way a dedicated data engineer would approach it. Developers write code that accomplishes tasks. Data engineers design systems that move information reliably at scale, with schema evolution and failure recovery considered from the start. Those are different disciplines, and blurring the line between them means neither job gets the attention it deserves.

The payroll mismatch is easy to quantify, while the drain on your team is harder to measure and more corrosive. When your developers are stuck monitoring dashboards and patching scripts, you're burning out the people who should be building your competitive advantage—all for work that doesn't move the business forward.

Ask yourself: What would your engineering team deliver next quarter if they got back the hours they currently spend on infrastructure maintenance? The size of that gap is the real cost, and it's almost always bigger than the salary line item suggests.

Lesson 4: Build for Change, Not for Stability

Most data systems are built around a snapshot and then optimized for that snapshot. The implicit assumption is that things will stay roughly the same. They won't. And every change to a system designed for stability carries a disproportionate cost.

At Torchlight, a single new field in our transactional database meant manual updates across scripts, pipelines and warehouse tables. A schema change could eat a week and pull in multiple people. We were designing for the business we had, and the business kept outgrowing the design.

The alternative is what I'd now call a DataDevOps mindset: treat data infrastructure like modern software teams treat application code. Version it. Automate it. Make it observable and resilient by default.

When you're evaluating infrastructure—building, buying or both—the question that matters most is: "How much will it cost us to change this six months from now?" If the honest answer makes you wince, the foundation needs work.

Lesson 5: Treat Data Movement as Strategic Leverage

Most leadership teams don't spend much time talking about how data gets from one place to another. I didn't either, until the consequences forced the conversation.

At Torchlight, the speed and accuracy of our data movement directly determined whether leads could be matched and sold in time, whether buyers trusted us and whether our reporting held up under scrutiny. It touched revenue, retention and credibility all at once. 

The companies that treat data movement as a real investment versus just an operational expense are the ones that will separate themselves over the next 18 months. Their infrastructure lets them adapt at lower cost, deliver faster and maintain trust in their numbers while they scale. And that edge gets wider over time, which is why the decision to invest in data movement now pays off disproportionately later.

The Window Is Closing

The tools and practices that would have changed Torchlight's trajectory are production-ready now. Five years ago, we didn't have great options. Now there are mature platforms and established practices around DataDevOps, schema evolution and pipeline resilience that didn't exist when we needed them. The teams that adopt these tools early will compound that advantage quarter over quarter.

Every growing company's data infrastructure has cracks. You can find them on your schedule or let them find you on theirs.

For more, check out my “Lessons in Data Movement” whitepaper.

Andy Guy is a technology leader, entrepreneur, and investor with more than 15 years of experience building and scaling data-driven businesses. He co-founded Torchlight Technology Group, growing it from a three-person startup into a 40+ employee company and leading its successful acquisition. At Torchlight, Andy served as CTO, architecting the company’s proprietary platforms for digital marketing and insurance lead generation. He has since held technology leadership roles with Intellivets and SunshineMD, and today is EASL’s Head of Product.

Andy Guy
Start today

You got it. It’s time to solve your data infrastructure issues all at once

We're data geeks who love to chat with anyone who appreciates clean infrastructure and issue-free data streams.