The most broken part of data pipelines is the handoff, and I'm fixing that

| A thing that has always felt broken to me about data pipelines is that the people building the actual logic are usually data scientists, researchers, or analysts, but once the workload gets big enough, it suddenly becomes DevOps responsibility. And to be fair, with most existing tools, that kind of makes sense. Distributed computing requires a pretty technical background. So the workflow usually ends up being:
The handoff sucks, creates bottlenecks, and leaves builders at the mercy of DevOps. The person who understands the workload best is usually the person writing the code. But as soon as it needs hundreds or thousands of machines, now they’re dealing with clusters, containers, infra, dependency sync, storage mounts, distributed logs, and all the other headaches that comes with scaling Python in the cloud. That is a big part of why I’ve been building Burla. Burla is an open source cloud platform for Python developers. It’s just one function: That’s the whole idea. Instead of building a pile of infrastructure just to get a pipeline running at scale, you write the logic first and scale each stage directly inside your Python code. https://i.redd.it/ekxmil3epfrg1.gif It scales to 10,000 CPUs in a single function call, supports GPUs and custom containers, and makes it possible to load data in parallel from cloud storage and write results back in parallel from thousands of VMs at once. What I’ve cared most about is making it feel like you’re coding locally, even when your code is running across thousands of VMs When you run functions with
A few other things it handles:
Running Python across a huge amount of cloud VMs should be as simple as calling one function, not something that requires additional resources and a whole plan. Burla is free and self-hostable --> github repo And if anyone wants to try a managed instance, if you click "try it now" it will add $50 in cloud credit to your account. [link] [comments] |
Want to read more?
Check out the full article on the original site