Make your data scientists and DevOps happy with one Tilebox

Organizations adopting Tilebox find their scientists focused on algorithms while their infrastructure teams efficiently manage resources and costs. How is this possible?


After implementing Tilebox, organizations have their scientists focusing on algorithms and insights while their infrastructure teams efficiently manage resources and costs.

Few weeks ago, an old friend of mine, Stefan Amberger reached out and asked me if I want to try Tilebox. I knew that he is in Space area but I was a little worried because last time I touched Earth Observation was around 2018. Will I be able to even remember the basics?

Back in the days, as an Earth Observation team, we spent significant time managing infrastructure, writing boilerplate code and navigating complex data access patterns instead of focusing on our core mission.

From the first minutes I understood that Tilebox streamlines this by providing an unified platform that handles data access, distributed processing, and infrastructure management while maintaining flexibility and avoiding vendor lock-in. Organizations using this platform have their scientists focusing on algorithms and insights while their infrastructure teams efficiently manage resources and costs.

Projects that previously took us months to reach production now could be deployed in days! Now, lucky teams can easily scale their processing from local development to distributed cloud environments without major code changes!

Please turn back the time! I still have some experiments to run! Until then, these are my thoughts. Keep in mind that there are far more unexplored features underneath!

Documentation? Who reads it? AI, of course!

Tilebox offers both human and AI readable documentation.

Documentation typically lives across multiple systems – wikis, PDFs, and code comments – making it difficult to find precise answers quickly. Tilebox’s documentation is designed to be AI-ready, allowing tools like Claude to provide accurate and contextual responses to your team’s questions.

Whether it’s 3 AM or during peak hours, developers can get immediate, reliable answers about implementation details, API usage or best practices. This documentation approach means no more digging through scattered resources or waiting for support tickets – just ask the question in natural language and get specific, actionable answers including relevant code examples.

The AI understands the full context of Tilebox’s architecture, ensuring responses are not just accurate but also aligned with your specific use case and implementation stage.

I mean, it is 2024! These days we usually read about the capabilities of a certain toolbox and then go to prompting! But the usual cutoff for commercial LLMs is ~1 year. It will not know about our new, fancy, toolbox. What we do in practice is take advantage of large context windows offered by current LLMs. However, putting the documentation in its context window is generally a bit of a pain, because said doc has to be arranged in certain ways.

Go to https://docs.tilebox.com/ai-assistance to see how to download the documentation in the right format and how to use it on Claude. As of now (Nov 2024) ChatGPT still has some catching up to do.

Once I created a project in Claude.ai I started to code! It is useful to be familiar with the capabilities of the Tilebox API but that’s about it. Irrelevant to know that, to start a job, you need a cluster and what is the semantic for creating/finding/joining such a cluster. AI will “know” it for you and will produce the right steps to get your task coded!

At each step, pasting the error message you will get explanations and workarounds. It is absolutely a joy to code this way! It took me less than two hours to have nodes in different networks, running Tilebox tasks!

Last but not least, this “interactive” documentation means faster onboarding for new team members and quicker resolution of technical challenges during development.

Data access

Tilebox eliminates data access complexity with a unified data access layer.

Satellite data access has traditionally been a maze of product specifications, orbit nomenclature, and processing levels. Each data provider has their own API, authentication methods and data structures.

Automating a simple task like downloading a Sentinel-1 granule took days of reading documentation and writing boilerplate code. Tilebox eliminates this complexity with a unified data access layer.

Instead of learning the intricacies of each satellite platform, engineers can now query data using intuitive parameters like time ranges and geographical areas [coming soon!]. Behind the scenes, Tilebox handles the complexities of data provider APIs, authentication, and product specifications. When providers update their APIs or modify their product specifications, Tilebox adapts transparently – your code keeps working without changes.

Getting satellite data from cloud providers to your local machine becomes almost a two-line operation. Tilebox manages the downloading of data products automatically. This means you can spend your time working with the data rather than figuring out how to access it. Whether you’re working with Sentinel, Landsat, or commercial SAR data, the interface remains consistent and straightforward.

Distributed computing

Distributed systems from in-orbit compute to on-premises without any vendor lock-in.

In traditional workflows, there is a constant tension between data scientists and devops. Devops engineers want to know what components are available, what is their APIs, order of execution and their data I/O fluxes. Data scientists want to quickly iterate through algorithms, not much remains fixed from one run to another. They want to spend their time coding algorithms that solves the task at hand. The code is there, the algorithm is explicit, why can’t somebody take it and distribute it over the compute? Well, this is, in my opinion, a very strong feature that Tilebox got right!

Tilebox allows your teams to work efficiently within their domains of expertise. Data scientists and researchers can focus on developing algorithms and processing pipelines, while infrastructure teams maintain operational control.

The separation between compute and deployment means each team can optimize their part of the workflow without stepping on each other’s toes.

Cost management in Earth observation processing becomes more straightforward. Infrastructure teams can implement auto-scaling policies and optimize resource utilization across different cloud providers or on-premises systems. Meanwhile, scientists can develop and test their workflows locally, then deploy them to production without much code changes. This means faster time-to-value for new projects, while maintaining the flexibility to add compute resources as needed.

The workflow engine handles the complexity of distributed processing and data movement, allowing your data scientists to focus on extracting insights from the data.

As your operation grows, Tilebox grows with you without creating vendor dependencies. The modular architecture means you can start with the provided components for quick deployment, then gradually replace them with custom solutions as your needs evolve. Whether you’re processing gigabytes or petabytes of satellite data, the same workflows scale from development to production.

For organizations looking to maintain control of their Earth observation infrastructure while reducing operational overhead, Tilebox provides a practical middle ground. It handles the complexities of distributed satellite data processing without forcing you into rigid patterns or limiting future technical choices.

With a bit of care, Tilebox takes the data scientist’s code and runs it wherever the DevOps engineer wants it to be ran.

Support

Enterprise client? Five star support because Tilebox is a company that cares about their customer success!

Support isn’t an afterthought. When your teams hit a roadblock, they can access Tilebox’s engineering support directly. Instead of going through layers of ticket systems, your developers get responses from the people who built the system. This means technical discussions are focused and productive, keeping your projects moving forward.

Not on Enterprise plan? No worries, there is a Discord channel available too!

The comprehensive examples and template workflows cover common Earth observation scenarios – from basic data access to complex distributed processing pipelines. Rather than starting from scratch, teams can adapt these tested patterns to their specific needs. This means your first production workflow can be up and running in days rather than months, while maintaining the flexibility to customize as requirements evolve.


Posted

in