Category Archives: Uncategorized
Cloud-Native Dataspaces: Experiences from the German Research Data Ecosystem
Written on July 8, 2024 at 9:00 AM, by Paul Au
Data on Kubernetes and stateful applications have gained remarkable adoption across the community. But why stop there? Kubernetes and cloud-native tools can provide compelling core technologies for building sophisticated data…
Lightning Talk: My Database Runs on Kubernetes. What’s Next? Data Platforms!
Written on July 3, 2024 at 9:00 AM, by Paul Au
There’s not much doubt that databases now run well on Kubernetes: operators have matured, storage management works, and there are lots of success stories. What do you do now? Build…
Lightning Talk: Ditching Data Pipelines: why treating Data as Assets is the best thing you can do
Written on July 1, 2024 at 9:00 AM, by Paul Au
Efficient data handling traditionally involves constructing robust pipelines to process information from diverse sources. However, recent open-source tools question this approach and propose an alternative: rather than detailing data processing…
From Zero to Hero: Scaling Postgres in Kubernetes Using the Power of CloudNativePG
Written on June 27, 2024 at 9:00 AM, by Paul Au
Unleash PostgreSQL’s potential in Kubernetes with CloudNativePG, a community-driven control plane reshaping the database landscape. Join a dedicated CloudNativePG maintainer and active Postgres contributor on a captivating journey through managing…
How to Create Your Own Metadata-Driven ML Platform from Scratch
Written on June 4, 2024 at 9:00 AM, by Paul Au
This talk demonstrated integrating a data lakehouse with a metadata-driven orchestration engine, utilizing open-source tools like Presto and Kubeflow. Other key learnings included: Learn about Presto and how it works…
Advanced CSI-FUSE Filesystem for AI/ML Data Management in Kubernetes
Written on May 20, 2024 at 9:00 AM, by Paul Au
AI/ML workloads, known for their data intensity, often depend on cloud storage. However, Kubernetes faces significant challenges in accessing this cloud storage data efficiently, primarily due to the lack of…
Does containerization affect the performance of databases?
Written on May 2, 2024 at 11:10 AM, by Guest Post
This post has been provided by DoK community sponsor ApeCloud and authored by Cai Songlu The wave of database containerization is on the rise, as clearly shown in Fig.1. Databases and…
Kubernetes as a Data Platform, DoK Panel at KubeCon
Written on April 29, 2024 at 8:00 AM, by Paul Au
Data on Kubernetes is well-positioned to become the operational default in a world where data and AI/ML applications are expected to grow. Scalability, flexibility, resilience, openness, and costs are among…
Adding Zonal Resiliency to Etsy’s Kafka Cluster
Written on April 18, 2024 at 8:00 AM, by Paul Au
Kafka is an important part of Etsy’s data ecosystem, moving data that powers a number of things like analytics, A/B testing, and search indexing. Etsy runs its Kafka cluster on…