We hold live meetups every week where different guests share their stories, wisdom and practical advice on how to overcome common issues. All meetups are recorded and put on our YouTube and Podcast channels.

Events

2020-07-21
5:00pm BST/9:00am PST
Is Kubernetes even ready for data?
Kubernetes has been a great solution for deploying application infrastructure. Trying to manage your data with the same control plane has been, less than ideal. This has been even more true when using distributed databases like Apache Cassandra. Once you get past the storage and stateful sets, you still have a lot to do. Let’s have a frank talk about the new opportunities to make Kubernetes ready for data.
Patrick McFadin
2020-07-28
5:00pm BST/9:00am PST
Data on k8s maturity check
Let’s talk about storage. Optoro has moved to running stateful stores on Kubernetes. It’s a challenge, but it has a lot of value. Let’s talk about how we chose to do it, and what we figured out along the way.
Zach Dunn
2020-08-04
5:00pm BST/9:00am PST
Design considerations for operationalizing Distributed SQL on Kubernetes
This talk is targeted towards cloud-native developers and architects looking to deploy the operational database on Kubernetes.  We are going to walk you through the design decisions YugabyteDB's team took when architecting the database as a service on Kubernetes. We are going to cover concepts related to Kubernetes Volume provisioning, pod placement strategies for data resilience/High availability, and how cluster events are used for reconciling the k8s workloads during day 2 operations like upgrades, scale-up/down.
Nikhil Chandrappa
2020-08-11
5:00pm BST/9:00am PST
The problem of stateful workloads - balance of keeping data HA vs. costs
In an engineer’s ideal world we would love all the resources and redundancies we can possibly get for our services and infrastructure that supports them for sanity and of course, HA. However, how do you balance between “enough” redundancy and the actual operational costs of supporting such engineering choices, and what are some of the tough engineering decisions that need to be made? This talk focuses primarily on services being run on Kubernetes (or public cloud offering of Kubernetes), but the principles can be extended to any infrastructure environment. Key Topics: capacity planning, cost management, distributed services
Ren Lee
2020-08-18
5:00pm BST/9:00am PST
The full cycle of doing data on k8s: a case study
Scaling ACID compliant databases in the cloud is challenging. We’ll look at a specific use case where we’re trying to scale a Saas Odoo ERP offering on Kubernetes and build a scalable Postgres cluster as a backend service.  
Dave Cook
2020-08-25
5:00pm BST/9:00am PST
"Operators, operators, operators….operators"
Operators represent a great opportunity for the data community to solve for the complexities of managing data products for their customers in a way that standardizes UX and integration points -- historically the most powerful solutions had to be niche and highly customized.
Amit Gupta
2020-09-08
5:00pm BST/9:00am PST
Appropriate workloads for databases in K8s
As more companies are moving to kubernetes and cloud native as a standard for developing net new functionality something has to happen to the legacy workloads. Often times we see a lift and shift mentality into kubernetes, we will talk about how that mentality can be dangerous or cause more work than expected.
Rick Vasquez
2020-09-15
5:00pm BST/9:00am PST
Geospatial Sensor Networks and Partitioning Data
We use resources like weather reports or air quality measurements to navigate the world. These resources become especially important when faced by extreme events like the current wildfires in the Western USA. The data for the reports, predictions, and maps all start as realtime sensor networks.In this talk, I’ll present some of my research into scientific data representation on the Web and how the key mechanism is the partitioning, annotation, and naming of data representations. We’ll take a look at a few examples, including some recent work on air quality data relating to the current wildfires in the western USA. We’ll explore the central question of how geospatial sensor network data can be collected and consumed within K8s deployments.
Alex Miłowski
2020-09-22
5:00pm BST/9:00am PST
Data on Kubernetes and container attached storage - an update
Back in 2018 the CNCF published a blog we wrote called Container Attached Storage. Today - September 22nd 2020 - a new blog is appearing on their site updating Container Attached Storage. This talk borrows very heavily from that blog. What is CAS? Why would anyone use Kubernetes itself for storage? How does a microservices architecture help? Why is shared storage at the end of the road - though still used underneath CAS sometimes?
Evan Powell
2020-09-29
5:00pm BST/9:00am PST
Doing Data Wrong
In this talk, we'll look at great ways to lose data (like running databases on Kubernetes and bare metal), pain points for developers, lessons we've learned, and have a Festivus in September airing of grievances sessions for those who have felt this pain.,
Jeremy Tanner, David McKay
2020-10-06
5:00pm BST/9:00am PST
PostgreSQL-as-a-Service on K8s at Zalando
PostgreSQL is a powerful, open-source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance, but an production-grade deployment requires many complementary technologies to the database core: high availability and automated failover, backup and recovery, monitoring and alerting, centralized access control and logging, connection pooling, and so on. Being not initially accustomed for running stateful workloads, Kubernetes with its infrastructure as a code paradigm, CustomResourceDefinition, and Operator pattern turned out to be extremely convenient for deploying and running PostgreSQL at scale. I will talk about a few open-source project developed and maintained by database team at Zalando which anybody could use to build own PgaaS: 1. https://github.com/zalando/patroni - Tool for PostgreSQL high availability and cluster management. Integrates with K8s API and makes PostgreSQL cloud-native. 2. https://github.com/zalando/spilo - The Docker image that packages Patroni, multiple versions of PostgreSQL, and tools for backup and recovery. 3. https://github.com/zalando/postgres-operator - Implements Kubernetes operator pattern, orchestrates hundreds and thouthands deployments of Patroni/Spilo clusters Aforementioned projects would never get to the current state without an effort of dozens external contributors.
Alexander Kukushkin
2020-10-13
5:00pm BST/9:00am PST
Distributed Workloads on Kubernetes: Operators to the Rescue
How easily can you run distributed workloads on Kubernetes? The initial deployment of your 10-nodes database might be easy to setup, but day-2 operations (changing the configuration, adding and removing nodes, version upgrades, etc.) are much more complicated. We'll discuss how operators can help you manage distributed workloads, and a few operator tricks we learned while working on ECK (Elastic Cloud on Kubernetes) - an operator for the Elastic stack. 
Sebastien Guilloux
2020-10-20
5:00pm BST/9:00am PST
Kubernetes Cost Control
The importance of cost control while working with the cloud. K8S, Data & Cost Control Hints/Tips around controlling your K8S costs
Arie van den Bos
2020-10-27
4:00pm BST/9:00am PST
Reaching limits in K8s: A case study with Ingress Controller
When talking about data, we usually think about big data and scale, and what do we do next. Such limits are sometimes a good problem to have. In this talk, we'll discuss our approach to this situation using the Ingress Controller.
Laurent Rouquette
2020-11-03
4:00pm BST/8:00am PST
HyperStore-C: S3 object storage managed by Kubernetes
Cloudian’s HyperStore is S3-compatible object storage software focused on the enterprise market.  In this talk, I'll discuss how and why we are working on Kubernetes-managed versions of HyperStore, including where we are now and what we're looking.
Gary Ogasawara
2020-11-10
4:00pm BST/8:00am PST
Is k8s Even Ready For Data? Round II
Data on Kubernetes Community: Is K8s even ready for data, Round II - Cassandra on OpenEBS - aka CaSS on CAS. In our inaugural DOKC meet-up, Patrick McFadin Developer Advocate at Datastax emphasized the challenges of running Cassandra on Kubernetes, concluding at one point that “Kubernetes might not be ready for Cassandra.” Since that meeting, the use of the open-source Container Attached Storage project OpenEBS as a simple and high performance per workload storage for Cassandra has proliferated. Also the Cassandra Operator from Datastax, aka “CaSS”, has progressed as well. So - where are we now? Is CaSS on CAS working well?  What is the future of collaboration between Datastax / Cassandra and MayaData / OpenEBS? Is Kubernetes now ready for Cassandra? What are the emerging technologies that might shape storage and Kubernetes in the near future?,
Jeffry Molanus, Patrick McFadin
2020-11-24
4:00pm BST/8:00am PST
Towards a K8s native streaming application
Starting from a simple application which can be deployed in every machine running Docker, we will go through all steps required to transform the simple app into a Kubernetes native streaming application. We will explain the theory and then exemplify the learnt concepts to define a recipe for running streaming applications on Kubernetes. We will focus both on cultural and technical tricks to help you successfully adopt streaming applications at scale. At the end of the talk, you will have a comprehensive view regarding all platform building blocks and application requirements needed to successfully run a streaming application on Kubernetes. Spoiler: you will hear several times the words Apache Kafka, Kafka Streams and Strimzi.,
Francesco Nobilia, Jeremy Frenay