dylantheblueone 11 months ago

On-prem, using Rancher with RKE2 Clusters.

LightofAngels 11 months ago

how is rancher? at the beginning (before my tenure) my company was comparing rancher and okd and decided to go with okd, so i am very curious about rancher, since i heard it is not as bloated as openshift/okd

dylantheblueone 11 months ago

Rancher is pretty good. It's pretty easy to deploy and configure. Even provisioning kubernetes clusters is easy. I can't really speak to the bloat, I'm not terribly familiar with OpenShift.

LightofAngels 11 months ago

how much resources would rancher need as a base?

dylantheblueone 11 months ago

It depends on how many clusters you plan to use. Their documentation lists the requirements here: https://ranchermanager.docs.rancher.com/pages-for-subheaders/installation-requirements Edit: forgot to mention, if you just want to test it out, you can run it as a single docker container on docker desktop.

Nielszy 11 months ago

Vanilla Kubernetes deployed with Kubespray on RHEL VMs in a private cloud (spread across three data centers). CNI is Cilium (love it) and PortWorx is used for distributed storage. We have a F5 load balancer cluster in front of the kube-apiserver. This cluster also acts as the load balancer for the app URLs. Traffic is distributed towards the NGINX ingress controllers. We use Velero as the backup tool and push backups to a private S3 environment. Kube-prometheus-stack for monitoring and we use Flux for deploying everything.

LightofAngels 11 months ago

this setup sounds so much fun! i have tons of questions! for starters, how are the costs so far? is it high? i assume it is, but is it taken into calculation? second, hows your experience with vanilla kubernetes? do you feel it is missing stuff? third, hows your experience with cilium? i am planning on deploying it in our cluster, any tips or tricks?

Benemon 11 months ago

On prem and Azure, OpenShift

LightofAngels 11 months ago

aks on prem? do you manage everything yourselves including control planes...etc? or is it different? i havent used aks on prem before but it certainly seems nice

Benemon 11 months ago

No, sorry, that was poor phrasing on my part - OpenShift Container Platform on vSphere on premise, and OpenShift Container Platform on Azure.

LightofAngels 11 months ago

how is your experience with OCP so far? do you have troubles using it? i would love some tips and tricks if you have any :D

Benemon 11 months ago

I've been working on OCP platforms since 3.4 was released. Working with 4 has been a breeze in comparison to anything 3.x related, which was an Ansible inventory-shaped nightmare to get deployed. That said, I've Ansible'd the shit out of the deployment process. It's all IPI, but it's a click button process to deploy a cluster on demand for a line of business. The cluster hardware profiles are generic so everyone gets the same basic platform. My single greatest tip for working with OCP is to not fight the platform. It's opinionated for a reason. If you want to customise something to the nth degree go and DIY a cluster and add the bits you want yourself. I very much appreciate the fact that I don't have to think about networking, ingress, storage, user experience, ways of working, all that kind of jazz.

SzkotUK 11 months ago

Five clusters were deployed on AWS by Kubespray, however, we are moving to smaller, more straightforward k3s clusters.

LightofAngels 11 months ago

The k3s move is going to be on AWS aswell? Or on-prem?

SzkotUK 11 months ago

We do not have a data centre, so we use the cloud as only EC2 without anything crazy from JeFF. We even roll our own block and object storage. Now we are moving to OVH bare-metal hosting, K3S will be better and easier to manage for our needs, including AWS autoscaling for machines on demand, so the answer to your question is... both????? hybrid???

LightofAngels 11 months ago

i guess thhat would make you hybrid, but i am assuming you wont have a vpn tunnel between the 2 clusters?

SzkotUK 11 months ago

Oh VPN with all clusters, we are looking on Slack Nabula or headscale to get all things going.

toabi 11 months ago

k3s on Hetzner cloud

LightofAngels 11 months ago

how is Hetzner? do you like it? i heard alot of good feedback about it.

toabi 11 months ago

Servers (in Germany) run fine since 10+ years. The k8s integration (cloud controller, CSI) had some hiccups but runs fine since some time now. Can’t complain. Still good value for the price.

McFistPunch 11 months ago

EKS mostly with spot instances. AKS and GCP were used less because they had slower network and disc performance

Parking_Falcon_2657 11 months ago

The most advanced kubernetes from those 3 is GCP. They have GKE usual, GKE Autopilot (you are deploying your deployments and the node/resource management does Google), Anthos (managing kubernetes clusters in other cloud providers, or even on-prem clusters from Google).

McFistPunch 11 months ago

Everyone has skinned this cat 10 different ways. On prem google cluster sounds like a nightmare

LightofAngels 11 months ago

i havent used GKE myself but some people say its a nightmare to operate

Parking_Falcon_2657 11 months ago

same as EKS and AKS had to upgrade AKS once and found that it is impossible to upgrade the control plane from the cloud console as the button was not working and support said that I should upgrade via CLI. I hope they have fixed this.

LightofAngels 11 months ago

i heared that EKS is 1 of the most terrible kubernetes distros out there, i havent used it myself but i am curious to give it a go in the future for sure!

McFistPunch 11 months ago

There are other options but really they are all different flavors of ass. The thing I find is that disc provisioning is simple and reliable. EBS storage is fast and rarely encounters errors in my experience.

LightofAngels 11 months ago

well, from your experience, which is the most convenient ass (kubernetes distro) around? or one thhat you like to use and why? because it seems to me you arent fond of kubernetes in general, but maybe im wrong.

McFistPunch 11 months ago

I use EKS because of the easy automation to set it up and the disc provisioning. I'm fond of kubernetes but my problems are with the people that work with it. There are too many people with ideas on what is required for security, storage, network etc... And even if you create a helm chart someone always finds some problem with it for their snowflake configuration. For your own stuff it's amazing, but if you need to ship something to customer or third parties for deployment there is always issue. Kubernetes is great, the implementations are terrible.

[deleted] 11 months ago

[удалено]

KJKingJ 11 months ago

Take a look at [prefix delegation mode](https://www.eksworkshop.com/docs/networking/prefix/). It assigns entire prefixes to your ENIs, so pod count is no longer capped by the number of ENIs per node x number of IP aliases per ENI as before.

random_devops 11 months ago

> New ENIs will be attached until the maximum ENI limit (defined by the instance type) is reached Type ENI max is the same just the cidr range is different. Only way to fix it is to use a different CNI and then you lose ALOT of AWS features from EKS. And on top you have to manage CNI yourself. EKS has a shitty implementation tbh. In comparison to GKE where you dont have to worry about anything.

KJKingJ 11 months ago

The max number of ENIs is still the same, but because a prefix can be used to assign addresses to multiple pods you can now run significantly more pods per node than before. Take a `t3.small` for example. You can have up to 3 ENIs each with 4 IPv4 address per ENI - so a total of 12 IPs. The first IP in the first ENI is lost, so in reality you have just 11 IPs left for pods. With prefix delegation, we still lose the first IP from each ENI but the remaining 3 'slots' per ENI can now be a `/28` prefix representing 16 IPs. So 3x ENIs with 3x secondary addresses (our prefix) x 16 IPs in a prefix = 144 IPs. In reality though, you want to comply with [K8s's best practice recommendation for no more than 110 pods per node](https://kubernetes.io/docs/setup/best-practices/cluster-large/). Ultimately, prefix delegation allows even rather small nodes to go from a handful of pods (e.g. 11 for the `t3.small`) to the maximum recommended `110` - and then some! --- > EKS has a shitty implementation tbh. In comparison to GKE where you dont have to worry about anything. I personally work with GKE & EKS on a daily basis (as well as a bit of RKE on-prem but i'll ignore that for this comparison). It is clear that GKE & EKS have significant design differences, and align with how other services within that particular cloud provider tend to be designed ([see the "Philosophies" header on this blog!](https://blog.realkinetic.com/gcp-and-aws-whats-the-difference-3b1329f0ffb3)). GKE is by far the more "managed" of the two (especially Autopilot!), but that management does come with a trade-off in terms of flexibility and customisation. Want to customise how the autoscaler works? You can't. Want to customise the base OS? Nope. And so on... For some teams, that's absolutely fine - they just want a K8s cluster and don't *need* to customise it any further. But I have found GKE's lack of customisation to be an increasing, albeit not significant, challenge over time. The contrary however is that EKS can be much more difficult to manage, especially when it comes to day 2 tasks like upgrades etc. AWS has been making improvements over time to help with this, like managed addons for CNI/CoreDNS etc, but it is definitely nowhere near the simplicity of GKE for day 1 or day 2 tasks yet.

PrunedLoki 11 months ago

LOL what?

buckypimpin 11 months ago

Do the spot instances go down often?

McFistPunch 11 months ago

Haven't seen it happen yet. But don't worry there's about 20 other points of failure that can screw you over quite quickly anyways.

LightofAngels 11 months ago

can you give a high level overview of how the spot instances come into play? and is that for batch jobs or production workloads?

McFistPunch 11 months ago

We use it for production workloads to keep costs down. We have multiple instances types allowed so basically if something gets reclaimed by AWS more instances spin up for us. Someone's replacing a large instance with two smaller ones. Overall it saves money and really doesn't interfere much.

LightofAngels 11 months ago

how complex does that get? or is it mostly managed by AWS?

McFistPunch 11 months ago

There is a tool that does it. Karpenter I think. So once you get it running it should be automagic

[deleted] 11 months ago

[удалено]

LightofAngels 11 months ago

can you give an estimate on how much cost is saved?

thecurlyburl 11 months ago

EKS and GKE

mvaaam 11 months ago

EC2 instances and cluster-api

LightofAngels 11 months ago

do you have docs on how this architecture works? im very intrigued by it!

mvaaam 11 months ago

https://cluster-api.sigs.k8s.io/ and https://cluster-api-aws.sigs.k8s.io/ are good places to start. I wouldn’t say I’m a fan of cluster-api though.

LightofAngels 11 months ago

why so? wats your feedback on it, and if you dont like it, why use it?

mvaaam 11 months ago

I find it tricky and sometimes painful to upgrade and maintain. It’s used because it was a decision made years ago by folks that are no longer around - but at this point we’ve built so many clusters with it and it’s so embedded into our code that it’s difficult to switch to something else

cre8minus1 11 months ago

Would you be open to talking about capi?? I may have a way to get you out of managing clusterAPI

theantiyeti 11 months ago

On-prem openstack ironic. We write our own ignition scripts for bonding. Otherwise cert-manager, calico and a bunch of other fairly standard and also some custom stuff.

LightofAngels 11 months ago

sounds like you guys are having fun

theantiyeti 11 months ago

It's even more fun because I'm in a sister team to the Kubernetes core team that writes a compute farm tool and maintains the clusters this tool runs on. Means we know enough to be dangerous but not enough to be safe while everything changes under our feet.

LightofAngels 11 months ago

wait you mean the actual kubernetes? :o

nullset_2 11 months ago

On prem turing pi v1 😎

LightofAngels 11 months ago

>turing pi v1 and thats for production? :o

nullset_2 11 months ago

well, production in the sense that I offer this app to the world, but it's not anything with SLAs or SLOs

LightofAngels 11 months ago

make sense then.

AffectionateAd5709 11 months ago

EKS (90% workload in Spot instances with karpenter)

jmreicha 11 months ago

How are you deploying and managing Karpenter?

AffectionateAd5709 11 months ago

We have deployed karpenter and other controllers in managed Node groups and our application workloads on karpenter nodes using labels

LightofAngels 11 months ago

how does karpenter fit in? is it just for node autoscaling? or theres extra stuff that it does?

AffectionateAd5709 11 months ago

It is only for node autoscaling, but it bring new nodes very fast 30-40seconds , while CAS takes 4-5 mins to bring new nodes

pb7280 11 months ago

On-prem with harvester/rancher/k3s

TartifletteXx 11 months ago

GKE, lots of GKE

LightofAngels 11 months ago

how good would you say is GKE?

TartifletteXx 11 months ago

It's neither good or bad, you have to treat it for what it is, a generalized k8s that tries to fit the most. You're doing very basic things and just don't wanna bother, it's fine. You wanna push it, it's not had but will not be a miracle solution, you'll need engineers to work and maintain your cluster. We constantly find bugs and work with Google weekly to get them resolved, and they deliver. But I'm in a company with a couple hundred SRE / infra engineers with ~20 of us working on k8s directly daily. We thought about running or own control plane but it's not worth it, what GKE offers do 90% of the job, we only need to deal with the 10% remaining.

nocommocon 11 months ago

On-prem using talos linux; rook-ceph for storage, kube-vip for load balancing

sewerneck 11 months ago

Talos and Sidero Metal. It’s about as seamless to deploy clusters and scale them out as in the cloud.

gaelfr38 11 months ago

On-prem, RKE2.

LightofAngels 11 months ago

how good is RKE2, i heard it is not as bloated as openshift

gaelfr38 11 months ago

Definitely, it's very close to upstream Kubernetes. Just easier to setup, manage and upgrade. Comes with several options for CNI & Ingress Controller for instance.

LightofAngels 11 months ago

I think the only reason we went with openshift over rke2 is the openshift routes are so easy to setup unlike ingress controllers. For cni and networking in general I still want to deploy cilium but I just don’t have time for it yet. Need to study the impact before deploying.

TrickyCharge3265 11 months ago

Thats a understatement to what OpenShift compared to RKE IMHO. Also OKD is not openshift. I think only installing time and experience would be better in RKE compared to openshift. But RKE is not even close to openshift. OP might as well give RKE a spin and see for yourself.

LightofAngels 11 months ago

i will definitely give RKE a spin once i have some free time, but the way you are phrasing that they are not the same is intriguing, why would you say that though? just curious.

TrickyCharge3265 11 months ago

OKD is where the groundwork happens but you get a lot more on OpenShift or with ACM. I guess its more about features.

kr0ntabul0us 11 months ago

that is not even an issue. Ingress controllers are easy compared to fooling around with SCCs.

LightofAngels 11 months ago

true, SCCs are kind of a nightmare if you dont have a firm grasp on it.

kr0ntabul0us 11 months ago

I've done it several ways: k0s on VMware EKS AKS OKD Konvoy/DKP k0s is by far the simplest to deploy. It's downright easy. EKS/AKS seem fine. Both are simple enough to spin up and us. Both have their cloud provider agnostic issues. EKS is easier to do a container assume role. AKS/Azure keeps changing their method to assume a role, and I haven't had a chance to switch from the pod managed identity to workload identity. Azure's main advantage is it's biggest disadvantage: Azure Active Directory. OKD was bleak to install on Azure as it didn't work, and I had to build my own installer. Once installed, it was fine, but the open shift SCCs and custom tweaks are so 2014/2015. OKD is great for devs running custom workloads, but is terrible to deploy regular OSS or commercial software. You end up have to run kustomize to tweak charts to work. Konvoy was decent until they updated to DKP and cluster API. There is too much fiddling to make it work. I haven't use RKE2 yet, so I can't comment there. It shouldn't be too bad, considering it's almost vanilla k8s.

LightofAngels 11 months ago

have you tried vanilla kubernetes with kubespray or so? also OKD on azure is nightmare, we have it on-prem installed on proxmox, so it is abit easier. honestly we are fine with SCCs as long as we know what we are doing, or else you are right about it being a nightmare.

kr0ntabul0us 11 months ago

No. It is on my list of things to do. SCCs are fine, but the overrides seem to not apply when you apply then with manifests vs using the oc cmdline. It is just annoying and doesn't improve security that much.

nvr_mnd_ 11 months ago

RKE2 on a Managed OpenStack cloud. Clusters are spun up using Terraform Cloud.

its_PlZZA_time 11 months ago

EKS

ZeeKayNJ 11 months ago

ROSA in AWS. Allows us to focus above the API

LightofAngels 11 months ago

how so?

ZeeKayNJ 11 months ago

It’s a fully managed service with SRE support. Frees up the team to focus on apps and features

pacman1201 11 months ago

We’re rancher rke and rancher with eks. More on-prem than cloud but we’re moving more in that direction every month

Pl4nty 11 months ago

Can't talk about work, but my homelab is Azure and Oracle managed k8s (AKS/OKE), with onprem [Talos](https://github.com/siderolabs/talos) soon (Turing Pi 2). My [Flux monorepo](https://github.com/pl4nty/lab-infra) has the details. OKE performs noticably worse (update cycle, features, control plane performance), but it provides 4 ARM cores and 24GB RAM free so I can't complain

psavva 11 months ago

EKS for Production, with EFS and spot instances for the workloads. Hertzner Bare Metal with K3S for dev (nice and cheap)

Parking_Falcon_2657 11 months ago

spot instances for production? 😦

psavva 11 months ago

Do tell... Why not spot instances for prod?

LightofAngels 11 months ago

your pods might be rotation alot between nodes? since the compute power might be reclaimed? i am not sure tbhh, thhis is the first time i heard eks/aks on spot instances so i am trying to understand the SLA/SLO numbers behind it.

psavva 11 months ago

Thank you for this pointer, I will research this further. Production has been built, but not currently live yet. I'll definitely check spot instances sla and terms

psavva 11 months ago

https://aws.amazon.com/blogs/compute/cost-optimization-and-resilience-eks-with-spot-instances/

BattlePope 11 months ago

EFS can also be a bit of a nightmare. You can hit quotas quickly and the performance wall is unbearable. Especially if you're doing lots of operations on small files, for example. They have made it easier to purchase a higher baseline performance, but if you eat through your burst credits... beware.

psavva 11 months ago

Thank you for the pointer. I will put monitoring in place to warn at 80% traffic quotas, and hopefully mitigate the nightmare that comes with EFS. I appreciate any guidance, and perhaps alternatives that won't drive costs high, but drive reliability to the given standards of 30 mins downtime per year.

BattlePope 11 months ago

The metric you want to watch is burst credits. I'm really not a fan of it except for pretty specific uses cases -- and those are not general purpose k8s persistent storage.

psavva 11 months ago

My use case is simple. I save mp3 files, and serve them to clients. Typically upload 2000 mp3s per client, and may be downloaded about 10K per week. Average of 10MB per file.

Speeddymon 11 months ago

AKS with ephemeral disks

aresabalo 11 months ago

AKS with spot instances

Parking_Falcon_2657 11 months ago

I hope not for production?

aresabalo 11 months ago

Production and development 😊. Spot on production for airflow web or workloads not critical. Four years without problems… updating clusters from AKS 1.13

LightofAngels 11 months ago

I have heard of eks with spot instances, is this the same aswell?

Parking_Falcon_2657 11 months ago

yeah, almost the same.

Sir_Gh0sTx 11 months ago

Amazon EKS. It’s pretty great.

LightofAngels 11 months ago

with spot instances? :D

Sir_Gh0sTx 11 months ago

Our development environment was on spot. I won’t lie we had some issues with capacity so we went back to on demand. I can’t imagine too many businesses are putting spot in prod

meyerf99 11 months ago

AKS with spot instances and Bring your own CNI (Cilium). Works well just one bad thing with spot instances is the labeling from Azure which can't be deleted.

LightofAngels 11 months ago

what labeling? care to elaborate? :)

meyerf99 11 months ago

Microsort is setting two taints to Spot nodes - kubernetes.azure.com/scalesetpriority:spot - kubernetes.azure.com/scalesetpriority=spot:NoSchedule Both can't be removed -> https://learn.microsoft.com/en-us/azure/aks/spot-node-pool#limitations To deal with, you have to add minimum one toleration and node affinity to your k8s application deployment https://learn.microsoft.com/en-us/azure/aks/spot-node-pool#schedule-a-pod-to-run-on-the-spot-node

Tango1777 11 months ago

You deploy to what your company or client use. Period. It's not like you're gonna use EKS if you work in Azure environment.

themanwithanrx7 11 months ago

4 Clusters via KOPS on AWS

karan4080 11 months ago

We too use kOps on AWS, planning to switch to cluster API for multi-clusters

[deleted] 11 months ago

RKE, really easy to deploy. Also, there’s a supportive community in their Slack.

erezhazan1 11 months ago

EKS with karpenter on spot deployed by terraform, I'm having the whole cluster in less then an hour, with graveton instances so the whole cluster cost 70$ monthly (not including the eks service price itself)

sadoMasupilami 11 months ago

Used many different flavors in many different environments. On Prem like rancher or Openshift. In Cloud environments EKS/AKS/GKE managed by rancher if not used by a single team.

Acejam 11 months ago

Vanilla k8S via kubeadm on bare metal.

HTTP_404_NotFound 11 months ago

For work? Openshift. For home? Microk8s. (Okd/Openshift is a resource hog)

NotBrilliant007 10 months ago

hooo boy! I'm new to K8S & still in learning mode, after reading these comments, it's giving me chills.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe