侧边栏壁纸
  • 累计撰写 30 篇文章
  • 累计创建 35 个标签
  • 累计收到 4 条评论

kubernetes 部署安装

GoDan
2022-05-08 / 0 评论 / 0 点赞 / 290 阅读 / 15,500 字 / 正在检测是否收录...
温馨提示:
本文最后更新于 2022-05-20,若内容或图片失效,请留言反馈。部分素材来自网络,若不小心影响到您的利益,请联系我们删除。

本文借鉴了CloudMan的每天5分钟玩转 Docker 容器技术与B站尚硅谷视频

概念

在实践之前,必须先学习 Kubernetes 的几个重要概念,它们是组成 Kubernetes 集群的基石。

Cluster

Cluster 是计算、存储和网络资源的集合,Kubernetes 利用这些资源运行各种基于容器的应用。

Master

Master 是 Cluster 的大脑,它的主要职责是调度,即决定将应用放在哪里运行。Master 运行 Linux 操作系统,可以是物理机或者虚拟机。为了实现高可用,可以运行多个 Master。

Node

Node 的职责是运行容器应用。Node 由 Master 管理,Node 负责监控并汇报容器的状态,并根据 Master 的要求管理容器的生命周期。Node 运行在 Linux 操作系统,可以是物理机或者是虚拟机。

Pod

Pod 是 Kubernetes 的最小工作单元。每个 Pod 包含一个或多个容器。Pod 中的容器会作为一个整体被 Master 调度到一个 Node 上运行。

Kubernetes 引入 Pod 主要基于下面两个目的:

可管理性。

有些容器天生就是需要紧密联系,一起工作。Pod 提供了比容器更高层次的抽象,将它们封装到一个部署单元中。Kubernetes 以 Pod 为最小单位进行调度、扩展、共享资源、管理生命周期。

通信和资源共享。

Pod 中的所有容器使用同一个网络 namespace,即相同的 IP 地址和 Port 空间。它们可以直接用 localhost 通信。同样的,这些容器可以共享存储,当 Kubernetes 挂载 volume 到 Pod,本质上是将 volume 挂载到 Pod 中的每一个容器。

Controller

Kubernetes 通常不会直接创建 Pod,而是通过 Controller 来管理 Pod 的。Controller 中定义了 Pod 的部署特性,比如有几个副本,在什么样的 Node 上运行等。为了满足不同的业务场景,Kubernetes 提供了多种 Controller,包括 Deployment、ReplicaSet、DaemonSet、StatefuleSet、Job 等,我们逐一讨论。

Deployment

Deployment 是最常用的 Controller,比如前面在线教程中就是通过创建 Deployment 来部署应用的。
Deployment 可以管理 Pod 的多个副本,并确保 Pod 按照期望的状态运行。

ReplicaSet

ReplicaSet 实现了 Pod 的多副本管理。使用 Deployment 时会自动创建 ReplicaSet,也就是说
Deployment 是通过 ReplicaSet 来管理 Pod 的多个副本,我们通常不需要直接使用 ReplicaSet。

DaemonSet

DaemonSet 用于每个 Node 最多只运行一个 Pod 副本的场景。正如其名称所揭示的,DaemonSet 通常用于运行 daemon。

StatefuleSet

StatefuleSet 能够保证 Pod 的每个副本在整个生命周期中名称是不变的。而其他 Controller 不提供这个功能,当某个 Pod 发生故障需要删除并重新启动时,Pod 的名称会发生变化。同时 StatefuleSet 会保证副本按照固定的顺序启动、更新或者删除。

Job

Job 用于运行结束就删除的应用。而其他 Controller 中的 Pod 通常是长期持续运行。

Service

Deployment 可以部署多个副本,每个 Pod 都有自己的 IP,外界如何访问这些副本呢?

通过 Pod 的 IP 吗?
要知道 Pod 很可能会被频繁地销毁和重启,它们的 IP 会发生变化,用 IP 来访问不太现实。

答案是 Service。
Kubernetes Service 定义了外界访问一组特定 Pod 的方式。Service 有自己的 IP 和端口,Service 为 Pod 提供了负载均衡。

Kubernetes 运行容器(Pod)与访问容器(Pod)这两项任务分别由 Controller 和 Service 执行。

Namespace

如果有多个用户或项目组使用同一个 Kubernetes Cluster,如何将他们创建的 Controller、Pod 等资源分开呢?

答案就是 Namespace。
Namespace 可以将一个物理的 Cluster 逻辑上划分成多个虚拟 Cluster,每个 Cluster 就是一个 Namespace。不同 Namespace 里的资源是完全隔离的。

Kubernetes 默认创建了两个 Namespace。

环境配置

  • 3台VPS
  • 172.16.48.133 master
    172.16.48.132 node2
    172.16.48.131 node3

kubernetes 集群搭建(kubeadm 方式)

初始化环境

# 安装软件包
yum -y install wget
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 永久
setenforce 0 # 临时
# 关闭swap
swapoff -a # 临时
sed -ri 's/.*swap.*/#&/' /etc/fatab # 永久
# 规划主机名
hostnamectl set-hostname <hostname>
# 增加hosts绑定
cat >> /etc/hosts << EOF
172.16.48.133 master
172.16.48.132 node2
172.16.48.131 node3
EOF
# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p # 生效
# 时间同步
yum -y install ntpdate
ntpdate time.windows.com

所有节点安装Docker/kubeadm/kubelet

安装docker

# 设置存储库
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
# 安装DOCKER引擎
# 安装最新版本的Docker Engine和容器
yum -y install docker-ce 
systemctl restart docker
systemctl enable docker

添加阿里云 YUM 软件源 设置仓库地址

cat > /etc/docker/daemon.json << EOF 
{
"registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"] 
}
EOF

添加k8s yum源

cat > /etc/yum.repos.d/kubernetes.repo << EOF 
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装 kubeadm,kubelet 和 kubectl

yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0
systemctl enable kubelet

部署 Kubernetes Master

# 在master节点上执行
kubeadm init \
--apiserver-advertise-address=172.16.48.133 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.10.0.0/16 \
--ignore-preflight-errors=all
# 重置
kubeadm reset

# 初始化成功
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.48.133:6443 --token f3z0w6.irdjdjagjvou8m69 \
	--discovery-token-ca-cert-hash sha256:bbb11a57a6fbdf40d314d15f1651197ff5755d344e858c434a5eb4e1eb8f0a08

# 按照提示创建config文件和在客户端加入集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看节点
kubectl get nodes
  • apiserver-advertise-address 集群通告地址
  • image-repository 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
  • kubernetes-version K8s版本,与上面安装的一致
  • service-cidr 集群内部虚拟网络,Pod统一访问入口
  • pod-network-cidr Pod网络,与下面部署的CNI网络组件yaml中保持一致

node节点加入集群

kubeadm join 172.16.48.133:6443 --token f3z0w6.irdjdjagjvou8m69 \
	--discovery-token-ca-cert-hash sha256:bbb11a57a6fbdf40d314d15f1651197ff5755d344e858c434a5eb4e1eb8f0a08
# 成功加入	
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在master查看

# 此时因没有安装网络插件,所以状态都是NotReady
[root@master ~]# kubectl get nodes
NAME     STATUS     ROLES                  AGE     VERSION
master   NotReady   control-plane,master   17h     v1.23.0
node2    NotReady   <none>                 16s     v1.23.0
node3    NotReady   <none>                 4m50s   v1.23.0

默认token有效期为24小时,当过期之后,该token不可用了,这是需要重新创建token

kubeadm token create --print-join-command

部署网络插件

要让 Kubernetes Cluster 能够工作,必须安装 Pod 网络,否则 Pod 之间无法通信。

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 查看pod
[root@master ~]# kubectl get pods -n kube-system
NAME                             READY   STATUS    RESTARTS      AGE
coredns-6d8c4cb4d-cnnlz          0/1     Running   0             17h
coredns-6d8c4cb4d-sqzww          0/1     Running   0             17h
etcd-master                      1/1     Running   0             17h
kube-apiserver-master            1/1     Running   1 (17h ago)   17h
kube-controller-manager-master   1/1     Running   4 (10h ago)   17h
kube-flannel-ds-hp8nm            1/1     Running   0             13m
kube-flannel-ds-lpkk4            1/1     Running   0             13m
kube-flannel-ds-q5zfq            1/1     Running   0             13m
kube-proxy-557zf                 1/1     Running   0             17h
kube-proxy-7dfqb                 1/1     Running   0             40m
kube-proxy-nb5m6                 1/1     Running   0             45m
kube-scheduler-master            1/1     Running   4 (10h ago)   17h
# 查看节点情况
[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   17h   v1.23.0
node2    Ready    <none>                 40m   v1.23.0
node3    Ready    <none>                 45m   v1.23.0

测试 kubernetes 集群

在 Kubernetes 集群中创建一个 pod,验证是否正常运行:

$ kubectl create deployment nginx --image=nginx
$ kubectl expose deployment nginx --port=80 --type=NodePort 
$ kubectl get pod,svc

访问地址:http://NodeIP:Port

报错

初始化错误

[root@master ~]# kubeadm init --apiserver-advertise-address 172.16.48.133 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.24.0 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.10.0.0/16
[init] Using Kubernetes version: v1.24.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 172.16.48.133]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [172.16.48.133 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [172.16.48.133 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
^@^@^@
Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

加上–ignore-preflight-errors=all参数,查看详细报错

[root@master ~]# kubeadm init --apiserver-advertise-address=172.16.48.133 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.24.0 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.10.0.0/16   --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.24.0
[preflight] Running pre-flight checks
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
	[WARNING CRI]: container runtime is not running: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
	[WARNING Port-10250]: Port 10250 is in use
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.0: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-controller-manager:v1.24.0: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-scheduler:v1.24.0: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-proxy:v1.24.0: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/pause:3.7: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/etcd:3.5.3-0: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1
	[WARNING ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/coredns:v1.8.6: output: time="2022-05-07T20:22:31+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"
, error: exit status 1

查看是因为镜像源没有1.24版本,降低版本;重新初始化成功

yum -y remove kubelet kubeadm kubectl
yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0

node节点加入报错

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
^@[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
error execution phase kubelet-start: error uploading crisocket: nodes "node3" not found

在/etc/docker/daemon.json加入"exec-opts":[“native.cgroupdriver=systemd”]

cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
  "exec-opts":["native.cgroupdriver=systemd"]
}
# 重启docker
systemctl restart docker
0

评论区