部署高可用Kubernetes集群

栏目: 编程工具 · 发布时间: 6年前

内容简介：整个安装过程中尽量不要出现写死的IP的情况出现，尽量全部使用域名代替IP。网上大量的人使用KeepAlive+VIP的形式完成高可用，这个方式有两个不好的地方：其一，受限于使用者的网络，无法适用于SDN网络，比如Aliyun的VPC。

写在前面的话

整个安装过程中尽量不要出现写死的IP的情况出现，尽量全部使用域名代替IP。

网上大量的人使用KeepAlive+VIP的形式完成高可用，这个方式有两个不好的地方：

其一，受限于使用者的网络，无法适用于SDN网络，比如Aliyun的VPC。

其二，虽然是高可用的，但是流量还是单点的，所有node的的网络I/O都会高度集中于一台机器上（VIP），一旦集群节点增多，pod增多，单机的网络I/O迟早是网络隐患。

本文的高可用可通用于任何云上的SDN环境和自建机房环境，例如阿里云的VPC环境中。

整体架构图

网上流传最多的基于VIP的高可用：

部署高可用Kubernetes集群

本文使用的基于NginxProxy的高可用：

部署高可用Kubernetes集群

节点清单

部署高可用Kubernetes集群

为什么要升级内核

本文中所使用到的OS为Ubuntu 16.04，用户均为root用户。升级内核为必须条件。

在低版本的内核中会出现一下很让人恼火的Bug，时不时来一下，发作时候会导致整个OS Hang住无法执行任何命令。

现象如下：

kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

关于这个Bug，你可以从以下地方追踪到：

还有老哥放出了重现这个Bug的代码： https://github.com/fho/docker-samba-loop 。

而根据我实际的实验（采坑）下来，这个问题我花费了差不多1个多月的时间先后尝试了内核版本3.10、4.4、4.9、4.12、4.14、4.15版本，均会不同程度的复现上述Bug，而一旦触发并无他法，只能重启（然后祈祷不要再次触发）。

实在是让人寝食难安，睡不踏实，直到我遇到了内核4.17，升级完毕之后，从6月到现在。重来没有复现过。似乎可以认为该Bug已经修复了。故而墙裂建议升级内核到4.17+。

我应该如何选择Master节点的配置

关于这个问题在Kubernetes项目中，提供了一个配置的脚本（里面包括有推荐的vCPU核心数，磁盘大小，podip段，svcip段等等。）

机器环境

升级内核

你可以从 Linux 内核官网了解到最新发布的内核版本。

我们这里升级到当前最新的4.19.11版本的内核，以下为升级到该内核版本需要的文件。

linux-headers-4.19.11-041911_4.19.11-041911.201812191931_all.deb

linux-image-unsigned-4.19.11-041911-generic_4.19.11-041911.201812191931_amd64.deb

linux-modules-4.19.11-041911-generic_4.19.11-041911.201812191931_amd64.deb

以上三个内核相关文件下载到本地。如我这里存放在~目录下。执行以下命令完成内核的安装。

➜ dpkg -i ~/*.deb

正在选中未选择的软件包 linux-headers-4.19.11-041911。

（正在读取数据库……系统当前共安装有 60576 个文件和目录。）

正准备解包 linux-headers-4.19.11-041911_4.19.11-041911.201812191931_all.deb  ...

正在解包 linux-headers-4.19.11-041911 (4.19.11-041911.201812191931) ...

正在选中未选择的软件包 linux-image-unsigned-4.19.11-041911-generic。

正准备解包 linux-image-unsigned-4.19.11-041911-generic_4.19.11-041911.201812191931_amd64.deb  ...

正在解包 linux-image-unsigned-4.19.11-041911-generic (4.19.11-041911.201812191931) ...

正在选中未选择的软件包 linux-modules-4.19.11-041911-generic。

正准备解包 linux-modules-4.19.11-041911-generic_4.19.11-041911.201812191931_amd64.deb  ...

正在解包 linux-modules-4.19.11-041911-generic (4.19.11-041911.201812191931) ...

正在设置 linux-headers-4.19.11-041911 (4.19.11-041911.201812191931) ...

正在设置 linux-modules-4.19.11-041911-generic (4.19.11-041911.201812191931) ...

正在设置 linux-image-unsigned-4.19.11-041911-generic (4.19.11-041911.201812191931) ...

I: /vmlinuz.old is now a symlink to boot/vmlinuz-4.4.0-131-generic

I: /initrd.img.old is now a symlink to boot/initrd.img-4.4.0-131-generic

I: /vmlinuz is now a symlink to boot/vmlinuz-4.19.11-041911-generic

I: /initrd.img is now a symlink to boot/initrd.img-4.19.11-041911-generic

正在处理用于 linux-image-unsigned-4.19.11-041911-generic (4.19.11-041911.201812191931) 的触发器 ...

/etc/kernel/postinst.d/initramfs-tools:

update-initramfs: Generating /boot/initrd.img-4.19.11-041911-generic

W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.

/etc/kernel/postinst.d/x-grub-legacy-ec2:

Searching for GRUB installation directory ... found: /boot/grub

Searching for default file ... found: /boot/grub/default

Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst

Searching for splash image ... none found, skipping ...

Found kernel: /boot/vmlinuz-4.4.0-131-generic

Found kernel: /boot/vmlinuz-4.19.11-041911-generic

Found kernel: /boot/vmlinuz-4.4.0-131-generic

Replacing config file /run/grub/menu.lst with new version

Updating /boot/grub/menu.lst ... done



/etc/kernel/postinst.d/zz-update-grub:

Generating grub configuration file ...

Found linux image: /boot/vmlinuz-4.19.11-041911-generic

Found initrd image: /boot/initrd.img-4.19.11-041911-generic

Found linux image: /boot/vmlinuz-4.4.0-131-generic

Found initrd image: /boot/initrd.img-4.4.0-131-generic

done

完成安装后，reboot重启服务器。查看我们的最新内核版本。

➜  uname -a 

Linux k8s 4.19.11-041911-generic #201812191931 SMP Wed Dec 19 19:33:33 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

清理老的内核（可选）。

# 查看老的内核

➜ dpkg --list | grep linux

# 清理 老旧的4.4.0内核的相关

➜ apt purge linux*4.4.0* -y

启用IPVS相关内核module

➜ module=(ip_vs

    ip_vs_rr

    ip_vs_wrr

    ip_vs_sh

    nf_conntrack)

for kernel_module in ${module[@]};do

/sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :

done

1#  如下输出表示加载成功

➜  lsmod | grep ip_vs

ip_vs_sh               16384  0

ip_vs_wrr              16384  0

ip_vs_rr               16384  0

ip_vs                 147456  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr

nf_conntrack          143360  6 xt_conntrack,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink,ip_vs

libcrc32c              16384  5 nf_conntrack,nf_nat,btrfs,raid456,ip_vs

内核参数调整

➜ cat > /etc/sysctl.conf << EOF

# https://github.com/moby/moby/issues/31208 

# ipvsadm -l --timout

# 修复ipvs模式下长连接timeout问题 小于900即可

net.ipv4.tcp_keepalive_time = 800

net.ipv4.tcp_keepalive_intvl = 30

net.ipv4.tcp_keepalive_probes = 10



net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

net.ipv4.neigh.default.gc_stale_time = 120

net.ipv4.conf.all.rp_filter = 0

net.ipv4.conf.default.rp_filter = 0

net.ipv4.conf.default.arp_announce = 2

net.ipv4.conf.lo.arp_announce = 2

net.ipv4.conf.all.arp_announce = 2

net.ipv4.ip_forward = 1

net.ipv4.tcp_max_tw_buckets = 5000

net.ipv4.tcp_syncookies = 1

net.ipv4.tcp_max_syn_backlog = 1024

net.ipv4.tcp_synack_retries = 2

net.bridge.bridge-nf-call-ip6tables = 1

net.bridge.bridge-nf-call-iptables = 1

fs.inotify.max_user_watches=89100

fs.file-max=52706963

fs.nr_open=52706963

net.bridge.bridge-nf-call-arptables = 1

vm.swappiness = 0

EOF

禁用swap并关闭防火墙

swapoff -a 

sysctl -w vm.swappiness=0 

sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

systemctl disable --now ufw

安装必须软件

配置镜像

阿里云的镜像仓库。

配置Ubuntu 16.04：

➜ cat > /etc/apt/sources.list << EOF

deb http://mirrors.aliyun.com/ubuntu/ xenial main

deb-src http://mirrors.aliyun.com/ubuntu/ xenial main

deb http://mirrors.aliyun.com/ubuntu/ xenial-updates main

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates main

deb http://mirrors.aliyun.com/ubuntu/ xenial universe

deb-src http://mirrors.aliyun.com/ubuntu/ xenial universe

deb http://mirrors.aliyun.com/ubuntu/ xenial-updates universe

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates universe

deb http://mirrors.aliyun.com/ubuntu/ xenial-security main

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security main

deb http://mirrors.aliyun.com/ubuntu/ xenial-security universe

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security universe

EOF

更新apt镜像源并升级相关软件

apt update && apt upgrade 

apt -y install ipvsadm ipset apt-transport-https 

apt -y install ca-certificates curl software-properties-common apt-transport-https

安装Docker-ce

# step 1: 安装GPG证书

curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -

# Step 2: 写入软件源信息

add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"

# Step 3: 更新并安装 Docker-CE

apt-get -y update

apt-get -y install docker-ce

配置Docker

touch /etc/docker/daemon.json

cat > /etc/docker/daemon.json <<EOF

{

"log-driver": "json-file",

"log-opts": {

"max-size": "100m",

"max-file": "3"

},

"live-restore": true,

"max-concurrent-downloads": 10,

"max-concurrent-uploads": 10,

"registry-mirrors": ["https://uo4pza0j.mirror.aliyuncs.com"],

"storage-driver": "overlay2",

"storage-opts": [

"overlay2.override_kernel_check=true"

]

}

EOF

systemctl daemon-reload 

systemctl restart docker

部署Nginx local Proxy

Nginx.conf

本地Nginx代理的主要主要是代理访问所有的Master节点。nginx.conf配置如下：

mkdir -p /etc/nginx

cat > /etc/nginx/nginx.conf << EOF

worker_processes auto;

user root;

events {

worker_connections  20240;

use epoll;

}

error_log /var/log/nginx_error.log info;



stream {

upstream kube-servers {

    hash $remote_addr consistent;

    server server1.k8s.local:6443 weight=5 max_fails=1 fail_timeout=10s;

    server server2.k8s.local:6443 weight=5 max_fails=1 fail_timeout=10s;

    server server3.k8s.local:6443 weight=5 max_fails=1 fail_timeout=10s;

}



server {

    listen 8443;

    proxy_connect_timeout 1s;

    proxy_timeout 3s;

    proxy_pass kube-servers;

}

}

EOF

启动Nginx

➜ docker run --restart=always \

-v /etc/apt/sources.list:/etc/apt/sources.list \

-v /etc/nginx/nginx.conf:/etc/nginx/nginx.conf \

--name kube_server_proxy \

--net host \

-it \

-d \

nginx

注意：请确保每一台Kubernetes机器中的机器都运行着此代理。

部署etcd集群

关于etcd要不要使用TLS？

首先TLS的目的是为了鉴权为了防止别人任意的连接上你的etcd集群。其实意思就是说如果你要放到公网上的etcd集群，并开放端口，我建议你一定要用TLS。

如果你的etcd集群跑在一个内网环境比如（VPC环境），而且你也不会开放etcd端口，你的etcd跑在防火墙之后，一个安全的局域网中，那么你用不用TLS，都行。

注意事项

--auto-compaction-retention

由于etcd数据存储多版本数据，随着写入的主键增加历史版本需要定时清理，默认的历史数据是不会清理的，数据达到2G就不能写入，必须要清理压缩历史数据才能继续写入；所以根据业务需求，在上生产环境之前就提前确定，历史数据多长时间压缩一次；推荐一小时压缩一次数据这样可以极大的保证集群稳定，减少内存和磁盘占用。

--max-request-bytes

etcd Raft消息最大字节数，etcd默认该值为1.5M；但是很多业务场景发现同步数据的时候1.5M完全没法满足要求，所以提前确定初始值很重要；由于1.5M导致我们线上的业务无法写入元数据的问题，我们紧急升级之后把该值修改为默认32M，但是官方推荐的是10M，大家可以根据业务情况自己调整。

--quota-backend-bytes

etcd db数据大小，默认是2G，当数据达到2G的时候就不允许写入，必须对历史数据进行压缩才能继续写入；参加1里面说的，我们启动的时候就应该提前确定大小，官方推荐是8G，这里我们也使用8G的配置。

Docker安装etcd

请依次在你规划好的etcd机器上运行即可。

mkdir -p /var/etcd

docker rm etcd1 -f

rm -rf /var/etcd

docker run --restart=always --net host -it --name etcd1 -d \

-v /var/etcd:/var/etcd \

-v /etc/localtime:/etc/localtime \

registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.2.24 \

etcd --name etcd-s1 \

--auto-compaction-retention=1 --max-request-bytes=33554432 --quota-backend-bytes=8589934592 \

--data-dir=/var/etcd/etcd-data \

--listen-client-urls http://0.0.0.0:2379 \

--listen-peer-urls http://0.0.0.0:2380 \

--initial-advertise-peer-urls http://server1.k8s.local:2380 \

--advertise-client-urls http://server1.k8s.local:2379,http://server1.k8s.local:2380 \

-initial-cluster-token etcd-cluster \

-initial-cluster "etcd-s1=http://server1.k8s.local:2380,etcd-s2=http://server2.k8s.local:2380,etcd-s3=http://server3.k8s.local:2380" \

-initial-cluster-state new

etcd2

mkdir -p /var/etcd

docker rm etcd2 -f

rm -rf /var/etcd

docker run --restart=always --net host -it --name etcd2 -d \

-v /var/etcd:/var/etcd \

-v /etc/localtime:/etc/localtime \

registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.2.24 \

etcd --name etcd-s2  \

--auto-compaction-retention=1 --max-request-bytes=33554432 --quota-backend-bytes=8589934592 \

--data-dir=/var/etcd/etcd-data \

--listen-client-urls http://0.0.0.0:2379 \

--listen-peer-urls http://0.0.0.0:2380 \

--initial-advertise-peer-urls http://server2.k8s.local:2380 \

--advertise-client-urls http://server2.k8s.local:2379,http://server2.k8s.local:2380 \

-initial-cluster-token etcd-cluster \

-initial-cluster "etcd-s1=http://server1.k8s.local:2380,etcd-s2=http://server2.k8s.local:2380,etcd-s3=http://server3.k8s.local:2380" \

-initial-cluster-state new

etcd3

mkdir -p /var/etcd

docker rm etcd3 -f

rm -rf /var/etcd

docker run --restart=always --net host -it --name etcd3 -d \

-v /var/etcd:/var/etcd \

-v /etc/localtime:/etc/localtime \

registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.2.24 \

etcd --name etcd-s3 \

--auto-compaction-retention=1 --max-request-bytes=33554432 --quota-backend-bytes=8589934592 \

--data-dir=/var/etcd/etcd-data \

--listen-client-urls http://0.0.0.0:2379 \

--listen-peer-urls http://0.0.0.0:2380 \

--initial-advertise-peer-urls http://server3.k8s.local:2380 \

--advertise-client-urls http://server3.k8s.local:2379,http://server3.k8s.local:2380 \

-initial-cluster-token etcd-cluster \

-initial-cluster "etcd-s1=http://server1.k8s.local:2380,etcd-s2=http://server2.k8s.local:2380,etcd-s3=http://server3.k8s.local:2380" \

-initial-cluster-state new

检查

➜ ETCDCTL_API=3 etcdctl  member list

410feb26f4fa3c7f: name=etcd-s1 peerURLs=http://server1.k8s.local:2380 clientURLs=http://server1.k8s.local:2379,http://server1.k8s.local:2380

56fa117fc503543c: name=etcd-s3 peerURLs=http://server3.k8s.local:2380 clientURLs=http://server3.k8s.local:2379,http://server3.k8s.local:2380

bc4d900274366497: name=etcd-s2 peerURLs=http://server2.k8s.local:2380 clientURLs=http://server2.k8s.local:2379,http://server2.k8s.local:2380



➜ ETCDCTL_API=3 etcdctl cluster-health

member 410feb26f4fa3c7f is healthy: got healthy result from http://server1.k8s.local:2379

member 56fa117fc503543c is healthy: got healthy result from http://server3.k8s.local:2379

member bc4d900274366497 is healthy: got healthy result from http://server2.k8s.local:2379

cluster is healthy

部署Master

安装Kubernetes基础组件

关于kubeadm自动签发的证书一年过期的问题

在Google找到一个蛮不错的做法。

可以参照以上博客链接说的将制定版本的Kubernetes拉取下来，然后修改点证书的过期时间（你可以改成比如99年后过期）后在自行编译kubeadm，我测试下来编译很快几乎几秒钟就搞定了。

设置镜像源

同样使用国内的阿里云提供的 Kubernetes镜像源，加速基础组件的安装。

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

cat  > /etc/apt/sources.list.d/kubernetes.list << EOF

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

EOF

安装kubeadm kubelet kubectl

如果你是自己手动编译的kubeadm的话记得替换下。

apt-get update

apt-get install kubeadm kubelet kubectl

设置kubelet的pause镜像

在Ubuntu中的kublet配置文件在/etc/systemd/system/kubelet.service.d/10-kubeadm.conf。

cat > /etc/default/kubelet << EOF

KUBELET_EXTRA_ARGS="--cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1"



EOF

systemctl daemon-reload

systemctl enable kubelet && systemctl restart kubelet

Kubeadm-config.yaml

自1.13.0之后的kubeadm的文件格式发生了比较大的变化。

你可以使用以下命令查看到默认的yaml格式。

查看 ClusterConfiguration 的默认配置：

#  kubeadm config print-default --api-objects ClusterConfiguration

查看KubeProxyConfiguration的默认配置：

#  kubeadm config print-default --api-objects KubeProxyConfiguration

查看KubeletConfiguration的默认配置：

#  kubeadm config print-default --api-objects KubeletConfiguration

kubeadm-config.yaml：

# kubeadm init --config=

apiVersion: kubeadm.k8s.io/v1beta1

kind: ClusterConfiguration

kubernetesVersion: v1.13.1

#useHyperKubeImage: true

imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers

apiServer:

certSANs:

- "server.k8s.local"

networking:

serviceSubnet: 10.96.0.0/12

podSubnet: 10.244.0.0/16



controlPlaneEndpoint: server.k8s.local:8443



etcd:

external:

endpoints:

  - http://server1.k8s.local:2379

  - http://server2.k8s.local:2379

  - http://server3.k8s.local:2379

---

apiVersion: kubeproxy.config.k8s.io/v1alpha1

kind: KubeProxyConfiguration

mode: ipvs

ipvs:

scheduler: rr

syncPeriod: 10s

预拉取镜像

➜ kubeadm config images pull --config kubeadm-config.yaml

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.13.1

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.13.1

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.13.1

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.13.1

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.2.6

初始化Master

预拉取镜像之后我们这步骤会十分的快很顺利。

成功初始化集群：

## 初始化Master

➜ kubeadm init --config kubeadm-config.yaml

[init] Using Kubernetes version: v1.13.1

[preflight] Running pre-flight checks

[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06

[preflight] Pulling images required for setting up a Kubernetes cluster

[preflight] This might take a minute or two, depending on the speed of your internet connection

[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"

[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"

[kubelet-start] Activating the kubelet service

[certs] Using certificateDir folder "/etc/kubernetes/pki"

[certs] External etcd mode: Skipping etcd/ca certificate authority generation

[certs] External etcd mode: Skipping etcd/server certificate authority generation

[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation

[certs] External etcd mode: Skipping etcd/peer certificate authority generation

[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation

[certs] Generating "ca" certificate and key

[certs] Generating "apiserver-kubelet-client" certificate and key

[certs] Generating "apiserver" certificate and key

[certs] apiserver serving cert is signed for DNS names [server1.k8s.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local server.k8s.local server.k8s.local] and IPs [10.96.0.1 192.168.0.230]

[certs] Generating "front-proxy-ca" certificate and key

[certs] Generating "front-proxy-client" certificate and key

[certs] Generating "sa" key and public key

[kubeconfig] Using kubeconfig folder "/etc/kubernetes"

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[kubeconfig] Writing "admin.conf" kubeconfig file

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[kubeconfig] Writing "kubelet.conf" kubeconfig file

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[kubeconfig] Writing "controller-manager.conf" kubeconfig file

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[kubeconfig] Writing "scheduler.conf" kubeconfig file

[control-plane] Using manifest folder "/etc/kubernetes/manifests"

[control-plane] Creating static Pod manifest for "kube-apiserver"

[control-plane] Creating static Pod manifest for "kube-controller-manager"

[control-plane] Creating static Pod manifest for "kube-scheduler"

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

[apiclient] All control plane components are healthy after 18.509223 seconds

[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace

[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster

[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "server1.k8s.local" as an annotation

[mark-control-plane] Marking the node server1.k8s.local as control-plane by adding the label "node-role.kubernetes.io/master=''"

[mark-control-plane] Marking the node server1.k8s.local as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

[bootstrap-token] Using token: qiqcg7.8kg2v7txawdf6ojh

[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles

[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials

[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token

[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster

[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace

[addons] Applied essential addon: CoreDNS

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[addons] Applied essential addon: kube-proxy



Your Kubernetes master has initialized successfully!



To start using your cluster, you need to run the following as a regular user:



mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config



You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

https://kubernetes.io/docs/concepts/cluster-administration/addons/



You can now join any number of machines by running the following on each node

as root:



kubeadm join server.k8s.local:8443 --token qiqcg7.8kg2v7txawdf6ojh --discovery-token-ca-cert-hash sha256:039b3de841b63309983911c890c967fa167c5be5a713fe0f9b6f5f4eda74b70a

启动完成之后你可以在/etc/kubernetes/pki/找到kubeadm生成的证书，使用一下命令可以查看到证书的信息以及过期时间。

你可以清楚的看到过期时间99年了。

➜  openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text

Certificate:

    ............

    Validity

        Not Before: Dec 25 15:55:21 2018 GMT

        Not After : Dec  1 15:55:21 2117 GMT

    Subject: CN=kubernetes

    Subject Public Key Info:

     ............

➜ openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text

➜ openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -text

➜ openssl x509 -in /etc/kubernetes/pki/front-proxy-ca.crt -noout -text

部署高可用Master

本章节的内容主要来自官方文档。

建议先做好Master节点之间的SSH公钥认证。

分发证书到其他Master：

# customizable

#mkdir -p /etc/kubernetes/pki/etcd

USER=root

CONTROL_PLANE_IPS="server2.k8s.local server3.k8s.local"

for host in ${CONTROL_PLANE_IPS}; do

scp /etc/kubernetes/pki/ca.crt "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/ca.key "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/sa.key "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/sa.pub "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/front-proxy-ca.crt "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/front-proxy-ca.key "${USER}"@$host:/etc/kubernetes/pki/

scp /etc/kubernetes/admin.conf "${USER}"@$host:/etc/kubernetes/

#    scp /etc/kubernetes/pki/etcd/ca.crt "${USER}"@$host:/etc/kubernetes/pki/etcd/ca.crt

#    scp /etc/kubernetes/pki/etcd/ca.key "${USER}"@$host:/etc/kubernetes/pki/etcd/ca.key

done

加入Master节点：

在1.13中kubeadm提供了一个新的试验性标志--experimental-control-plane。

我们只需要在kubeadm join token --experimental-control-plane 即可完成Master的添加。

分别在Master2，3执行如下命令，成功如下正确输出：

➜ kubeadm join server.k8s.local:8443 --token qiqcg7.8kg2v7txawdf6ojh --discovery-token-ca-cert-hash sha256:039b3de841b63309983911c890c967fa167c5be5a713fe0f9b6f5f4eda74b70a --experimental-control-plane

[preflight] Running pre-flight checks

[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06

[discovery] Trying to connect to API Server "server.k8s.local:8443"

[discovery] Created cluster-info discovery client, requesting info from "https://server.k8s.local:8443"

[discovery] Requesting info from "https://server.k8s.local:8443" again to validate TLS against the pinned public key

[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "server.k8s.local:8443"

[discovery] Successfully established connection with API Server "server.k8s.local:8443"

[join] Reading configuration from the cluster...

[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'

[join] Running pre-flight checks before initializing the new control plane instance

[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06

[certs] Generating "apiserver" certificate and key

[certs] apiserver serving cert is signed for DNS names [server2.k8s.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local server.k8s.local server.k8s.local] and IPs [10.96.0.1 192.168.0.231]

[certs] Generating "apiserver-kubelet-client" certificate and key

[certs] Generating "front-proxy-client" certificate and key

[certs] valid certificates and keys now exist in "/etc/kubernetes/pki"

[certs] Using the existing "sa" key

[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address

[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/admin.conf"

[kubeconfig] Writing "controller-manager.conf" kubeconfig file

[kubeconfig] Writing "scheduler.conf" kubeconfig file

[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace

[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"

[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"

[kubelet-start] Activating the kubelet service

[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...

[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "server2.k8s.local" as an annotation

[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace

[mark-control-plane] Marking the node server2.k8s.local as control-plane by adding the label "node-role.kubernetes.io/master=''"

[mark-control-plane] Marking the node server2.k8s.local as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]



This node has joined the cluster and a new control plane instance was created:

Certificate signing request was sent to apiserver and approval was received.
The Kubelet was informed of the new secure connection details.
Master label and taint were applied to the new node.
The Kubernetes control plane instances scaled up.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

查看集群状态

设置kubeconfig：

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看node，你可以发现我们的master节点已经启动好了3台了，status是NotReady，这不用管，这是因为我们还没赚网络插件导致的。我们在下一章专门讲解。

➜ kubectl get node

NAME                STATUS     ROLES    AGE     VERSION

server1.k8s.local   NotReady   master   17m     v1.13.1

server2.k8s.local   NotReady   master   3m10s   v1.13.1

server3.k8s.local   NotReady   master   2m56s   v1.13.1

查看static Pods coredns 状态是ContainerCreating，原因依然是网络插件的问题。我们在下一章专门讲解。

➜  kubectl get pods -nkube-system

NAME                                        READY   STATUS              RESTARTS   AGE

coredns-89cc84847-2s5xq                     0/1     ContainerCreating   0          16m

coredns-89cc84847-4cbqf                     0/1     ContainerCreating   0          16m

kube-apiserver-server1.k8s.local            1/1     Running             0          16m

kube-apiserver-server2.k8s.local            1/1     Running             0          2m29s

kube-apiserver-server3.k8s.local            1/1     Running             0          2m14s

kube-controller-manager-server1.k8s.local   1/1     Running             0          16m

kube-controller-manager-server2.k8s.local   1/1     Running             0          2m29s

kube-controller-manager-server3.k8s.local   1/1     Running             0          2m14s

kube-proxy-5mgrq                            1/1     Running             0          16m

kube-proxy-b6wpc                            1/1     Running             0          2m29s

kube-proxy-j7gnq                            1/1     Running             0          2m15s

kube-scheduler-server1.k8s.local            1/1     Running             0          16m

kube-scheduler-server2.k8s.local            1/1     Running             0          2m29s

kube-scheduler-server3.k8s.local            1/1     Running             0          2m14s

参考资料：

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

网易一千零一夜

网易杭研项目管理部 / 电子工业出版社 / 2016-9-1 / 46

本书是网易杭州研究院项目管理部多年来丰富的项目管理实践总结与干货分享。字字句句凝结了网易项目经理的甘与苦、汗与泪。全书围绕项目管理体系，从敏捷实践、项目立项、需求管理、沟通管理，到计划进度管理、风险管理，真实反映了网易面向互联网产品项目管理实战经验与心路历程。不论你是项目管理新手，还是资深项目经理，都可以从本书中获得启发与借鉴。一起来看看《网易一千零一夜》这本书的介绍吧!

码农工具

部署高可用Kubernetes集群

写在前面的话

整体架构图

节点清单

为什么要升级内核

我应该如何选择Master节点的配置

机器环境

升级内核

启用IPVS相关内核module

内核参数调整

禁用swap并关闭防火墙

安装必须软件

配置镜像

更新apt镜像源并升级相关软件

安装Docker-ce

配置Docker

部署Nginx local Proxy

Nginx.conf

启动Nginx

部署etcd集群

注意事项

Docker安装etcd

部署Master

安装Kubernetes基础组件

预拉取镜像

初始化Master

部署高可用Master

查看集群状态

网易一千零一夜

CSS 压缩/解压工具

XML 在线格式化

RGB CMYK 转换工具