Kubelet从1.7.16升级到1.9.11,Sandbox以外的容器都被重建的问题调查

栏目: 编程工具 · 发布时间: 5年前

内容简介:将1.7.16版本的kubelet直接替换成1.9.11版本的kubelet,参数也一并调整,1.9.11版本的kubelet启动时,Sandbox以外的可以看到sandbox容器的创建时间是2分钟前,使用的还是以前的sandbox,另外一个容器的启动时间是32秒之前,被重建了。换成1.9.11版本的kubelet之后,在kubectl get node看到的版本信息随之更新:

现象

将1.7.16版本的kubelet直接替换成1.9.11版本的kubelet,参数也一并调整,1.9.11版本的kubelet启动时,Sandbox以外的 非pause 容器会被重建,是 重建 不是重启。

$ docker ps
CONTAINER ID        IMAGE                                                     COMMAND                CREATED             STATUS           NAMES
e8b4a197e171        harbor.xxxx.com/kubernetes/node-exporter            "/bin/node_exporter"   31 seconds ago      Up 30 seconds    k8s_prometheus-node-exporter_prometheus-node-exporter-hpsrw_monitoring_8cdcb7ba-17d0-11e9-9243-52540064c479_2
ed0b86b380d2        harbor.xxxx.com/google_containers/pause-amd64:3.0   "/pause"               2 minutes ago       Up 2 minutes     k8s_POD_prometheus-node-exporter-hpsrw_monitoring_8cdcb7ba-17d0-11e9-9243-52540064c479_0

可以看到sandbox容器的创建时间是2分钟前,使用的还是以前的sandbox,另外一个容器的启动时间是32秒之前,被重建了。

换成1.9.11版本的kubelet之后,在kubectl get node看到的版本信息随之更新:

# 之前
10-10-66-204    Ready   <none>    1h     v1.7.16   <none>    CentOS Linux 7 (Core)   3.10.0-862.9.1.el7.x86_64   docker://17.5.0

# 之后
10-10-66-204    Ready   <none>    1h     v1.9.11   <none>    CentOS Linux 7 (Core)   3.10.0-862.9.1.el7.x86_64   docker://17.5.0

从1.9.11换回1.7.16,也会重建非 pause 容器。

分析

比较奇怪的地方是,sandbox容器没有重建,而且重建前后Pod的ID是没有变化的:

# 之前
$ ls /var/lib/kubelet/pods/
4ec492d2-17de-11e9-9206-52540064c479

# 之后
$ ls /var/lib/kubelet/pods/
4ec492d2-17de-11e9-9206-52540064c479

再对比容器的ID:

# 之前
$ ls /sys/fs/cgroup/cpu/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/
03eba336d229e81d1155d2de3b5c85369d851a823ab32734811aaf752f6cef3d  cgroup.clone_children  cgroup.procs       cpu.cfs_quota_us  cpu.rt_runtime_us  cpu.stat      cpuacct.usage         notify_on_release
74172e19e1a03f79231f6c0ca1fb666d490be6e70c6b7c4f00add4dbf72a3a43  cgroup.event_control   cpu.cfs_period_us  cpu.rt_period_us  cpu.shares         cpuacct.stat  cpuacct.usage_percpu  tasks

# 之后
$ ls /sys/fs/cgroup/cpu/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/
74172e19e1a03f79231f6c0ca1fb666d490be6e70c6b7c4f00add4dbf72a3a43  cgroup.procs       cpu.rt_period_us   cpu.stat       cpuacct.usage_percpu                                              tasks
cgroup.clone_children                                             cpu.cfs_period_us  cpu.rt_runtime_us  cpuacct.stat   f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1
cgroup.event_control                                              cpu.cfs_quota_us   cpu.shares         cpuacct.usage  notify_on_release

可以看到sandbox以外的容器的ID发生了变化。

对比日志:

# 1.7.16
I0114 17:55:49.989510   12551 kubelet.go:1882] SyncLoop (ADD, "api"): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:55:49.990031   12551 kubelet_pods.go:1220] Generating status for "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:55:49.990366   12551 status_manager.go:340] Status Manager: adding pod: "4ec492d2-17de-11e9-9206-52540064c479", with status: ('\x01', {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:25:18 +0800 CST  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:51:22 +0800 CST ContainersNotReady containers with unready status: [prometheus-node-exporter]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:25:19 +0800 CST  }]   10.10.66.204 10.10.66.204 2019-01-14 17:25:18 +0800 CST [] [{prometheus-node-exporter {nil nil &ContainerStateTerminated{ExitCode:143,Signal

# 1.9.11
I0114 17:57:39.258527   12945 config.go:405] Receiving a new pod "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
...
I0114 17:57:41.448790   12945 manager.go:970] Added container: "/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1" (aliases: [k8s_prometheus-node-exporter_prometheus-node-exporter-l7vzz_monitoring_4ec492d2-17de-11e9-9206-52540064c479_1 f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1], namespace: "docker")
I0114 17:57:42.413318   12945 kubelet.go:1857] SyncLoop (ADD, "api"): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:57:42.413510   12945 kubelet.go:1902] SyncLoop (PLEG): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)", event: &pleg.PodLifecycleEvent{ID:"4ec492d2-17de-11e9-9206-52540064c479", Type:"ContainerStarted", Data:"f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1"}
I0114 17:57:42.413599   12945 kubelet.go:1902] SyncLoop (PLEG): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)", event: &pleg.PodLifecycleEvent{ID:"4ec492d2-17de-11e9-9206-52540064c479", Type:"ContainerDied", Data:"de1f8ebdf737297a1c591978d51de56ee2c0c4f3cc956517fe826bdfdef70e0f"}

后来通过梳理代码和查看日志,发现切换版本后,容器的hash发生了变化,1.9.11中的日志如下:

I0114 17:57:42.715551   12945 kuberuntime_manager.go:550] Container "prometheus-node-exporter" ({"docker" "f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1"})
                                       of pod prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479): 
                                       Container spec hash changed (1559107639 vs 1428860573).. Container will be killed and recreated.

继续分析

先看一下1.7.16版本中的这部分代码:

// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager: 492
func (m *kubeGenericRuntimeManager) computePodContainerChanges(pod *v1.Pod, podStatus *kubecontainer.PodStatus) podContainerSpecChanges {
	...
	expectedHash := kubecontainer.HashContainer(&container)
	containerChanged := containerStatus.Hash != expectedHash
	if containerChanged {
		message := fmt.Sprintf("Pod %q container %q hash changed (%d vs %d), it will be killed and re-created.",
			pod.Name, container.Name, containerStatus.Hash, expectedHash)
		glog.Info(message)
		changes.ContainersToStart[index] = message
		continue
	}
	...
}

// kubernetes/pkg/kubelet/container/helpers.go: 99
// HashContainer returns the hash of the container. It is used to compare
// the running container with its desired spec.
func HashContainer(container *v1.Container) uint64 {
	hash := fnv.New32a()
	hashutil.DeepHashObject(hash, *container)
	return uint64(hash.Sum32())
}

// kubernetes/pkg/util/hash/hash.go: 28
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
	hasher.Reset()
	printer := spew.ConfigState{
		Indent:         " ",
		SortKeys:       true,
		DisableMethods: true,
		SpewKeys:       true,
	}
	printer.Fprintf(hasher, "%#v", objectToWrite)
}

再看一下1.9.11中的这部分代码:

// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager.go: 522
func (m *kubeGenericRuntimeManager) computePodActions(pod *v1.Pod, podStatus *kubecontainer.PodStatus) podActions {
	...
	if expectedHash, actualHash, changed := containerChanged(&container, containerStatus); changed {
		reason = fmt.Sprintf("Container spec hash changed (%d vs %d).", actualHash, expectedHash)
		// Restart regardless of the restart policy because the container
		// spec changed.
		restart = true
	}
	...
}

// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager.go: 423
func containerChanged(container *v1.Container, containerStatus *kubecontainer.ContainerStatus) (uint64, uint64, bool) {
	expectedHash := kubecontainer.HashContainer(container)
	return expectedHash, containerStatus.Hash, containerStatus.Hash != expectedHash
}

// kubernetes/pkg/kubelet/container/helpers.go: 97
func HashContainer(container *v1.Container) uint64 {
	hash := fnv.New32a()
	hashutil.DeepHashObject(hash, *container)
	return uint64(hash.Sum32())
}

// kubernetes/pkg/util/hash/hash.go: 28
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
	hasher.Reset()
	printer := spew.ConfigState{
		Indent:         " ",
		SortKeys:       true,
		DisableMethods: true,
		SpewKeys:       true,
	}
	printer.Fprintf(hasher, "%#v", objectToWrite)
}

通过对比代码可以看到,哈希算法没有变化,那么导致哈希值不同的原因只能是 输入 发生了变化。

1.9.11中的Container定义:

type Container struct {
    Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
    Image string `json:"image,omitempty" protobuf:"bytes,2,opt,name=image"`
    Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
    Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`
    WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`
    Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`
    EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`
    Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`
    Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
    VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`
    VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath" protobuf:"bytes,21,rep,name=volumeDevices"`
    LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`
    ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`
    Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`
    TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`
    TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty" protobuf:"bytes,20,opt,name=terminationMessagePolicy,casttype=TerminationMessagePolicy"`
    ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`
    SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`
    Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
    StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
    TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}

1.7.16中的Container定义:

type Container struct {
	Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
	Image string `json:"image" protobuf:"bytes,2,opt,name=image"`
	Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
	Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`
	WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`
	Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`
	EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`
	Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`
	Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
	VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`
	LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`
	ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`
	Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`
	TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`
	TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty" protobuf:"bytes,20,opt,name=terminationMessagePolicy,casttype=TerminationMessagePolicy"`
	ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`
	SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`
	Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
	StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
	TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}

仔细对比发现,1.9.11的Container定义中多出了一个字段:

VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath" protobuf:"bytes,21,rep,name=volumeDevices"`

因此是Container定义发生了变化,导致哈希值发生了变化。

解决方法

只能通过改代码了,1.9.11版本计算出的hash值和1.7.16版本计算出的哈希值不相同,是因为Container的定义发生了变化,那就从Container中抽取出部分字段,组成一个新的结构体作为hash算法的输入。

本文 原创首发 于网站:www.lijiaocn.com


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Lighttpd源码分析

Lighttpd源码分析

高群凯 / 机械工业出版社 / 2010-3 / 59.00元

本书主要针对lighttpd源码进行了深度剖析。主要内容包括:lighttpd介绍与分析准备工作、lighttpd网络服务主模型、lighttpd数据结构、伸展树、日志系统、文件状态缓存器、配置信息加载、i/o多路复用技术模型、插件链、网络请求服务响应流程、请求响应数据快速传输方式,以及基本插件模块。本书针对的lighttpd项目版本为稳定版本1.4.20。 本书适合使用lighttpd的人......一起来看看 《Lighttpd源码分析》 这本书的介绍吧!

URL 编码/解码
URL 编码/解码

URL 编码/解码

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具