内容简介:将1.7.16版本的kubelet直接替换成1.9.11版本的kubelet,参数也一并调整,1.9.11版本的kubelet启动时,Sandbox以外的可以看到sandbox容器的创建时间是2分钟前,使用的还是以前的sandbox,另外一个容器的启动时间是32秒之前,被重建了。换成1.9.11版本的kubelet之后,在kubectl get node看到的版本信息随之更新:
现象
将1.7.16版本的kubelet直接替换成1.9.11版本的kubelet,参数也一并调整,1.9.11版本的kubelet启动时,Sandbox以外的 非pause
容器会被重建,是 重建
不是重启。
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES e8b4a197e171 harbor.xxxx.com/kubernetes/node-exporter "/bin/node_exporter" 31 seconds ago Up 30 seconds k8s_prometheus-node-exporter_prometheus-node-exporter-hpsrw_monitoring_8cdcb7ba-17d0-11e9-9243-52540064c479_2 ed0b86b380d2 harbor.xxxx.com/google_containers/pause-amd64:3.0 "/pause" 2 minutes ago Up 2 minutes k8s_POD_prometheus-node-exporter-hpsrw_monitoring_8cdcb7ba-17d0-11e9-9243-52540064c479_0
可以看到sandbox容器的创建时间是2分钟前,使用的还是以前的sandbox,另外一个容器的启动时间是32秒之前,被重建了。
换成1.9.11版本的kubelet之后,在kubectl get node看到的版本信息随之更新:
# 之前 10-10-66-204 Ready <none> 1h v1.7.16 <none> CentOS Linux 7 (Core) 3.10.0-862.9.1.el7.x86_64 docker://17.5.0 # 之后 10-10-66-204 Ready <none> 1h v1.9.11 <none> CentOS Linux 7 (Core) 3.10.0-862.9.1.el7.x86_64 docker://17.5.0
从1.9.11换回1.7.16,也会重建非 pause
容器。
分析
比较奇怪的地方是,sandbox容器没有重建,而且重建前后Pod的ID是没有变化的:
# 之前 $ ls /var/lib/kubelet/pods/ 4ec492d2-17de-11e9-9206-52540064c479 # 之后 $ ls /var/lib/kubelet/pods/ 4ec492d2-17de-11e9-9206-52540064c479
再对比容器的ID:
# 之前 $ ls /sys/fs/cgroup/cpu/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/ 03eba336d229e81d1155d2de3b5c85369d851a823ab32734811aaf752f6cef3d cgroup.clone_children cgroup.procs cpu.cfs_quota_us cpu.rt_runtime_us cpu.stat cpuacct.usage notify_on_release 74172e19e1a03f79231f6c0ca1fb666d490be6e70c6b7c4f00add4dbf72a3a43 cgroup.event_control cpu.cfs_period_us cpu.rt_period_us cpu.shares cpuacct.stat cpuacct.usage_percpu tasks # 之后 $ ls /sys/fs/cgroup/cpu/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/ 74172e19e1a03f79231f6c0ca1fb666d490be6e70c6b7c4f00add4dbf72a3a43 cgroup.procs cpu.rt_period_us cpu.stat cpuacct.usage_percpu tasks cgroup.clone_children cpu.cfs_period_us cpu.rt_runtime_us cpuacct.stat f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1 cgroup.event_control cpu.cfs_quota_us cpu.shares cpuacct.usage notify_on_release
可以看到sandbox以外的容器的ID发生了变化。
对比日志:
# 1.7.16
I0114 17:55:49.989510 12551 kubelet.go:1882] SyncLoop (ADD, "api"): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:55:49.990031 12551 kubelet_pods.go:1220] Generating status for "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:55:49.990366 12551 status_manager.go:340] Status Manager: adding pod: "4ec492d2-17de-11e9-9206-52540064c479", with status: ('\x01', {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:25:18 +0800 CST } {Ready False 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:51:22 +0800 CST ContainersNotReady containers with unready status: [prometheus-node-exporter]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2019-01-14 17:25:19 +0800 CST }] 10.10.66.204 10.10.66.204 2019-01-14 17:25:18 +0800 CST [] [{prometheus-node-exporter {nil nil &ContainerStateTerminated{ExitCode:143,Signal
# 1.9.11
I0114 17:57:39.258527 12945 config.go:405] Receiving a new pod "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
...
I0114 17:57:41.448790 12945 manager.go:970] Added container: "/kubepods/besteffort/pod4ec492d2-17de-11e9-9206-52540064c479/f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1" (aliases: [k8s_prometheus-node-exporter_prometheus-node-exporter-l7vzz_monitoring_4ec492d2-17de-11e9-9206-52540064c479_1 f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1], namespace: "docker")
I0114 17:57:42.413318 12945 kubelet.go:1857] SyncLoop (ADD, "api"): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)"
I0114 17:57:42.413510 12945 kubelet.go:1902] SyncLoop (PLEG): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)", event: &pleg.PodLifecycleEvent{ID:"4ec492d2-17de-11e9-9206-52540064c479", Type:"ContainerStarted", Data:"f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1"}
I0114 17:57:42.413599 12945 kubelet.go:1902] SyncLoop (PLEG): "prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479)", event: &pleg.PodLifecycleEvent{ID:"4ec492d2-17de-11e9-9206-52540064c479", Type:"ContainerDied", Data:"de1f8ebdf737297a1c591978d51de56ee2c0c4f3cc956517fe826bdfdef70e0f"}
后来通过梳理代码和查看日志,发现切换版本后,容器的hash发生了变化,1.9.11中的日志如下:
I0114 17:57:42.715551 12945 kuberuntime_manager.go:550] Container "prometheus-node-exporter" ({"docker" "f59c4812a66d65572020efab38780c1271d671330b126642653390dc8b8d29f1"})
of pod prometheus-node-exporter-l7vzz_monitoring(4ec492d2-17de-11e9-9206-52540064c479):
Container spec hash changed (1559107639 vs 1428860573).. Container will be killed and recreated.
继续分析
先看一下1.7.16版本中的这部分代码:
// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager: 492
func (m *kubeGenericRuntimeManager) computePodContainerChanges(pod *v1.Pod, podStatus *kubecontainer.PodStatus) podContainerSpecChanges {
...
expectedHash := kubecontainer.HashContainer(&container)
containerChanged := containerStatus.Hash != expectedHash
if containerChanged {
message := fmt.Sprintf("Pod %q container %q hash changed (%d vs %d), it will be killed and re-created.",
pod.Name, container.Name, containerStatus.Hash, expectedHash)
glog.Info(message)
changes.ContainersToStart[index] = message
continue
}
...
}
// kubernetes/pkg/kubelet/container/helpers.go: 99
// HashContainer returns the hash of the container. It is used to compare
// the running container with its desired spec.
func HashContainer(container *v1.Container) uint64 {
hash := fnv.New32a()
hashutil.DeepHashObject(hash, *container)
return uint64(hash.Sum32())
}
// kubernetes/pkg/util/hash/hash.go: 28
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
hasher.Reset()
printer := spew.ConfigState{
Indent: " ",
SortKeys: true,
DisableMethods: true,
SpewKeys: true,
}
printer.Fprintf(hasher, "%#v", objectToWrite)
}
再看一下1.9.11中的这部分代码:
// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager.go: 522
func (m *kubeGenericRuntimeManager) computePodActions(pod *v1.Pod, podStatus *kubecontainer.PodStatus) podActions {
...
if expectedHash, actualHash, changed := containerChanged(&container, containerStatus); changed {
reason = fmt.Sprintf("Container spec hash changed (%d vs %d).", actualHash, expectedHash)
// Restart regardless of the restart policy because the container
// spec changed.
restart = true
}
...
}
// kubernetes/pkg/kubelet/kuberuntime/kuberuntime_manager.go: 423
func containerChanged(container *v1.Container, containerStatus *kubecontainer.ContainerStatus) (uint64, uint64, bool) {
expectedHash := kubecontainer.HashContainer(container)
return expectedHash, containerStatus.Hash, containerStatus.Hash != expectedHash
}
// kubernetes/pkg/kubelet/container/helpers.go: 97
func HashContainer(container *v1.Container) uint64 {
hash := fnv.New32a()
hashutil.DeepHashObject(hash, *container)
return uint64(hash.Sum32())
}
// kubernetes/pkg/util/hash/hash.go: 28
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
hasher.Reset()
printer := spew.ConfigState{
Indent: " ",
SortKeys: true,
DisableMethods: true,
SpewKeys: true,
}
printer.Fprintf(hasher, "%#v", objectToWrite)
}
通过对比代码可以看到,哈希算法没有变化,那么导致哈希值不同的原因只能是 输入
发生了变化。
1.9.11中的Container定义:
type Container struct {
Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
Image string `json:"image,omitempty" protobuf:"bytes,2,opt,name=image"`
Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`
WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`
Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`
EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`
Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`
Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`
VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath" protobuf:"bytes,21,rep,name=volumeDevices"`
LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`
ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`
Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`
TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`
TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty" protobuf:"bytes,20,opt,name=terminationMessagePolicy,casttype=TerminationMessagePolicy"`
ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`
SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`
Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}
1.7.16中的Container定义:
type Container struct {
Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
Image string `json:"image" protobuf:"bytes,2,opt,name=image"`
Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`
WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`
Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`
EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`
Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`
Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`
LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`
ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`
Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`
TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`
TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty" protobuf:"bytes,20,opt,name=terminationMessagePolicy,casttype=TerminationMessagePolicy"`
ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`
SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`
Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}
仔细对比发现,1.9.11的Container定义中多出了一个字段:
VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath" protobuf:"bytes,21,rep,name=volumeDevices"`
因此是Container定义发生了变化,导致哈希值发生了变化。
解决方法
只能通过改代码了,1.9.11版本计算出的hash值和1.7.16版本计算出的哈希值不相同,是因为Container的定义发生了变化,那就从Container中抽取出部分字段,组成一个新的结构体作为hash算法的输入。
本文 原创首发 于网站:www.lijiaocn.com
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Design Accessible Web Sites
Jeremy Sydik / Pragmatic Bookshelf / 2007-11-05 / USD 34.95
It's not a one-browser web anymore. You need to reach audiences that use cell phones, PDAs, game consoles, or other "alternative" browsers, as well as users with disabilities. Legal requirements for a......一起来看看 《Design Accessible Web Sites》 这本书的介绍吧!