Black lives matter.
We stand in solidarity with the Black community.
Racism is unacceptable.
It conflicts with the core values of the Kubernetes project and our community does not tolerate it.
We stand in solidarity with the Black community.
Racism is unacceptable.
It conflicts with the core values of the Kubernetes project and our community does not tolerate it.
This page shows how to safely drain a node, respecting the PodDisruptionBudget you have defined.
This task assumes that you have met the following prerequisites:
kubectl drain
to remove a node from serviceYou can use kubectl drain
to safely evict all of your pods from a
node before you perform maintenance on the node (e.g. kernel upgrade,
hardware maintenance, etc.). Safe evictions allow the pod's containers
to gracefully terminate
and will respect the PodDisruptionBudgets
you have specified.
Note: By defaultkubectl drain
will ignore certain system pods on the node that cannot be killed; see the kubectl drain documentation for more details.
When kubectl drain
returns successfully, that indicates that all of
the pods (except the ones excluded as described in the previous paragraph)
have been safely evicted (respecting the desired graceful termination period,
and respecting the PodDisruptionBudget you have defined). It is then safe to
bring down the node by powering down its physical machine or, if running on a
cloud platform, deleting its virtual machine.
First, identify the name of the node you wish to drain. You can list all of the nodes in your cluster with
kubectl get nodes
Next, tell Kubernetes to drain the node:
kubectl drain <node name>
Once it returns (without giving an error), you can power down the node (or equivalently, if on a cloud platform, delete the virtual machine backing the node). If you leave the node in the cluster during the maintenance operation, you need to run
kubectl uncordon <node name>
afterwards to tell Kubernetes that it can resume scheduling new pods onto the node.
The kubectl drain
command should only be issued to a single node at a
time. However, you can run multiple kubectl drain
commands for
different nodes in parallel, in different terminals or in the
background. Multiple drain commands running concurrently will still
respect the PodDisruptionBudget
you specify.
For example, if you have a StatefulSet with three replicas and have
set a PodDisruptionBudget
for that set specifying minAvailable: 2
. kubectl drain
will only evict a pod from the StatefulSet if all
three pods are ready, and if you issue multiple drain commands in
parallel, Kubernetes will respect the PodDisruptionBudget and ensure
that only one pod is unavailable at any given time. Any drains that
would cause the number of ready replicas to fall below the specified
budget are blocked.
If you prefer not to use kubectl drain (such as to avoid calling to an external command, or to get finer control over the pod eviction process), you can also programmatically cause evictions using the eviction API.
You should first be familiar with using Kubernetes language clients.
The eviction subresource of a pod can be thought of as a kind of policy-controlled DELETE operation on the pod itself. To attempt an eviction (perhaps more REST-precisely, to attempt to create an eviction), you POST an attempted operation. Here's an example:
{
"apiVersion": "policy/v1beta1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
You can attempt an eviction using curl
:
curl -v -H 'Content-type: application/json' http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
The API can respond in one of three ways:
DELETE
request to the pod's URL and you get back 200 OK
.429 Too Many Requests
. This is
typically used for generic rate limiting of any requests, but here we mean
that this request isn't allowed right now but it may be allowed later.
Currently, callers do not get any Retry-After
advice, but they may in
future versions.500 Internal Server Error
.For a given eviction request, there are two cases:
200 OK
.In some cases, an application may reach a broken state where it will never return anything other than 429 or 500. This can happen, for example, if the replacement pod created by the application's controller does not become ready, or if the last pod evicted has a very long termination grace period.
In this case, there are two potential solutions:
DELETE
the pod instead of using the eviction API.Kubernetes does not specify what the behavior should be in this case; it is up to the application owners and cluster owners to establish an agreement on behavior in these cases.