Linux中查看CPU的使用情况

CPU使用分析主要就是分析CPU的使用率,看看那些进程占用的CPU资源比较多。

top/htop/atop查看CPU使用情况

top命令不多说了,需要注意的是在top中键入H后,进入线程模式。

htop和atop是两个功能更丰富的类top工具:

yum install -y htop atop

pidstat查看进程的CPU使用率

pidstat命令可以显示每个进程的在不同CPU状态中耗费的时间的百分比(1,每秒显示一次;-p,指定进程,如果不指定,显示所有进程):

$ pidstat 1 -p 27936 Linux 3.10.0-693.11.6.el7.x86_64 (10.10.64.58) 12/04/2018 _x86_64_ (4 CPU) 05:00:59 PM UID PID %usr %system %guest %CPU CPU Command 05:01:00 PM 99 27936 0.00 0.00 0.00 0.00 0 openresty 05:01:01 PM 99 27936 0.00 0.00 0.00 0.00 0 openresty 05:01:02 PM 99 27936 0.00 0.00 0.00 0.00 0 openresty

perf top查看CPU占用高的函数

perf top显示占用CPU时间最多的函数或者指令:

$ perf top Samples: 3K of event 'cpu-clock', Event count (approx.): 903937500 Overhead Shared Object Symbol 8.69% perf [.] symbols__insert 5.33% perf [.] rb_next 3.41% [kernel] [k] _raw_spin_unlock_irqrestore 3.12% libc-2.17.so [.] __memcpy_ssse3_back 2.40% [kernel] [k] finish_task_switch 2.40% libc-2.17.so [.] __strchr_sse42 2.08% libelf-0.168.so [.] gelf_getsym ...省略后续内容...

perf report查看cpu事件占比

perf record将采样数据保存,然后用perf record查看,或者直接用下面的命令一次完成,-a查看所有cpu:

perf record -ag -- sleep 15;perf report

perf report中显示,stress进程的cpu事件占比是77%,它大量调用了随机数生成函数random():

perf-report

vmstat查看上下文切换

vmstat命令显示系统整体状态,输出如下:

$ vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 5980 126076 648 6997448 0 0 6 27 13 1 1 1 97 0 0

可以在后面加上一个数字,每隔指定时间输出一次,例如vmstat 5,如果感觉数据排列太密集,可以加上参数-w,用宽格式显示。

vmstat命令的输出包含procsmemoryswapiosystemcpu六块内容。

每列数据的含义如下,可以在man vmstat中找到:

Procs r: The number of runnable processes (running or waiting for run time). b: The number of processes in uninterruptible sleep. Memory swpd: the amount of virtual memory used. free: the amount of idle memory. buff: the amount of memory used as buffers. cache: the amount of memory used as cache. inact: the amount of inactive memory. (-a option) active: the amount of active memory. (-a option) Swap si: Amount of memory swapped in from disk (/s). so: Amount of memory swapped to disk (/s). IO bi: Blocks received from a block device (blocks/s). bo: Blocks sent to a block device (blocks/s). System in: The number of interrupts per second, including the clock. cs: The number of context switches per second. CPU These are percentages of total CPU time. us: Time spent running non-kernel code. (user time, including nice time) sy: Time spent running kernel code. (system time) id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time. wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle. st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

其中cs是一秒内发生的上下文切换次数。

pidstat查看特定进程的切换情况

pidstat -w可以显示每个进程的上下文切换情况:

$ pidstat -w Linux 3.10.0-693.11.6.el7.x86_64 (10.10.64.58) 12/04/2018 _x86_64_ (4 CPU) 04:04:15 PM UID PID cswch/s nvcswch/s Command 04:04:15 PM 0 1 1.75 0.00 systemd 04:04:15 PM 0 2 0.01 0.00 kthreadd 04:04:15 PM 0 3 2.54 0.00 ksoftirqd/0 ...省略后续内容...

(可以在后面上一个数字,每隔指定时间输出一次,例如pidstat -w 5

特别注意:pidstats -w显示的是进程的状态,如果要将线程一并显示出来,需要再加一个-t参数:

$ pidstat -wt Average: UID TGID TID cswch/s nvcswch/s Command Average: 0 3 - 3.92 0.00 ksoftirqd/0 Average: 0 - 3 3.92 0.00 |__ksoftirqd/0 Average: 0 9 - 45.59 0.00 rcu_sched Average: 0 - 9 45.59 0.00 |__rcu_sched Average: 0 13 - 1.96 0.00 ksoftirqd/1 Average: 0 - 13 1.96 0.00 |__ksoftirqd/1 Average: 0 17 - 0.49 0.00 migration/2 ...省略后续内容...

-w参数的作用是显示进程切换状态,每一列的含义如下(可以在man pidstat中找到):

-w Report task switching activity (kernels 2.6.23 and later only). The following values may be displayed: UID The real user identification number of the task being monitored. USER The name of the real user owning the task being monitored. PID The identification number of the task being monitored. cswch/s Total number of voluntary context switches the task made per second. A voluntary context switch occurs when a task blocks because it requires a resource that is unavailable. nvcswch/s Total number of non voluntary context switches the task made per second. A involuntary context switch takes place when a task executes for the duration of its time slice and then is forced to relinquish the processor. Command The command name of the task.

需要注意自愿切换(cswch/s,voluntary context switches)和非自愿切换(nvcswch/s,non voluntary context switches)的区别。前者是因为需要的资源没有准备好,主动让出CPU发生的切换,后者是进程分配的时间片已经用完,被调度器强制切换。

另外pidstat还有一个-u参数,可以一并输出进程和线程(加-t)的CPU使用情况:

[root@10.10.64.58 ~]# pidstat -wt -u Linux 3.10.0-693.11.6.el7.x86_64 (10.10.64.58) 12/04/2018 _x86_64_ (4 CPU) 04:21:56 PM UID TGID TID %usr %system %guest %CPU CPU Command 04:21:56 PM 0 1 - 0.02 0.01 0.00 0.03 0 systemd 04:21:56 PM 0 - 1 0.02 0.01 0.00 0.03 0 |__systemd 04:21:56 PM 0 2 - 0.00 0.00 0.00 0.00 0 kthreadd 04:21:56 PM 0 - 2 0.00 0.00 0.00 0.00 0 |__kthreadd ...省略后续内容... 04:21:56 PM UID TGID TID cswch/s nvcswch/s Command 04:21:56 PM 0 1 - 1.75 0.00 systemd 04:21:56 PM 0 - 1 1.75 0.00 |__systemd 04:21:56 PM 0 2 - 0.01 0.00 kthreadd 04:21:56 PM 0 - 2 0.01 0.00 |__kthreadd 04:21:56 PM 0 3 - 2.53 0.00 ksoftirqd/0 ...省略后续内容...

/proc/interrupts记录的中断状态

watch -d cat /proc/interrupts用可以显示中断的变化情况。

Every 2.0s: cat /proc/interrupts Tue Dec 4 16:29:25 2018 CPU0 CPU1 CPU2 CPU3 0: 153 0 0 0 IO-APIC-edge timer 1: 10 0 0 0 IO-APIC-edge i8042 6: 3 0 0 0 IO-APIC-edge floppy 8: 0 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 10: 0 0 0 0 IO-APIC-fasteoi virtio0 11: 31 0 3 0 IO-APIC-fasteoi uhci_hcd:usb1 12: 15 0 0 0 IO-APIC-edge i8042 14: 0 0 0 0 IO-APIC-edge ata_piix 15: 0 0 0 0 IO-APIC-edge ata_piix 24: 0 0 0 0 PCI-MSI-edge virtio3-config 25: 6234 0 120283 272843 PCI-MSI-edge virtio3-req.0 26: 0 0 0 0 PCI-MSI-edge virtio1-config 27: 99 176029503 172147021 0 PCI-MSI-edge virtio1-input.0 28: 24 2721 497 4165 PCI-MSI-edge virtio1-output.0 29: 0 0 0 0 PCI-MSI-edge virtio2-config 30: 19 0 0 0 PCI-MSI-edge virtio2-virtqueues 31: 0 0 0 0 PCI-MSI-edge virtio4-config 32: 131111 0 0 0 PCI-MSI-edge virtio4-req.0 NMI: 0 0 0 0 Non-maskable interrupts LOC: 64284244 70653905 70450086 65741168 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 1242971 1424513 1360783 1296367 IRQ work interrupts RTR: 0 0 0 0 APIC ICR read retries RES: 33708963 29562936 30300354 32946303 Rescheduling interrupts CAL: 131150935 64213698 67023823 133615842 Function call interrupts TLB: 519133 541110 531432 522850 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts DFR: 0 0 0 0 Deferred Error APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 2001 2001 2001 2001 Machine check polls ERR: 0 MIS: 0 PIN: 0 0 0 0 Posted-interrupt notification event PIW: 0 0 0 0 Posted-interrupt wakeup event

Rescheduling interrupts是重调度中断,用来唤醒空闲的CPU执行新的任务。

参考