profile-manual: Added oprofile usage section.

No re-writing at all.

(From yocto-docs rev: f42230e3665903a7603212696949241244555f8b)

Signed-off-by: Scott Rifenbark <scott.m.rifenbark@intel.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This commit is contained in:
Scott Rifenbark 2013-01-17 12:57:16 -08:00 committed by Richard Purdie
parent fcf615546f
commit acb86de34e
1 changed files with 552 additions and 0 deletions

View File

@ -2221,6 +2221,558 @@
</section>
</section>
<section id='profile-manual-oprofile'>
<title>oprofile</title>
<para>
oprofile itself is a command-line application that runs on the
target system.
</para>
<section id='oprofile-setup'>
<title>Setup</title>
<para>
For this section, we'll assume you've already performed the
basic setup outlined in the
"<link linkend='profile-manual-general-setup'>General Setup</link>"
section.
</para>
<para>
For the the section that deals with oprofile from the command-line,
we assume you've ssh'ed to the host and will be running
oprofile on the target.
</para>
<para>
oprofileui (oprofile-viewer) is a GUI-based program that runs
on the host and interacts remotely with the target.
See the oprofileui section for the exact steps needed to
install oprofileui on the host.
</para>
</section>
<section id='oprofile-basic-usage'>
<title>Basic Usage</title>
<para>
Oprofile as configured in Yocto is a system-wide profiler
(i.e. the version in Yocto doesn't yet make use of the
perf_events interface which would allow it to profile
specific processes and workloads). It's relies on hardware
counter support in the hardware (but can fall back to a
timer-based mode), which means that it doesn't take
advantage of tracepoints or other event sources for example.
</para>
<para>
It consists of a kernel module that collects samples and a
userspace daemon that writes the sample data to disk.
</para>
<para>
The 'opcontrol' shell script is used for transparently
managing these components and starting and stopping
profiles, and the 'opreport' command is used to
display the results.
</para>
<para>
The oprofile daemon should already be running, but before
you start profiling, you may need to change some settings
and some of these settings may require the daemon not
be running. One of these settings is the path the the
vmlinux file, which you'll want to set using the --vmlinux
option if you want the kernel profiled:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r`
The profiling daemon is currently active, so changes to the configuration
will be used the next time you restart oprofile after a --shutdown or --deinit.
</literallayout>
You can check if vmlinux file: is set using opcontrol --status:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --status
Daemon paused: pid 1334
Separate options: library
vmlinux file: none
Image filter: none
Call-graph depth: 6
</literallayout>
If it's not, you need to shutdown the daemon, add the setting
and restart the daemon:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --shutdown
Killing daemon.
root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r`
root@crownbay:~# opcontrol --start-daemon
Using default event: CPU_CLK_UNHALTED:100000:0:1:1
Using 2.6+ OProfile kernel interface.
Reading module info.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
</literallayout>
If we get the status again we now see our updated settings:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --status
Daemon paused: pid 1649
Separate options: library
vmlinux file: /boot/vmlinux-3.4.11-yocto-standard
Image filter: none
Call-graph depth: 6
</literallayout>
We're now in a position to run a profile. For that we used
'opcontrol --start':
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --start
Profiler running.
</literallayout>
In another window, run our wget workload:
<literallayout class='monospaced'>
root@crownbay:~# rm linux-2.6.19.2.tar.bz2; wget <ulink url='http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2'>http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2</ulink>; sync
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA
</literallayout>
To stop the profile we use 'opcontrol --shudown', which not
only stops the profile but shuts down the daemon as well:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --start
Stopping profiling.
Killing daemon.
</literallayout>
Oprofile writes sample data to /var/lib/oprofile/samples,
which you can look at if you're interested in seeing how the
samples are structured. This is also interesting because
it's related to how you dive down to get further details
about specific executables in OProfile.
</para>
<para>
To see the default display output for a profile, simply type
'opreport', which will show the results using the data in
/var/lib/oprofile/samples:
<literallayout class='monospaced'>
root@crownbay:~# opreport
WARNING! The OProfile kernel driver reports sample buffer overflows.
Such overflows can result in incorrect sample attribution, invalid sample
files and other symptoms. See the oprofiled.log for details.
You should adjust your sampling frequency to eliminate (or at least minimize)
these overflows.
CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
samples| %|
------------------
464365 79.8156 vmlinux-3.4.11-yocto-standard
65108 11.1908 oprofiled
CPU_CLK_UNHALT...|
samples| %|
------------------
64416 98.9372 oprofiled
692 1.0628 libc-2.16.so
36959 6.3526 no-vmlinux
4378 0.7525 busybox
CPU_CLK_UNHALT...|
samples| %|
------------------
2844 64.9612 libc-2.16.so
1337 30.5391 busybox
193 4.4084 ld-2.16.so
2 0.0457 libnss_compat-2.16.so
1 0.0228 libnsl-2.16.so
1 0.0228 libnss_files-2.16.so
4344 0.7467 bash
CPU_CLK_UNHALT...|
samples| %|
------------------
2657 61.1648 bash
1665 38.3287 libc-2.16.so
18 0.4144 ld-2.16.so
3 0.0691 libtinfo.so.5.9
1 0.0230 libdl-2.16.so
3118 0.5359 nf_conntrack
686 0.1179 matchbox-terminal
CPU_CLK_UNHALT...|
samples| %|
------------------
214 31.1953 libglib-2.0.so.0.3200.4
114 16.6181 libc-2.16.so
79 11.5160 libcairo.so.2.11200.2
78 11.3703 libgdk-x11-2.0.so.0.2400.8
51 7.4344 libpthread-2.16.so
45 6.5598 libgobject-2.0.so.0.3200.4
29 4.2274 libvte.so.9.2800.2
25 3.6443 libX11.so.6.3.0
19 2.7697 libxcb.so.1.1.0
17 2.4781 libgtk-x11-2.0.so.0.2400.8
12 1.7493 librt-2.16.so
3 0.4373 libXrender.so.1.3.0
671 0.1153 emgd
411 0.0706 nf_conntrack_ipv4
391 0.0672 iptable_nat
378 0.0650 nf_nat
263 0.0452 Xorg
CPU_CLK_UNHALT...|
samples| %|
------------------
106 40.3042 Xorg
53 20.1521 libc-2.16.so
31 11.7871 libpixman-1.so.0.27.2
26 9.8859 emgd_drv.so
16 6.0837 libemgdsrv_um.so.1.5.15.3226
11 4.1825 libEMGD2d.so.1.5.15.3226
9 3.4221 libfb.so
7 2.6616 libpthread-2.16.so
1 0.3802 libudev.so.0.9.3
1 0.3802 libdrm.so.2.4.0
1 0.3802 libextmod.so
1 0.3802 mouse_drv.so
.
.
.
9 0.0015 connmand
CPU_CLK_UNHALT...|
samples| %|
------------------
4 44.4444 libglib-2.0.so.0.3200.4
2 22.2222 libpthread-2.16.so
1 11.1111 connmand
1 11.1111 libc-2.16.so
1 11.1111 librt-2.16.so
6 0.0010 oprofile-server
CPU_CLK_UNHALT...|
samples| %|
------------------
3 50.0000 libc-2.16.so
1 16.6667 oprofile-server
1 16.6667 libpthread-2.16.so
1 16.6667 libglib-2.0.so.0.3200.4
5 8.6e-04 gconfd-2
CPU_CLK_UNHALT...|
samples| %|
------------------
2 40.0000 libdbus-1.so.3.7.2
2 40.0000 libglib-2.0.so.0.3200.4
1 20.0000 libc-2.16.so
</literallayout>
The output above shows the breakdown or samples by both
number of samples and percentage for each executable.
Within an executable, the sample counts are broken down
further into executable and shared libraries (DSOs) used
by the executable.
</para>
<para>
To get even more detailed breakdowns by function, we need to
have the full paths to the DSOs, which we can get by
using -f with opreport:
<literallayout class='monospaced'>
root@crownbay:~# opreport -f
CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
samples| %|
464365 79.8156 /boot/vmlinux-3.4.11-yocto-standard
65108 11.1908 /usr/bin/oprofiled
CPU_CLK_UNHALT...|
samples| %|
------------------
64416 98.9372 /usr/bin/oprofiled
692 1.0628 /lib/libc-2.16.so
36959 6.3526 /no-vmlinux
4378 0.7525 /bin/busybox
CPU_CLK_UNHALT...|
samples| %|
------------------
2844 64.9612 /lib/libc-2.16.so
1337 30.5391 /bin/busybox
193 4.4084 /lib/ld-2.16.so
2 0.0457 /lib/libnss_compat-2.16.so
1 0.0228 /lib/libnsl-2.16.so
1 0.0228 /lib/libnss_files-2.16.so
4344 0.7467 /bin/bash
CPU_CLK_UNHALT...|
samples| %|
------------------
2657 61.1648 /bin/bash
1665 38.3287 /lib/libc-2.16.so
18 0.4144 /lib/ld-2.16.so
3 0.0691 /lib/libtinfo.so.5.9
1 0.0230 /lib/libdl-2.16.so
.
.
.
</literallayout>
Using the paths shown in the above output and the -l option to
opreport, we can see all the functions that have hits in the
profile and their sample counts and percentages. Here's a
portion of what we get for the kernel:
<literallayout class='monospaced'>
root@crownbay:~# opreport -l /boot/vmlinux-3.4.11-yocto-standard
CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % symbol name
233981 50.3873 intel_idle
15437 3.3243 rb_get_reader_page
14503 3.1232 ring_buffer_consume
14092 3.0347 mutex_spin_on_owner
13024 2.8047 read_hpet
8039 1.7312 sub_preempt_count
7096 1.5281 ioread32
6997 1.5068 add_preempt_count
3985 0.8582 rb_advance_reader
3488 0.7511 add_event_entry
3303 0.7113 get_parent_ip
3104 0.6684 rb_buffer_peek
2960 0.6374 op_cpu_buffer_read_entry
2614 0.5629 sync_buffer
2545 0.5481 debug_smp_processor_id
2456 0.5289 ohci_irq
2397 0.5162 memset
2349 0.5059 __copy_to_user_ll
2185 0.4705 ring_buffer_event_length
1918 0.4130 in_lock_functions
1850 0.3984 __schedule
1767 0.3805 __copy_from_user_ll_nozero
1575 0.3392 rb_event_data_length
1256 0.2705 memcpy
1233 0.2655 system_call
1213 0.2612 menu_select
</literallayout>
Notice that above we see an entry for the __copy_to_user_ll()
function that we've looked at with other profilers as well.
</para>
<para>
Here's what we get when we do the same thing for the
busybox executable:
<literallayout class='monospaced'>
CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
349 8.4198 busybox retrieve_file_data
308 7.4306 libc-2.16.so _IO_file_xsgetn
283 6.8275 libc-2.16.so __read_nocancel
235 5.6695 libc-2.16.so syscall
233 5.6212 libc-2.16.so clearerr
215 5.1870 libc-2.16.so fread
181 4.3667 libc-2.16.so __write_nocancel
158 3.8118 libc-2.16.so __underflow
151 3.6429 libc-2.16.so _dl_addr
150 3.6188 busybox progress_meter
150 3.6188 libc-2.16.so __poll_nocancel
148 3.5706 libc-2.16.so _IO_file_underflow@@GLIBC_2.1
137 3.3052 busybox safe_poll
125 3.0157 busybox bb_progress_update
122 2.9433 libc-2.16.so __x86.get_pc_thunk.bx
95 2.2919 busybox full_write
81 1.9542 busybox safe_write
77 1.8577 busybox xwrite
72 1.7370 libc-2.16.so _IO_file_read
71 1.7129 libc-2.16.so _IO_sgetn
67 1.6164 libc-2.16.so poll
52 1.2545 libc-2.16.so _IO_switch_to_get_mode
45 1.0856 libc-2.16.so read
34 0.8203 libc-2.16.so write
32 0.7720 busybox monotonic_sec
25 0.6031 libc-2.16.so vfprintf
22 0.5308 busybox get_mono
14 0.3378 ld-2.16.so strcmp
14 0.3378 libc-2.16.so __x86.get_pc_thunk.cx
.
.
.
</literallayout>
Since we recorded the profile with a callchain depth of 6, we
should be able to see our __copy_to_user_ll() callchains in
the output, and indeed we can if we search around a bit in
the 'opreport --callgraph' output:
<literallayout class='monospaced'>
root@crownbay:~# opreport --callgraph /boot/vmlinux-3.4.11-yocto-standard
392 6.9639 vmlinux-3.4.11-yocto-standard sock_aio_read
736 13.0751 vmlinux-3.4.11-yocto-standard __generic_file_aio_write
3255 57.8255 vmlinux-3.4.11-yocto-standard inet_recvmsg
785 0.1690 vmlinux-3.4.11-yocto-standard tcp_recvmsg
1790 31.7940 vmlinux-3.4.11-yocto-standard local_bh_enable
1238 21.9893 vmlinux-3.4.11-yocto-standard __kfree_skb
992 17.6199 vmlinux-3.4.11-yocto-standard lock_sock_nested
785 13.9432 vmlinux-3.4.11-yocto-standard tcp_recvmsg [self]
525 9.3250 vmlinux-3.4.11-yocto-standard release_sock
112 1.9893 vmlinux-3.4.11-yocto-standard tcp_cleanup_rbuf
72 1.2789 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec
170 0.0366 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec
1491 73.3038 vmlinux-3.4.11-yocto-standard memcpy_toiovec
327 16.0767 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec
170 8.3579 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec [self]
20 0.9833 vmlinux-3.4.11-yocto-standard copy_to_user
2588 98.2909 vmlinux-3.4.11-yocto-standard copy_to_user
2349 0.5059 vmlinux-3.4.11-yocto-standard __copy_to_user_ll
2349 89.2138 vmlinux-3.4.11-yocto-standard __copy_to_user_ll [self]
166 6.3046 vmlinux-3.4.11-yocto-standard do_page_fault
</literallayout>
Remember that by default OProfile sessions are cumulative
i.e. if you start and stop a profiling session, then start a
new one, the new one will not erase the previous run(s) but
will build on it. If you want to restart a profile from scratch,
you need to reset:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --reset
</literallayout>
</para>
</section>
<section id='oprofileui-a-gui-for-oprofile'>
<title>OProfileUI - A GUI for OProfile</title>
<para>
Yocto also supports a graphical UI for controlling and viewing
OProfile traces, called OProfileUI. To use it, you first need
to clone the oprofileui git repo, then configure, build, and
install it:
<literallayout class='monospaced'>
[trz@empanada tmp]$ git clone git://git.yoctoproject.org/oprofileui
[trz@empanada tmp]$ cd oprofileui
[trz@empanada oprofileui]$ ./autogen.sh
[trz@empanada oprofileui]$ sudo make install
</literallayout>
OprofileUI replaces the 'opreport' functionality with a GUI,
and normally doesn't require the user to use 'opcontrol' either.
If you want to profile the kernel, however, you need to either
use the UI to specify a vmlinux or use 'opcontrol' to specify
it on the target:
</para>
<para>
First, on the target, check if vmlinux file: is set:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --status
</literallayout>
If not:
<literallayout class='monospaced'>
root@crownbay:~# opcontrol --shutdown
root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r`
root@crownbay:~# opcontrol --start-daemon
</literallayout>
Now, start the oprofile UI on the host system:
<literallayout class='monospaced'>
[trz@empanada oprofileui]$ oprofile-viewer
</literallayout>
To run a profile on the remote system, first connect to the
remote system by pressing the 'Connect' button and supplying
the IP address and port of the remote system (the default
port is 4224).
</para>
<para>
The oprofile server should automatically be started already.
If not, the connection will fail and you either typed in the
wrong IP address and port (see below), or you need to start
the server yourself:
<literallayout class='monospaced'>
root@crownbay:~# oprofile-server
</literallayout>
Or, to specify a specific port:
<literallayout class='monospaced'>
root@crownbay:~# oprofile-server --port 8888
</literallayout>
Once connected, press the 'Start' button and then run the
wget workload on the remote system:
<literallayout class='monospaced'>
root@crownbay:~# rm linux-2.6.19.2.tar.bz2; wget <ulink url='http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2'>http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2</ulink>; sync
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA
</literallayout>
Once the workload completes, press the 'Stop' button. At that
point the OProfile viewer will download the profile files it's
collected (this may take some time, especially if the kernel
was profiled). While it downloads the files, you should see
something like the following:
</para>
<para>
<imagedata fileref="figures/oprofileui-downloading.png" width="6in" depth="7in" align="center" scalefit="1" />
</para>
<para>
Once the profile files have been retrieved, you should see a
list of the processes that were profiled:
</para>
<para>
<imagedata fileref="figures/oprofileui-processes.png" width="6in" depth="7in" align="center" scalefit="1" />
</para>
<para>
If you select one of them, you should see all the symbols that
were hit during the profile. Selecting one of them will show a
list of callers and callees of the chosen function in two
panes below the top pane. For example, here's what we see
when we select __copy_to_user_ll():
</para>
<para>
<imagedata fileref="figures/oprofileui-copy-to-user.png" width="6in" depth="7in" align="center" scalefit="1" />
</para>
<para>
As another example, we can look at the busybox process and see
that the progress meter made a system call:
</para>
<para>
<imagedata fileref="figures/oprofileui-busybox.png" width="6in" depth="7in" align="center" scalefit="1" />
</para>
<note>
Tying It Together: oprofile does have build options to enable
use of the perf_event subsystem and benefit from the perf_event
infrastructure by adding support for something other than
system-wide profiling i.e. per-process or workload profiling,
but the version in danny doesn't yet take advantage of
those capabilities.
</note>
</section>
<section id='oprofile-documentation'>
<title>Documentation</title>
<para>
Yocto already has some information on setting up and using
OProfile and oprofileui. As this document doesn't cover
everything in detail, it may be worth taking a look at the
"<ulink url='&YOCTO_DOCS_DEV_URL;#platdev-oprofile'>Profiling with OProfile</ulink>"
section in the Yocto Project Development Manual
</para>
<para>
The OProfile manual can be found here:
<ulink url='http://oprofile.sourceforge.net/doc/index.html'>OProfile manual</ulink>
</para>
<para>
The OProfile website contains links to the above manual and
bunch of other items including an extensive set of examples:
<ulink url='http://oprofile.sourceforge.net/about/'>About OProfile</ulink>
</para>
</section>
</section>
</chapter>
<!--
vim: expandtab tw=80 ts=4