Difference between revisions of "ProfilingWesnoth"
Pentarctagon (talk | contribs) (→Profile Guided Optimization (PGO)) |
(wikify) |
||
Line 77: | Line 77: | ||
# Run the executable to generate the profile information, which will be saved to the <code>pgo_data/</code> directory (if using GCC) or a <code>.profraw</code> file if using Clang. | # Run the executable to generate the profile information, which will be saved to the <code>pgo_data/</code> directory (if using GCC) or a <code>.profraw</code> file if using Clang. | ||
# Set <code>-DPGO_DATA=use</code> and build the executable again | # Set <code>-DPGO_DATA=use</code> and build the executable again | ||
+ | |||
+ | == See Also == | ||
+ | * [[UsingGooglePerformanceTools]] |
Latest revision as of 04:50, 8 May 2023
Contents
Linux
When using either scons or cmake to build, there are four options available for profiling which are listed below. For cmake use -DPROFILER=<name>
, for scons use profiler=<name>
.
gperftools
To use gperftools:
- Install the packages
google-perftools
(needed later for running google-pprof) andlibgoogle-perftools-dev
(needed in order to use the -lprofiler linker option). - In a terminal, export the
CPUPROFILE
variable, such asexport CPUPROFILE=./wesnoth-prof
. - Build any executable while setting either
-DPROFILER=gperftools
(cmake) orprofiler=gperftools
(scons). - Run the executable and have it do any task(s) as needed to get relevant profiling information.
- Generate the human-readable profiling output using the command
google-pprof <executable> <profiling info> > prof.txt
for a text file, orgoogle-pprof -gif <executable> <profiling info> > prof.gif
for a viewable gif image.
Unfortunately, the output, whether graphical or text, doesn't provide any labels for what the values mean. For the text output, the columns are:
- The number of profiling samples in this function
- The percentage of profiling samples in this function
- The percentage of profiling samples in the functions printed so far
- The number of profiling samples in this function and its callees
- The percentage of profiling samples in this function and its callees
- The function name
For the graphical output, each square will contain:
- The namespace/class/method profiled, each of those on a separate line
- The number of profiling samples in this function
- The percentage of profiling samples in this function (in parenthesis)
- The number of profiling samples in this function and its callees
- The percentage of profiling samples in this function and its callees (in parenthesis)
perf
To use perf:
- Install the packages
linux-tools-common
andlinux-tools-<kernel version>
, ie linux-tools-5.8.0-55-generic. - Run
perf record <executable>
. To do this you will need to either:- Run the executable as root
- Switch to root and run
echo 2 > /proc/sys/kernel/perf_event_paranoid
- Create a new group that has rights to use perf and add your user to it, ie:
cd /usr/bin
groupadd perf_users
chgrp perf_users perf
chmod o-rwx perf
setcap cap_sys_admin,cap_sys_ptrace,cap_syslog=ep perf
Once that's complete, there will be a perf.data
file created, and commands like perf report
can be run.
gcov
To use gcov:
- After the executable is built and been executed as needed, there will be many
.gcda
and.gcno
files in the source directory. - Generate the human readable information from those files with
gcov ./**/*.gcno
. This will generate a.gcov
file for each source file. - Open any gcov file(s) of interest.
gprof
To use gprof:
- After the executable is built and been executed as needed, there will be a
gmon.out
file generated. - Execute
gprof <executable> gmon.out > prof.txt
. This may take quite a while and create a rather large file. - Open prof.txt.
Profile Guided Optimization (PGO)
Profile guided optimization is, to summarize it extremely briefly, a way to let the compiler generate a higher performing executable by providing it additional information (aka a profile) about how the program is used. While there is support for enabling this in scons and cmake, no executables are currently built using this for two reasons:
- It is much more work.
- The executable must be built first with the additional instrumentation needed to generate the profiling data.
- That executable must then be run and perform tasks that accurately mimic real world use. If your profile contains information primarily on unimportant and rarely executed tasks, or if the way the tasks are executed don't match what happens in the real world, then the profile generated will be used by the compiler to make optimizations that have no impact on the program's performance or may even make it slower.
- The executable must then be built a second time, this time using the information gathered in the profile to help direct how optimizations are done.
- For the wesnoth client (
wesnoth.exe
) for example, there's no set of cases defining what situations would need to be run to generate a good profile. For the server executables (such aswesnothd
), while it would be fairly easy to generate a good profile by simply letting an instrumented build run for a day or so, there is currently no need to do this since they're already able to perform adequately without any optimizations being enabled at all.
To use PGO for cmake:
- Set
-DPGO_DATA=generate
and build the executable - Run the executable to generate the profile information, which will be saved to the
pgo_data/
directory (if using GCC) or a.profraw
file if using Clang. - Set
-DPGO_DATA=use
and build the executable again
To use PGO for scons:
- Set
-DPGO_DATA=generate
and build the executable - Run the executable to generate the profile information, which will be saved to the
pgo_data/
directory (if using GCC) or a.profraw
file if using Clang. - Set
-DPGO_DATA=use
and build the executable again