exynos-linux-stable/kernel
Daniel Borkmann 527bc99584
bpf: make jited programs visible in traces
Long standing issue with JITed programs is that stack traces from
function tracing check whether a given address is kernel code
through {__,}kernel_text_address(), which checks for code in core
kernel, modules and dynamically allocated ftrace trampolines. But
what is still missing is BPF JITed programs (interpreted programs
are not an issue as __bpf_prog_run() will be attributed to them),
thus when a stack trace is triggered, the code walking the stack
won't see any of the JITed ones. The same for address correlation
done from user space via reading /proc/kallsyms. This is read by
tools like perf, but the latter is also useful for permanent live
tracing with eBPF itself in combination with stack maps when other
eBPF types are part of the callchain. See offwaketime example on
dumping stack from a map.

This work tries to tackle that issue by making the addresses and
symbols known to the kernel. The lookup from *kernel_text_address()
is implemented through a latched RB tree that can be read under
RCU in fast-path that is also shared for symbol/size/offset lookup
for a specific given address in kallsyms. The slow-path iteration
through all symbols in the seq file done via RCU list, which holds
a tiny fraction of all exported ksyms, usually below 0.1 percent.
Function symbols are exported as bpf_prog_<tag>, in order to aide
debugging and attribution. This facility is currently enabled for
root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
is active in any mode. The rationale behind this is that still a lot
of systems ship with world read permissions on kallsyms thus addresses
should not get suddenly exposed for them. If that situation gets
much better in future, we always have the option to change the
default on this. Likewise, unprivileged programs are not allowed
to add entries there either, but that is less of a concern as most
such programs types relevant in this context are for root-only anyway.
If enabled, call graphs and stack traces will then show a correct
attribution; one example is illustrated below, where the trace is
now visible in tooling such as perf script --kallsyms=/proc/kallsyms
and friends.

Before:

  7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
         f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)

After:

  7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
  [...]
  7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
         f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)

Change-Id: Ied10f8ade48a833d5e222b0c7621663f9df20a5a
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-25 16:54:39 +03:00
..
bpf bpf: make jited programs visible in traces 2024-09-25 16:54:39 +03:00
configs ANDROID: add script to fetch android kernel config fragments 2017-10-03 17:19:26 +00:00
debug This is the 4.9.212 stable release 2020-01-29 10:47:55 +01:00
events BACKPORT: bpf: permit multiple bpf attachments for a single perf event 2024-09-25 16:54:38 +03:00
gcov gcov: support GCC 7.1 2017-09-02 07:07:53 +02:00
irq Merge 4.9.218 branch 'android-4.9-q' into tw10-android-4.9-q 2020-04-09 17:15:01 +03:00
livepatch ANDROID: kallsyms: increase KSYM_NAME_LEN 2023-02-21 00:16:35 +03:00
locking Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
power PM/Sleep: Start killing wakelocks after two minutes of idle (120s) 2023-02-21 00:10:25 +03:00
printk printk: Add sleep time offset to all timestamps 2023-02-21 00:15:19 +03:00
rcu Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
sched psi: eliminate kthread_worker from psi trigger scheduling mechanism 2024-09-25 16:54:35 +03:00
time timers: Add a function to start/reduce a timer 2023-04-30 19:48:24 +03:00
trace bpf: remove struct bpf_prog_type_list 2024-09-25 16:54:39 +03:00
.gitignore
acct.c kernel/acct.c: fix the acct->needcheck check in check_free_space() 2018-01-10 09:29:51 +01:00
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-17 13:21:18 +01:00
audit.c fs: proc: backport PROC_AVC from N770F 2023-02-21 00:18:26 +03:00
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c audit_get_nd(): don't unlock parent too early 2019-12-21 10:40:48 +01:00
auditfilter.c audit: fix error handling in audit_data_to_entry() 2020-03-11 07:53:05 +01:00
auditsc.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-11-13 11:16:57 -08:00
capability.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
cfi.c ANDROID: cfi: Remove unused variable in ptr_to_check_fn 2019-03-06 18:27:16 +00:00
cgroup.c UPSTREAM: bpf: multi program support for cgroup+bpf 2024-09-25 16:54:37 +03:00
cgroup_freezer.c exynos9810: kernel/drivers: remove samsung freecess 2023-02-21 00:08:33 +03:00
cgroup_pids.c cgroup: pids: use atomic64_t for pids->limit 2019-12-21 10:42:02 +01:00
compat.c
configs.c
context_tracking.c
cpu.c cpu: Silence log spam when a CPU is brought up 2023-02-21 00:15:16 +03:00
cpu_pm.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
cpuset.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
crash_dump.c
cred.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
delayacct.c import PAGE_BOOST from N10 Lite 2023-02-21 00:19:34 +03:00
dma.c
elfcore.c kernel/elfcore.c: include proper prototypes 2019-10-17 13:42:13 -07:00
exec_domain.c
exit.c import G96xFXXU9ETF5 2023-02-21 00:10:23 +03:00
extable.c bpf: make jited programs visible in traces 2024-09-25 16:54:39 +03:00
fork.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
freezer.c
futex.c futex: Unbreak futex hashing 2020-04-02 17:20:28 +02:00
futex_compat.c
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2018-01-10 09:29:52 +01:00
hung_task.c kernel: hung_task.c: disable on suspend 2019-04-20 09:07:52 +02:00
irq_work.c
jump_label.c jump_label: Invoke jump_label_test() via early_initcall() 2017-12-14 09:28:24 +01:00
kallsyms.c bpf: make jited programs visible in traces 2024-09-25 16:54:39 +03:00
kaslr.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c This is the 4.9.117 stable release 2018-08-03 08:50:05 +02:00
kexec.c
kexec_core.c objtool, x86: Add several functions and files to the objtool whitelist 2018-06-05 10:28:57 +02:00
kexec_file.c
kexec_internal.h
kmod.c Import G96XFXXUCFTJ2 OSRC 2023-02-21 00:10:26 +03:00
kprobes.c kprobes: Don't call BUG_ON() if there is a kprobe in use on free list 2019-11-25 09:52:15 +01:00
ksysfs.c
kthread.c kthread, tracing: Don't expose half-written comm when creating kthreads 2018-08-03 07:55:12 +02:00
latencytop.c
Makefile kernel: rename stock config for /proc/config.gz 2023-02-21 00:11:12 +03:00
membarrier.c Fix: Disable sys_membarrier when nohz_full is enabled 2017-03-12 06:41:45 +01:00
memremap.c mm, devm_memremap_pages: kill mapping "System RAM" support 2019-01-13 10:03:51 +01:00
module-internal.h
module.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
module_signing.c
notifier.c Merge 4.9.218 branch 'android-4.9-q' into tw10-android-4.9-q 2020-04-09 17:15:01 +03:00
nsproxy.c
padata.c padata: always acquire cpu_hotplug_lock before pinst->lock 2020-04-13 10:32:53 +02:00
panic.c locking/refcounts, x86/asm: Implement fast refcount overflow protection 2023-04-30 19:49:33 +03:00
params.c
pid.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
pid_namespace.c signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig 2019-08-04 09:33:16 +02:00
profile.c
ptrace.c import G96XFXXUGFUG4 OSRC 2023-02-21 00:10:27 +03:00
range.c
reboot.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
relay.c kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE 2018-05-30 07:50:29 +02:00
resource.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
seccomp.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
signal.c exynos9810: kernel/drivers: remove samsung freecess 2023-02-21 00:08:33 +03:00
smp.c smp: Avoid using two cache lines for struct call_single_data 2023-04-30 19:42:50 +03:00
smpboot.c
smpboot.h
softirq.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
stacktrace.c stacktrace, lockdep: Fix address, newline ugliness 2017-02-14 15:25:42 -08:00
stop_machine.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
sys.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
sys_ni.c BACKPORT: signal: support CLONE_PIDFD with pidfd_send_signal 2019-09-03 13:44:59 -07:00
sysctl.c kernel: sched: sync sched_rt_boost_threshold with N770F 2023-06-08 13:36:43 +03:00
sysctl_binary.c
task_work.c
taskstats.c taskstats: add e/u/stime for TGID command 2023-04-30 19:45:59 +03:00
test_kprobes.c
torture.c
tracepoint.c tracepoint: Do not warn on ENOMEM 2018-05-09 09:50:20 +02:00
tsacct.c
ucount.c kernel/ucount.c: mark user_header with kmemleak_ignore() 2017-06-17 06:41:51 +02:00
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2018-01-10 09:29:52 +01:00
up.c smp: Avoid using two cache lines for struct call_single_data 2023-04-30 19:42:50 +03:00
user-return-notifier.c
user.c ANDROID: proc: Add /proc/uid directory 2018-04-03 11:15:30 -07:00
user_namespace.c Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q 2020-02-12 12:32:38 +02:00
utsname.c
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-09-09 20:01:24 +02:00
watchdog.c import G965FXXU7DTAA OSRC 2020-02-04 13:50:09 +02:00
watchdog_hld.c kernel/watchdog: prevent false hardlockup on overloaded system 2017-06-17 06:41:57 +02:00
workqueue.c kernel/workqueue.c: remove ifdefs over wq_power_efficient 2023-02-21 00:09:19 +03:00
workqueue_internal.h UPSTREAM: psi: fix aggregation idle shut-off 2019-03-22 14:15:01 -07:00