Commit graph

498 commits

Author SHA1 Message Date
Diep Quynh
75f9b626ee
gaming_control: Remove task kill check
In case task_struct was locked out, we can't check its cmdline
If we do, we'll get a soft lockup

Background and top-app check is enough

Signed-off-by: Diep Quynh <remilia.1505@gmail.com>
2024-09-27 17:20:00 +03:00
Diep Quynh
609163c324
drivers: Introduce brand new kernel gaming mode
How to trigger gaming mode? Just open a game that is supported in games list
How to exit? Kill the game from recents, or simply back to homescreen

What does this gaming mode do?
- It limits big cluster maximum frequency to 2,0GHz, and little cluster
  to values matching GPU frequencies as below:
+ 338MHz: 455MHz
+ 385MHz and above: 1053MHz
- As for the cluster freq limits, it overcomes heating issue while playing
  heavy games, as well as saves battery juice

Big thanks to [kerneltoast] for the following commits on his wahoo kernel
5ac1e81d3d
e13e2c4554

Gaming control's idea was based on these

Signed-off-by: Diep Quynh <remilia.1505@gmail.com>
2024-09-27 17:19:55 +03:00
Dylan Neve
2ffda2a073
exynos9810: kernel/drivers: remove samsung freecess 2023-02-21 00:08:33 +03:00
FAROVITUS
eb6fae6224 Merge 4.9.217 branch 'android-4.9-q' into tw10-android-4.9-q 2020-03-23 16:23:34 +02:00
Greg Kroah-Hartman
d2adefff96 This is the 4.9.217 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl50ed8ACgkQONu9yGCS
 aT7A8A//WF8aq34T1LLR70ESa6fJVj2dEzCQA94qvUCPHDJfBj1x00R8OGvZfKEc
 E7YPI8MPrvMmXnyxImge2UcqCJhmnis+gk1fLn+N3Yi5nfZWrxC1dnShCnCTeye1
 tA+BuhyzOXHDmLqrktBcT3TCTrIUWK8lehR8m7jlkAh7wG6Z2QmgKlaiq5fRAVE9
 Z0je4fw0HxdVrCM84BJiM1M6w3pa+AYUkSBfqvU+fgm8xfOWGUPZFEitx9O1PwaH
 m0UOS51gM5ZSzpAMDBlXYnmUgPxUTlpwqIL77zgMFzYcpKbf2rVphFRxmacWWqQ2
 MWnt6vn/eatDeJ6TFAF08APy63SUAR7BvssSYoySz8Y4m6LbL4yUfCWYkL9lhLFH
 dJL6TTjUwFw1Rd7UVzHongEr24NASCS37MiR8V/3RUiDqIYtrlaJKxhhVdTKg7mm
 xRKA+QBXkGUfwFJm2mL+3E3D7/Xgl4kNo22lp1YgamTgc50/UMB5JEf8Z3aqa/qh
 ie/4oKnmaa+SSgOxQmvqS3lWF7RYU+axvc9K0FK5CYSXnc6Jfgtq0yDB5twslj05
 HdCwNJX9KOL0NZAzQKs+FLjnhOtnRigANMG7KMm5nmS3vwsxQqIkm0b6tPhgBcd/
 k4QQm8h6C5Tdi+R39fGQOgtBK0gVVJl+n3XfA3gbmunvXhSo6IQ=
 =fFH1
 -----END PGP SIGNATURE-----

Merge 4.9.217 into android-4.9-q

Changes in 4.9.217
	NFS: Remove superfluous kmap in nfs_readdir_xdr_to_array
	phy: Revert toggling reset changes.
	net: phy: Avoid multiple suspends
	cgroup, netclassid: periodically release file_lock on classid updating
	gre: fix uninit-value in __iptunnel_pull_header
	ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface
	net: macsec: update SCI upon MAC address change.
	net: nfc: fix bounds checking bugs on "pipe"
	r8152: check disconnect status after long sleep
	bnxt_en: reinitialize IRQs when MTU is modified
	fib: add missing attribute validation for tun_id
	nl802154: add missing attribute validation
	nl802154: add missing attribute validation for dev_type
	macsec: add missing attribute validation for port
	net: fq: add missing attribute validation for orphan mask
	team: add missing attribute validation for port ifindex
	team: add missing attribute validation for array index
	nfc: add missing attribute validation for SE API
	nfc: add missing attribute validation for vendor subcommand
	ipvlan: add cond_resched_rcu() while processing muticast backlog
	ipvlan: do not add hardware address of master to its unicast filter list
	ipvlan: egress mcast packets are not exceptional
	ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()
	ipvlan: don't deref eth hdr before checking it's set
	macvlan: add cond_resched() during multicast processing
	net: fec: validate the new settings in fec_enet_set_coalesce()
	slip: make slhc_compress() more robust against malicious packets
	bonding/alb: make sure arp header is pulled before accessing it
	cgroup: memcg: net: do not associate sock with unrelated cgroup
	net: phy: fix MDIO bus PM PHY resuming
	virtio-blk: fix hw_queue stopped on arbitrary error
	iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
	workqueue: don't use wq_select_unbound_cpu() for bound works
	drm/amd/display: remove duplicated assignment to grph_obj_type
	cifs_atomic_open(): fix double-put on late allocation failure
	gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
	KVM: x86: clear stale x86_emulate_ctxt->intercept value
	ARC: define __ALIGN_STR and __ALIGN symbols for ARC
	efi: Fix a race and a buffer overflow while reading efivars via sysfs
	iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
	iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
	nl80211: add missing attribute validation for critical protocol indication
	nl80211: add missing attribute validation for beacon report scanning
	nl80211: add missing attribute validation for channel switch
	netfilter: cthelper: add missing attribute validation for cthelper
	mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()
	iommu/vt-d: Fix the wrong printing in RHSA parsing
	iommu/vt-d: Ignore devices with out-of-spec domain number
	ipv6: restrict IPV6_ADDRFORM operation
	efi: Add a sanity check to efivar_store_raw()
	batman-adv: Fix double free during fragment merge error
	batman-adv: Fix transmission of final, 16th fragment
	batman-adv: Initialize gw sel_class via batadv_algo
	batman-adv: Fix rx packet/bytes stats on local ARP reply
	batman-adv: Use default throughput value on cfg80211 error
	batman-adv: Accept only filled wifi station info
	batman-adv: fix TT sync flag inconsistencies
	batman-adv: Avoid spurious warnings from bat_v neigh_cmp implementation
	batman-adv: Always initialize fragment header priority
	batman-adv: Fix check of retrieved orig_gw in batadv_v_gw_is_eligible
	batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq
	batman-adv: Fix internal interface indices types
	batman-adv: Avoid race in TT TVLV allocator helper
	batman-adv: Fix TT sync flags for intermediate TT responses
	batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs
	batman-adv: Fix debugfs path for renamed hardif
	batman-adv: Fix debugfs path for renamed softif
	batman-adv: Avoid storing non-TT-sync flags on singular entries too
	batman-adv: Fix multicast TT issues with bogus ROAM flags
	batman-adv: Prevent duplicated gateway_node entry
	batman-adv: Fix duplicated OGMs on NETDEV_UP
	batman-adv: Avoid free/alloc race when handling OGM2 buffer
	batman-adv: Avoid free/alloc race when handling OGM buffer
	batman-adv: Don't schedule OGM for disabled interface
	batman-adv: update data pointers after skb_cow()
	batman-adv: Avoid probe ELP information leak
	batman-adv: Use explicit tvlv padding for ELP packets
	perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
	ACPI: watchdog: Allow disabling WDAT at boot
	HID: apple: Add support for recent firmware on Magic Keyboards
	HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override
	cfg80211: check reg_rule for NULL in handle_channel_custom()
	net: ks8851-ml: Fix IRQ handling and locking
	mac80211: rx: avoid RCU list traversal under mutex
	signal: avoid double atomic counter increments for user accounting
	jbd2: fix data races at struct journal_head
	ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional()
	ARM: 8958/1: rename missed uaccess .fixup section
	mm: slub: add missing TID bump in kmem_cache_alloc_bulk()
	ipv4: ensure rcu_read_lock() in cipso_v4_error()
	Linux 4.9.217

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia7aeed273cd7548dc8d0dfaaad8b96bedfe499b1
2020-03-20 11:01:08 +01:00
Linus Torvalds
4306259ff6 signal: avoid double atomic counter increments for user accounting
[ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ]

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-20 09:07:57 +01:00
FAROVITUS
af1d3ae977 Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q
Documentation/filesystems/fscrypt.rst
	arch/arm/common/Kconfig
	arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
	arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
	arch/arm64/boot/dts/arm/juno-clocks.dtsi
	arch/arm64/boot/dts/broadcom/ns2.dtsi
	arch/arm64/boot/dts/lg/lg1312.dtsi
	arch/arm64/boot/dts/lg/lg1313.dtsi
	arch/arm64/boot/dts/marvell/armada-37xx.dtsi
	arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi
	arch/arm64/boot/dts/nvidia/tegra210-p2597.dtsi
	arch/arm64/boot/dts/nvidia/tegra210.dtsi
	arch/arm64/boot/dts/qcom/apq8016-sbc.dtsi
	arch/arm64/boot/dts/qcom/msm8996.dtsi
	arch/arm64/configs/ranchu64_defconfig
	arch/arm64/include/asm/cpucaps.h
	arch/arm64/kernel/cpufeature.c
	arch/arm64/kernel/traps.c
	arch/arm64/mm/mmu.c
	crypto/Makefile
	crypto/ablkcipher.c
	crypto/blkcipher.c
	crypto/testmgr.h
	crypto/zstd.c
	drivers/android/binder.c
	drivers/android/binder_alloc.c
	drivers/char/random.c
	drivers/clocksource/exynos_mct.c
	drivers/dma/pl330.c
	drivers/hid/hid-sony.c
	drivers/hid/uhid.c
	drivers/hid/usbhid/hiddev.c
	drivers/i2c/i2c-core.c
	drivers/md/dm-crypt.c
	drivers/media/v4l2-core/videobuf2-v4l2.c
	drivers/mmc/host/dw_mmc.c
	drivers/net/ethernet/broadcom/tg3.c
	drivers/net/usb/r8152.c
	drivers/scsi/scsi_logging.c
	drivers/scsi/sd.c
	drivers/scsi/ufs/ufshcd-pci.c
	drivers/scsi/ufs/ufshcd-pltfrm.c
	drivers/staging/android/Kconfig
	drivers/staging/android/ion/ion.c
	drivers/staging/android/ion/ion_priv.h
	drivers/staging/android/ion/ion_system_heap.c
	drivers/staging/android/lowmemorykiller.c
	drivers/tty/serial/samsung.c
	drivers/usb/dwc3/core.c
	drivers/usb/dwc3/gadget.c
	drivers/usb/host/xhci-hub.c
	drivers/video/fbdev/core/fbmon.c
	drivers/video/fbdev/core/modedb.c
	fs/crypto/fname.c
	fs/crypto/fscrypt_private.h
	fs/crypto/keyinfo.c
	fs/ext4/ialloc.c
	fs/ext4/namei.c
	fs/ext4/xattr.c
	fs/f2fs/checkpoint.c
	fs/f2fs/data.c
	fs/f2fs/debug.c
	fs/f2fs/dir.c
	fs/f2fs/f2fs.h
	fs/f2fs/file.c
	fs/f2fs/gc.c
	fs/f2fs/inline.c
	fs/f2fs/inode.c
	fs/f2fs/namei.c
	fs/f2fs/node.c
	fs/f2fs/recovery.c
	fs/f2fs/segment.c
	fs/f2fs/segment.h
	fs/f2fs/super.c
	fs/f2fs/sysfs.c
	fs/fat/dir.c
	fs/fat/fatent.c
	fs/file.c
	fs/namespace.c
	fs/pnode.c
	fs/proc/inode.c
	fs/proc/root.c
	fs/proc/task_mmu.c
	fs/sdcardfs/dentry.c
	fs/sdcardfs/derived_perm.c
	fs/sdcardfs/file.c
	fs/sdcardfs/inode.c
	fs/sdcardfs/lookup.c
	fs/sdcardfs/main.c
	fs/sdcardfs/sdcardfs.h
	fs/sdcardfs/super.c
	include/linux/blk_types.h
	include/linux/cpuhotplug.h
	include/linux/cred.h
	include/linux/fb.h
	include/linux/power_supply.h
	include/linux/sched.h
	include/linux/zstd.h
	include/trace/events/sched.h
	include/uapi/linux/android/binder.h
	init/Kconfig
	init/main.c
	kernel/bpf/hashtab.c
	kernel/cpu.c
	kernel/cred.c
	kernel/fork.c
	kernel/locking/spinlock_debug.c
	kernel/panic.c
	kernel/printk/printk.c
	kernel/sched/Makefile
	kernel/sched/core.c
	kernel/sched/fair.c
	kernel/sched/rt.c
	kernel/sched/walt.c
	kernel/sched/walt.h
	kernel/trace/trace.c
	lib/bug.c
	lib/list_debug.c
	lib/vsprintf.c
	lib/zstd/bitstream.h
	lib/zstd/compress.c
	lib/zstd/decompress.c
	lib/zstd/fse.h
	lib/zstd/fse_compress.c
	lib/zstd/fse_decompress.c
	lib/zstd/huf_compress.c
	lib/zstd/huf_decompress.c
	lib/zstd/zstd_internal.h
	mm/debug.c
	mm/filemap.c
	mm/rmap.c
	net/core/filter.c
	net/ipv4/sysctl_net_ipv4.c
	net/ipv4/sysfs_net_ipv4.c
	net/ipv4/tcp_input.c
	net/ipv4/tcp_output.c
	net/ipv4/udp.c
	net/ipv6/netfilter/nf_conntrack_reasm.c
	net/netfilter/Kconfig
	net/netfilter/Makefile
	net/netfilter/xt_qtaguid.c
	net/netfilter/xt_qtaguid_internal.h
	net/xfrm/xfrm_policy.c
	net/xfrm/xfrm_state.c
	scripts/checkpatch.pl
	security/selinux/hooks.c
	sound/core/compress_offload.c
2020-02-12 12:32:38 +02:00
FAROVITUS
2b92eefa41 import G965FXXU7DTAA OSRC
*First release for Android (Q).

Signed-off-by: FAROVITUS <farovitus@gmail.com>
2020-02-04 13:50:09 +02:00
Greg Kroah-Hartman
7e0f964aa2 This is the 4.9.212 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4xT1kACgkQONu9yGCS
 aT7d9Q/9FvxHEFvYlet8Bfx/uLFsVGOKwDiZ+e7eQhYqzamdl75qLbPbPqcyfeDO
 hCa87JFJCCpRlcHjSvdNk/IF9rRpA6+Wi7FijX/UjgmWNPCao6qb5vBRoH9KMgOk
 rtDnIp+klLcG+xLAnTkLgAVCm7osGMVZgwKKqhCLEKdVIT38VlP9nCipehTeMmrS
 1f8sqQhqXt0FpticFP5JVJXg3iH5bdqXNc9eCXuF1h7nudYEut8v5NFoF3w0EInZ
 iMKnLhbfR9ScOpzaWE/Hg5tJIUwFG9l/IhlDDJJk3O0oTKzs+ZM4N8+RzZ3iFy+N
 JNzspQatLKIHSbGGyGK61bu4Eq2bGdCaQMPcMmFHiDKtz7+thE7bjZOcPjzo4pjA
 3JJ9ytVUAujIXKryed5E/vNfG6W/puIY6xaM1LPJPIYGa20/0gJGiWXnsXBk6NxG
 EvkuC1nwYcIdkopYqM4OZz8n63ywya0pY5JifVW6q3zYU8sqwC9mkkAYGjT0IRb8
 llvUyaO05nIDJkY2mgv124cxxyKTsUYPc7+c/VTW7HGyzAc7MLPOpYDbXZHW9WWT
 GNGhlBJlk1UK87CRoHswuPgmx+eABXtGjmwHQJbxRmf1y7aWFNjYBKAyQCi9y/sm
 XW4LT4l4JhT3xB5SFcti1iqchD2OFk65CKCFeHkm54URXb68Qjs=
 =PuM5
 -----END PGP SIGNATURE-----

Merge 4.9.212 into android-4.9-q

Changes in 4.9.212
	xfs: Sanity check flags of Q_XQUOTARM call
	powerpc/archrandom: fix arch_get_random_seed_int()
	mt7601u: fix bbp version check in mt7601u_wait_bbp_ready
	drm/sti: do not remove the drm_bridge that was never added
	drm/virtio: fix bounds check in virtio_gpu_cmd_get_capset()
	ALSA: hda: fix unused variable warning
	IB/rxe: replace kvfree with vfree
	ALSA: usb-audio: update quirk for B&W PX to remove microphone
	staging: comedi: ni_mio_common: protect register write overflow
	pwm: lpss: Release runtime-pm reference from the driver's remove callback
	mlxsw: reg: QEEC: Add minimum shaper fields
	pcrypt: use format specifier in kobject_add
	exportfs: fix 'passing zero to ERR_PTR()' warning
	drm/dp_mst: Skip validating ports during destruction, just ref
	net: phy: Fix not to call phy_resume() if PHY is not attached
	pinctrl: sh-pfc: r8a7740: Add missing REF125CK pin to gether_gmii group
	pinctrl: sh-pfc: r8a7740: Add missing LCD0 marks to lcd0_data24_1 group
	pinctrl: sh-pfc: r8a7791: Remove bogus ctrl marks from qspi_data4_b group
	pinctrl: sh-pfc: r8a7791: Remove bogus marks from vin1_b_data18 group
	pinctrl: sh-pfc: sh73a0: Add missing TO pin to tpu4_to3 group
	pinctrl: sh-pfc: r8a7794: Remove bogus IPSR9 field
	pinctrl: sh-pfc: sh7734: Add missing IPSR11 field
	pinctrl: sh-pfc: sh7269: Add missing PCIOR0 field
	pinctrl: sh-pfc: sh7734: Remove bogus IPSR10 value
	Input: nomadik-ske-keypad - fix a loop timeout test
	clk: highbank: fix refcount leak in hb_clk_init()
	clk: qoriq: fix refcount leak in clockgen_init()
	clk: socfpga: fix refcount leak
	clk: samsung: exynos4: fix refcount leak in exynos4_get_xom()
	clk: imx6q: fix refcount leak in imx6q_clocks_init()
	clk: imx6sx: fix refcount leak in imx6sx_clocks_init()
	clk: imx7d: fix refcount leak in imx7d_clocks_init()
	clk: vf610: fix refcount leak in vf610_clocks_init()
	clk: armada-370: fix refcount leak in a370_clk_init()
	clk: kirkwood: fix refcount leak in kirkwood_clk_init()
	clk: armada-xp: fix refcount leak in axp_clk_init()
	clk: dove: fix refcount leak in dove_clk_init()
	IB/usnic: Fix out of bounds index check in query pkey
	RDMA/ocrdma: Fix out of bounds index check in query pkey
	RDMA/qedr: Fix out of bounds index check in query pkey
	arm64: dts: apq8016-sbc: Increase load on l11 for SDCARD
	drm/etnaviv: NULL vs IS_ERR() buf in etnaviv_core_dump()
	media: s5p-jpeg: Correct step and max values for V4L2_CID_JPEG_RESTART_INTERVAL
	crypto: tgr192 - fix unaligned memory access
	ASoC: imx-sgtl5000: put of nodes if finding codec fails
	IB/iser: Pass the correct number of entries for dma mapped SGL
	rtc: cmos: ignore bogus century byte
	clk: sunxi-ng: sun8i-a23: Enable PLL-MIPI LDOs when ungating it
	iwlwifi: mvm: fix A-MPDU reference assignment
	tty: ipwireless: Fix potential NULL pointer dereference
	crypto: crypto4xx - Fix wrong ppc4xx_trng_probe()/ppc4xx_trng_remove() arguments
	ARM: dts: lpc32xx: add required clocks property to keypad device node
	ARM: dts: lpc32xx: reparent keypad controller to SIC1
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller variant
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller clocks property
	ARM: dts: lpc32xx: phy3250: fix SD card regulator voltage
	iwlwifi: mvm: fix RSS config command
	staging: most: cdev: add missing check for cdev_add failure
	rtc: ds1672: fix unintended sign extension
	thermal: mediatek: fix register index error
	net: phy: fixed_phy: Fix fixed_phy not checking GPIO
	rtc: 88pm860x: fix unintended sign extension
	rtc: 88pm80x: fix unintended sign extension
	rtc: pm8xxx: fix unintended sign extension
	fbdev: chipsfb: remove set but not used variable 'size'
	iw_cxgb4: use tos when importing the endpoint
	iw_cxgb4: use tos when finding ipv6 routes
	pinctrl: sh-pfc: emev2: Add missing pinmux functions
	pinctrl: sh-pfc: r8a7791: Fix scifb2_data_c pin group
	pinctrl: sh-pfc: r8a7792: Fix vin1_data18_b pin group
	pinctrl: sh-pfc: sh73a0: Fix fsic_spdif pin groups
	usb: phy: twl6030-usb: fix possible use-after-free on remove
	block: don't use bio->bi_vcnt to figure out segment number
	keys: Timestamp new keys
	vfio_pci: Enable memory accesses before calling pci_map_rom
	dmaengine: mv_xor: Use correct device for DMA API
	cdc-wdm: pass return value of recover_from_urb_loss
	regulator: pv88060: Fix array out-of-bounds access
	regulator: pv88080: Fix array out-of-bounds access
	regulator: pv88090: Fix array out-of-bounds access
	net: dsa: qca8k: Enable delay for RGMII_ID mode
	drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON
	drm/nouveau/pmu: don't print reply values if exec is false
	ASoC: qcom: Fix of-node refcount unbalance in apq8016_sbc_parse_of()
	fs/nfs: Fix nfs_parse_devname to not modify it's argument
	NFS: Fix a soft lockup in the delegation recovery code
	clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
	clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
	mmc: sdhci-brcmstb: handle mmc_of_parse() errors during probe
	ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used
	ARM: 8848/1: virt: Align GIC version check with arm64 counterpart
	regulator: wm831x-dcdc: Fix list of wm831x_dcdc_ilim from mA to uA
	nios2: ksyms: Add missing symbol exports
	scsi: megaraid_sas: reduce module load time
	drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen()
	xen, cpu_hotplug: Prevent an out of bounds access
	net: sh_eth: fix a missing check of of_get_phy_mode
	media: ivtv: update *pos correctly in ivtv_read_pos()
	media: cx18: update *pos correctly in cx18_read_pos()
	media: wl128x: Fix an error code in fm_download_firmware()
	media: cx23885: check allocation return
	regulator: tps65086: Fix tps65086_ldoa1_ranges for selector 0xB
	jfs: fix bogus variable self-initialization
	tipc: tipc clang warning
	m68k: mac: Fix VIA timer counter accesses
	ARM: OMAP2+: Fix potentially uninitialized return value for _setup_reset()
	media: davinci-isif: avoid uninitialized variable use
	media: tw5864: Fix possible NULL pointer dereference in tw5864_handle_frame
	spi: tegra114: clear packed bit for unpacked mode
	spi: tegra114: fix for unpacked mode transfers
	soc/fsl/qe: Fix an error code in qe_pin_request()
	spi: bcm2835aux: fix driver to not allow 65535 (=-1) cs-gpios
	ehea: Fix a copy-paste err in ehea_init_port_res
	scsi: qla2xxx: Unregister chrdev if module initialization fails
	ARM: pxa: ssp: Fix "WARNING: invalid free of devm_ allocated data"
	hwmon: (w83627hf) Use request_muxed_region for Super-IO accesses
	tipc: set sysctl_tipc_rmem and named_timeout right range
	powerpc: vdso: Make vdso32 installation conditional in vdso_install
	ARM: dts: ls1021: Fix SGMII PCS link remaining down after PHY disconnect
	media: ov2659: fix unbalanced mutex_lock/unlock
	6lowpan: Off by one handling ->nexthdr
	dmaengine: axi-dmac: Don't check the number of frames for alignment
	ALSA: usb-audio: Handle the error from snd_usb_mixer_apply_create_quirk()
	packet: in recvmsg msg_name return at least sizeof sockaddr_ll
	ASoC: fix valid stream condition
	usb: gadget: fsl: fix link error against usb-gadget module
	IB/mlx5: Add missing XRC options to QP optional params mask
	iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU
	net: ena: fix swapped parameters when calling ena_com_indirect_table_fill_entry
	net: ena: fix: Free napi resources when ena_up() fails
	net: ena: fix incorrect test of supported hash function
	net: ena: fix ena_com_fill_hash_function() implementation
	dmaengine: tegra210-adma: restore channel status
	l2tp: Fix possible NULL pointer dereference
	media: omap_vout: potential buffer overflow in vidioc_dqbuf()
	media: davinci/vpbe: array underflow in vpbe_enum_outputs()
	platform/x86: alienware-wmi: printing the wrong error code
	netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule
	pwm: meson: Don't disable PWM when setting duty repeatedly
	ARM: riscpc: fix lack of keyboard interrupts after irq conversion
	kdb: do a sanity check on the cpu in kdb_per_cpu()
	backlight: lm3630a: Return 0 on success in update_status functions
	thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
	dmaengine: tegra210-adma: Fix crash during probe
	spi: spi-fsl-spi: call spi_finalize_current_message() at the end
	crypto: ccp - fix AES CFB error exposed by new test vectors
	serial: stm32: fix transmit_chars when tx is stopped
	misc: sgi-xp: Properly initialize buf in xpc_get_rsvd_page_pa
	iommu: Use right function to get group for device
	signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig
	inet: frags: call inet_frags_fini() after unregister_pernet_subsys()
	media: vivid: fix incorrect assignment operation when setting video mode
	powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
	drm/msm/mdp5: Fix mdp5_cfg_init error return
	net: netem: fix backlog accounting for corrupted GSO frames
	net/af_iucv: always register net_device notifier
	ASoC: ti: davinci-mcasp: Fix slot mask settings when using multiple AXRs
	rtc: pcf8563: Clear event flags and disable interrupts before requesting irq
	drm/msm/a3xx: remove TPL1 regs from snapshot
	perf/ioctl: Add check for the sample_period value
	dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width"
	clk: qcom: Fix -Wunused-const-variable
	iommu/amd: Make iommu_disable safer
	mfd: intel-lpss: Release IDA resources
	rxrpc: Fix uninitialized error code in rxrpc_send_data_packet()
	devres: allow const resource arguments
	RDMA/hns: Fixs hw access invalid dma memory error
	net: pasemi: fix an use-after-free in pasemi_mac_phy_init()
	scsi: libfc: fix null pointer dereference on a null lport
	libertas_tf: Use correct channel range in lbtf_geo_init
	qed: reduce maximum stack frame size
	usb: host: xhci-hub: fix extra endianness conversion
	mic: avoid statically declaring a 'struct device'.
	x86/kgbd: Use NMI_VECTOR not APIC_DM_NMI
	ALSA: aoa: onyx: always initialize register read value
	net/mlx5: Fix mlx5_ifc_query_lag_out_bits
	cifs: fix rmmod regression in cifs.ko caused by force_sig changes
	crypto: caam - free resources in case caam_rng registration failed
	ext4: set error return correctly when ext4_htree_store_dirent fails
	ASoC: es8328: Fix copy-paste error in es8328_right_line_controls
	ASoC: cs4349: Use PM ops 'cs4349_runtime_pm'
	ASoC: wm8737: Fix copy-paste error in wm8737_snd_controls
	signal: Allow cifs and drbd to receive their terminating signals
	ASoC: sun4i-i2s: RX and TX counter registers are swapped
	dmaengine: dw: platform: Switch to acpi_dma_controller_register()
	mac80211: minstrel_ht: fix per-group max throughput rate initialization
	mips: avoid explicit UB in assignment of mips_io_port_base
	ahci: Do not export local variable ahci_em_messages
	Partially revert "kfifo: fix kfifo_alloc() and kfifo_init()"
	hwmon: (lm75) Fix write operations for negative temperatures
	power: supply: Init device wakeup after device_add()
	x86, perf: Fix the dependency of the x86 insn decoder selftest
	staging: greybus: light: fix a couple double frees
	bcma: fix incorrect update of BCMA_CORE_PCI_MDIO_DATA
	iio: dac: ad5380: fix incorrect assignment to val
	ath9k: dynack: fix possible deadlock in ath_dynack_node_{de}init
	net: sonic: return NETDEV_TX_OK if failed to map buffer
	Btrfs: fix hang when loading existing inode cache off disk
	hwmon: (shtc1) fix shtc1 and shtw1 id mask
	net: sonic: replace dev_kfree_skb in sonic_send_packet
	net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names'
	iommu/amd: Wait for completion of IOTLB flush in attach_device
	net: hisilicon: Fix signedness bug in hix5hd2_dev_probe()
	net: broadcom/bcmsysport: Fix signedness in bcm_sysport_probe()
	net: stmmac: dwmac-meson8b: Fix signedness bug in probe
	of: mdio: Fix a signedness bug in of_phy_get_and_connect()
	net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse()
	nvme: retain split access workaround for capability reads
	net: stmmac: gmac4+: Not all Unicast addresses may be available
	mac80211: accept deauth frames in IBSS mode
	llc: fix another potential sk_buff leak in llc_ui_sendmsg()
	llc: fix sk_buff refcounting in llc_conn_state_process()
	net: stmmac: fix length of PTP clock's name string
	act_mirred: Fix mirred_init_module error handling
	drm/msm/dsi: Implement reset correctly
	dmaengine: imx-sdma: fix size check for sdma script_number
	net: netem: fix error path for corrupted GSO frames
	net: netem: correct the parent's backlog when corrupted packet was dropped
	net: qca_spi: Move reset_count to struct qcaspi
	afs: Fix large file support
	media: ov6650: Fix incorrect use of JPEG colorspace
	media: ov6650: Fix some format attributes not under control
	media: ov6650: Fix .get_fmt() V4L2_SUBDEV_FORMAT_TRY support
	MIPS: Loongson: Fix return value of loongson_hwmon_init
	net: neigh: use long type to store jiffies delta
	packet: fix data-race in fanout_flow_is_huge()
	dmaengine: ti: edma: fix missed failure handling
	drm/radeon: fix bad DMA from INTERRUPT_CNTL2
	arm64: dts: juno: Fix UART frequency
	IB/iser: Fix dma_nents type definition
	m68k: Call timer_interrupt() with interrupts disabled
	net: ethtool: Add back transceiver type
	net: phy: Keep reporting transceiver type
	can, slip: Protect tty->disc_data in write_wakeup and close with RCU
	firestream: fix memory leaks
	net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM
	net, ip6_tunnel: fix namespaces move
	net, ip_tunnel: fix namespaces move
	net_sched: fix datalen for ematch
	tcp_bbr: improve arithmetic division in bbr_update_bw()
	net: usb: lan78xx: Add .ndo_features_check
	gtp: make sure only SOCK_DGRAM UDP sockets are accepted
	hwmon: (adt7475) Make volt2reg return same reg as reg2volt input
	hwmon: (core) Simplify sysfs attribute name allocation
	hwmon: Deal with errors from the thermal subsystem
	hwmon: (core) Fix double-free in __hwmon_device_register()
	hwmon: (core) Do not use device managed functions for memory allocations
	Input: keyspan-remote - fix control-message timeouts
	ARM: 8950/1: ftrace/recordmcount: filter relocation types
	mmc: tegra: fix SDR50 tuning override
	mmc: sdhci: fix minimum clock rate for v3 controller
	Input: sur40 - fix interface sanity checks
	Input: gtco - fix endpoint sanity check
	Input: aiptek - fix endpoint sanity check
	Input: pegasus_notetaker - fix endpoint sanity check
	Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register
	hwmon: (nct7802) Fix voltage limits to wrong registers
	scsi: RDMA/isert: Fix a recently introduced regression related to logout
	tracing: xen: Ordered comparison of function pointers
	do_last(): fetch directory ->i_mode and ->i_uid before it's too late
	Documentation: Document arm64 kpti control
	arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
	coresight: etb10: Do not call smp_processor_id from preemptible
	coresight: tmc-etf: Do not call smp_processor_id from preemptible
	libertas: Fix two buffer overflows at parsing bss descriptor
	bcache: silence static checker warning
	scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func
	md: Avoid namespace collision with bitmap API
	bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()
	netfilter: ipset: use bitmap infrastructure completely
	net/x25: fix nonblocking connect
	Linux 4.9.212

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2e83a05c5f119a7467a4d6984045d45d0c06b764
2020-01-29 10:47:55 +01:00
Eric W. Biederman
a371457c19 signal: Allow cifs and drbd to receive their terminating signals
[ Upstream commit 33da8e7c814f77310250bb54a9db36a44c5de784 ]

My recent to change to only use force_sig for a synchronous events
wound up breaking signal reception cifs and drbd.  I had overlooked
the fact that by default kthreads start out with all signals set to
SIG_IGN.  So a change I thought was safe turned out to have made it
impossible for those kernel thread to catch their signals.

Reverting the work on force_sig is a bad idea because what the code
was doing was very much a misuse of force_sig.  As the way force_sig
ultimately allowed the signal to happen was to change the signal
handler to SIG_DFL.  Which after the first signal will allow userspace
to send signals to these kernel threads.  At least for
wake_ack_receiver in drbd that does not appear actively wrong.

So correct this problem by adding allow_kernel_signal that will allow
signals whose siginfo reports they were sent by the kernel through,
but will not allow userspace generated signals, and update cifs and
drbd to call allow_kernel_signal in an appropriate place so that their
thread can receive this signal.

Fixing things this way ensures that userspace won't be able to send
signals and cause problems, that it is clear which signals the
threads are expecting to receive, and it guarantees that nothing
else in the system will be affected.

This change was partly inspired by similar cifs and drbd patches that
added allow_signal.

Reported-by: ronnie sahlberg <ronniesahlberg@gmail.com>
Reported-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Steve French <smfrench@gmail.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Fixes: 247bc9470b1e ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes")
Fixes: 72abe3bcf091 ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig")
Fixes: fee109901f39 ("signal/drbd: Use send_sig not force_sig")
Fixes: 3cf5d076fb4d ("signal: Remove task parameter from force_sig")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-29 10:24:30 +01:00
Greg Kroah-Hartman
13ff5130ff This is the 4.9.203 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3blr8ACgkQONu9yGCS
 aT5BrQ//bhVt4Fy5NPJ/QlzpslU7XrwTYtGH0foNbUVvUeeJjiTYZj4XrRDE1VhC
 dQJNYRs8wqxG1EhNV4ydpCyLZgG+iLFW0pJc+k47WL+KJAChN0gQT0lGRkZ90GUJ
 1+FAM9/QJ5T5Z5T3sMfigp+nUDDoWgWrXl4RBxxbROmYu+xgOOa5agWxqzJDh3CU
 bjMF9heQEbpK26NyCqpaumL/nBaj86c/Qspq9PHeHwQDF3Sv+XEDgx8S94t8Zc1K
 DJojA3Nyl87FQOpdR/1UFCjge7GcdMP2xozRYGzMXqUwNCHjm6crKoM2EaGqHf2a
 HrFDUKL+tC6pE0YrPnrsaQkEFauZEB4Zm9zegwuIqxEovM3o+VkjiTi4XI4VdrZT
 t5VVE2tirlvNEAIN1BSbMqZnlE+c+8PpObABLhQtpmNAJP+PNXVeze600uissLbp
 grAU7hVvC89uzhPTbfhUyAginAAlqftDaO/YvdPlHXYCv4bLGcF1PYyEH2WfzAPp
 fpFJwyVE5tUev5WT4Xm1tAzK5EsW0aUZ69y0+RqwC1mKXTvcHQy9A0u+5ccmQTl9
 KSoR2Z/y2qvzfUAB0umAsXQohLUhZYtE4cCyj1us0WDy/0DI3HiJK5rDmJElM3DQ
 4PcGN4QtH2PZUogtPaHdwOnL9WS7kwFv+RKDdV3Mjw8bTZPiZg0=
 =TWjb
 -----END PGP SIGNATURE-----

Merge 4.9.203 into android-4.9-q

Changes in 4.9.203
	ax88172a: fix information leak on short answers
	slip: Fix memory leak in slip_open error path
	ALSA: usb-audio: Fix missing error check at mixer resolution test
	ALSA: usb-audio: not submit urb for stopped endpoint
	Input: ff-memless - kill timer in destroy()
	Input: synaptics-rmi4 - fix video buffer size
	Input: synaptics-rmi4 - clear IRQ enables for F54
	Input: synaptics-rmi4 - destroy F54 poller workqueue when removing
	IB/hfi1: Ensure full Gen3 speed in a Gen4 system
	ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
	ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
	iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
	mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
	mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
	mmc: sdhci-of-at91: fix quirk2 overwrite
	ath10k: fix kernel panic by moving pci flush after napi_disable
	iio: dac: mcp4922: fix error handling in mcp4922_write_raw
	ALSA: pcm: signedness bug in snd_pcm_plug_alloc()
	arm64: dts: tegra210-p2180: Correct sdmmc4 vqmmc-supply
	ARM: dts: at91/trivial: Fix USART1 definition for at91sam9g45
	cfg80211: Avoid regulatory restore when COUNTRY_IE_IGNORE is set
	ALSA: seq: Do error checks at creating system ports
	ath9k: fix tx99 with monitor mode interface
	gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
	ASoC: dpcm: Properly initialise hw->rate_max
	MIPS: BCM47XX: Enable USB power on Netgear WNDR3400v3
	ARM: dts: exynos: Fix sound in Snow-rev5 Chromebook
	ARM: dts: exynos: Fix regulators configuration on Peach Pi/Pit Chromebooks
	i40e: use correct length for strncpy
	i40e: hold the rtnl lock on clearing interrupt scheme
	i40e: Prevent deleting MAC address from VF when set by PF
	IB/rxe: fixes for rdma read retry
	iwlwifi: mvm: avoid sending too many BARs
	ARM: dts: pxa: fix power i2c base address
	rtl8187: Fix warning generated when strncpy() destination length matches the sixe argument
	net: lan78xx: Bail out if lan78xx_get_endpoints fails
	ASoC: sgtl5000: avoid division by zero if lo_vag is zero
	ARM: dts: exynos: Disable pull control for S5M8767 PMIC
	ath10k: wmi: disable softirq's while calling ieee80211_rx
	mips: txx9: fix iounmap related issue
	ASoC: Intel: hdac_hdmi: Limit sampling rates at dai creation
	of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
	ARM: dts: omap3-gta04: give spi_lcd node a label so that we can overwrite in other DTS files
	ARM: dts: omap3-gta04: fixes for tvout / venc
	ARM: dts: omap3-gta04: tvout: enable as display1 alias
	ARM: dts: omap3-gta04: fix touchscreen tsc2007
	ARM: dts: omap3-gta04: make NAND partitions compatible with recent U-Boot
	ARM: dts: omap3-gta04: keep vpll2 always on
	dmaengine: dma-jz4780: Don't depend on MACH_JZ4780
	dmaengine: dma-jz4780: Further residue status fix
	ath9k: add back support for using active monitor interfaces for tx99
	signal: Always ignore SIGKILL and SIGSTOP sent to the global init
	signal: Properly deliver SIGILL from uprobes
	signal: Properly deliver SIGSEGV from x86 uprobes
	f2fs: fix memory leak of percpu counter in fill_super()
	scsi: sym53c8xx: fix NULL pointer dereference panic in sym_int_sir()
	ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set
	scsi: pm80xx: Corrected dma_unmap_sg() parameter
	scsi: pm80xx: Fixed system hang issue during kexec boot
	kprobes: Don't call BUG_ON() if there is a kprobe in use on free list
	nvmem: core: return error code instead of NULL from nvmem_device_get
	media: fix: media: pci: meye: validate offset to avoid arbitrary access
	media: dvb: fix compat ioctl translation
	ALSA: intel8x0m: Register irq handler after register initializations
	pinctrl: at91-pio4: fix has_config check in atmel_pctl_dt_subnode_to_map()
	llc: avoid blocking in llc_sap_close()
	ARM: dts: qcom: ipq4019: fix cpu0's qcom,saw2 reg value
	powerpc/vdso: Correct call frame information
	ARM: dts: socfpga: Fix I2C bus unit-address error
	pinctrl: at91: don't use the same irqchip with multiple gpiochips
	cxgb4: Fix endianness issue in t4_fwcache()
	power: supply: ab8500_fg: silence uninitialized variable warnings
	power: reset: at91-poweroff: do not procede if at91_shdwc is allocated
	power: supply: max8998-charger: Fix platform data retrieval
	component: fix loop condition to call unbind() if bind() fails
	kernfs: Fix range checks in kernfs_get_target_path
	ip_gre: fix parsing gre header in ipgre_err
	ARM: dts: rockchip: Fix erroneous SPI bus dtc warnings on rk3036
	ath9k: Fix a locking bug in ath9k_add_interface()
	s390/qeth: invoke softirqs after napi_schedule()
	PCI/ACPI: Correct error message for ASPM disabling
	serial: mxs-auart: Fix potential infinite loop
	powerpc/iommu: Avoid derefence before pointer check
	powerpc/64s/hash: Fix stab_rr off by one initialization
	powerpc/pseries: Disable CPU hotplug across migrations
	RDMA/i40iw: Fix incorrect iterator type
	libfdt: Ensure INT_MAX is defined in libfdt_env.h
	power: supply: twl4030_charger: fix charging current out-of-bounds
	power: supply: twl4030_charger: disable eoc interrupt on linear charge
	net: toshiba: fix return type of ndo_start_xmit function
	net: xilinx: fix return type of ndo_start_xmit function
	net: broadcom: fix return type of ndo_start_xmit function
	net: amd: fix return type of ndo_start_xmit function
	usb: chipidea: imx: enable OTG overcurrent in case USB subsystem is already started
	usb: chipidea: Fix otg event handler
	mlxsw: spectrum: Init shaper for TCs 8..15
	ARM: dts: am335x-evm: fix number of cpsw
	f2fs: fix to recover inode's uid/gid during POR
	ARM: dts: ux500: Correct SCU unit address
	ARM: dts: ux500: Fix LCDA clock line muxing
	ARM: dts: ste: Fix SPI controller node names
	spi: pic32: Use proper enum in dmaengine_prep_slave_rg
	cpufeature: avoid warning when compiling with clang
	ARM: dts: marvell: Fix SPI and I2C bus warnings
	bnx2x: Ignore bandwidth attention in single function mode
	net: micrel: fix return type of ndo_start_xmit function
	x86/CPU: Use correct macros for Cyrix calls
	MIPS: kexec: Relax memory restriction
	media: pci: ivtv: Fix a sleep-in-atomic-context bug in ivtv_yuv_init()
	media: au0828: Fix incorrect error messages
	media: davinci: Fix implicit enum conversion warning
	usb: gadget: uvc: configfs: Drop leaked references to config items
	usb: gadget: uvc: configfs: Prevent format changes after linking header
	phy: phy-twl4030-usb: fix denied runtime access
	usb: gadget: uvc: Factor out video USB request queueing
	usb: gadget: uvc: Only halt video streaming endpoint in bulk mode
	coresight: Fix handling of sinks
	coresight: etm4x: Configure EL2 exception level when kernel is running in HYP
	coresight: tmc: Fix byte-address alignment for RRP
	misc: kgdbts: Fix restrict error
	misc: genwqe: should return proper error value.
	vfio/pci: Fix potential memory leak in vfio_msi_cap_len
	vfio/pci: Mask buggy SR-IOV VF INTx support
	scsi: libsas: always unregister the old device if going to discover new
	ARM: dts: tegra30: fix xcvr-setup-use-fuses
	ARM: tegra: apalis_t30: fix mmc1 cmd pull-up
	ARM: dts: paz00: fix wakeup gpio keycode
	net: smsc: fix return type of ndo_start_xmit function
	EDAC: Raise the maximum number of memory controllers
	ARM: dts: realview: Fix SPI controller node names
	Bluetooth: L2CAP: Detect if remote is not able to use the whole MPS
	crypto: s5p-sss: Fix Fix argument list alignment
	crypto: fix a memory leak in rsa-kcs1pad's encryption mode
	scsi: NCR5380: Clear all unissued commands on host reset
	scsi: NCR5380: Use DRIVER_SENSE to indicate valid sense data
	scsi: NCR5380: Check for invalid reselection target
	scsi: NCR5380: Don't clear busy flag when abort fails
	scsi: NCR5380: Don't call dsprintk() following reselection interrupt
	scsi: NCR5380: Handle BUS FREE during reselection
	arm64: dts: amd: Fix SPI bus warnings
	arm64: dts: lg: Fix SPI controller node names
	ARM: dts: lpc32xx: Fix SPI controller node names
	usb: xhci-mtk: fix ISOC error when interval is zero
	fuse: use READ_ONCE on congestion_threshold and max_background
	IB/iser: Fix possible NULL deref at iser_inv_desc()
	memfd: Use radix_tree_deref_slot_protected to avoid the warning.
	slcan: Fix memory leak in error path
	net: cdc_ncm: Signedness bug in cdc_ncm_set_dgram_size()
	x86/atomic: Fix smp_mb__{before,after}_atomic()
	kprobes/x86: Prohibit probing on exception masking instructions
	uprobes/x86: Prohibit probing on MOV SS instruction
	fbdev: Ditch fb_edid_add_monspecs
	block: introduce blk_rq_is_passthrough
	libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
	net: ovs: fix return type of ndo_start_xmit function
	net: xen-netback: fix return type of ndo_start_xmit function
	ARM: dts: omap5: enable OTG role for DWC3 controller
	f2fs: return correct errno in f2fs_gc
	SUNRPC: Fix priority queue fairness
	kvm: arm/arm64: Fix stage2_flush_memslot for 4 level page table
	arm64/numa: Report correct memblock range for the dummy node
	ath10k: fix vdev-start timeout on error
	ata: ahci_brcm: Allow using driver or DSL SoCs
	ath9k: fix reporting calculated new FFT upper max
	usb: gadget: udc: fotg210-udc: Fix a sleep-in-atomic-context bug in fotg210_get_status()
	nl80211: Fix a GET_KEY reply attribute
	dmaengine: ep93xx: Return proper enum in ep93xx_dma_chan_direction
	dmaengine: timb_dma: Use proper enum in td_prep_slave_sg
	mei: samples: fix a signedness bug in amt_host_if_call()
	cxgb4: Use proper enum in cxgb4_dcb_handle_fw_update
	cxgb4: Use proper enum in IEEE_FAUX_SYNC
	powerpc/pseries: Fix DTL buffer registration
	powerpc/pseries: Fix how we iterate over the DTL entries
	mtd: rawnand: sh_flctl: Use proper enum for flctl_dma_fifo0_transfer
	ixgbe: Fix crash with VFs and flow director on interface flap
	IB/mthca: Fix error return code in __mthca_init_one()
	IB/mlx4: Avoid implicit enumerated type conversion
	ACPICA: Never run _REG on system_memory and system_IO
	ata: ep93xx: Use proper enums for directions
	media: pxa_camera: Fix check for pdev->dev.of_node
	ALSA: hda/sigmatel - Disable automute for Elo VuPoint
	KVM: PPC: Book3S PR: Exiting split hack mode needs to fixup both PC and LR
	USB: serial: cypress_m8: fix interrupt-out transfer length
	mtd: physmap_of: Release resources on error
	cpu/SMT: State SMT is disabled even with nosmt and without "=force"
	brcmfmac: reduce timeout for action frame scan
	brcmfmac: fix full timeout waiting for action frame on-channel tx
	clk: samsung: Use clk_hw API for calling clk framework from clk notifiers
	i2c: brcmstb: Allow enabling the driver on DSL SoCs
	NFSv4.x: fix lock recovery during delegation recall
	dmaengine: ioat: fix prototype of ioat_enumerate_channels
	Input: st1232 - set INPUT_PROP_DIRECT property
	Input: silead - try firmware reload after unsuccessful resume
	x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
	crypto: mxs-dcp - Fix SHA null hashes and output length
	crypto: mxs-dcp - Fix AES issues
	ACPI / SBS: Fix rare oops when removing modules
	iwlwifi: mvm: don't send keys when entering D3
	fbdev: sbuslib: use checked version of put_user()
	fbdev: sbuslib: integer overflow in sbusfb_ioctl_helper()
	reset: Fix potential use-after-free in __of_reset_control_get()
	bcache: recal cached_dev_sectors on detach
	s390/kasan: avoid vdso instrumentation
	proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
	backlight: lm3639: Unconditionally call led_classdev_unregister
	mfd: ti_am335x_tscadc: Keep ADC interface on if child is wakeup capable
	printk: Give error on attempt to set log buffer length to over 2G
	media: isif: fix a NULL pointer dereference bug
	GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads
	media: cx231xx: fix potential sign-extension overflow on large shift
	x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
	gpio: syscon: Fix possible NULL ptr usage
	spi: spidev: Fix OF tree warning logic
	ARM: 8802/1: Call syscall_trace_exit even when system call skipped
	orangefs: rate limit the client not running info message
	hwmon: (pwm-fan) Silence error on probe deferral
	hwmon: (ina3221) Fix INA3221_CONFIG_MODE macros
	misc: cxl: Fix possible null pointer dereference
	mac80211: minstrel: fix CCK rate group streams value
	spi: rockchip: initialize dma_slave_config properly
	ARM: dts: omap5: Fix dual-role mode on Super-Speed port
	arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
	Linux 4.9.203

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-25 10:11:00 +01:00
Eric W. Biederman
1f7d8a28b6 signal: Always ignore SIGKILL and SIGSTOP sent to the global init
[ Upstream commit 86989c41b5ea08776c450cb759592532314a4ed6 ]

If the first process started (aka /sbin/init) receives a SIGKILL it
will panic the system if it is delivered.  Making the system unusable
and undebugable.  It isn't much better if the first process started
receives SIGSTOP.

So always ignore SIGSTOP and SIGKILL sent to init.

This is done in a separate clause in sig_task_ignored as force_sig_info
can clear SIG_UNKILLABLE and this protection should work even then.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25 09:52:12 +01:00
Joel Fernandes (Google)
af1070fbf2 UPSTREAM: pidfd: add polling support
This patch adds polling support to pidfd.

Android low memory killer (LMK) needs to know when a process dies once
it is sent the kill signal. It does so by checking for the existence of
/proc/pid which is both racy and slow. For example, if a PID is reused
between when LMK sends a kill signal and checks for existence of the
PID, since the wrong PID is now possibly checked for existence.
Using the polling support, LMK will be able to get notified when a process
exists in race-free and fast way, and allows the LMK to do other things
(such as by polling on other fds) while awaiting the process being killed
to die.

For notification to polling processes, we follow the same existing
mechanism in the kernel used when the parent of the task group is to be
notified of a child's death (do_notify_parent). This is precisely when the
tasks waiting on a poll of pidfd are also awakened in this patch.

We have decided to include the waitqueue in struct pid for the following
reasons:
1. The wait queue has to survive for the lifetime of the poll. Including
   it in task_struct would not be option in this case because the task can
   be reaped and destroyed before the poll returns.

2. By including the struct pid for the waitqueue means that during
   de_thread(), the new thread group leader automatically gets the new
   waitqueue/pid even though its task_struct is different.

Appropriate test cases are added in the second patch to provide coverage of
all the cases the patch is handling.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@android.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Co-developed-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit b53b0b9d9a613c418057f6cb921c2f40a6f78c24)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I02f259d2875bec46b198d580edfbb067f077084e
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:45:29 -07:00
Christian Brauner
ac937bb3fb UPSTREAM: signal: improve comments
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.

Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit c732327f04a3818f35fa97d07b1d64d31b691d78)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I06c6bdd1dddaeb8ac75a78dd21f9cdd0dc139a4c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:45:19 -07:00
Christian Brauner
b3ae59851f BACKPORT: signal: support CLONE_PIDFD with pidfd_send_signal
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD.  With this
patch pidfd_send_signal() becomes independent of procfs.  This fullfils
the request made when we merged the pidfd_send_signal() patchset.  The
pidfd_send_signal() syscall is now always available allowing for it to
be used by users without procfs mounted or even users without procfs
support compiled into the kernel.

Signed-off-by: Christian Brauner <christian@brauner.io>
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>

(cherry picked from commit 2151ad1b067275730de1b38c7257478cae47d29e)

Conflicts:
        kernel/sys_ni.c

(1. Replaced COND_SYSCALL with cond_syscall.)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I621fe6547397e0e68c560d7da60ef7715deb290c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:44:59 -07:00
Christian Brauner
68defbc4c8 UPSTREAM: signal: use fdget() since we don't allow O_PATH
As stated in the original commit for pidfd_send_signal() we don't allow
to signal processes through O_PATH file descriptors since it is
semantically equivalent to a write on the pidfd.

We already correctly error out right now and return EBADF if an O_PATH
fd is passed.  This is because we use file->f_op to detect whether a
pidfd is passed and O_PATH fds have their file->f_op set to empty_fops
in do_dentry_open() and thus fail the test.

Thus, there is no regression.  It's just semantically correct to use
fdget() and return an error right from there instead of taking a
reference and returning an error later.

Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jann@thejh.net>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 738a7832d21e3d911fcddab98ce260b79010b461)

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: Id52eaadf9da371fb2d9caae4df49627760de7229
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:44:25 -07:00
Jann Horn
f511d49e01 UPSTREAM: signal: don't silently convert SI_USER signals to non-current pidfd
The current sys_pidfd_send_signal() silently turns signals with explicit
SI_USER context that are sent to non-current tasks into signals with
kernel-generated siginfo.
This is unlike do_rt_sigqueueinfo(), which returns -EPERM in this case.
If a user actually wants to send a signal with kernel-provided siginfo,
they can do that with pidfd_send_signal(pidfd, sig, NULL, 0); so allowing
this case is unnecessary.

Instead of silently replacing the siginfo, just bail out with an error;
this is consistent with other interfaces and avoids special-casing behavior
based on security checks.

Fixes: 3eb39f47934f ("signal: add pidfd_send_signal() syscall")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit 556a888a14afe27164191955618990fb3ccc9aad)

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I493af671b82c43bff1425ee24550d2fb9aa6d961
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:44:15 -07:00
Christian Brauner
cf9f829523 BACKPORT: signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].

This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).

/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);

/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).

In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.

/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.

/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).

/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.

The  "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.

/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.

/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).

/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]).  The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.

/* testing */
This patch was tested on x64 and x86.

/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:

 #define _GNU_SOURCE
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
 #include <sys/types.h>
 #include <unistd.h>

 static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
                                         unsigned int flags)
 {
 #ifdef __NR_pidfd_send_signal
         return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 #else
         return -ENOSYS;
 #endif
 }

 int main(int argc, char *argv[])
 {
         int fd, ret, saved_errno, sig;

         if (argc < 3)
                 exit(EXIT_FAILURE);

         fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
         if (fd < 0) {
                 printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
                 exit(EXIT_FAILURE);
         }

         sig = atoi(argv[2]);

         printf("Sending signal %d to process %s\n", sig, argv[1]);
         ret = do_pidfd_send_signal(fd, sig, NULL, 0);

         saved_errno = errno;
         close(fd);
         errno = saved_errno;

         if (ret < 0) {
                 printf("%s - Failed to send signal %d to process %s\n",
                        strerror(errno), sig, argv[1]);
                 exit(EXIT_FAILURE);
         }

         exit(EXIT_SUCCESS);
 }

/* Q&A
 * Given that it seems the same questions get asked again by people who are
 * late to the party it makes sense to add a Q&A section to the commit
 * message so it's hopefully easier to avoid duplicate threads.
 *
 * For the sake of progress please consider these arguments settled unless
 * there is a new point that desperately needs to be addressed. Please make
 * sure to check the links to the threads in this commit message whether
 * this has not already been covered.
 */
Q-01: (Florian Weimer [20], Andrew Morton [21])
      What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).

Q-02:  (Andrew Morton [21])
       Is the task_struct pinned by the fd?
A-02:  No. A reference to struct pid is kept. struct pid - as far as I
       understand - was created exactly for the reason to not require to
       pin struct task_struct (cf. [22]).

Q-03: (Andrew Morton [21])
      Does the entire procfs directory remain visible? Just one entry
      within it?
A-03: The same thing that happens right now when you hold a file descriptor
      to /proc/<pid> open (cf. [22]).

Q-04: (Andrew Morton [21])
      Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
      recycled (cf. [22]).

Q-05: (Andrew Morton [21])
      Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.

Q-06: (Andrew Morton [22])
      Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
      so this is something that we will need to support anyway. Hence,
      there's no immediate need to add another syscalls just to make
      pidfd_send_signal() not dependent on the presence of procfs. However,
      adding a syscalls to get such file descriptors is planned for a
      future patchset (cf. [22]).

Q-07: (Andrew Morton [21] and others)
      This fd-for-a-process sounds like a handy thing and people may well
      think up other uses for it in the future, probably unrelated to
      signals. Are the code and the interface designed to permit such
      future applications?
A-07: Yes (cf. [22]).

Q-08: (Andrew Morton [21] and others)
      Now I think about it, why a new syscall? This thing is looking
      rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
      preferred for a variety or reasons. Here are just a few taken from
      prior threads. Syscalls are safer than ioctl()s especially when
      signaling to fds. Processes are a core kernel concept so a syscall
      seems more appropriate. The layout of the syscall with its four
      arguments would require the addition of a custom struct for the
      ioctl() thereby causing at least the same amount or even more
      complexity for userspace than a simple syscall. The new syscall will
      replace multiple other pid-based syscalls (see description above).
      The file-descriptors-for-processes concept introduced with this
      syscall will be extended with other syscalls in the future. See also
      [22], [23] and various other threads already linked in here.

Q-09: (Florian Weimer [24])
      What happens if you use the new interface with an O_PATH descriptor?
A-09:
      pidfds opened as O_PATH fds cannot be used to send signals to a
      process (cf. [2]). Signaling processes through pidfds is the
      equivalent of writing to a file. Thus, this is not an operation that
      operates "purely at the file descriptor level" as required by the
      open(2) manpage. See also [4].

/* References */
[1]:  https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]:  https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]:  https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]:  https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]:  https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]:  https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]:  https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]:  https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]:  https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>

(cherry picked from commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad)

Conflicts:
        arch/x86/entry/syscalls/syscall_32.tbl - trivial manual merge
        arch/x86/entry/syscalls/syscall_64.tbl - trivial manual merge
        include/linux/proc_fs.h - trivial manual merge
        include/linux/syscalls.h - trivial manual merge
        include/uapi/asm-generic/unistd.h - trivial manual merge
        kernel/signal.c - struct kernel_siginfo does not exist in 4.9
        kernel/sys_ni.c - cond_syscall is used instead of COND_SYSCALL
        arch/x86/entry/syscalls/syscall_32.tbl
        arch/x86/entry/syscalls/syscall_64.tbl

(1. manual merges because of 4.9 differences
 2. change prepare_kill_siginfo() to use struct siginfo instead of
kernel_siginfo
 3. exclude kill() changes to avoid struct kernel_siginfo usage
 4. exclude copy_siginfo_from_user_any() to avoid struct kernel_siginfo usage
 5. use copy_from_user() instead of copy_siginfo_from_user() in copy_siginfo_from_user_any()
 6. replaced COND_SYSCALL with cond_syscall
 7. Removed __ia32_sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_32.tbl.
 8. Replaced __x64_sys_pidfd_send_signal with sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_64.tbl.)

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I00f1c618b2e9dbafae4d4113ad4d8a1a44b6957c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:43:55 -07:00
Zhenliang Wei
9adcdd5c5c kernel/signal.c: trace_signal_deliver when signal_group_exit
commit 98af37d624ed8c83f1953b1b6b2f6866011fc064 upstream.

In the fixes commit, removing SIGKILL from each thread signal mask and
executing "goto fatal" directly will skip the call to
"trace_signal_deliver".  At this point, the delivery tracking of the
SIGKILL signal will be inaccurate.

Therefore, we need to add trace_signal_deliver before "goto fatal" after
executing sigdelset.

Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.

Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com
Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT")
Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ivan Delalande <colona@arista.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:22:43 +02:00
Eric W. Biederman
aa74f266c0 signal: Restore the stop PTRACE_EVENT_EXIT
commit cf43a757fd49442bc38f76088b70c2299eed2c2f upstream.

In the middle of do_exit() there is there is a call
"ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process
in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for
for the debugger to release the task or SIGKILL to be delivered.

Skipping past dequeue_signal when we know a fatal signal has already
been delivered resulted in SIGKILL remaining pending and
TIF_SIGPENDING remaining set.  This in turn caused the
scheduler to not sleep in PTACE_EVENT_EXIT as it figured
a fatal signal was pending.  This also caused ptrace_freeze_traced
in ptrace_check_attach to fail because it left a per thread
SIGKILL pending which is what fatal_signal_pending tests for.

This difference in signal state caused strace to report
strace: Exit of unknown pid NNNNN ignored

Therefore update the signal handling state like dequeue_signal
would when removing a per thread SIGKILL, by removing SIGKILL
from the per thread signal mask and clearing TIF_SIGPENDING.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ivan Delalande <colona@arista.com>
Cc: stable@vger.kernel.org
Fixes: 35634ffa1751 ("signal: Always notice exiting tasks")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20 10:18:33 +01:00
Eric W. Biederman
181f1f0db8 signal: Better detection of synchronous signals
commit 7146db3317c67b517258cb5e1b08af387da0618b upstream.

Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop failing
to deliver SIGHUP but always trying.

When the stack overflows delivery of SIGHUP fails and force_sigsegv is
called.  Unfortunately because SIGSEGV is numerically higher than
SIGHUP next_signal tries again to deliver a SIGHUP.

From a quality of implementation standpoint attempting to deliver the
timer SIGHUP signal is wrong.  We should attempt to deliver the
synchronous SIGSEGV signal we just forced.

We can make that happening in a fairly straight forward manner by
instead of just looking at the signal number we also look at the
si_code.  In particular for exceptions (aka synchronous signals) the
si_code is always greater than 0.

That still has the potential to pick up a number of asynchronous
signals as in a few cases the same si_codes that are used
for synchronous signals are also used for asynchronous signals,
and SI_KERNEL is also included in the list of possible si_codes.

Still the heuristic is much better and timer signals are definitely
excluded.  Which is enough to prevent all known ways for someone
sending a process signals fast enough to cause unexpected and
arguably incorrect behavior.

Cc: stable@vger.kernel.org
Fixes: a27341cd5f ("Prioritize synchronous signals over 'normal' signals")
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-15 08:07:37 +01:00
Eric W. Biederman
39beaea03e signal: Always notice exiting tasks
commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.

Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop
failing to deliver SIGHUP but always trying.

Upon examination it turns out part of the problem is actually most of
the solution.  Since 2.5 signal delivery has found all fatal signals,
marked the signal group for death, and queued SIGKILL in every threads
thread queue relying on signal->group_exit_code to preserve the
information of which was the actual fatal signal.

The conversion of all fatal signals to SIGKILL results in the
synchronous signal heuristic in next_signal kicking in and preferring
SIGHUP to SIGKILL.  Which is especially problematic as all
fatal signals have already been transformed into SIGKILL.

Instead of dequeueing signals and depending upon SIGKILL to
be the first signal dequeued, first test if the signal group
has already been marked for death.  This guarantees that
nothing in the signal queue can prevent a process that needs
to exit from exiting.

Cc: stable@vger.kernel.org
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-15 08:07:37 +01:00
Will Deacon
1e7066a454 signal: Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
commit 22839869f21ab3850fbbac9b425ccc4c0023926f upstream.

The sigaltstack(2) system call fails with -ENOMEM if the new alternative
signal stack is found to be smaller than SIGMINSTKSZ. On architectures
such as arm64, where the native value for SIGMINSTKSZ is larger than
the compat value, this can result in an unexpected error being reported
to a compat task. See, for example:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904385

This patch fixes the problem by extending do_sigaltstack to take the
minimum signal stack size as an additional parameter, allowing the
native and compat system call entry code to pass in their respective
values. COMPAT_SIGMINSTKSZ is just defined as SIGMINSTKSZ if it has not
been defined by the architecture.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Reported-by: Steve McIntyre <steve.mcintyre@arm.com>
Tested-by: Steve McIntyre <93sam@debian.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[signal: Fix up cherry-pick conflicts for 22839869f21a]
Signed-off-by: Steve McIntyre <93sam@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21 14:11:29 +01:00
Eric W. Biederman
ba277fec4c signal: Always deliver the kernel's SIGKILL and SIGSTOP to a pid namespace init
[ Upstream commit 3597dfe01d12f570bc739da67f857fd222a3ea66 ]

Instead of playing whack-a-mole and changing SEND_SIG_PRIV to
SEND_SIG_FORCED throughout the kernel to ensure a pid namespace init
gets signals sent by the kernel, stop allowing a pid namespace init to
ignore SIGKILL or SIGSTOP sent by the kernel.  A pid namespace init is
only supposed to be able to ignore signals sent from itself and
children with SIG_DFL.

Fixes: 921cf9f630 ("signals: protect cinit from unblocked SIG_DFL signals")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:16:53 -08:00
zhongjiang
ec1975ac98 kernel/signal.c: avoid undefined behaviour in kill_something_info
commit 4ea77014af0d6205b05503d1c7aac6eace11d473 upstream.

When running kill(72057458746458112, 0) in userspace I hit the following
issue.

  UBSAN: Undefined behaviour in kernel/signal.c:1462:11
  negation of -2147483648 cannot be represented in type 'int':
  CPU: 226 PID: 9849 Comm: test Tainted: G    B          ---- -------   3.10.0-327.53.58.70.x86_64_ubsan+ #116
  Hardware name: Huawei Technologies Co., Ltd. RH8100 V3/BC61PBIA, BIOS BLHSV028 11/11/2014
  Call Trace:
    dump_stack+0x19/0x1b
    ubsan_epilogue+0xd/0x50
    __ubsan_handle_negate_overflow+0x109/0x14e
    SYSC_kill+0x43e/0x4d0
    SyS_kill+0xe/0x10
    system_call_fastpath+0x16/0x1b

Add code to avoid the UBSAN detection.

[akpm@linux-foundation.org: tweak comment]
Link: http://lkml.kernel.org/r/1496670008-59084-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:50:18 +02:00
Waiman Long
20a30619b3 signals: avoid unnecessary taking of sighand->siglock
commit c7be96af89d4b53211862d8599b2430e8900ed92 upstream.

When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

  9.30%  9.30%  905387  dataserver  /proc/kcore 0x7fff8163f4d2
  [k] _raw_spin_lock_irq
            |
            ---_raw_spin_lock_irq
               |
               |--99.34%-- __set_current_blocked
               |          sigprocmask
               |          sys_rt_sigprocmask
               |          system_call_fastpath
               |          |
               |          |--50.63%-- __swapcontext
               |          |          |
               |          |          |--99.91%-- upsleepgeneric
               |          |
               |          |--49.36%-- __setcontext
               |          |          ktskRun

Looking further into the swapcontext function in glibc, it was found that
the function always call sigprocmask() without checking if there are
changes in the signal mask.

A check was added to the __set_current_blocked() function to avoid taking
the sighand->siglock spinlock if there is no change in the signal mask.
This will prevent unneeded spinlock contention when many threads are
trying to call sigprocmask().

With this patch applied, the spinlock contention in sigprocmask() was
gone.

Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22 16:57:57 +02:00
Oleg Nesterov
4d53eb4949 kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal()
commit 426915796ccaf9c2bd9bb06dc5702225957bc2e5 upstream.

complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy
the thread group, today this is wrong in many ways.

If nothing else, fatal_signal_pending() should always imply that the
whole thread group (except ->group_exit_task if it is not NULL) is
killed, this check breaks the rule.

After the previous changes we can rely on sig_task_ignored();
sig_fatal(sig) && SIGNAL_UNKILLABLE can only be true if we actually want
to kill this task and sig == SIGKILL OR it is traced and debugger can
intercept the signal.

This should hopefully fix the problem reported by Dmitry.  This
test-case

	static int init(void *arg)
	{
		for (;;)
			pause();
	}

	int main(void)
	{
		char stack[16 * 1024];

		for (;;) {
			int pid = clone(init, stack + sizeof(stack)/2,
					CLONE_NEWPID | SIGCHLD, NULL);
			assert(pid > 0);

			assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0);
			assert(waitpid(-1, NULL, WSTOPPED) == pid);

			assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0);
			assert(syscall(__NR_tkill, pid, SIGKILL) == 0);
			assert(pid == wait(NULL));
		}
	}

triggers the WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING)) in
task_participate_group_stop().  do_signal_stop()->signal_group_exit()
checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending()
checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING.

And his should fix the minor security problem reported by Kyle,
SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the
task is the root of a pid namespace.

Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Kyle Huey <me@kylehuey.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-10 09:29:53 +01:00
Oleg Nesterov
794ac8ef9b kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals
commit ac25385089f673560867eb5179228a44ade0cfc1 upstream.

Change sig_task_ignored() to drop the SIG_DFL && !sig_kernel_only()
signals even if force == T.  This simplifies the next change and this
matches the same check in get_signal() which will drop these signals
anyway.

Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-10 09:29:52 +01:00
Oleg Nesterov
1453b3ac6c kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL
commit 628c1bcba204052d19b686b5bac149a644cdb72e upstream.

The comment in sig_ignored() says "Tracers may want to know about even
ignored signals" but SIGKILL can not be reported to debugger and it is
just wrong to return 0 in this case: SIGKILL should only kill the
SIGNAL_UNKILLABLE task if it comes from the parent ns.

Change sig_ignored() to ignore ->ptrace if sig == SIGKILL and rely on
sig_task_ignored().

SISGTOP coming from within the namespace is not really right too but at
least debugger can intercept it, and we can't drop it here because this
will break "gdb -p 1": ptrace_attach() won't work.  Perhaps we will add
another ->ptrace check later, we will see.

Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-10 09:29:52 +01:00
Jamie Iles
916a05b90d signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
[ Upstream commit 2d39b3cd34e6d323720d4c61bd714f5ae202c022 ]

Since commit 00cd5c37af ("ptrace: permit ptracing of /sbin/init") we
can now trace init processes.  init is initially protected with
SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
there are a number of paths during tracing where SIGNAL_UNKILLABLE can
be implicitly cleared.

This can result in init becoming stoppable/killable after tracing.  For
example, running:

  while true; do kill -STOP 1; done &
  strace -p 1

and then stopping strace and the kill loop will result in init being
left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
init will now respond to future SIGSTOP signals rather than ignoring
them.

Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
that we don't clear SIGNAL_UNKILLABLE.

Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-11 08:49:36 -07:00
Eric W. Biederman
f719f20abe signal: Only reschedule timers on signals timers have sent
commit 57db7e4a2d92c2d3dfbca4ef8057849b2682436b upstream.

Thomas Gleixner  wrote:
> The CRIU support added a 'feature' which allows a user space task to send
> arbitrary (kernel) signals to itself. The changelog says:
>
>   The kernel prevents sending of siginfo with positive si_code, because
>   these codes are reserved for kernel.  I think we can allow a task to
>   send such a siginfo to itself.  This operation should not be dangerous.
>
> Quite contrary to that claim, it turns out that it is outright dangerous
> for signals with info->si_code == SI_TIMER. The following code sequence in
> a user space task allows to crash the kernel:
>
>    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
>    timer_set(id, ....);
>    info->si_signo = SIGX;
>    info->si_code = SI_TIMER:
>    info->_sifields._timer._tid = id;
>    info->_sifields._timer._sys_private = 2;
>    rt_[tg]sigqueueinfo(..., SIGX, info);
>    sigemptyset(&sigset);
>    sigaddset(&sigset, SIGX);
>    rt_sigtimedwait(sigset, info);
>
> For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
> results in a kernel crash because sigwait() dequeues the signal and the
> dequeue code observes:
>
>   info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
>
> which triggers the following callchain:
>
>  do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
>
> arm_timer() executes a list_add() on the timer, which is already armed via
> the timer_set() syscall. That's a double list add which corrupts the posix
> cpu timer list. As a consequence the kernel crashes on the next operation
> touching the posix cpu timer list.
>
> Posix clocks which are internally implemented based on hrtimers are not
> affected by this because hrtimer_start() can handle already armed timers
> nicely, but it's a reliable way to trigger the WARN_ON() in
> hrtimer_forward(), which complains about calling that function on an
> already armed timer.

This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.

The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.

Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.

Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers.   This makes it clear why the code cares about preallocated
timers.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference: 66dd34ad31 ("signal: allow to send any siginfo to itself")
Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks & timers")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-29 13:00:29 +02:00
Stas Sergeev
6d94a6b32e sigaltstack: support SS_AUTODISARM for CONFIG_COMPAT
commit 441398d378f29a5ad6d0fcda07918e54e4961800 upstream.

Currently SS_AUTODISARM is not supported in compatibility mode, but does
not return -EINVAL either.  This makes dosemu built with -m32 on x86_64
to crash.  Also the kernel's sigaltstack selftest fails if compiled with
-m32.

This patch adds the needed support.

Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Cc: Milosz Tanski <milosz@adfin.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:41:44 +01:00
Dmitry Safonov
6846351052 x86/signal: Add SA_{X32,IA32}_ABI sa_flags
Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-7-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:11 +02:00
Thomas Gleixner
2b1ecc3d1a signals: Use hrtimer for sigtimedwait()
We've converted most timeout related syscalls to hrtimers, but
sigtimedwait() did not get this treatment.

Convert it so we get a reasonable accuracy and remove the
user space exposure to the timer wheel properties.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:07 +02:00
Wang Xiaoqiang
747800efbe kernel/signal.c: convert printk(KERN_<LEVEL> ...) to pr_<level>(...)
Use pr_<level> instead of printk(KERN_<LEVEL> ).

Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-23 17:04:14 -07:00
Andy Lutomirski
0318bc8a91 signals/sigaltstack: Report current flag bits in sigaltstack()
sigaltstack()'s reported previous state uses a somewhat odd
convention, but the concept of flag bits is new, and we can do the
flag bits sensibly.  Specifically, let's just report them directly.

This will allow saving and restoring the sigaltstack state using
sigaltstack() to work correctly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/94b291ec9fd47741a9264851e316e158ded0b00d.1462296606.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04 08:34:14 +02:00
Stas Sergeev
2a74213838 signals/sigaltstack: Implement SS_AUTODISARM flag
This patch implements the SS_AUTODISARM flag that can be OR-ed with
SS_ONSTACK when forming ss_flags.

When this flag is set, sigaltstack will be disabled when entering
the signal handler; more precisely, after saving sas to uc_stack.
When leaving the signal handler, the sigaltstack is restored by
uc_stack.

When this flag is used, it is safe to switch from sighandler with
swapcontext(). Without this flag, the subsequent signal will corrupt
the state of the switched-away sighandler.

To detect the support of this functionality, one can do:

  err = sigaltstack(SS_DISABLE | SS_AUTODISARM);
  if (err && errno == EINVAL)
	unsupported();

Signed-off-by: Stas Sergeev <stsp@list.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460665206-13646-4-git-send-email-stsp@list.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03 08:37:59 +02:00
Stas Sergeev
407bc16ad1 signals/sigaltstack: Prepare to add new SS_xxx flags
This patch adds SS_FLAG_BITS - the mask that splits sigaltstack
mode values and bit-flags. Since there is no bit-flags yet, the
mask is defined to 0. The flags are added by subsequent patches.
With every new flag, the mask should have the appropriate bit cleared.

This makes sure if some flag is tried on a kernel that doesn't
support it, the -EINVAL error will be returned, because such a
flag will be treated as an invalid mode rather than the bit-flag.

That way the existence of the particular features can be probed
at run-time.

This change was suggested by Andy Lutomirski:

  https://lkml.org/lkml/2016/3/6/158

Signed-off-by: Stas Sergeev <stsp@list.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460665206-13646-3-git-send-email-stsp@list.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03 08:37:59 +02:00
Helge Deller
41b2715487 kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE
The value of __ARCH_SI_PREAMBLE_SIZE defines the size (including
padding) of the part of the struct siginfo that is before the union, and
it is then used to calculate the needed padding (SI_PAD_SIZE) to make
the size of struct siginfo equal to 128 (SI_MAX_SIZE) bytes.

Depending on the target architecture and word width it equals to either
3 or 4 times sizeof int.

Since the very beginning we had __ARCH_SI_PREAMBLE_SIZE wrong on the
parisc architecture for the 64bit kernel build.  It's even more
frustrating, because it can easily be checked at compile time if the
value was defined correctly.

This patch adds such a check for the correctness of
__ARCH_SI_PREAMBLE_SIZE in the hope that it will prevent existing and
future architectures from running into the same problem.

I refrained from replacing __ARCH_SI_PREAMBLE_SIZE by offsetof() in
copy_siginfo() in include/asm-generic/siginfo.h, because a) it doesn't
make any difference and b) it's used in the Documentation/kmemcheck.txt
example.

I ran this patch through the 0-DAY kernel test infrastructure and only
the parisc architecture triggered as expected.  That means that this
patch should be OK for all major architectures.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Dave Hansen
cd0ea35ff5 signals, pkeys: Notify userspace about protection key faults
A protection key fault is very similar to any other access error.
There must be a VMA, etc...  We even want to take the same action
(SIGSEGV) that we do with a normal access fault.

However, we do need to let userspace know that something is
different.  We do this the same way what we did with SEGV_BNDERR
with Memory Protection eXtensions (MPX): define a new SEGV code:
SEGV_PKUERR.

We add a siginfo field: si_pkey that reveals to userspace which
protection key was set on the PTE that we faulted on.  There is
no other easy way for userspace to figure this out.  They could
parse smaps but that would be a bit cruel.

We share space with in siginfo with _addr_bnd.  #BR faults from
MPX are completely separate from page faults (#PF) that trigger
from protection key violations, so we never need both at the same
time.

Note that _pkey is a 64-bit value.  The current hardware only
supports 4-bit protection keys.  We do this because there is
_plenty_ of space in _sigfault and it is possible that future
processors would support more than 4 bits of protection keys.

The x86 code to actually fill in the siginfo is in the next
patch.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210212.3A9B83AC@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18 09:32:42 +01:00
Sasha Levin
823dd3224a signals: avoid random wakeups in sigsuspend()
A random wakeup can get us out of sigsuspend() without TIF_SIGPENDING
being set.

Avoid that by making sure we were signaled, like sys_pause() does.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Richard Weinberger
9d8a765211 kernel/signal.c: unexport sigsuspend()
sigsuspend() is nowhere used except in signal.c itself, so we can mark it
static do not pollute the global namespace.

But this patch is more than a boring cleanup patch, it fixes a real issue
on UserModeLinux.  UML has a special console driver to display ttys using
xterm, or other terminal emulators, on the host side.  Vegard reported
that sometimes UML is unable to spawn a xterm and he's facing the
following warning:

  WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0()

It turned out that this warning makes absolutely no sense as the UML
xterm code calls sigsuspend() on the host side, at least it tries.  But
as the kernel itself offers a sigsuspend() symbol the linker choose this
one instead of the glibc wrapper.  Interestingly this code used to work
since ever but always blocked signals on the wrong side.  Some recent
kernel change made the WARN_ON() trigger and uncovered the bug.

It is a wonderful example of how much works by chance on computers. :-)

Fixes: 68f3f16d9a ("new helper: sigsuspend()")
Signed-off-by: Richard Weinberger <richard@nod.at>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>	[3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-20 16:17:32 -08:00
Oleg Nesterov
5fa534c987 coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
task_will_free_mem() is wrong in many ways, and in particular the
SIGNAL_GROUP_COREDUMP check is not reliable: a task can participate in the
coredumping without SIGNAL_GROUP_COREDUMP bit set.

change zap_threads() paths to always set SIGNAL_GROUP_COREDUMP even if
other CLONE_VM processes can't react to SIGKILL.  Fortunately, at least
oom-kill case if fine; it kills all tasks sharing the same mm, so it
should also kill the process which actually dumps the core.

The change in prepare_signal() is not strictly necessary, it just ensures
that the patch does not bring another subtle behavioural change.  But it
reminds us that this SIGNAL_GROUP_EXIT/COREDUMP case needs more changes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kyle Walker <kwalker@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Stanislav Kozina <skozina@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Oleg Nesterov
2e01fabe67 signals: kill block_all_signals() and unblock_all_signals()
It is hardly possible to enumerate all problems with block_all_signals()
and unblock_all_signals().  Just for example,

1. block_all_signals(SIGSTOP/etc) simply can't help if the caller is
   multithreaded. Another thread can dequeue the signal and force the
   group stop.

2. Even is the caller is single-threaded, it will "stop" anyway. It
   will not sleep, but it will spin in kernel space until SIGCONT or
   SIGKILL.

And a lot more. In short, this interface doesn't work at all, at least
the last 10+ years.

Daniel said:

  Yeah the only times I played around with the DRM_LOCK stuff was when
  old drivers accidentally deadlocked - my impression is that the entire
  DRM_LOCK thing was never really tested properly ;-) Hence I'm all for
  purging where this leaks out of the drm subsystem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Amanieu d'Antras
26135022f8 signal: fix information leak in copy_siginfo_to_user
This function may copy the si_addr_lsb, si_lower and si_upper fields to
user mode when they haven't been initialized, which can leak kernel
stack data to user mode.

Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals.  This is solved by
checking the value of si_signo in addition to si_code.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:40 +03:00
Amanieu d'Antras
3c00cb5e68 signal: fix information leak in copy_siginfo_from_user32
This function can leak kernel stack data when the user siginfo_t has a
positive si_code value.  The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.

copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.

This fixes the following information leaks:
x86:   8 bytes leaked when sending a signal from a 32-bit process to
       itself. This leak grows to 16 bytes if the process uses x32.
       (si_code = __SI_CHLD)
x86:   100 bytes leaked when sending a signal from a 32-bit process to
       a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
       64-bit process. (si_code = any)

parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process.  These bugs are also fixed for consistency.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:40 +03:00
Linus Torvalds
e22619a29f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "The main change in this kernel is Casey's generalized LSM stacking
  work, which removes the hard-coding of Capabilities and Yama stacking,
  allowing multiple arbitrary "small" LSMs to be stacked with a default
  monolithic module (e.g.  SELinux, Smack, AppArmor).

  See
        https://lwn.net/Articles/636056/

  This will allow smaller, simpler LSMs to be incorporated into the
  mainline kernel and arbitrarily stacked by users.  Also, this is a
  useful cleanup of the LSM code in its own right"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
  tpm, tpm_crb: fix le64_to_cpu conversions in crb_acpi_add()
  vTPM: set virtual device before passing to ibmvtpm_reset_crq
  tpm_ibmvtpm: remove unneccessary message level.
  ima: update builtin policies
  ima: extend "mask" policy matching support
  ima: add support for new "euid" policy condition
  ima: fix ima_show_template_data_ascii()
  Smack: freeing an error pointer in smk_write_revoke_subj()
  selinux: fix setting of security labels on NFS
  selinux: Remove unused permission definitions
  selinux: enable genfscon labeling for sysfs and pstore files
  selinux: enable per-file labeling for debugfs files.
  selinux: update netlink socket classes
  signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
  selinux: Print 'sclass' as string when unrecognized netlink message occurs
  Smack: allow multiple labels in onlycap
  Smack: fix seq operations in smackfs
  ima: pass iint to ima_add_violation()
  ima: wrap event related data to the new ima_event_data structure
  integrity: add validity checks for 'path' parameter
  ...
2015-06-27 13:26:03 -07:00
Oleg Nesterov
9e7c8f8c62 signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
selinux_bprm_committed_creds()->__flush_signals() is not right, we
shouldn't clear TIF_SIGPENDING unconditionally. There can be other
reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
potentially more.

Also change this code to check fatal_signal_pending() rather than
SIGNAL_GROUP_EXIT, it looks a bit better.

Now we can kill __flush_signals() before it finds another buggy user.

Note: this code looks racy, we can flush a signal which was sent after
the task SID has been updated.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-06-04 16:22:16 -04:00
Palmer Dabbelt
b76808e680 signals, sched: Change all uses of JOBCTL_* from 'int' to 'long'
c56fb6564dcd ("Fix a misaligned load inside ptrace_attach()") makes
jobctl an "unsigned long".  It makes sense to have the masks applied
to it match that type.  This is currently just a cosmetic change, but
it will prevent the mask from being unexpectedly truncated if we ever
end up with masks with more bits.

One instance of "signr" is an int, but I left this alone because the
mask ensures that it will never overflow.

Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bobby.prani@gmail.com
Cc: oleg@redhat.com
Cc: paulmck@linux.vnet.ibm.com
Cc: richard@nod.at
Cc: vdavydov@parallels.com
Link: http://lkml.kernel.org/r/1430453997-32459-4-git-send-email-palmer@dabbelt.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:04:36 +02:00
Vladimir Davydov
69828dce7a signal: remove warning about using SI_TKILL in rt_[tg]sigqueueinfo
Sending SI_TKILL from rt_[tg]sigqueueinfo was deprecated, so now we issue
a warning on the first attempt of doing it.  We use WARN_ON_ONCE, which is
not informative and, what is worse, taints the kernel, making the trinity
syscall fuzzer complain false-positively from time to time.

It does not look like we need this warning at all, because the behaviour
changed quite a long time ago (2.6.39), and if an application relies on
the old API, it gets EPERM anyway and can issue a warning by itself.

So let us zap the warning in kernel.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:04:06 -04:00