Commit graph

570 commits

Author SHA1 Message Date
Anan Jaser
be1e7be3eb
kernel: sched: sync sched_rt_boost_threshold with N770F
* Add previously removed sys nodes to proc
2023-06-08 13:36:43 +03:00
FAROVITUS
af1d3ae977 Merge 4.9.212 branch 'android-4.9-q' into tw10-android-4.9-q
Documentation/filesystems/fscrypt.rst
	arch/arm/common/Kconfig
	arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
	arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
	arch/arm64/boot/dts/arm/juno-clocks.dtsi
	arch/arm64/boot/dts/broadcom/ns2.dtsi
	arch/arm64/boot/dts/lg/lg1312.dtsi
	arch/arm64/boot/dts/lg/lg1313.dtsi
	arch/arm64/boot/dts/marvell/armada-37xx.dtsi
	arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi
	arch/arm64/boot/dts/nvidia/tegra210-p2597.dtsi
	arch/arm64/boot/dts/nvidia/tegra210.dtsi
	arch/arm64/boot/dts/qcom/apq8016-sbc.dtsi
	arch/arm64/boot/dts/qcom/msm8996.dtsi
	arch/arm64/configs/ranchu64_defconfig
	arch/arm64/include/asm/cpucaps.h
	arch/arm64/kernel/cpufeature.c
	arch/arm64/kernel/traps.c
	arch/arm64/mm/mmu.c
	crypto/Makefile
	crypto/ablkcipher.c
	crypto/blkcipher.c
	crypto/testmgr.h
	crypto/zstd.c
	drivers/android/binder.c
	drivers/android/binder_alloc.c
	drivers/char/random.c
	drivers/clocksource/exynos_mct.c
	drivers/dma/pl330.c
	drivers/hid/hid-sony.c
	drivers/hid/uhid.c
	drivers/hid/usbhid/hiddev.c
	drivers/i2c/i2c-core.c
	drivers/md/dm-crypt.c
	drivers/media/v4l2-core/videobuf2-v4l2.c
	drivers/mmc/host/dw_mmc.c
	drivers/net/ethernet/broadcom/tg3.c
	drivers/net/usb/r8152.c
	drivers/scsi/scsi_logging.c
	drivers/scsi/sd.c
	drivers/scsi/ufs/ufshcd-pci.c
	drivers/scsi/ufs/ufshcd-pltfrm.c
	drivers/staging/android/Kconfig
	drivers/staging/android/ion/ion.c
	drivers/staging/android/ion/ion_priv.h
	drivers/staging/android/ion/ion_system_heap.c
	drivers/staging/android/lowmemorykiller.c
	drivers/tty/serial/samsung.c
	drivers/usb/dwc3/core.c
	drivers/usb/dwc3/gadget.c
	drivers/usb/host/xhci-hub.c
	drivers/video/fbdev/core/fbmon.c
	drivers/video/fbdev/core/modedb.c
	fs/crypto/fname.c
	fs/crypto/fscrypt_private.h
	fs/crypto/keyinfo.c
	fs/ext4/ialloc.c
	fs/ext4/namei.c
	fs/ext4/xattr.c
	fs/f2fs/checkpoint.c
	fs/f2fs/data.c
	fs/f2fs/debug.c
	fs/f2fs/dir.c
	fs/f2fs/f2fs.h
	fs/f2fs/file.c
	fs/f2fs/gc.c
	fs/f2fs/inline.c
	fs/f2fs/inode.c
	fs/f2fs/namei.c
	fs/f2fs/node.c
	fs/f2fs/recovery.c
	fs/f2fs/segment.c
	fs/f2fs/segment.h
	fs/f2fs/super.c
	fs/f2fs/sysfs.c
	fs/fat/dir.c
	fs/fat/fatent.c
	fs/file.c
	fs/namespace.c
	fs/pnode.c
	fs/proc/inode.c
	fs/proc/root.c
	fs/proc/task_mmu.c
	fs/sdcardfs/dentry.c
	fs/sdcardfs/derived_perm.c
	fs/sdcardfs/file.c
	fs/sdcardfs/inode.c
	fs/sdcardfs/lookup.c
	fs/sdcardfs/main.c
	fs/sdcardfs/sdcardfs.h
	fs/sdcardfs/super.c
	include/linux/blk_types.h
	include/linux/cpuhotplug.h
	include/linux/cred.h
	include/linux/fb.h
	include/linux/power_supply.h
	include/linux/sched.h
	include/linux/zstd.h
	include/trace/events/sched.h
	include/uapi/linux/android/binder.h
	init/Kconfig
	init/main.c
	kernel/bpf/hashtab.c
	kernel/cpu.c
	kernel/cred.c
	kernel/fork.c
	kernel/locking/spinlock_debug.c
	kernel/panic.c
	kernel/printk/printk.c
	kernel/sched/Makefile
	kernel/sched/core.c
	kernel/sched/fair.c
	kernel/sched/rt.c
	kernel/sched/walt.c
	kernel/sched/walt.h
	kernel/trace/trace.c
	lib/bug.c
	lib/list_debug.c
	lib/vsprintf.c
	lib/zstd/bitstream.h
	lib/zstd/compress.c
	lib/zstd/decompress.c
	lib/zstd/fse.h
	lib/zstd/fse_compress.c
	lib/zstd/fse_decompress.c
	lib/zstd/huf_compress.c
	lib/zstd/huf_decompress.c
	lib/zstd/zstd_internal.h
	mm/debug.c
	mm/filemap.c
	mm/rmap.c
	net/core/filter.c
	net/ipv4/sysctl_net_ipv4.c
	net/ipv4/sysfs_net_ipv4.c
	net/ipv4/tcp_input.c
	net/ipv4/tcp_output.c
	net/ipv4/udp.c
	net/ipv6/netfilter/nf_conntrack_reasm.c
	net/netfilter/Kconfig
	net/netfilter/Makefile
	net/netfilter/xt_qtaguid.c
	net/netfilter/xt_qtaguid_internal.h
	net/xfrm/xfrm_policy.c
	net/xfrm/xfrm_state.c
	scripts/checkpatch.pl
	security/selinux/hooks.c
	sound/core/compress_offload.c
2020-02-12 12:32:38 +02:00
FAROVITUS
2b92eefa41 import G965FXXU7DTAA OSRC
*First release for Android (Q).

Signed-off-by: FAROVITUS <farovitus@gmail.com>
2020-02-04 13:50:09 +02:00
Greg Kroah-Hartman
9759f952fd This is the 4.9.208 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4Qh+sACgkQONu9yGCS
 aT7sUQ/+P+dg9KWYEUxPUaSrZ3N8yvFTI2GlVWVv9MMrl/TQPU4jLVoFXxpaQd+i
 4MIlgs6a/FEPJWZbsJZP7DCMcIOX3wEbaV4IdHDD67W6sK8A2kUL7Rz7kVEz1/Gi
 o3Z0aep0kScIoY7gIOJXxypreg983odXouyEiP2OMKUNkPWpJqa+kG3XCYWOMnoP
 ak9+gOGVlyVY2rgxHcAIS1IFUhM6QBmTRy78B9vRr/c4OLTwKR9S2AyY9LYX3FsA
 Y4cxXzxFdgHROXg7ev2Xebor2bnrJR6jI2/7BlNLZ7mv/GTKe3UbgLqPcmiqIJ0l
 ybiiDIqs62K/UFW187H3UEwrHzPZcA5eacKk5dX5wTPEiJIMSnhlUKUcNBNFMDEf
 YWeDgd5oxAKyEe8uDTUynQJLa4SpyGpgkWuipEjpih7VlYKXsNsGIMsKnZT30xFO
 dUe6GRaflx/fAU2X5W56o+NfND8QwCaODPfhWXnFUBNK2UP8qortIUBvqqvv+h8g
 mi+FWskrkG+UPHa6WrLoHvhbRzWZ55I4hS3cnDTJfZ+GyQViiEFIFwh+yQV7s9T6
 MWDWzAkK+gMqAIKtHlJA5h7CoFCVUWc804FpFVBT0/ew1fS1ji2kR+RPg7JaJUrU
 /1aoBlsaPI9T/W4na/d6pxJG6CsjNUZCbRIIvk++xfF4qNZuPg4=
 =brIz
 -----END PGP SIGNATURE-----

Merge 4.9.208 into android-4.9-q

Changes in 4.9.208
	btrfs: skip log replay on orphaned roots
	btrfs: do not leak reloc root if we fail to read the fs root
	btrfs: handle ENOENT in btrfs_uuid_tree_iterate
	ALSA: pcm: Avoid possible info leaks from PCM stream buffers
	ALSA: hda/ca0132 - Keep power on during processing DSP response
	ALSA: hda/ca0132 - Avoid endless loop
	drm: mst: Fix query_payload ack reply struct
	drm/bridge: analogix-anx78xx: silence -EPROBE_DEFER warnings
	iio: light: bh1750: Resolve compiler warning and make code more readable
	spi: Add call to spi_slave_abort() function when spidev driver is released
	staging: rtl8192u: fix multiple memory leaks on error path
	staging: rtl8188eu: fix possible null dereference
	rtlwifi: prevent memory leak in rtl_usb_probe
	libertas: fix a potential NULL pointer dereference
	IB/iser: bound protection_sg size by data_sg size
	media: am437x-vpfe: Setting STD to current value is not an error
	media: i2c: ov2659: fix s_stream return value
	media: i2c: ov2659: Fix missing 720p register config
	media: ov6650: Fix stored frame format not in sync with hardware
	tools/power/cpupower: Fix initializer override in hsw_ext_cstates
	usb: renesas_usbhs: add suspend event support in gadget mode
	hwrng: omap3-rom - Call clk_disable_unprepare() on exit only if not idled
	regulator: max8907: Fix the usage of uninitialized variable in max8907_regulator_probe()
	media: flexcop-usb: fix NULL-ptr deref in flexcop_usb_transfer_init()
	media: cec-funcs.h: add status_req checks
	samples: pktgen: fix proc_cmd command result check logic
	mwifiex: pcie: Fix memory leak in mwifiex_pcie_init_evt_ring
	media: ti-vpe: vpe: fix a v4l2-compliance warning about invalid pixel format
	media: ti-vpe: vpe: fix a v4l2-compliance failure about frame sequence number
	media: ti-vpe: vpe: Make sure YUYV is set as default format
	extcon: sm5502: Reset registers during initialization
	x86/mm: Use the correct function type for native_set_fixmap()
	perf test: Report failure for mmap events
	perf report: Add warning when libunwind not compiled in
	usb: usbfs: Suppress problematic bind and unbind uevents.
	iio: adc: max1027: Reset the device at probe time
	Bluetooth: hci_core: fix init for HCI_USER_CHANNEL
	x86/mce: Lower throttling MCE messages' priority to warning
	drm/gma500: fix memory disclosures due to uninitialized bytes
	rtl8xxxu: fix RTL8723BU connection failure issue after warm reboot
	x86/ioapic: Prevent inconsistent state when moving an interrupt
	arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
	libata: Ensure ata_port probe has completed before detach
	pinctrl: sh-pfc: sh7734: Fix duplicate TCLK1_B
	Bluetooth: Fix advertising duplicated flags
	bnx2x: Fix PF-VF communication over multi-cos queues.
	spi: img-spfi: fix potential double release
	ALSA: timer: Limit max amount of slave instances
	rtlwifi: fix memory leak in rtl92c_set_fw_rsvdpagepkt()
	perf probe: Fix to find range-only function instance
	perf probe: Fix to list probe event with correct line number
	perf probe: Walk function lines in lexical blocks
	perf probe: Fix to probe an inline function which has no entry pc
	perf probe: Fix to show ranges of variables in functions without entry_pc
	perf probe: Fix to show inlined function callsite without entry_pc
	perf probe: Fix to probe a function which has no entry pc
	perf probe: Skip overlapped location on searching variables
	perf probe: Return a better scope DIE if there is no best scope
	perf probe: Fix to show calling lines of inlined functions
	perf probe: Skip end-of-sequence and non statement lines
	perf probe: Filter out instances except for inlined subroutine and subprogram
	ath10k: fix get invalid tx rate for Mesh metric
	media: pvrusb2: Fix oops on tear-down when radio support is not present
	media: si470x-i2c: add missed operations in remove
	EDAC/ghes: Fix grain calculation
	spi: pxa2xx: Add missed security checks
	ASoC: rt5677: Mark reg RT5677_PWR_ANLG2 as volatile
	s390/disassembler: don't hide instruction addresses
	parport: load lowlevel driver if ports not found
	cpufreq: Register drivers only after CPU devices have been registered
	x86/crash: Add a forward declaration of struct kimage
	iwlwifi: mvm: fix unaligned read of rx_pkt_status
	spi: tegra20-slink: add missed clk_unprepare
	mmc: tmio: Add MMC_CAP_ERASE to allow erase/discard/trim requests
	btrfs: don't prematurely free work in end_workqueue_fn()
	btrfs: don't prematurely free work in run_ordered_work()
	spi: st-ssc4: add missed pm_runtime_disable
	x86/insn: Add some Intel instructions to the opcode map
	iwlwifi: check kasprintf() return value
	fbtft: Make sure string is NULL terminated
	crypto: sun4i-ss - Fix 64-bit size_t warnings on sun4i-ss-hash.c
	crypto: vmx - Avoid weird build failures
	libtraceevent: Fix memory leakage in copy_filter_type
	net: phy: initialise phydev speed and duplex sanely
	btrfs: don't prematurely free work in reada_start_machine_worker()
	Revert "mmc: sdhci: Fix incorrect switch to HS mode"
	usb: xhci: Fix build warning seen with CONFIG_PM=n
	btrfs: don't double lock the subvol_sem for rename exchange
	btrfs: do not call synchronize_srcu() in inode_tree_del
	btrfs: return error pointer from alloc_test_extent_buffer
	btrfs: abort transaction after failed inode updates in create_subvol
	Btrfs: fix removal logic of the tree mod log that leads to use-after-free issues
	af_packet: set defaule value for tmo
	fjes: fix missed check in fjes_acpi_add
	mod_devicetable: fix PHY module format
	net: hisilicon: Fix a BUG trigered by wrong bytes_compl
	net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive()
	net: qlogic: Fix error paths in ql_alloc_large_buffers()
	net: usb: lan78xx: Fix suspend/resume PHY register access error
	sctp: fully initialize v4 addr in some functions
	net: dst: Force 4-byte alignment of dst_metrics
	usbip: Fix error path of vhci_recv_ret_submit()
	USB: EHCI: Do not return -EPIPE when hub is disconnected
	platform/x86: hp-wmi: Make buffer for HPWMI_FEATURE2_QUERY 128 bytes
	staging: comedi: gsc_hpdi: check dma_alloc_coherent() return value
	ext4: fix ext4_empty_dir() for directories with holes
	ext4: check for directory entries too close to block end
	powerpc/irq: fix stack overflow verification
	mmc: sdhci-of-esdhc: fix P2020 errata handling
	perf probe: Fix to show function entry line as probe-able
	scsi: mpt3sas: Fix clear pending bit in ioctl status
	scsi: lpfc: Fix locking on mailbox command completion
	Input: atmel_mxt_ts - disable IRQ across suspend
	iommu/tegra-smmu: Fix page tables in > 4 GiB memory
	scsi: target: compare full CHAP_A Algorithm strings
	scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
	scsi: csiostor: Don't enable IRQs too early
	powerpc/pseries: Mark accumulate_stolen_time() as notrace
	powerpc/pseries: Don't fail hash page table insert for bolted mapping
	dma-debug: add a schedule point in debug_dma_dump_mappings()
	clocksource/drivers/asm9260: Add a check for of_clk_get
	powerpc/security/book3s64: Report L1TF status in sysfs
	powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
	jbd2: Fix statistics for the number of logged blocks
	scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
	scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
	clk: qcom: Allow constant ratio freq tables for rcg
	irqchip/irq-bcm7038-l1: Enable parent IRQ if necessary
	irqchip: ingenic: Error out if IRQ domain creation failed
	fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
	scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
	scsi: ufs: fix potential bug which ends in system hang
	powerpc/pseries/cmm: Implement release() function for sysfs device
	powerpc/security: Fix wrong message when RFI Flush is disable
	scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE
	clk: pxa: fix one of the pxa RTC clocks
	bcache: at least try to shrink 1 node in bch_mca_scan()
	HID: Improve Windows Precision Touchpad detection.
	ext4: work around deleting a file with i_nlink == 0 safely
	scsi: pm80xx: Fix for SATA device discovery
	scsi: scsi_debug: num_tgts must be >= 0
	scsi: target: iscsi: Wait for all commands to finish before freeing a session
	gpio: mpc8xxx: Don't overwrite default irq_set_type callback
	scripts/kallsyms: fix definitely-lost memory leak
	cdrom: respect device capabilities during opening action
	perf regs: Make perf_reg_name() return "unknown" instead of NULL
	libfdt: define INT32_MAX and UINT32_MAX in libfdt_env.h
	s390/cpum_sf: Check for SDBT and SDB consistency
	ocfs2: fix passing zero to 'PTR_ERR' warning
	kernel: sysctl: make drop_caches write-only
	x86/mce: Fix possibly incorrect severity calculation on AMD
	net, sysctl: Fix compiler warning when only cBPF is present
	ALSA: hda - Downgrade error message for single-cmd fallback
	perf strbuf: Remove redundant va_end() in strbuf_addv()
	Make filldir[64]() verify the directory entry filename is valid
	filldir[64]: remove WARN_ON_ONCE() for bad directory entries
	netfilter: ebtables: compat: reject all padding in matches/watchers
	6pack,mkiss: fix possible deadlock
	netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
	net: icmp: fix data-race in cmp_global_allow()
	hrtimer: Annotate lockless access to timer->state
	tty/serial: atmel: fix out of range clock divider handling
	pinctrl: baytrail: Really serialize all register accesses
	mmc: sdhci: Update the tuning failed messages to pr_debug level
	net: ena: fix napi handler misbehavior when the napi budget is zero
	vhost/vsock: accept only packets with the right dst_cid
	tcp/dccp: fix possible race __inet_lookup_established()
	tcp: do not send empty skb from tcp_write_xmit()
	gtp: fix wrong condition in gtp_genl_dump_pdp()
	gtp: avoid zero size hashtable
	Linux 4.9.208

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2020-01-04 13:57:22 +01:00
Johannes Weiner
b231f9db5e kernel: sysctl: make drop_caches write-only
[ Upstream commit 204cb79ad42f015312a5bbd7012d09c93d9b46fb ]

Currently, the drop_caches proc file and sysctl read back the last value
written, suggesting this is somehow a stateful setting instead of a
one-time command.  Make it write-only, like e.g.  compact_memory.

While mitigating a VM problem at scale in our fleet, there was confusion
about whether writing to this file will permanently switch the kernel into
a non-caching mode.  This influences the decision making in a tense
situation, where tens of people are trying to fix tens of thousands of
affected machines: Do we need a rollback strategy?  What are the
performance implications of operating in a non-caching state for several
days?  It also caused confusion when the kernel team said we may need to
write the file several times to make sure it's effective ("But it already
reads back 3?").

Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 13:41:04 +01:00
Greg Kroah-Hartman
a0b21f86b2 This is the 4.9.183 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0Nx/UACgkQONu9yGCS
 aT78vQ/9FibC80cVhM69HzZko3evlbEQmPZkmgjwkEqJTZ+2qczRZR6sM2Oexx9q
 tLBeRU1+3cPe9Pl/SsEZVM/rTqSrxcRXnGgyG6RMq8wBYtBWyiwRFLRJ2j5zw//s
 Wg0LXJhaqLV9FfP34PaPkXmRdBOec2nw5N0U1uoA/0VPwLFRfPTMn8lsfsmbtLWw
 JlXMlpEYosGQJArYMadMOntnCwaPtUJtd4EfLTqb+Yc23LM0RTYhtwn+XKE8cJzj
 1WMWHhTRkd+RtZIlsgw+FW1iX8pAHvrAPq0b2xOpmq5CCmErHfqjmeZ7VY6SUdAf
 RqnTDFb/yNAdhbjklzB5+o/jAl/yv1nbfOGmbJrZNVxzZ3RqKfUJNDpVQPwJlgAc
 frCJ2+ZuBVnJWcvbyaR35DgJlAxEZ26JVP17D8oywXnQLpemKW++bvMB6J3kL4qD
 nN5bmPd5jCe89Aawo66akIVN1UOnbBlzkdiXEXEzxwvKMraYcoSnoL6AmFxzjcBp
 HV/mUy0L7umBbtz1l9aXiACoGtUwZbbsfNTDeiXadbnZSSAGVYpmtf9UGhSSUrA0
 dqkWos5SWXTYWSO4+4FkuIaGIZF+LPE6Esm+PxyzXmbm5rBiVnxxcxT1X4Pd5O+C
 0y+wng0TSKHMPbwUyoGktuR7ec9P+/WhcyBwjS0EnMnBeBL+Bdw=
 =o76k
 -----END PGP SIGNATURE-----

Merge 4.9.183 into android-4.9-q

Changes in 4.9.183
	rapidio: fix a NULL pointer dereference when create_workqueue() fails
	fs/fat/file.c: issue flush after the writeback of FAT
	sysctl: return -EINVAL if val violates minmax
	ipc: prevent lockup on alloc_msg and free_msg
	ARM: prevent tracing IPI_CPU_BACKTRACE
	hugetlbfs: on restore reserve error path retain subpool reservation
	mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE
	mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
	mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
	mm/slab.c: fix an infinite loop in leaks_show()
	kernel/sys.c: prctl: fix false positive in validate_prctl_map()
	drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER
	mfd: tps65912-spi: Add missing of table registration
	mfd: intel-lpss: Set the device in reset state when init
	mfd: twl6040: Fix device init errors for ACCCTL register
	perf/x86/intel: Allow PEBS multi-entry in watermark mode
	drm/bridge: adv7511: Fix low refresh rate selection
	objtool: Don't use ignore flag for fake jumps
	pwm: meson: Use the spin-lock only to protect register modifications
	ntp: Allow TAI-UTC offset to be set to zero
	f2fs: fix to avoid panic in do_recover_data()
	f2fs: fix to clear dirty inode in error path of f2fs_iget()
	f2fs: fix to do sanity check on valid block count of segment
	configfs: fix possible use-after-free in configfs_register_group
	uml: fix a boot splat wrt use of cpu_all_mask
	watchdog: imx2_wdt: Fix set_timeout for big timeout values
	watchdog: fix compile time error of pretimeout governors
	iommu/vt-d: Set intel_iommu_gfx_mapped correctly
	ALSA: hda - Register irq handler after the chip initialization
	nvmem: core: fix read buffer in place
	fuse: retrieve: cap requested size to negotiated max_write
	nfsd: allow fh_want_write to be called twice
	x86/PCI: Fix PCI IRQ routing table memory leak
	platform/chrome: cros_ec_proto: check for NULL transfer function
	soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher
	clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA
	PCI: rpadlpar: Fix leaked device_node references in add/remove paths
	platform/x86: intel_pmc_ipc: adding error handling
	PCI: rcar: Fix a potential NULL pointer dereference
	PCI: rcar: Fix 64bit MSI message address handling
	video: hgafb: fix potential NULL pointer dereference
	video: imsttfb: fix potential NULL pointer dereferences
	PCI: xilinx: Check for __get_free_pages() failure
	gpio: gpio-omap: add check for off wake capable gpios
	dmaengine: idma64: Use actual device for DMA transfers
	pwm: tiehrpwm: Update shadow register for disabling PWMs
	ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa
	pwm: Fix deadlock warning when removing PWM device
	ARM: exynos: Fix undefined instruction during Exynos5422 resume
	Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections"
	ALSA: seq: Cover unsubscribe_port() in list_mutex
	ALSA: oxfw: allow PCM capture for Stanton SCS.1m
	libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk
	mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
	fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
	signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO
	ptrace: restore smp_rmb() in __ptrace_may_access()
	media: v4l2-ioctl: clear fields in s_parm
	i2c: acorn: fix i2c warning
	bcache: fix stack corruption by PRECEDING_KEY()
	cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
	ASoC: cs42xx8: Add regcache mask dirty
	ASoC: fsl_asrc: Fix the issue about unsupported rate
	x86/uaccess, kcov: Disable stack protector
	ALSA: seq: Protect in-kernel ioctl calls with mutex
	ALSA: seq: Fix race of get-subscription call vs port-delete ioctls
	Revert "ALSA: seq: Protect in-kernel ioctl calls with mutex"
	Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var
	scsi: lpfc: add check for loss of ndlp when sending RRQ
	arm64/mm: Inhibit huge-vmap with ptdump
	scsi: bnx2fc: fix incorrect cast to u64 on shift operation
	selftests/timers: Add missing fflush(stdout) calls
	usbnet: ipheth: fix racing condition
	KVM: x86/pmu: do not mask the value that is written to fixed PMUs
	KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
	drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read
	drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()
	usb: dwc2: Fix DMA cache alignment issues
	USB: Fix chipmunk-like voice when using Logitech C270 for recording audio.
	USB: usb-storage: Add new ID to ums-realtek
	USB: serial: pl2303: add Allied Telesis VT-Kit3
	USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode
	USB: serial: option: add Telit 0x1260 and 0x1261 compositions
	rtc: pcf8523: don't return invalid date when battery is low
	ax25: fix inconsistent lock state in ax25_destroy_timer
	be2net: Fix number of Rx queues used for flow hashing
	ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
	lapb: fixed leak of control-blocks.
	neigh: fix use-after-free read in pneigh_get_next
	sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
	Revert "staging: vc04_services: prevent integer overflow in create_pagelist()"
	perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
	selftests: netfilter: missing error check when setting up veth interface
	mISDN: make sure device name is NUL terminated
	x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
	perf/ring_buffer: Fix exposing a temporarily decreased data_head
	perf/ring_buffer: Add ordering to rb->nest increment
	gpio: fix gpio-adp5588 build errors
	net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE()
	i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
	configfs: Fix use-after-free when accessing sd->s_dentry
	perf data: Fix 'strncat may truncate' build failure with recent gcc
	perf record: Fix s390 missing module symbol and warning for non-root users
	ia64: fix build errors by exporting paddr_to_nid()
	KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
	KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu
	net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
	scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
	scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask
	scsi: libsas: delete sas port if expander discover failed
	mlxsw: spectrum: Prevent force of 56G
	Abort file_remove_privs() for non-reg. files
	Linux 4.9.183

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-22 08:56:18 +02:00
Christian Brauner
726f69d327 sysctl: return -EINVAL if val violates minmax
[ Upstream commit e260ad01f0aa9e96b5386d5cd7184afd949dc457 ]

Currently when userspace gives us a values that overflow e.g.  file-max
and other callers of __do_proc_doulongvec_minmax() we simply ignore the
new value and leave the current value untouched.

This can be problematic as it gives the illusion that the limit has
indeed be bumped when in fact it failed.  This commit makes sure to
return EINVAL when an overflow is detected.  Please note that this is a
userspace facing change.

Link: http://lkml.kernel.org/r/20190210203943.8227-4-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:17:11 +02:00
Greg Kroah-Hartman
25d7d0eb82 This is the 4.9.171 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzEBhgACgkQONu9yGCS
 aT5v2w/+Mh9eJL1WjwqUfGzbel0DXZjEcDgPV3t0fwU+GeovkKVanvtK7vnODzou
 KR5m4DENfmxu+tMeaGjlzDWeHJ1D+G0tl4Z0BK9EH7Vj2CXvCll43jQE8TLsvTS6
 9ypHUKSs2lA/1HGbfcn1GZCJ55wz9YkjlkGxOb+uDOPtB6C4q0pKkYLWpHbAzaaK
 PwFCxF9zmzBQlVm/77GuAB1GPdzn5b0jJ8eG9Vlx0hxwa61orsB9Qjpj3MNtyzGS
 /pBn1jQO6bOpmiy+4v3+Baydm5Gs77hkFnB4H/TvXq9qVkYKpL3DZFJuzFQ+0x4I
 p8uytuGuXPRcmE7WVOBqYM1VZSqrvbIbqtw14yso8mGHFa5Tq7jMvjWXjGuC99bU
 Ui1Ebn5tM+S4Cyu83F6PWp6irhLUxR0Ud1a43AYKPTaI9kEHZdR71GuclpZsEPPH
 C2PgbZppntV9h6vzkehw7gmq+06qelMcNplKbVVHtyEymdj6YgwOln7IYnlAeGYt
 DrX2UIOVXFauZvBbRekryRdoxrp4enwGxc1mjG8hltoOanvyri0kUnLvLpPZKlIj
 rpZbpeIaEylEgWyeprAjjMOE98xm0cu1HqKHXKvVVe7WfZi/y4skbT3Y73Vmfm0z
 bTeJNQ0SLZwzoS82YEgpTwKh5ykMLRWJmrnKAI4GZcv6A4NRYGI=
 =LAgp
 -----END PGP SIGNATURE-----

Merge 4.9.171 into android-4.9-q

Changes in 4.9.171
	bonding: fix event handling for stacked bonds
	net: atm: Fix potential Spectre v1 vulnerabilities
	net: bridge: fix per-port af_packet sockets
	net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
	net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
	tcp: tcp_grow_window() needs to respect tcp_space()
	team: set slave to promisc if team is already in promisc mode
	vhost: reject zero size iova range
	ipv4: recompile ip options in ipv4_link_failure
	ipv4: ensure rcu_read_lock() in ipv4_link_failure()
	crypto: crypto4xx - properly set IV after de- and encrypt
	mmc: sdhci: Fix data command CRC error handling
	modpost: file2alias: go back to simple devtable lookup
	modpost: file2alias: check prototype of handler
	tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
	CIFS: keep FileInfo handle live during oplock break
	KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
	staging: iio: ad7192: Fix ad7193 channel address
	iio/gyro/bmg160: Use millidegrees for temperature scale
	iio: ad_sigma_delta: select channel when reading register
	iio: adc: at91: disable adc channel interrupt in timeout case
	io: accel: kxcjk1013: restore the range after resume.
	staging: comedi: vmk80xx: Fix use of uninitialized semaphore
	staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
	staging: comedi: ni_usb6501: Fix use of uninitialized mutex
	staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
	ALSA: core: Fix card races between register and disconnect
	Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
	Revert "svm: Fix AVIC incomplete IPI emulation"
	crypto: x86/poly1305 - fix overflow during partial reduction
	arm64: futex: Restore oldval initialization to work around buggy compilers
	x86/kprobes: Verify stack frame on kretprobe
	kprobes: Mark ftrace mcount handler functions nokprobe
	kprobes: Fix error check when reusing optimized probes
	rt2x00: do not increment sequence number while re-transmitting
	mac80211: do not call driver wake_tx_queue op during reconfig
	perf/x86/amd: Add event map for AMD Family 17h
	Revert "kbuild: use -Oz instead of -Os when using clang"
	sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
	device_cgroup: fix RCU imbalance in error case
	mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
	ALSA: info: Fix racy addition/deletion of nodes
	percpu: stop printing kernel addresses
	i2c-hid: properly terminate i2c_hid_dmi_desc_override_table[] array
	Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
	kernel/sysctl.c: fix out-of-bounds access when setting file-max
	Linux 4.9.171

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-04-30 13:09:48 +02:00
Will Deacon
3141fcc89f kernel/sysctl.c: fix out-of-bounds access when setting file-max
commit 9002b21465fa4d829edfc94a5a441005cffaa972 upstream.

Commit 32a5ad9c2285 ("sysctl: handle overflow for file-max") hooked up
min/max values for the file-max sysctl parameter via the .extra1 and
.extra2 fields in the corresponding struct ctl_table entry.

Unfortunately, the minimum value points at the global 'zero' variable,
which is an int.  This results in a KASAN splat when accessed as a long
by proc_doulongvec_minmax on 64-bit architectures:

  | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0
  | Read of size 8 at addr ffff2000133d1c20 by task systemd/1
  |
  | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 #2
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |  dump_backtrace+0x0/0x228
  |  show_stack+0x14/0x20
  |  dump_stack+0xe8/0x124
  |  print_address_description+0x60/0x258
  |  kasan_report+0x140/0x1a0
  |  __asan_report_load8_noabort+0x18/0x20
  |  __do_proc_doulongvec_minmax+0x5d8/0x6a0
  |  proc_doulongvec_minmax+0x4c/0x78
  |  proc_sys_call_handler.isra.19+0x144/0x1d8
  |  proc_sys_write+0x34/0x58
  |  __vfs_write+0x54/0xe8
  |  vfs_write+0x124/0x3c0
  |  ksys_write+0xbc/0x168
  |  __arm64_sys_write+0x68/0x98
  |  el0_svc_common+0x100/0x258
  |  el0_svc_handler+0x48/0xc0
  |  el0_svc+0x8/0xc
  |
  | The buggy address belongs to the variable:
  |  zero+0x0/0x40
  |
  | Memory state around the buggy address:
  |  ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa
  |  ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
  | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
  |                                ^
  |  ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
  |  ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix the splat by introducing a unsigned long 'zero_ul' and using that
instead.

Link: http://lkml.kernel.org/r/20190403153409.17307-1-will.deacon@arm.com
Fixes: 32a5ad9c2285 ("sysctl: handle overflow for file-max")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Christian Brauner <christian@brauner.io>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 09:34:47 +02:00
Greg Kroah-Hartman
417da48bd3 This is the 4.9.168 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlynutMACgkQONu9yGCS
 aT7lVA//QFJ8IsKQt9GCsGqVrJR/sI6HkCY9axyFNnSJSh67QGkZdOBd4W4kXDgW
 T0/WyhvYwhdDagm81sThKYsQo2WrPEuQ1KfarK9VZgnDVkmdkY+IJEg90QJUoDGg
 ucGsZs5S91cfDLC7UyfWEJLKJ9pqaPA4+jV0aHbcHiIzKCZqpBUwrZ+sa8UZXOTP
 a8BPz3PM7EjHLLGstPzN1ZP8mGsOwgR/Hy9Fy7hkX+SRor8sAIdRs5uLkH4qr52G
 mdbP42v3CtqyCXHTRaSVqXVak1U3i9IcD1zFY3JSQyUUo+QgHBzMf6ZGANudo7oM
 hI8Q9fKl095P9lYFp8zb1nH4OoFbS1P6gUlUtl9qBPWx12EUbXic9XrcEzmk8bPH
 E1uao/TZRymbadgRYiZZp4wIyG0XHWhY2aQ8AvKgMN5ddlqYfpOCY+gFuo5OPosC
 /vusNZgy4RshbLi+NNpx+5HluMJQJaU7NLs6sHud9CsmeQYx41Hqu0v+VOBuvvJ+
 iRkoPB6jw8ekJWKGQVm/eT2Qb0t8VqlPWTSWkbZEjWqeb/3dhJHlsDox1n/DuPgA
 mBPkOHPYfZKO/2uNgiFLLb5FZA9HjMbyy6l4jogUhEeQkMc4bM2h3TXafRdmca8o
 3/ElCDte0h/8uIPaqj8hprCK1/DunauQP3F4wA75XnKzU5C/FNw=
 =NJEW
 -----END PGP SIGNATURE-----

Merge 4.9.168 into android-4.9-q

Changes in 4.9.168
	arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
	arm64: debug: Ensure debug handlers check triggering exception level
	ext4: cleanup bh release code in ext4_ind_remove_space()
	lib/int_sqrt: optimize initial value compute
	tty/serial: atmel: Add is_half_duplex helper
	tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
	mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
	i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
	CIFS: fix POSIX lock leak and invalid ptr deref
	h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
	tracing: kdb: Fix ftdump to not sleep
	gpio: gpio-omap: fix level interrupt idling
	include/linux/relay.h: fix percpu annotation in struct rchan
	sysctl: handle overflow for file-max
	enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
	scsi: hisi_sas: Set PHY linkrate when disconnected
	mm/cma.c: cma_declare_contiguous: correct err handling
	mm/page_ext.c: fix an imbalance with kmemleak
	mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
	mm/slab.c: kmemleak no scan alien caches
	ocfs2: fix a panic problem caused by o2cb_ctl
	f2fs: do not use mutex lock in atomic context
	fs/file.c: initialize init_files.resize_wait
	cifs: use correct format characters
	dm thin: add sanity checks to thin-pool and external snapshot creation
	cifs: Fix NULL pointer dereference of devname
	jbd2: fix invalid descriptor block checksum
	fs: fix guard_bio_eod to check for real EOD errors
	tools lib traceevent: Fix buffer overflow in arg_eval
	wil6210: check null pointer in _wil_cfg80211_merge_extra_ies
	crypto: crypto4xx - add missing of_node_put after of_device_is_available
	usb: chipidea: Grab the (legacy) USB PHY by phandle first
	scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
	coresight: etm4x: Add support to enable ETMv4.2
	ARM: 8840/1: use a raw_spinlock_t in unwind
	iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables
	mmc: omap: fix the maximum timeout setting
	e1000e: Fix -Wformat-truncation warnings
	mlxsw: spectrum: Avoid -Wformat-truncation warnings
	IB/mlx4: Increase the timeout for CM cache
	scsi: megaraid_sas: return error when create DMA pool failed
	perf test: Fix failure of 'evsel-tp-sched' test on s390
	SoC: imx-sgtl5000: add missing put_device()
	media: sh_veu: Correct return type for mem2mem buffer helpers
	media: s5p-jpeg: Correct return type for mem2mem buffer helpers
	media: s5p-g2d: Correct return type for mem2mem buffer helpers
	media: mx2_emmaprp: Correct return type for mem2mem buffer helpers
	vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1
	HID: intel-ish-hid: avoid binding wrong ishtp_cl_device
	leds: lp55xx: fix null deref on firmware load failure
	iwlwifi: pcie: fix emergency path
	ACPI / video: Refactor and fix dmi_is_desktop()
	kprobes: Prohibit probing on bsearch()
	ARM: 8833/1: Ensure that NEON code always compiles with Clang
	ALSA: PCM: check if ops are defined before suspending PCM
	usb: f_fs: Avoid crash due to out-of-scope stack ptr access
	bcache: fix input overflow to cache set sysfs file io_error_halflife
	bcache: fix input overflow to sequential_cutoff
	bcache: improve sysfs_strtoul_clamp()
	genirq: Avoid summation loops for /proc/stat
	iw_cxgb4: fix srqidx leak during connection abort
	fbdev: fbmem: fix memory access if logo is bigger than the screen
	cdrom: Fix race condition in cdrom_sysctl_register
	e1000e: fix cyclic resets at link up with active tx
	ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe
	efi/memattr: Don't bail on zero VA if it equals the region's PA
	ARM: dts: lpc32xx: Remove leading 0x and 0s from bindings notation
	soc: qcom: gsbi: Fix error handling in gsbi_probe()
	mt7601u: bump supported EEPROM version
	ARM: avoid Cortex-A9 livelock on tight dmb loops
	tty: increase the default flip buffer limit to 2*640K
	powerpc/pseries: Perform full re-add of CPU for topology update post-migration
	media: mt9m111: set initial frame size other than 0x0
	hwrng: virtio - Avoid repeated init of completion
	soc/tegra: fuse: Fix illegal free of IO base address
	HID: intel-ish: ipc: handle PIMR before ish_wakeup also clear PISR busy_clear bit
	hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable
	dmaengine: imx-dma: fix warning comparison of distinct pointer types
	dmaengine: qcom_hidma: assign channel cookie correctly
	netfilter: physdev: relax br_netfilter dependency
	media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration
	regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting
	drm/nouveau: Stop using drm_crtc_force_disable
	x86/build: Specify elf_i386 linker emulation explicitly for i386 objects
	selinux: do not override context on context mounts
	wlcore: Fix memory leak in case wl12xx_fetch_firmware failure
	x86/build: Mark per-CPU symbols as absolute explicitly for LLD
	dmaengine: tegra: avoid overflow of byte tracking
	drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
	ACPI / video: Extend chassis-type detection with a "Lunch Box" check
	Linux 4.9.168

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-04-05 22:37:25 +02:00
Christian Brauner
6b65c268df sysctl: handle overflow for file-max
[ Upstream commit 32a5ad9c22852e6bd9e74bdec5934ef9d1480bc5 ]

Currently, when writing

  echo 18446744073709551616 > /proc/sys/fs/file-max

/proc/sys/fs/file-max will overflow and be set to 0.  That quickly
crashes the system.

This commit sets the max and min value for file-max.  The max value is
set to long int.  Any higher value cannot currently be used as the
percpu counters are long ints and not unsigned integers.

Note that the file-max value is ultimately parsed via
__do_proc_doulongvec_minmax().  This function does not report error when
min or max are exceeded.  Which means if a value largen that long int is
written userspace will not receive an error instead the old value will be
kept.  There is an argument to be made that this should be changed and
__do_proc_doulongvec_minmax() should return an error when a dedicated min
or max value are exceeded.  However this has the potential to break
userspace so let's defer this to an RFC patch.

Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
[christian@brauner.io: v4]
  Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05 22:29:06 +02:00
Greg Kroah-Hartman
72b54df9a5 This is the 4.9.165 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlyWJGwACgkQONu9yGCS
 aT6ESRAA12tEs9A/BRn3YolJ4ePUIvf5lG4EzFQLwCaPyx1OGaJInLZBWIo6PLCz
 DEfWsiE5iO1JtGNbvMah/5SXy7AxocmfFPkGRiMJZASyFflua69HaSCqoIhmqF+G
 qWfEC6YvhGFWBfmasNDbn1JhQxsXBJw6Hy5xRsW+Jdp5c2WH5Jao5d1XckZVbASo
 QJg+nqAOWJtbjdCe71uQSiLM5YEk5AjakC+A3tDGPCZlTy+X2EpBogjW0aZ8bq1n
 bVjcwhbTLO0o+yle2Y0CBUdcFw1pmx8MSmTp/gu8q7Q0fb4OrwisRCkomZQC3zdC
 Vrea9quNQnB2bdx2foDFE7yOBVU5ehLmoYUjXFxLxhqSP1pTMXHo77rMusOZbtuF
 uWjSrRHaD4U/CPn4CJWcEO+L5WHIwBKbaOcpRgzlLU9yaoWbRjL2XvezSo2m+8V5
 snger3zYy7Z5yQZQYscGvMHc7bQWAgbCtmGubshm+yrS5sVDmzHiyQ1iBGQ86qsP
 v1CYZQGYyZbRhocTH6YpRPuoJUecieaO1U8XsXPcKHHR0pXNr/yUJ1gQmrxuRDjw
 SBLMmKnrtQTOOfra1Qi77W2VxBL9rT/3BdauPRkKbIuuen4IHoE9UxN8Oxcp37yF
 mmHO5QrQllVPdkS7oFwBmfrA7nANfgXeRXjfV6hpNyunHhwNGvE=
 =oels
 -----END PGP SIGNATURE-----

Merge 4.9.165 into android-4.9

Changes in 4.9.165
	media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
	9p: use inode->i_lock to protect i_size_write() under 32-bit
	9p/net: fix memory leak in p9_client_create
	ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
	iio: adc: exynos-adc: Fix NULL pointer exception on unbind
	stm class: Fix an endless loop in channel allocation
	crypto: caam - fixed handling of sg list
	crypto: ahash - fix another early termination in hash walk
	gpu: ipu-v3: Fix i.MX51 CSI control registers offset
	gpu: ipu-v3: Fix CSI offsets for imx53
	s390/dasd: fix using offset into zero size array error
	ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized
	Input: cap11xx - switch to using set_brightness_blocking()
	Input: matrix_keypad - use flush_delayed_work()
	floppy: check_events callback should not return a negative number
	mm/gup: fix gup_pmd_range() for dax
	mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
	net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
	i2c: cadence: Fix the hold bit setting
	Input: st-keyscan - fix potential zalloc NULL dereference
	clk: sunxi: A31: Fix wrong AHB gate number
	ARM: 8824/1: fix a migrating irq bug when hotplug cpu
	assoc_array: Fix shortcut creation
	scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
	net: systemport: Fix reception of BPDUs
	pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
	qmi_wwan: apply SET_DTR quirk to Sierra WP7607
	net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
	ASoC: topology: free created components in tplg load error
	arm64: Relax GIC version check during early boot
	net: marvell: mvneta: fix DMA debug warning
	tmpfs: fix link accounting when a tmpfile is linked in
	ARCv2: lib: memcpy: fix doing prefetchw outside of buffer
	ARC: uacces: remove lp_start, lp_end from clobber list
	phonet: fix building with clang
	mac80211_hwsim: propagate genlmsg_reply return code
	net: thunderx: make CFG_DONE message to run through generic send-ack sequence
	nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
	nfp: bpf: fix ALU32 high bits clearance bug
	net: set static variable an initial value in atl2_probe()
	tmpfs: fix uninitialized return value in shmem_link
	stm class: Prevent division by zero
	libnvdimm/label: Clear 'updating' flag after label-set update
	libnvdimm/pmem: Honor force_raw for legacy pmem regions
	libnvdimm: Fix altmap reservation size calculation
	crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
	crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
	CIFS: Do not reset lease state to NONE on lease break
	CIFS: Fix read after write for files with read caching
	tracing: Use strncpy instead of memcpy for string keys in hist triggers
	tracing: Do not free iter->trace in fail path of tracing_open_pipe()
	ACPI / device_sysfs: Avoid OF modalias creation for removed device
	spi: ti-qspi: Fix mmap read when more than one CS in use
	spi: pxa2xx: Setup maximum supported DMA transfer length
	regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
	regulator: s2mpa01: Fix step values for some LDOs
	clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
	clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
	s390/virtio: handle find on invalid queue gracefully
	scsi: virtio_scsi: don't send sc payload with tmfs
	scsi: sd: Optimal I/O size should be a multiple of physical block size
	scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
	fs/devpts: always delete dcache dentry-s in dput()
	splice: don't merge into linked buffers
	m68k: Add -ffreestanding to CFLAGS
	btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
	Btrfs: fix corruption reading shared and compressed extents after hole punching
	crypto: pcbc - remove bogus memcpy()s with src == dest
	libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer
	cpufreq: tegra124: add missing of_node_put()
	cpufreq: pxa2xx: remove incorrect __init annotation
	ext4: fix crash during online resizing
	ext2: Fix underflow in ext2_max_size()
	clk: clk-twl6040: Fix imprecise external abort for pdmclk
	clk: ingenic: Fix round_rate misbehaving with non-integer dividers
	clk: ingenic: Fix doc of ingenic_cgu_div_info
	nfit: acpi_nfit_ctl(): Check out_obj->type in the right place
	mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
	mm/vmalloc: fix size check for remap_vmalloc_range_partial()
	kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
	device property: Fix the length used in PROPERTY_ENTRY_STRING()
	intel_th: Don't reference unassigned outputs
	parport_pc: fix find_superio io compare code, should use equal test.
	i2c: tegra: fix maximum transfer size
	drm/i915: Relax mmap VMA check
	serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
	serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
	8250: FIX Fourth port offset of Pericom PI7C9X7954 boards
	serial: 8250_pci: Fix number of ports for ACCES serial cards
	serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
	jbd2: clear dirty flag when revoking a buffer from an older transaction
	jbd2: fix compile warning when using JBUFFER_TRACE
	powerpc/32: Clear on-stack exception marker upon exception return
	powerpc/wii: properly disable use of BATs when requested.
	powerpc/powernv: Make opal log only readable by root
	powerpc/83xx: Also save/restore SPRG4-7 during suspend
	powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest
	powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning
	ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
	dm: fix to_sector() for 32bit
	NFS: Fix I/O request leakages
	NFS: Fix an I/O request leakage in nfs_do_recoalesce
	NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
	nfsd: fix memory corruption caused by readdir
	nfsd: fix wrong check in write_v4_end_grace()
	PM / wakeup: Rework wakeup source timer cancellation
	bcache: never writeback a discard operation
	perf intel-pt: Fix CYC timestamp calculation after OVF
	perf auxtrace: Define auxtrace record alignment
	perf intel-pt: Fix overlap calculation for padding
	perf intel-pt: Fix divide by zero when TSC is not available
	md: Fix failed allocation of md_register_thread
	rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
	media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
	drm/radeon/evergreen_cs: fix missing break in switch statement
	KVM: nVMX: Sign extend displacements of VMX instr's mem operands
	KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
	KVM: X86: Fix residual mmio emulation request to userspace
	Linux 4.9.165

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-03-23 13:34:48 +01:00
Zev Weiss
45a67f153b kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
commit 8cf7630b29701d364f8df4a50e4f1f5e752b2778 upstream.

This bug has apparently existed since the introduction of this function
in the pre-git era (4500e91754d3 in Thomas Gleixner's history.git,
"[NET]: Add proc_dointvec_userhz_jiffies, use it for proper handling of
neighbour sysctls.").

As a minimal fix we can simply duplicate the corresponding check in
do_proc_dointvec_conv().

Link: http://lkml.kernel.org/r/20190207123426.9202-3-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: <stable@vger.kernel.org>	[2.6.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23 13:19:49 +01:00
Greg Kroah-Hartman
8817a280ad This is the 4.9.156 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxjFC8ACgkQONu9yGCS
 aT5lqRAAtBlGUzPIwEdReR2998K5wZ5mXqtT887ByrEsHNxg51BUfQv+7gsfrUlF
 PMGEjH93D8qJhPEg13vxVljl6EDtPwXZlxB+ECslot6+aOGkhpwdp9//9QDbF15q
 gZg7PBhE2Ma+ULs7DNNt1WlgVELZaO8Y7rMiJ5IgHcLy6MA1LVckA+TRC4pYNjqu
 plSMHKaqZn7zYiSzc7SnTsQPihXUQIoBq1D4Ho0Yl/f2CAJL7qZmG7Jqzc5LL+Fq
 aHuk+YqRAMt1BgduQ6q6iGtirJOhBXNRJQB2Kd3fJ+wt//QsSUaHiY6+CgRHjmwV
 tIpCWvpUAdX67P2tMQYPO1ZqscR616a8cO7ZtWA6kODsiLQ95bRBaRo3LMG99G7x
 55PFSQtbNAD/eXmDd6OKPeTovuF1U8KylvNTtcYJOsPStIBiC/9uYlvqVnfkLDZ6
 GYnTO0xgbhxkgfNXgpVyR1vmsiCFvP3FASyPxaIf2W52brozhHGF+OwtbZcSFAzH
 ZSlee0L3ItOVY4BgoDpVqAHp/k8Uso5g46SYFburQ2Eg2CtxuZpL8B7MadEegapA
 He+7LhwLBLR8aPWzHDR0rM9m/yZ8rPeyhSI5FoE2KZ+Ov9XtGYnpRjI5y7sU6xhp
 HP7iDN+wFqSDE02PZq4FQKGiSJN8/RHGL23MOqx8F/BGAfucXlw=
 =xb2h
 -----END PGP SIGNATURE-----

Merge 4.9.156 into android-4.9

Changes in 4.9.156
	drm/bufs: Fix Spectre v1 vulnerability
	staging: iio: adc: ad7280a: handle error from __ad7280_read32()
	ASoC: Intel: mrfld: fix uninitialized variable access
	gpu: ipu-v3: image-convert: Prevent race between run and unprepare
	ath9k: dynack: use authentication messages for 'late' ack
	scsi: lpfc: Correct LCB RJT handling
	ARM: 8808/1: kexec:offline panic_smp_self_stop CPU
	dlm: Don't swamp the CPU with callbacks queued during recovery
	x86/PCI: Fix Broadcom CNB20LE unintended sign extension (redux)
	powerpc/pseries: add of_node_put() in dlpar_detach_node()
	drm/vc4: ->x_scaling[1] should never be set to VC4_SCALING_NONE
	serial: fsl_lpuart: clear parity enable bit when disable parity
	ptp: check gettime64 return code in PTP_SYS_OFFSET ioctl
	staging:iio:ad2s90: Make probe handle spi_setup failure
	staging: iio: ad7780: update voltage on read
	ARM: OMAP2+: hwmod: Fix some section annotations
	modpost: validate symbol names also in find_elf_symbol
	perf tools: Add Hygon Dhyana support
	soc/tegra: Don't leak device tree node reference
	media: mtk-vcodec: Release device nodes in mtk_vcodec_init_enc_pm()
	dmaengine: xilinx_dma: Remove __aligned attribute on zynqmp_dma_desc_ll
	iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID
	media: adv*/tc358743/ths8200: fill in min width/height/pixelclock
	f2fs: move dir data flush to write checkpoint process
	f2fs: fix wrong return value of f2fs_acl_create
	sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
	soc: bcm: brcmstb: Don't leak device tree node reference
	nfsd4: fix crash on writing v4_end_grace before nfsd startup
	Thermal: do not clear passive state during system sleep
	firmware/efi: Add NULL pointer checks in efivars API functions
	arm64: ftrace: don't adjust the LR value
	ARM: dts: mmp2: fix TWSI2
	x86/fpu: Add might_fault() to user_insn()
	media: DaVinci-VPBE: fix error handling in vpbe_initialize()
	smack: fix access permissions for keyring
	usb: hub: delay hub autosuspend if USB3 port is still link training
	timekeeping: Use proper seqcount initializer
	clk: sunxi-ng: a33: Set CLK_SET_RATE_PARENT for all audio module clocks
	iommu/amd: Fix amd_iommu=force_isolation
	ARM: dts: Fix OMAP4430 SDP Ethernet startup
	mips: bpf: fix encoding bug for mm_srlv32_op
	iommu/arm-smmu: Add support for qcom,smmu-v2 variant
	iommu/arm-smmu-v3: Use explicit mb() when moving cons pointer
	sata_rcar: fix deferred probing
	clk: imx6sl: ensure MMDC CH0 handshake is bypassed
	cpuidle: big.LITTLE: fix refcount leak
	i2c-axxia: check for error conditions first
	udf: Fix BUG on corrupted inode
	ARM: pxa: avoid section mismatch warning
	ASoC: fsl: Fix SND_SOC_EUKREA_TLV320 build error on i.MX8M
	memstick: Prevent memstick host from getting runtime suspended during card detection
	tty: serial: samsung: Properly set flags in autoCTS mode
	perf header: Fix unchecked usage of strncpy()
	perf probe: Fix unchecked usage of strncpy()
	arm64: KVM: Skip MMIO insn after emulation
	powerpc/uaccess: fix warning/error with access_ok()
	mac80211: fix radiotap vendor presence bitmap handling
	xfrm6_tunnel: Fix spi check in __xfrm6_tunnel_alloc_spi
	Bluetooth: Fix unnecessary error message for HCI request completion
	scsi: smartpqi: correct host serial num for ssa
	scsi: smartpqi: correct volume status
	cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan()
	drbd: narrow rcu_read_lock in drbd_sync_handshake
	drbd: disconnect, if the wrong UUIDs are attached on a connected peer
	drbd: skip spurious timeout (ping-timeo) when failing promote
	drbd: Avoid Clang warning about pointless switch statment
	video: clps711x-fb: release disp device node in probe()
	fbdev: fbmem: behave better with small rotated displays and many CPUs
	i40e: define proper net_device::neigh_priv_len
	igb: Fix an issue that PME is not enabled during runtime suspend
	fbdev: fbcon: Fix unregister crash when more than one framebuffer
	pinctrl: meson: meson8: fix the GPIO function for the GPIOAO pins
	pinctrl: meson: meson8b: fix the GPIO function for the GPIOAO pins
	KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported
	NFS: nfs_compare_mount_options always compare auth flavors.
	hwmon: (lm80) fix a missing check of the status of SMBus read
	hwmon: (lm80) fix a missing check of bus read in lm80 probe
	seq_buf: Make seq_buf_puts() null-terminate the buffer
	crypto: ux500 - Use proper enum in cryp_set_dma_transfer
	crypto: ux500 - Use proper enum in hash_set_dma_transfer
	MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8
	cifs: check ntwrk_buf_start for NULL before dereferencing it
	um: Avoid marking pages with "changed protection"
	niu: fix missing checks of niu_pci_eeprom_read
	f2fs: fix sbi->extent_list corruption issue
	scripts/decode_stacktrace: only strip base path when a prefix of the path
	ocfs2: don't clear bh uptodate for block read
	isdn: hisax: hfc_pci: Fix a possible concurrency use-after-free bug in HFCPCI_l1hw()
	gdrom: fix a memory leak bug
	fsl/fman: Use GFP_ATOMIC in {memac,tgec}_add_hash_mac_address()
	block/swim3: Fix -EBUSY error when re-opening device after unmount
	thermal: generic-adc: Fix adc to temp interpolation
	HID: lenovo: Add checks to fix of_led_classdev_register
	kernel/hung_task.c: break RCU locks based on jiffies
	proc/sysctl: fix return error for proc_doulongvec_minmax()
	fs/epoll: drop ovflist branch prediction
	exec: load_script: don't blindly truncate shebang string
	thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set
	dccp: fool proof ccid_hc_[rt]x_parse_options()
	net: dp83640: expire old TX-skb
	rxrpc: bad unlock balance in rxrpc_recvmsg
	skge: potential memory corruption in skge_get_regs()
	rds: fix refcount bug in rds_sock_addref
	net: systemport: Fix WoL with password after deep sleep
	net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
	net: dsa: slave: Don't propagate flag changes on down slave interfaces
	enic: fix checksum validation for IPv6
	ALSA: compress: Fix stop handling on compressed capture streams
	ALSA: hda - Serialize codec registrations
	fuse: call pipe_buf_release() under pipe lock
	fuse: decrement NR_WRITEBACK_TEMP on the right page
	fuse: handle zero sized retrieve correctly
	dmaengine: bcm2835: Fix interrupt race on RT
	dmaengine: bcm2835: Fix abort of transactions
	dmaengine: imx-dma: fix wrong callback invoke
	usb: phy: am335x: fix race condition in _probe
	usb: gadget: udc: net2272: Fix bitwise and boolean operations
	usb: gadget: musb: fix short isoc packets with inventra dma
	scsi: aic94xx: fix module loading
	KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
	kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
	KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
	perf/x86/intel/uncore: Add Node ID mask
	x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
	perf/core: Don't WARN() for impossible ring-buffer sizes
	perf tests evsel-tp-sched: Fix bitwise operator
	serial: fix race between flush_to_ldisc and tty_open
	oom, oom_reaper: do not enqueue same task twice
	PCI: vmd: Free up IRQs on suspend path
	IB/hfi1: Add limit test for RC/UC send via loopback
	perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
	ath9k: dynack: make ewma estimation faster
	ath9k: dynack: check da->enabled first in sampling routines
	Linux 4.9.156

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-02-12 20:27:44 +01:00
Cheng Lin
0e5c7505a9 proc/sysctl: fix return error for proc_doulongvec_minmax()
[ Upstream commit 09be178400829dddc1189b50a7888495dd26aa84 ]

If the number of input parameters is less than the total parameters, an
EINVAL error will be returned.

For example, we use proc_doulongvec_minmax to pass up to two parameters
with kern_table:

{
	.procname       = "monitor_signals",
	.data           = &monitor_sigs,
	.maxlen         = 2*sizeof(unsigned long),
	.mode           = 0644,
	.proc_handler   = proc_doulongvec_minmax,
},

Reproduce:

When passing two parameters, it's work normal.  But passing only one
parameter, an error "Invalid argument"(EINVAL) is returned.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  -bash: echo: write error: Invalid argument
  [root@cl150 ~]# echo $?
  1
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

The following is the result after apply this patch.  No error is
returned when the number of input parameters is less than the total
parameters.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# echo $?
  0
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

There are three processing functions dealing with digital parameters,
__do_proc_dointvec/__do_proc_douintvec/__do_proc_doulongvec_minmax.

This patch deals with __do_proc_doulongvec_minmax, just as
__do_proc_dointvec does, adding a check for parameters 'left'.  In
__do_proc_douintvec, its code implementation explicitly does not support
multiple inputs.

static int __do_proc_douintvec(...){
         ...
         /*
          * Arrays are not supported, keep this simple. *Do not* add
          * support for them.
          */
         if (vleft != 1) {
                 *lenp = 0;
                 return -EINVAL;
         }
         ...
}

So, just __do_proc_doulongvec_minmax has the problem.  And most use of
proc_doulongvec_minmax/proc_doulongvec_ms_jiffies_minmax just have one
parameter.

Link: http://lkml.kernel.org/r/1544081775-15720-1-git-send-email-cheng.lin130@zte.com.cn
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12 19:44:59 +01:00
Greg Kroah-Hartman
db71418d4a This is the 4.9.142 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwCSesACgkQONu9yGCS
 aT6l4xAA1dKYI5S/l/K6VswVDAR48BZY4hAVPrCOCuhWt6RXMa6rv9Jhqvlj8DTH
 njfntSSpQCkEUHdmG1uceTVtuitXKFWyIwQXVql73cdg9u7Jxd1uJ71B7dSzyqFw
 s712+HkKyp2O64KH/LpLjwEbImtVXTAzIXhIAxLjLdOEkBV7kNZMcVTFzKqPO/vX
 OKwZw65ro2+3VZiU5cfSVx6Oerbdjusan56dQTGAxH96JCo3+5mXjBiK83fmQf0e
 52zmesn4bSfC4083bwJpAx5wy6tBBZocDRMTCXJ6vvHMFL58QvZEnIdWpe2ECT3v
 rUIxOs5nNq6yFHwfn3VM+PzYhqz7WT08RBkr2KNTpaDOIQqFqMi7H4qVo2BfUVnv
 RKEUnyXe9hO5SNcVroNmM8qwda5tubA/EGPgLt+BFw2sUEZIcQ0VWduxSJtgxerH
 aSEvyEvT9551oXLg4LHARIJwNiKDk2Iq8UkANwL6+F0ZLD0sNzDNM+gysvNII7aW
 lkY6Z/sE0oGYWylq17i+wFxJ2PeLo84KJeZ3tqrnTbTuok4cjIWSmxPszDnq+E3f
 b9dMoDnECEO1DmM+OCKBeOkxzwbl92rkn3HaQxWQNFRhGzoGK1h4ECHSHyMCkZJw
 7n17JeT4QGrxhmkmO7qS2u1yjPiJWURDd0FDd1ns6Tm4soaljvs=
 =zLVF
 -----END PGP SIGNATURE-----

Merge 4.9.142 into android-4.9

Changes in 4.9.142
	usb: core: Fix hub port connection events lost
	usb: dwc3: core: Clean up ULPI device
	usb: xhci: fix timeout for transition from RExit to U0
	MAINTAINERS: Add Sasha as a stable branch maintainer
	gpio: don't free unallocated ida on gpiochip_add_data_with_key() error path
	iwlwifi: mvm: support sta_statistics() even on older firmware
	iwlwifi: mvm: fix regulatory domain update when the firmware starts
	brcmfmac: fix reporting support for 160 MHz channels
	tools/power/cpupower: fix compilation with STATIC=true
	v9fs_dir_readdir: fix double-free on p9stat_read error
	selinux: Add __GFP_NOWARN to allocation at str_read()
	bfs: add sanity check at bfs_fill_super()
	sctp: clear the transport of some out_chunk_list chunks in sctp_assoc_rm_peer
	gfs2: Don't leave s_fs_info pointing to freed memory in init_sbd
	llc: do not use sk_eat_skb()
	mm: don't warn about large allocations for slab
	drm/ast: change resolution may cause screen blurred
	drm/ast: fixed cursor may disappear sometimes
	drm/ast: Remove existing framebuffers before loading driver
	can: dev: can_get_echo_skb(): factor out non sending code to __can_get_echo_skb()
	can: dev: __can_get_echo_skb(): replace struct can_frame by canfd_frame to access frame length
	can: dev: __can_get_echo_skb(): Don't crash the kernel if can_priv::echo_skb is accessed out of bounds
	can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb
	IB/core: Fix for core panic
	IB/hfi1: Eliminate races in the SDMA send error path
	usb: xhci: Prevent bus suspend if a port connect change or polling state is detected
	pinctrl: meson: fix pinconf bias disable
	KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE
	cpufreq: imx6q: add return value check for voltage scale
	rtc: pcf2127: fix a kmemleak caused in pcf2127_i2c_gather_write
	floppy: fix race condition in __floppy_read_block_0()
	powerpc/io: Fix the IO workarounds code to work with Radix
	perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
	SUNRPC: Fix a bogus get/put in generic_key_to_expire()
	kdb: Use strscpy with destination buffer size
	powerpc/numa: Suppress "VPHN is not supported" messages
	efi/arm: Revert deferred unmap of early memmap mapping
	tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset
	of: add helper to lookup compatible child node
	NFC: nfcmrvl_uart: fix OF child-node lookup
	net: bcmgenet: fix OF child-node lookup
	arm64: remove no-op -p linker flag
	ath10k: fix kernel panic due to race in accessing arvif list
	Input: xpad - add product ID for Xbox One S pad
	Input: xpad - fix Xbox One rumble stopping after 2.5 secs
	Input: xpad - correctly sort vendor id's
	Input: xpad - move reporting xbox one home button to common function
	Input: xpad - simplify error condition in init_output
	Input: xpad - don't depend on endpoint order
	Input: xpad - fix stuck mode button on Xbox One S pad
	Input: xpad - restore LED state after device resume
	Input: xpad - support some quirky Xbox One pads
	Input: xpad - sort supported devices by USB ID
	Input: xpad - sync supported devices with xboxdrv
	Input: xpad - add USB IDs for Mad Catz Brawlstick and Razer Sabertooth
	Input: xpad - sync supported devices with 360Controller
	Input: xpad - sync supported devices with XBCD
	Input: xpad - constify usb_device_id
	Input: xpad - fix PowerA init quirk for some gamepad models
	Input: xpad - validate USB endpoint type during probe
	Input: xpad - add support for PDP Xbox One controllers
	Input: xpad - add PDP device id 0x02a4
	Input: xpad - fix some coding style issues
	Input: xpad - avoid using __set_bit() for capabilities
	Input: xpad - add GPD Win 2 Controller USB IDs
	Input: xpad - fix GPD Win 2 controller name
	Input: xpad - add support for Xbox1 PDP Camo series gamepad
	cw1200: Don't leak memory if krealloc failes
	mwifiex: prevent register accesses after host is sleeping
	mwifiex: report error to PCIe for suspend failure
	mwifiex: Fix NULL pointer dereference in skb_dequeue()
	mwifiex: fix p2p device doesn't find in scan problem
	scsi: ufs: fix bugs related to null pointer access and array size
	scsi: ufshcd: Fix race between clk scaling and ungate work
	scsi: ufs: fix race between clock gating and devfreq scaling work
	scsi: ufshcd: release resources if probe fails
	include/linux/pfn_t.h: force '~' to be parsed as an unary operator
	tty: wipe buffer.
	tty: wipe buffer if not echoing data
	usb: xhci: fix uninitialized completion when USB3 port got wrong status
	sched/core: Allow __sched_setscheduler() in interrupts when PI is not used
	namei: allow restricted O_CREAT of FIFOs and regular files
	lan78xx: Read MAC address from DT if present
	s390/mm: Check for valid vma before zapping in gmap_discard
	net: ieee802154: 6lowpan: fix frag reassembly
	Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
	ima: always measure and audit files in policy
	EVM: Add support for portable signature format
	ima: re-introduce own integrity cache lock
	ima: re-initialize iint->atomic_flags
	Linux 4.9.142

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-12-01 10:00:35 +01:00
Salvatore Mesoraca
0c41beebcd namei: allow restricted O_CREAT of FIFOs and regular files
commit 30aba6656f61ed44cba445a3c0d38b296fa9e8f5 upstream.

Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag.  The purpose
is to make data spoofing attacks harder.  This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection.  This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.

This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:

CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489

This list is not meant to be complete.  It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector.  In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.

[s.mesoraca16@gmail.com: fix bug reported by Dan Carpenter]
  Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda
  Link: http://lkml.kernel.org/r/1524829819-11275-1-git-send-email-s.mesoraca16@gmail.com
[keescook@chromium.org: drop pr_warn_ratelimited() in favor of audit changes in the future]
[keescook@chromium.org: adjust commit subjet]
Link: http://lkml.kernel.org/r/20180416175918.GA13494@beast
Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Solar Designer <solar@openwall.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Loic <hackurx@opensec.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-01 09:44:25 +01:00
Greg Kroah-Hartman
a925dfbdda This is the 4.9.125 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAluPg6sACgkQONu9yGCS
 aT4T5BAAos3xo+rmcisBVP2hRLKeDKm8sa8dx3x96o5PUthbmOkeh3piGWB0QFbF
 EQApHitfxQgeQGaqdAaEhoRh/l0iVQ7rfxsXqOud+O9G+QQa/wQ1IPGbkW/eKsS3
 SFcILOhK37ssLe7+gsvD+H1RWXXD01XgYg2Ue00S6vSnuAuTFpTb45pavhyi6AzM
 Ab7vhEreIw90mpLpQS+eSvItgPqRIbM8pWmlMbloDKKvrZr9MeTFQFkIL10p0g9d
 bL1IpRGmsTiqULNOxwPtJJcxjDbdNfMoBNraAyMa028Wo+V2BdhWsqoFWVskNNAw
 iEdIHEgUUzGIID4nRYRE1Tmd5i6LXiwAPfS5IPCmp96ngvxtZeVUVyQEBtGKCuc3
 CIbw5sOmF+IfCPjcnKyeevRa06xGhsvNaw0zs3qIgfo+3ApycqvwI7b8YeHAJH4j
 gFcP9dVIrkfhxuTekABSIPY6oqDDby1iSRRDeQFXtWXGO8s2QTpDedC8rxFOYGe2
 KSySBUnced/QBojMpUB9xZXbIq07OPvY0FvHwAtA9NNFMuzM/jF2lthNLrIEmSs7
 OeE8SpRzbMl21xPwzv6i+hozl0v56TTc1w6zlQk7eDCFVFlulZTnRWZGfCvZeRmI
 39y+teQ5bprEypy06dGi1/m2cCFzYDwxAsMdJbUpAVxMQumcpmU=
 =hTgP
 -----END PGP SIGNATURE-----

Merge 4.9.125 into android-4.9

Changes in 4.9.125
	vti6: fix PMTU caching and reporting on xmit
	xfrm: fix missing dst_release() after policy blocking lbcast and multicast
	xfrm: free skb if nlsk pointer is NULL
	mac80211: add stations tied to AP_VLANs during hw reconfig
	nl80211: Add a missing break in parse_station_flags
	drm/bridge: adv7511: Reset registers on hotplug
	scsi: libiscsi: fix possible NULL pointer dereference in case of TMF
	drm/imx: imx-ldb: disable LDB on driver bind
	drm/imx: imx-ldb: check if channel is enabled before printing warning
	usb: gadget: r8a66597: Fix two possible sleep-in-atomic-context bugs in init_controller()
	usb: gadget: r8a66597: Fix a possible sleep-in-atomic-context bugs in r8a66597_queue()
	usb/phy: fix PPC64 build errors in phy-fsl-usb.c
	tools: usb: ffs-test: Fix build on big endian systems
	usb: gadget: f_uac2: fix endianness of 'struct cntrl_*_lay3'
	bpf, ppc64: fix unexpected r0=0 exit path inside bpf_xadd
	tools/power turbostat: fix -S on UP systems
	net: caif: Add a missing rcu_read_unlock() in caif_flow_cb
	qed: Fix possible race for the link state value.
	qed: Correct Multicast API to reflect existence of 256 approximate buckets.
	atl1c: reserve min skb headroom
	net: prevent ISA drivers from building on PPC32
	can: mpc5xxx_can: check of_iomap return before use
	i2c: davinci: Avoid zero value of CLKH
	perf/x86/amd/ibs: Don't access non-started event
	media: staging: omap4iss: Include asm/cacheflush.h after generic includes
	bnx2x: Fix invalid memory access in rss hash config path.
	qmi_wwan: fix interface number for DW5821e production firmware
	net: axienet: Fix double deregister of mdio
	x86/boot: Fix if_changed build flip/flop bug
	fscache: Allow cancelled operations to be enqueued
	cachefiles: Fix refcounting bug in backing-file read monitoring
	cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
	selftests/ftrace: Add snapshot and tracing_on test case
	zswap: re-check zswap_is_full() after do zswap_shrink()
	tools/power turbostat: Read extended processor family from CPUID
	Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"
	enic: handle mtu change for vf properly
	arc: [plat-eznps] fix data type errors in platform headers
	arc: fix build errors in arc/include/asm/delay.h
	arc: fix type warnings in arc/mm/cache.c
	squashfs metadata 2: electric boogaloo
	Squashfs: Compute expected length from inode size rather than block length
	drivers: net: lmc: fix case value for target abort error
	memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
	scsi: fcoe: drop frames in ELS LOGO error path
	scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO
	scsi: vmw_pvscsi: Return DID_RESET for status SAM_STAT_COMMAND_TERMINATED
	mm/memory.c: check return value of ioremap_prot
	sched/sysctl: Check user input value of sysctl_sched_time_avg
	Cipso: cipso_v4_optptr enter infinite loop
	mei: don't update offset in write
	cifs: add missing debug entries for kconfig options
	cifs: check kmalloc before use
	smb3: enumerating snapshots was leaving part of the data off end
	smb3: Do not send SMB3 SET_INFO if nothing changed
	smb3: don't request leases in symlink creation and query
	kprobes/arm64: Fix %p uses in error messages
	arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
	s390/kvm: fix deadlock when killed by oom
	ext4: check for NUL characters in extended attribute's name
	ext4: sysfs: print ext4_super_block fields as little-endian
	ext4: reset error code in ext4_find_entry in fallback
	staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free
	KVM: arm/arm64: Skip updating PTE entry if no change
	KVM: arm/arm64: Skip updating PMD entry if no change
	sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
	x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
	x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM
	x86/speculation/l1tf: Suggest what to do on systems with too much RAM
	x86/process: Re-export start_thread()
	KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
	x86/kvm/vmx: Remove duplicate l1d flush definitions
	fuse: Don't access pipe->buffers without pipe_lock()
	fuse: fix initial parallel dirops
	fuse: fix double request_end()
	fuse: fix unlocked access to processing queue
	fuse: umount should wait for all requests
	fuse: Fix oops at process_init_reply()
	fuse: Add missed unlock_page() to fuse_readpages_fill()
	udl-kms: change down_interruptible to down
	udl-kms: handle allocation failure
	udl-kms: fix crash due to uninitialized memory
	b43legacy/leds: Ensure NUL-termination of LED name string
	b43/leds: Ensure NUL-termination of LED name string
	ASoC: dpcm: don't merge format from invalid codec dai
	ASoC: sirf: Fix potential NULL pointer dereference
	pinctrl: freescale: off by one in imx1_pinconf_group_dbg_show()
	x86/irqflags: Mark native_restore_fl extern inline
	x86/spectre: Add missing family 6 check to microcode check
	x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
	x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
	s390: fix br_r1_trampoline for machines without exrl
	s390/qdio: reset old sbal_state flags
	s390/numa: move initial setup of node_to_cpumask_map
	s390/pci: fix out of bounds access during irq setup
	kprobes: Make list and blacklist root user read only
	MIPS: Correct the 64-bit DSP accumulator register size
	MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
	scsi: sysfs: Introduce sysfs_{un,}break_active_protection()
	scsi: core: Avoid that SCSI device removal through sysfs triggers a deadlock
	iscsi target: fix session creation failure handling
	clk: rockchip: fix clk_i2sout parent selection bits on rk3399
	PM / clk: signedness bug in of_pm_clk_add_clks()
	power: generic-adc-battery: fix out-of-bounds write when copying channel properties
	power: generic-adc-battery: check for duplicate properties copied from iio channels
	cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
	staging: android: ion: check for kref overflow
	Linux 4.9.125

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-09-05 12:38:22 +02:00
Ethan Zhao
fe0034a08b sched/sysctl: Check user input value of sysctl_sched_time_avg
commit 5ccba44ba118a5000cccc50076b0344632459779 upstream.

System will hang if user set sysctl_sched_time_avg to 0:

  [root@XXX ~]# sysctl kernel.sched_time_avg_ms=0

  Stack traceback for pid 0
  0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
  ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
  0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
  ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
  Call Trace:
  <IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
  [<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
  [<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
  [<ffffffff810c5b84>] ? load_balance+0x194/0x900
  [<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
  [<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
  [<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
  [<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
  [<ffffffff8108ac85>] ? irq_exit+0x125/0x130
  [<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
  [<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
  [<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
   <EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
  [<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
  [<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
  [<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
  [<ffffffff81053373>] ? start_secondary+0x173/0x1e0

Because divide-by-zero error happens in function:

update_group_capacity()
  update_cpu_capacity()
    scale_rt_capacity()
     {
          ...
          total = sched_avg_period() + delta;
          used = div_u64(avg, total);
          ...
     }

To fix this issue, check user input value of sysctl_sched_time_avg, keep
it unchanged when hitting invalid input, and set the minimum limit of
sysctl_sched_time_avg to 1 ms.

Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: ethan.kernel@gmail.com
Cc: keescook@chromium.org
Cc: mcgrof@kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Steve Muckle <smuckle@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05 09:20:04 +02:00
Rik van Riel
dbe0f61220 ANDROID: add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.

This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period.  In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.

It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.

[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.

[surenb]
Will be reverted as soon as Android framework is reworked to
use upstream-supported watermark_scale_factor instead of
extra_free_kbytes.

Bug: 86445363
Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2018-06-04 15:47:24 +00:00
Viresh Kumar
e0907557ef cpufreq: Drop schedfreq governor
We all should be using (and improving) the schedutil governor now. Get
rid of the non-upstream governor.

Tested on Hikey.

Change-Id: I2104558b03118b0a9c5f099c23c42cd9a6c2a963
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2017-11-07 10:31:04 +05:30
Greg Kroah-Hartman
379e3b2a6d This is the 4.9.53 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlnV4tYACgkQONu9yGCS
 aT48Xg/8C8R4G3oeWMmOnEiTBk+31YjzePlE7lGOGMcaVjPd+EFWwq66ikE75c9E
 GevAkUkTMSed0OnxvruNK5Sld9m31orDaIdUqX8fs/hzGTq0e5ygvLsnb5BlUxVW
 Vwv49HGX3qqv5WP3Uv1rqThntizL/TxYbowtta0XZrvW4YXmSbyJnsFH5B2eavxN
 7JUcmylOOQyTDngM084qrDTpFGahaRRzr7W2GYwklhYgQ+yuTdtyXLRfs/eYH6Lh
 mNzDEK3siacafqOWrI5zTpQ2IWYuxxeajCJi1fH0JVDqQOLoMKCRkKXCHpd+3Qko
 gkhYsItX02nfo9lSLC3+nUQP0n+HkIC8CW6/q0tsD58ccm8atKAaeJ/b7ZS2ONNA
 2Dn2/s2zpunODrTBM/OkZRjtVrxPOtUeuCrxZZTERIe8AvudP+igSF1Bp9d1vtLb
 AoibT3G3kiEFObF4e92yFiQRnMSZj5Nb9oep+dET8tN0aAbnx7uRbwfLYGoi1TLZ
 y/PdXh/0OccGqBHRde7JrOJR8hcuvoEjg8W3su+irBm0pa1GY4W0jSg0kZI1Lt0r
 FObAjKpp02UulSfIpUBdjEPCBQvVOHyh31Z2hZG+ZsZCj6p/zuh95QtZdp27R8py
 2QiqXv71lAVkSBsBP8ip/+ZlWgX45dEIBJ4Hx1D/ytcjttLfyyw=
 =s0r9
 -----END PGP SIGNATURE-----

Merge 4.9.53 into android-4.9

Changes in 4.9.53
	cifs: release cifs root_cred after exit_cifs
	cifs: release auth_key.response for reconnect.
	fs/proc: Report eip/esp in /prod/PID/stat for coredumping
	mac80211: fix VLAN handling with TXQs
	mac80211_hwsim: Use proper TX power
	mac80211: flush hw_roc_start work before cancelling the ROC
	genirq: Make sparse_irq_lock protect what it should protect
	KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
	KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
	tracing: Fix trace_pipe behavior for instance traces
	tracing: Erase irqsoff trace with empty write
	md/raid5: fix a race condition in stripe batch
	md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list
	scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
	drm/radeon: disable hard reset in hibernate for APUs
	crypto: drbg - fix freeing of resources
	crypto: talitos - Don't provide setkey for non hmac hashing algs.
	crypto: talitos - fix sha224
	crypto: talitos - fix hashing
	security/keys: properly zero out sensitive key material in big_key
	security/keys: rewrite all of big_key crypto
	KEYS: fix writing past end of user-supplied buffer in keyring_read()
	KEYS: prevent creating a different user's keyrings
	KEYS: prevent KEYCTL_READ on negative key
	powerpc/pseries: Fix parent_dn reference leak in add_dt_node()
	powerpc/tm: Flush TM only if CPU has TM feature
	powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
	s390/mm: fix write access check in gup_huge_pmd()
	PM: core: Fix device_pm_check_callbacks()
	Fix SMB3.1.1 guest authentication to Samba
	SMB3: Warn user if trying to sign connection that authenticated as guest
	SMB: Validate negotiate (to protect against downgrade) even if signing off
	SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
	vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets
	nl80211: check for the required netlink attributes presence
	bsg-lib: don't free job in bsg_prepare_job
	iw_cxgb4: remove the stid on listen create failure
	iw_cxgb4: put ep reference in pass_accept_req()
	selftests/seccomp: Support glibc 2.26 siginfo_t.h
	seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
	arm64: Make sure SPsel is always set
	arm64: fault: Route pte translation faults via do_translation_fault
	KVM: VMX: extract __pi_post_block
	KVM: VMX: avoid double list add with VT-d posted interrupts
	KVM: VMX: simplify and fix vmx_vcpu_pi_load
	kvm/x86: Handle async PF in RCU read-side critical sections
	KVM: VMX: Do not BUG() on out-of-bounds guest IRQ
	kvm: nVMX: Don't allow L2 to access the hardware CR8
	xfs: validate bdev support for DAX inode flag
	etnaviv: fix gem object list corruption
	PCI: Fix race condition with driver_override
	btrfs: fix NULL pointer dereference from free_reloc_roots()
	btrfs: propagate error to btrfs_cmp_data_prepare caller
	btrfs: prevent to set invalid default subvolid
	x86/mm: Fix fault error path using unsafe vma pointer
	x86/fpu: Don't let userspace set bogus xcomp_bv
	gfs2: Fix debugfs glocks dump
	timer/sysclt: Restrict timer migration sysctl values to 0 and 1
	KVM: VMX: do not change SN bit in vmx_update_pi_irte()
	KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt
	cxl: Fix driver use count
	KVM: VMX: use cmpxchg64
	video: fbdev: aty: do not leak uninitialized padding in clk to userspace
	swiotlb-xen: implement xen_swiotlb_dma_mmap callback
	Linux 4.9.53

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-10-05 10:37:37 +02:00
Myungho Jung
4c00015385 timer/sysclt: Restrict timer migration sysctl values to 0 and 1
commit b94bf594cf8ed67cdd0439e70fa939783471597a upstream.

timer_migration sysctl acts as a boolean switch, so the allowed values
should be restricted to 0 and 1.

Add the necessary extra fields to the sysctl table entry to enforce that.

[ tglx: Rewrote changelog ]

Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kazuhiro Hayashi <kazuhiro3.hayashi@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-05 09:44:04 +02:00
Dietmar Eggemann
ea5a7f2df0 ANDROID: sched: Remove sysctl_sched_is_big_little
With the new wakeup approach this sysctl is not necessary any more.

Change-Id: I52114b3c918791f6a4f9f30f50002919ccbc1a9c
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
(cherry picked from commit 885c0d503bcdf0ef4e9b46822496f16b20aa3bbd)
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2017-08-03 15:30:22 -07:00
Greg Kroah-Hartman
598195a906 This is the 4.9.37 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllmJ2QACgkQONu9yGCS
 aT5gQBAAyxVViSCLdZx/ft/+Qu9lXaBvyNxXdISkwUnnbPmyzXtMz/1fvMtuPxsB
 GYMQp/zG5uApUc+Up65tWhRlZEE7nnbj4dIJZyJjrVp3Ay6lM22532Hg/7CkCSs9
 jeRjI2VF0KmYxklfurbwTFBJRwrF6+2uNLgf6iMjxVMod/G4U8to6RV/R/n2kEAl
 tQDIfMF0naP6NHKcL+g0xqHtpd8NIewbywFF47FY4L/qnKqNCFSZZ6tt7rKKGpZr
 R61da975UTJj/Mk+md2wf0aO5FNZzMS/vESHyehSPR46pSGCkZkc+pz/jQlcrGDM
 0OILOmPZ1eewGsaQPGKVbFCq6IAlUCP0uYIraKLbzSC5h6iV93rkVpsFX9OhT54/
 45Y5VFpeskvtZ32ZJQGEfX1uNEp6cUhqkkgevj6BbK1zY/bkEn7zT4W6bsgHQcRh
 9SIGRQw3kXenmZcBvWf1MPhNnwe4PF/Yl6/FQS8I0CBi8iqIancoKA+n0ftTyZtw
 kU6xpcWp+b+3Cvc7tdrPYYflKDKgNJLdHsS42hDEYsBKBAc2E2xOz2jLHwcnK66n
 xAJb6VMHh8Njof1wvVVfjuow7yaZmILdwmzKJBB8TWaGXjPxLt6Vd5AIs1SiFJZB
 KWstpEndODLlsyhvvfAPDS3ft51QJ14N1Dq6VMjos6+frrgfC/Q=
 =wtCu
 -----END PGP SIGNATURE-----

Merge 4.9.37 into android-4.9

Changes in 4.9.37
	fs: add a VALID_OPEN_FLAGS
	fs: completely ignore unknown open flags
	driver core: platform: fix race condition with driver_override
	ceph: choose readdir frag based on previous readdir reply
	tracing/kprobes: Allow to create probe with a module name starting with a digit
	media: entity: Fix stream count check
	drm/virtio: don't leak bo on drm_gem_object_init failure
	usb: dwc3: replace %p with %pK
	USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
	Add USB quirk for HVR-950q to avoid intermittent device resets
	usb: usbip: set buffer pointers to NULL after free
	usb: Fix typo in the definition of Endpoint[out]Request
	USB: core: fix device node leak
	mac80211_hwsim: Replace bogus hrtimer clockid
	sysctl: don't print negative flag for proc_douintvec
	sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec
	pinctrl: qcom: ipq4019: add missing pingroups for pins > 70
	pinctrl: cherryview: Add a quirk to make Acer Chromebook keyboard work again
	pinctrl: sh-pfc: r8a7794: Swap ATA signals
	pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data
	pinctrl: sh-pfc: r8a7791: Add missing DVC_MUTE signal
	pinctrl: sh-pfc: r8a7795: Fix hscif2_clk_b and hscif4_ctrl
	pinctrl: meson: meson8b: fix the NAND DQS pins
	pinctrl: stm32: Fix bad function call
	pinctrl: sunxi: Fix SPDIF function name for A83T
	pinctrl: cherryview: Add terminate entry for dmi_system_id tables
	pinctrl: mxs: atomically switch mux and drive strength config
	pinctrl: sh-pfc: r8a7791: Add missing HSCIF1 pinmux data
	pinctrl: sh-pfc: Update info pointer after SoC-specific init
	USB: serial: option: add two Longcheer device ids
	USB: serial: qcserial: new Sierra Wireless EM7305 device ID
	xhci: Limit USB2 port wake support for AMD Promontory hosts
	gfs2: Fix glock rhashtable rcu bug
	tpm: fix a kernel memory leak in tpm-sysfs.c
	x86/tools: Fix gcc-7 warning in relocs.c
	x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings
	ath10k: override CE5 config for QCA9377
	KEYS: Fix an error code in request_master_key()
	crypto: drbg - Fixes panic in wait_for_completion call
	RDMA/uverbs: Check port number supplied by user verbs cmds
	rt286: add Thinkpad Helix 2 to force_combo_jack_table
	Linux 4.9.37

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-07-12 17:07:29 +02:00
Liping Zhang
7bdacd3d9f sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec
commit 425fffd886bae3d127a08fa6a17f2e31e24ed7ff upstream.

Currently, inputting the following command will succeed but actually the
value will be truncated:

  # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat

This is not friendly to the user, so instead, we should report error
when the value is larger than UINT_MAX.

Fixes: e7d316a02f ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:04 +02:00
Liping Zhang
3a20c57b43 sysctl: don't print negative flag for proc_douintvec
commit 5380e5644afbba9e3d229c36771134976f05c91e upstream.

I saw some very confusing sysctl output on my system:
  # cat /proc/sys/net/core/xfrm_aevent_rseqth
  -2
  # cat /proc/sys/net/core/xfrm_aevent_etime
  -10
  # cat /proc/sys/net/ipv4/tcp_notsent_lowat
  -4294967295

Because we forget to set the *negp flag in proc_douintvec, so it will
become a garbage value.

Since the value related to proc_douintvec is always an unsigned integer,
so we can set *negp to false explictily to fix this issue.

Fixes: e7d316a02f ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:04 +02:00
Dmitry Shmidt
e37704d2d0 This is the 4.9.8 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAliVlScACgkQONu9yGCS
 aT5mpQ/+P2ZZmg1jqCGpz5+uawv1pQzYfSqlahFO71m9Jkm3OIHsFt+MScxTMBXe
 LEDuY1eKtnRj/RzV8ewNC5cwHWW54IqyzVCqg1vWVMOpVp205w5kaeFC2NpvZJCZ
 bJ8/QEj4oFI8xNwXE/5IO9qdW3pgS8R7H6FqS/h6bIaiGAItw2wlaZcY1n5IgX24
 xJxiyoXj94eKwSpWH09rwTBP8474rK/Wjikx1i5B/fUpTGVWLgcgR19vFjREUhrR
 54pgawaXAzsx//uQgNIKyW9zi6oTirbGQqI+QfPqCy8fhlCjaK8yESIjp7ucsRyR
 lvgn7ahG5533Eur74jvZEihdDBKbcidsw8aZAeEJU35XSo8KrXcQCq/IxK3tJjtd
 lXw/gK5jh4ZC4oEsqagmG9MDWXUXP38Qrk2KLwT6ydO771Ihh69r+JhucofE/0cb
 0OCIMmZBDUBlwuLOHeVWGOSJhcT5IuCnjLU669chWrJ7bnkTlveQhuvRdvJ9In7U
 ErU3WhoJDtIBNVCJYATsQT1qkJLIKQI9XgZi+1KB5jLGkGRo+cltcV5YWTpJd83A
 jzwjJWXHkDoF9n+unyQ/20gq52De+lAxDqIbkKWLKzsjx6vgNCfRtsgNHVtdzyvZ
 aExinmXV8Tm8Pztidmg40Gf4aK77V6omOIJPiC+fzqJH7RRDRj8=
 =0n80
 -----END PGP SIGNATURE-----

Merge tag 'v4.9.8' into android-4.9

This is the 4.9.8 stable release
2017-02-06 13:13:30 -08:00
Eric Dumazet
03707d6c36 sysctl: fix proc_doulongvec_ms_jiffies_minmax()
commit ff9f8a7cf935468a94d9927c68b00daae701667e upstream.

We perform the conversion between kernel jiffies and ms only when
exporting kernel value to user space.

We need to do the opposite operation when value is written by user.

Only matters when HZ != 1000

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01 08:33:05 +01:00
Srinath Sridharan
3a73c96a28 ANDROID: sched/walt: Accounting for number of irqs pending on each core
Schedules on a core whose irq count is less than a threshold.
Improves I/O performance of EAS.

Change-Id: I08ff7dd0d22502a0106fc636b1af2e6fe9e758b5
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:59 -08:00
Srivatsa Vaddagiri
26c2154816 ANDROID: sched: Introduce Window Assisted Load Tracking (WALT)
use a window based view of time in order to track task
demand and CPU utilization in the scheduler.

Window Assisted Load Tracking (WALT) implementation credits:
 Srivatsa Vaddagiri, Steve Muckle, Syed Rameez Mustafa, Joonwoo Park,
 Pavan Kumar Kondeti, Olav Haugan

2016-03-06: Integration with EAS/refactoring by Vikram Mulukutla
            and Todd Kjos

Change-Id: I21408236836625d4e7d7de1843d20ed5ff36c708

Includes fixes for issues:

eas/walt: Use walt_ktime_clock() instead of ktime_get_ns() to avoid a
race resulting in watchdog resets
BUG: 29353986
Change-Id: Ic1820e22a136f7c7ebd6f42e15f14d470f6bbbdb

Handle walt accounting anomoly during resume

During resume, there is a corner case where on wakeup, a task's
prev_runnable_sum can go negative. This is a workaround that
fixes the condition and warns (instead of crashing).

BUG: 29464099
Change-Id: I173e7874324b31a3584435530281708145773508

Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:58 -08:00
Todd Kjos
c6a6f3bfa0 ANDROID: sched/fair: add tunable to set initial task load
The choice of initial task load upon fork has a large influence
on CPU and OPP selection when scheduler-driven DVFS is in use.
Make this tuneable by adding a new sysctl "sched_initial_task_util".

If the sched governor is not used, the default remains at SCHED_LOAD_SCALE
Otherwise, the value from the sysctl is used. This defaults to 0.

Signed-off-by: "Todd Kjos <tkjos@google.com>"
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:52 -08:00
Juri Lelli
1931b93dba ANDROID: sched/fair: add tunable to force selection at cpu granularity
EAS assumes that clusters with smaller capacity cores are more
energy-efficient. This may not be true on non-big-little devices,
so EAS can make incorrect cluster selections when finding a CPU
to wake. The "sched_is_big_little" hint can be used to cause a
cpu-based selection instead of cluster-based selection.

This change incorporates the addition of the sync hint enable patch

EAS did not honour synchronous wakeup hints, a new sysctl is
created to ask EAS to use this information when selecting a CPU.
The control is called "sched_sync_hint_enable".

Also contains:

EAS: sched/fair: for SMP bias toward idle core with capacity

For SMP devices, on wakeup bias towards idle cores that have capacity
vs busy devices that need a higher OPP

eas: favor idle cpus for boosted tasks

BUG: 29533997
BUG: 29512132
Change-Id: I0cc9a1b1b88fb52916f18bf2d25715bdc3634f9c
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>

eas/sched/fair: Favoring busy cpus with low OPPs

BUG: 29533997
BUG: 29512132
Change-Id: I9305b3239698d64278db715a2e277ea0bb4ece79

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:51 -08:00
Srinath Sridharan
bf47bdd180 ANDROID: sched: EAS: take cstate into account when selecting idle core
Introduce a new sysctl for this option, 'sched_cstate_aware'.
When this is enabled, select_idle_sibling in CFS is modified to
choose the idle CPU in the sibling group which has the lowest
idle state index - idle state indexes are assumed to increase
as sleep depth and hence wakeup latency increase. In this way,
we attempt to minimise wakeup latency when an idle CPU is
required.

Signed-off-by: Srinath Sridharan <srinathsr@google.com>

Includes:
sched: EAS: fix select_idle_sibling

when sysctl_sched_cstate_aware is enabled, best_idle cpu will not be chosen
in the original flow because it will goto done directly

Bug: 30107557
Change-Id: Ie09c2e3960cafbb976f8d472747faefab3b4d6ac
Signed-off-by: martin_liu <martin_liu@htc.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:50 -08:00
Patrick Bellasi
ae71030fd5 ANDROID: sched/tune: add initial support for CGroups based boosting
To support task performance boosting, the usage of a single knob has the
advantage to be a simple solution, both from the implementation and the
usability standpoint.  However, on a real system it can be difficult to
identify a single value for the knob which fits the needs of multiple
different tasks. For example, some kernel threads and/or user-space
background services should be better managed the "standard" way while we
still want to be able to boost the performance of specific workloads.

In order to improve the flexibility of the task boosting mechanism this
patch is the first of a small series which extends the previous
implementation to introduce a "per task group" support.
This first patch introduces just the basic CGroups support, a new
"schedtune" CGroups controller is added which allows to configure
different boost value for different groups of tasks.
To keep the implementation simple but still effective for a boosting
strategy, the new controller:
  1. allows only a two layer hierarchy
  2. supports only a limited number of boost groups

A two layer hierarchy allows to place each task either:
  a) in the root control group
     thus being subject to a system-wide boosting value
  b) in a child of the root group
     thus being subject to the specific boost value defined by that
     "boost group"

The limited number of "boost groups" supported is mainly motivated by
the observation that in a real system it could be useful to have only
few classes of tasks which deserve different treatment.
For example, background vs foreground or interactive vs low-priority.
As an additional benefit, a limited number of boost groups allows also
to have a simpler implementation especially for the code required to
compute the boost value for CPUs which have runnable tasks belonging to
different boost groups.

cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:37 -08:00
Patrick Bellasi
69fa4c768a ANDROID: sched/tune: add sysctl interface to define a boost value
The current (CFS) scheduler implementation does not allow "to boost"
tasks performance by running them at a higher OPP compared to the
minimum required to meet their workload demands.

To support tasks performance boosting the scheduler should provide a
"knob" which allows to tune how much the system is going to be optimised
for energy efficiency vs performance.

This patch is the first of a series which provides a simple interface to
define a tuning knob. One system-wide "boost" tunable is exposed via:
  /proc/sys/kernel/sched_cfs_boost
which can be configured in the range [0..100], to define a percentage
where:
  - 0%   boost requires to operate in "standard" mode by scheduling
         tasks at the minimum capacities required by the workload demand
  - 100% boost requires to push at maximum the task performances,
         "regardless" of the incurred energy consumption

A boost value in between these two boundaries is used to bias the
power/performance trade-off, the higher the boost value the more the
scheduler is biased toward performance boosting instead of energy
efficiency.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:35 -08:00
Linus Torvalds
abb5a14fa2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted misc bits and pieces.

  There are several single-topic branches left after this (rename2
  series from Miklos, current_time series from Deepa Dinamani, xattr
  series from Andreas, uaccess stuff from from me) and I'd prefer to
  send those separately"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
  proc: switch auxv to use of __mem_open()
  hpfs: support FIEMAP
  cifs: get rid of unused arguments of CIFSSMBWrite()
  posix_acl: uapi header split
  posix_acl: xattr representation cleanups
  fs/aio.c: eliminate redundant loads in put_aio_ring_file
  fs/internal.h: add const to ns_dentry_operations declaration
  compat: remove compat_printk()
  fs/buffer.c: make __getblk_slow() static
  proc: unsigned file descriptors
  fs/file: more unsigned file descriptors
  fs: compat: remove redundant check of nr_segs
  cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
  cifs: don't use memcpy() to copy struct iov_iter
  get rid of separate multipage fault-in primitives
  fs: Avoid premature clearing of capabilities
  fs: Give dentry to inode_change_ok() instead of inode
  fuse: Propagate dentry down to inode_change_ok()
  ceph: Propagate dentry down to inode_change_ok()
  xfs: Propagate dentry down to inode_change_ok()
  ...
2016-10-10 13:04:49 -07:00
Linus Torvalds
14986a34e1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "This set of changes is a number of smaller things that have been
  overlooked in other development cycles focused on more fundamental
  change. The devpts changes are small things that were a distraction
  until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an
  trivial regression fix to autofs for the unprivileged mount changes
  that went in last cycle. A pair of ioctls has been added by Andrey
  Vagin making it is possible to discover the relationships between
  namespaces when referring to them through file descriptors.

  The big user visible change is starting to add simple resource limits
  to catch programs that misbehave. With namespaces in general and user
  namespaces in particular allowing users to use more kinds of
  resources, it has become important to have something to limit errant
  programs. Because the purpose of these limits is to catch errant
  programs the code needs to be inexpensive to use as it always on, and
  the default limits need to be high enough that well behaved programs
  on well behaved systems don't encounter them.

  To this end, after some review I have implemented per user per user
  namespace limits, and use them to limit the number of namespaces. The
  limits being per user mean that one user can not exhause the limits of
  another user. The limits being per user namespace allow contexts where
  the limit is 0 and security conscious folks can remove from their
  threat anlysis the code used to manage namespaces (as they have
  historically done as it root only). At the same time the limits being
  per user namespace allow other parts of the system to use namespaces.

  Namespaces are increasingly being used in application sand boxing
  scenarios so an all or nothing disable for the entire system for the
  security conscious folks makes increasing use of these sandboxes
  impossible.

  There is also added a limit on the maximum number of mounts present in
  a single mount namespace. It is nontrivial to guess what a reasonable
  system wide limit on the number of mount structure in the kernel would
  be, especially as it various based on how a system is using
  containers. A limit on the number of mounts in a mount namespace
  however is much easier to understand and set. In most cases in
  practice only about 1000 mounts are used. Given that some autofs
  scenarious have the potential to be 30,000 to 50,000 mounts I have set
  the default limit for the number of mounts at 100,000 which is well
  above every known set of users but low enough that the mount hash
  tables don't degrade unreaonsably.

  These limits are a start. I expect this estabilishes a pattern that
  other limits for resources that namespaces use will follow. There has
  been interest in making inotify event limits per user per user
  namespace as well as interest expressed in making details about what
  is going on in the kernel more visible"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits)
  autofs:  Fix automounts by using current_real_cred()->uid
  mnt: Add a per mount namespace limit on the number of mounts
  netns: move {inc,dec}_net_namespaces into #ifdef
  nsfs: Simplify __ns_get_path
  tools/testing: add a test to check nsfs ioctl-s
  nsfs: add ioctl to get a parent namespace
  nsfs: add ioctl to get an owning user namespace for ns file descriptor
  kernel: add a helper to get an owning user namespace for a namespace
  devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts
  devpts: Remove sync_filesystems
  devpts: Make devpts_kill_sb safe if fsi is NULL
  devpts: Simplify devpts_mount by using mount_nodev
  devpts: Move the creation of /dev/pts/ptmx into fill_super
  devpts: Move parse_mount_options into fill_super
  userns: When the per user per user namespace limit is reached return ENOSPC
  userns; Document per user per user namespace limits.
  mntns: Add a limit on the number of mount namespaces.
  netns: Add a limit on the number of net namespaces
  cgroupns: Add a limit on the number of cgroup namespaces
  ipcns: Add a  limit on the number of ipc namespaces
  ...
2016-10-06 09:52:23 -07:00
Eric W. Biederman
d29216842a mnt: Add a per mount namespace limit on the number of mounts
CAI Qian <caiqian@redhat.com> pointed out that the semantics
of shared subtrees make it possible to create an exponentially
increasing number of mounts in a mount namespace.

    mkdir /tmp/1 /tmp/2
    mount --make-rshared /
    for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done

Will create create 2^20 or 1048576 mounts, which is a practical problem
as some people have managed to hit this by accident.

As such CVE-2016-6213 was assigned.

Ian Kent <raven@themaw.net> described the situation for autofs users
as follows:

> The number of mounts for direct mount maps is usually not very large because of
> the way they are implemented, large direct mount maps can have performance
> problems. There can be anywhere from a few (likely case a few hundred) to less
> than 10000, plus mounts that have been triggered and not yet expired.
>
> Indirect mounts have one autofs mount at the root plus the number of mounts that
> have been triggered and not yet expired.
>
> The number of autofs indirect map entries can range from a few to the common
> case of several thousand and in rare cases up to between 30000 and 50000. I've
> not heard of people with maps larger than 50000 entries.
>
> The larger the number of map entries the greater the possibility for a large
> number of active mounts so it's not hard to expect cases of a 1000 or somewhat
> more active mounts.

So I am setting the default number of mounts allowed per mount
namespace at 100,000.  This is more than enough for any use case I
know of, but small enough to quickly stop an exponential increase
in mounts.  Which should be perfect to catch misconfigurations and
malfunctioning programs.

For anyone who needs a higher limit this can be changed by writing
to the new /proc/sys/fs/mount-max sysctl.

Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-09-30 12:46:48 -05:00
Arnd Bergmann
9dcfcda576 compat: remove compat_printk()
After 7e8e385aaf ("x86/compat: Remove sys32_vm86_warning"), this
function has become unused, so we can remove it as well.

Link: http://lkml.kernel.org/r/20160617142903.3070388-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2016-09-27 21:20:53 -04:00
Alexey Dobriyan
9b80a184ea fs/file: more unsigned file descriptors
Propagate unsignedness for grand total of 149 bytes:

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
	add/remove: 0/0 grow/shrink: 0/10 up/down: 0/-149 (-149)
	function                                     old     new   delta
	set_close_on_exec                             99      98      -1
	put_files_struct                             201     200      -1
	get_close_on_exec                             59      58      -1
	do_prlimit                                   498     497      -1
	do_execveat_common.isra                     1662    1661      -1
	__close_fd                                   178     173      -5
	do_dup2                                      219     204     -15
	seq_show                                     685     660     -25
	__alloc_fd                                   384     357     -27
	dup_fd                                       718     646     -72

It mostly comes from converting "unsigned int" to "long" for bit operations.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 18:47:38 -04:00
Subash Abhinov Kasiviswanathan
e7d316a02f sysctl: handle error writing UINT_MAX to u32 fields
We have scripts which write to certain fields on 3.18 kernels but this
seems to be failing on 4.4 kernels.  An entry which we write to here is
xfrm_aevent_rseqth which is u32.

  echo 4294967295  > /proc/sys/net/core/xfrm_aevent_rseqth

Commit 230633d109 ("kernel/sysctl.c: detect overflows when converting
to int") prevented writing to sysctl entries when integer overflow
occurs.  However, this does not apply to unsigned integers.

Heinrich suggested that we introduce a new option to handle 64 bit
limits and set min as 0 and max as UINT_MAX.  This might not work as it
leads to issues similar to __do_proc_doulongvec_minmax.  Alternatively,
we would need to change the datatype of the entry to 64 bit.

  static int __do_proc_doulongvec_minmax(void *data, struct ctl_table
  {
      i = (unsigned long *) data;   //This cast is causing to read beyond the size of data (u32)
      vleft = table->maxlen / sizeof(unsigned long); //vleft is 0 because maxlen is sizeof(u32) which is lesser than sizeof(unsigned long) on x86_64.

Introduce a new proc handler proc_douintvec.  Individual proc entries
will need to be updated to use the new handler.

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 230633d109 ("kernel/sysctl.c:detect overflows when converting to int")
Link: http://lkml.kernel.org/r/1471479806-5252-1-git-send-email-subashab@codeaurora.org
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26 17:39:35 -07:00
Borislav Petkov
750afe7bab printk: add kernel parameter to control writes to /dev/kmsg
Add a "printk.devkmsg" kernel command line parameter which controls how
userspace writes into /dev/kmsg.  It has three options:

 * ratelimit - ratelimit logging from userspace.
 * on  - unlimited logging from userspace
 * off - logging from userspace gets ignored

The default setting is to ratelimit the messages written to it.

This changes the kernel default setting of "on" to "ratelimit" and we do
that because we want to keep userspace spamming /dev/kmsg to sane
levels.  This is especially moot when a small kernel log buffer wraps
around and messages get lost.  So the ratelimiting setting should be a
sane setting where kernel messages should have a bit higher chance of
survival from all the spamming.

It additionally does not limit logging to /dev/kmsg while the system is
booting if we haven't disabled it on the command line.

Furthermore, we can control the logging from a lower priority sysctl
interface - kernel.printk_devkmsg.

That interface will succeed only if printk.devkmsg *hasn't* been
supplied on the command line.  If it has, then printk.devkmsg is a
one-time setting which remains for the duration of the system lifetime.
This "locking" of the setting is to prevent userspace from changing the
logging on us through sysctl(2).

This patch is based on previous patches from Linus and Steven.

[bp@suse.de: fixes]
  Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:06 -04:00
Mel Gorman
a5f5f91da6 mm: convert zone_reclaim to node_reclaim
As reclaim is now per-node based, convert zone_reclaim to be
node_reclaim.  It is possible that a node will be reclaimed multiple
times if it has multiple zones but this is unavoidable without caching
all nodes traversed so far.  The documentation and interface to
userspace is the same from a configuration perspective and will will be
similar in behaviour unless the node-local allocation requests were also
limited to lower zones.

Link: http://lkml.kernel.org/r/1467970510-21195-24-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Daniel Bristot de Oliveira
088e9d253d rcu: sysctl: Panic on RCU Stall
It is not always easy to determine the cause of an RCU stall just by
analysing the RCU stall messages, mainly when the problem is caused
by the indirect starvation of rcu threads. For example, when preempt_rcu
is not awakened due to the starvation of a timer softirq.

We have been hard coding panic() in the RCU stall functions for
some time while testing the kernel-rt. But this is not possible in
some scenarios, like when supporting customers.

This patch implements the sysctl kernel.panic_on_rcu_stall. If
set to 1, the system will panic() when an RCU stall takes place,
enabling the capture of a vmcore. The vmcore provides a way to analyze
all kernel/tasks states, helping out to point to the culprit and the
solution for the stall.

The kernel.panic_on_rcu_stall sysctl is disabled by default.

Changes from v1:
- Fixed a typo in the git log
- The if(sysctl_panic_on_rcu_stall) panic() is in a static function
- Fixed the CONFIG_TINY_RCU compilation issue
- The var sysctl_panic_on_rcu_stall is now __read_mostly

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-15 16:00:05 -07:00
Linus Torvalds
bdc6b758e4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Mostly tooling and PMU driver fixes, but also a number of late updates
  such as the reworking of the call-chain size limiting logic to make
  call-graph recording more robust, plus tooling side changes for the
  new 'backwards ring-buffer' extension to the perf ring-buffer"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  perf record: Read from backward ring buffer
  perf record: Rename variable to make code clear
  perf record: Prevent reading invalid data in record__mmap_read
  perf evlist: Add API to pause/resume
  perf trace: Use the ptr->name beautifier as default for "filename" args
  perf trace: Use the fd->name beautifier as default for "fd" args
  perf report: Add srcline_from/to branch sort keys
  perf evsel: Record fd into perf_mmap
  perf evsel: Add overwrite attribute and check write_backward
  perf tools: Set buildid dir under symfs when --symfs is provided
  perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
  perf annotate: Sort list of recognised instructions
  perf annotate: Fix identification of ARM blt and bls instructions
  perf tools: Fix usage of max_stack sysctl
  perf callchain: Stop validating callchains by the max_stack sysctl
  perf trace: Fix exit_group() formatting
  perf top: Use machine->kptr_restrict_warned
  perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
  perf machine: Do not bail out if not managing to read ref reloc symbol
  perf/x86/intel/p4: Trival indentation fix, remove space
  ...
2016-05-25 17:05:40 -07:00
Hugh Dickins
52b6f46bc1 mm: /proc/sys/vm/stat_refresh to force vmstat update
Provide /proc/sys/vm/stat_refresh to force an immediate update of
per-cpu into global vmstats: useful to avoid a sleep(2) or whatever
before checking counts when testing.  Originally added to work around a
bug which left counts stranded indefinitely on a cpu going idle (an
inaccuracy magnified when small below-batch numbers represent "huge"
amounts of memory), but I believe that bug is now fixed: nonetheless,
this is still a useful knob.

Its schedule_on_each_cpu() is probably too expensive just to fold into
reading /proc/meminfo itself: give this mode 0600 to prevent abuse.
Allow a write or a read to do the same: nothing to read, but "grep -h
Shmem /proc/sys/vm/stat_refresh /proc/meminfo" is convenient.  Oh, and
since global_page_state() itself is careful to disguise any underflow as
0, hack in an "Invalid argument" and pr_warn() if a counter is negative
after the refresh - this helped to fix a misaccounting of
NR_ISOLATED_FILE in my migration code.

But on recent kernels, I find that NR_ALLOC_BATCH and NR_PAGES_SCANNED
often go negative some of the time.  I have not yet worked out why, but
have no evidence that it's actually harmful.  Punt for the moment by
just ignoring the anomaly on those.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Ning Qu <quning@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Arnaldo Carvalho de Melo
c85b033496 perf core: Separate accounting of contexts and real addresses in a stack trace
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.

A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:53 -03:00
Arnaldo Carvalho de Melo
a831100aee perf core: Generalize max_stack sysctl handler
So that it can be used for other stack related knobs, such as the
upcoming one to tweak the max number of of contexts per stack sample.

In all those cases we can only change the value if there are no perf
sessions collecting stacks, so they need to grab that mutex, etc.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-16 23:11:49 -03:00
Arnaldo Carvalho de Melo
c5dfd78eb7 perf core: Allow setting up max frame stack depth via sysctl
The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.

And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.

The new file is:

  # cat /proc/sys/kernel/perf_event_max_stack
  127

Chaging it:

  # echo 256 > /proc/sys/kernel/perf_event_max_stack
  # cat /proc/sys/kernel/perf_event_max_stack
  256

But as soon as there is some event using callchains we get:

  # echo 512 > /proc/sys/kernel/perf_event_max_stack
  -bash: echo: write error: Device or resource busy
  #

Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.

Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-27 10:20:39 -03:00