Commit graph

379 commits

Author SHA1 Message Date
Phapoom Saksri
57cbd42092 MindPatched v2.5 2025-01-25 18:19:22 +07:00
David Howells
1999644115 UPSTREAM: Make anon_inodes unconditional
Make the anon_inodes facility unconditional so that it can be used by core
VFS code.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

(cherry picked from commit dadd2299ab61fc2b55b95b7b3a8f674cdd3b69c9)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I2f97bda4f360d8d05bbb603de839717b3d8067ae
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-03 13:44:37 -07:00
Greg Kroah-Hartman
32e6695e35 This is the 4.9.155 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxbDFsACgkQONu9yGCS
 aT4TmQ/+PCgz7TKTu1/Sh7ScmbSjlYFfaQ25K9NzqWDuaTOXCsvrWY4IDdbk7fxZ
 WnM8WPpTUfKaRwUVL3T32uwv/uMM8TnXN2r4dB9TH+WF/UK1ovLHQGO5zGWS9erY
 dvckRjkRuPe27KYd/sPSA7PRBiWuw1EFqcBIdAuC8OJK1knm3W95C8XVeIkVfey2
 /NFIvZELYEOCMzXwPqXbeFG553VtHVGB2GRJqEqZzqZ+JGnUCyBpCgvuwbrQ706l
 /Smvz23mmElUGzdTXr4uRPmf2g8bA3a+08/gQM4YPpZYRUMbcCVE3+KaYf3jjJPR
 Inj+cKBhF8R6JwkJayzdBJ1KbG6e6DaL2yANqNvoeYCkyP+4GNmoWKebFYklrXr/
 jn/btnJtHwSBDVufdgW6uQ039UE66Wfvvi7QrkCgTjebEyf6jU3SnB8Z7HpZZMCg
 dwgO6rHKklm6cSJTBZsIDyBd4oj19bTTvOiUcaayQXfeyjlrbWXhH6UOEBze54uT
 Ho8yjCq9r2TiO6g9mJOfkkvep1hgNVgabnOyA/papUx3my9sGkUn1Rqg1Pbm9OEF
 PV4p853t4+Qe5iBq+yXgacrCg+21q9Xg3ct6QW/c/Quz4UirR9Ayw93C4ej4rJYZ
 79/mADmkpH+A1ufP2F80UoWo0IN9oizecfYHOZ+6useUEVgIRik=
 =QoXP
 -----END PGP SIGNATURE-----

Merge 4.9.155 into android-4.9

Changes in 4.9.155
	Fix "net: ipv4: do not handle duplicate fragments as overlapping"
	fs: add the fsnotify call to vfs_iter_write
	ipv6: Consider sk_bound_dev_if when binding a socket to an address
	l2tp: copy 4 more bytes to linear part if necessary
	net/mlx4_core: Add masking for a few queries on HCA caps
	netrom: switch to sock timer API
	net/rose: fix NULL ax25_cb kernel panic
	ucc_geth: Reset BQL queue when stopping device
	net/mlx5e: Allow MAC invalidation while spoofchk is ON
	l2tp: remove l2specific_len dependency in l2tp_core
	l2tp: fix reading optional fields of L2TPv3
	ipvlan, l3mdev: fix broken l3s mode wrt local routes
	CIFS: Do not count -ENODATA as failure for query directory
	fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()
	ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment
	arm64: kaslr: ensure randomized quantities are clean also when kaslr is off
	arm64: hyp-stub: Forbid kprobing of the hyp-stub
	arm64: hibernate: Clean the __hyp_text to PoC after resume
	gfs2: Revert "Fix loop in gfs2_rbm_find"
	platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK
	platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes
	mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
	kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
	mm, oom: fix use-after-free in oom_kill_process
	mm: hwpoison: use do_send_sig_info() instead of force_sig()
	mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
	cifs: Always resolve hostname before reconnecting
	drivers: core: Remove glue dirs from sysfs earlier
	fs: don't scan the inode cache before SB_BORN is set
	fanotify: fix handling of events on child sub-directory
	Linux 4.9.155

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-02-07 09:14:43 +01:00
Amir Goldstein
987d8ff3a2 fanotify: fix handling of events on child sub-directory
commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.

When an event is reported on a sub-directory and the parent inode has
a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
fsnotify() even if the event type is not in the parent mark mask
(e.g. FS_OPEN).

Further more, if that event happened on a mount or a filesystem with
a mount/sb mark that does have that event type in their mask, the "on
child" event will be reported on the mount/sb mark.  That is not
desired, because user will get a duplicate event for the same action.

Note that the event reported on the victim inode is never merged with
the event reported on the parent inode, because of the check in
should_merge(): old_fsn->inode == new_fsn->inode.

Fix this by looking for a match of an actual event type (i.e. not just
FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
event to group if event type is only found on mount/sb marks.

[backport hint: The bug seems to have always been in fanotify, but this
                patch will only apply cleanly to v4.19.y]

Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
[amir: backport to v4.9]
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-06 17:33:30 +01:00
Greg Kroah-Hartman
320d53a9d0 This is the 4.9.96 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlre3foACgkQONu9yGCS
 aT4ygw/7BxK85/7my26TCcF3NlNfw8x9B1AuEHcjQdUFmSFip9fbolC1VVFQyzWV
 1Z6sK5LKrFU/234A4LVW+6oFVnUi8VvX5RnWNqRLj/xpPyv/L5PsG4A7VwQLZDCj
 yn6OrIkQ2czv3oio9R8bz1DFxrPsxoCdrMzV1xR/tbWQJscSRSomBSNstZ63lGQP
 FleS//hG21092lM0EDPQRfc9/xnwJFMKJoCwBrf6SUR7CK2ZdYp+w+HoeP5BSA49
 BCDdMU6wi63gTdfGFJa76VcKuFxRRic3F9VVJwOuYPCfCpgNz7GiUaGXirNCRSxD
 F0VveFMx9lOrxMCyDDEoj3lRaZ4FrhwFOF+Q0RK4/p8XH8Pr9emvt4beOeYNk3ev
 dFvCI/zA3zmDj8ppB+ZnKq9GTRSuNZ5sw4q7AyFu6fm0mSMV84ZRHgCrVbJkis3i
 VDnu6V+Nd9ZYP6rR6AegtQwFM3OReVVJG4ZpB17PZOebLdM+n4DsmqDCgvhFboVP
 WowtlR13GIn6ER7hiHfOlwv3CtTUspRrAZi73F8aEA7gdoYzGc8/9Q1fA738WiI4
 46utL18qyfjzwbwmI9dtFx1SLkrobSJ+X88PLquLYNUleoxbLx/2u4+v+gm0P4xo
 yQGPjIv+kms+rbwPR3ogMKqC2GrYYJs4uHBEPJboJWzYVoPAaEs=
 =yoJG
 -----END PGP SIGNATURE-----

Merge 4.9.96 into android-4.9

Changes in 4.9.96
	tty: make n_tty_read() always abort if hangup is in progress
	ubifs: Check ubifs_wbuf_sync() return code
	ubi: fastmap: Don't flush fastmap work on detach
	ubi: Fix error for write access
	ubi: Reject MLC NAND
	fs/reiserfs/journal.c: add missing resierfs_warning() arg
	resource: fix integer overflow at reallocation
	ipc/shm: fix use-after-free of shm file via remap_file_pages()
	mm, slab: reschedule cache_reap() on the same CPU
	usb: musb: gadget: misplaced out of bounds check
	usb: gadget: udc: core: update usb_ep_queue() documentation
	ARM: dts: at91: at91sam9g25: fix mux-mask pinctrl property
	ARM: dts: exynos: Fix IOMMU support for GScaler devices on Exynos5250
	ARM: dts: at91: sama5d4: fix pinctrl compatible string
	spi: Fix scatterlist elements size in spi_map_buf
	xen-netfront: Fix hang on device removal
	regmap: Fix reversed bounds check in regmap_raw_write()
	ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E
	ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()
	USB: gadget: f_midi: fixing a possible double-free in f_midi
	USB:fix USB3 devices behind USB3 hubs not resuming at hibernate thaw
	usb: dwc3: pci: Properly cleanup resource
	smb3: Fix root directory when server returns inode number of zero
	HID: i2c-hid: fix size check and type usage
	powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
	powerpc/64: Fix smp_wmb barrier definition use use lwsync consistently
	powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
	powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
	HID: Fix hid_report_len usage
	HID: core: Fix size as type u32
	ASoC: ssm2602: Replace reg_default_raw with reg_default
	thunderbolt: Resume control channel after hibernation image is created
	irqchip/gic: Take lock when updating irq type
	random: use a tighter cap in credit_entropy_bits_safe()
	jbd2: if the journal is aborted then don't allow update of the log tail
	ext4: don't update checksum of new initialized bitmaps
	ext4: protect i_disksize update by i_data_sem in direct write path
	ext4: fail ext4_iget for root directory if unallocated
	RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA device
	RDMA/rxe: Fix an out-of-bounds read
	ALSA: pcm: Fix UAF at PCM release via PCM timer access
	IB/srp: Fix srp_abort()
	IB/srp: Fix completion vector assignment algorithm
	dmaengine: at_xdmac: fix rare residue corruption
	libnvdimm, namespace: use a safe lookup for dimm device name
	nfit, address-range-scrub: fix scrub in-progress reporting
	um: Compile with modern headers
	um: Use POSIX ucontext_t instead of struct ucontext
	iommu/vt-d: Fix a potential memory leak
	mmc: jz4740: Fix race condition in IRQ mask update
	clk: mvebu: armada-38x: add support for 1866MHz variants
	clk: mvebu: armada-38x: add support for missing clocks
	clk: fix false-positive Wmaybe-uninitialized warning
	clk: bcm2835: De-assert/assert PLL reset signal when appropriate
	pwm: rcar: Fix a condition to prevent mismatch value setting to duty
	thermal: imx: Fix race condition in imx_thermal_probe()
	dt-bindings: clock: mediatek: add binding for fixed-factor clock axisel_d4
	watchdog: f71808e_wdt: Fix WD_EN register read
	vfio/pci: Virtualize Maximum Read Request Size
	ALSA: pcm: Use ERESTARTSYS instead of EINTR in OSS emulation
	ALSA: pcm: Avoid potential races between OSS ioctls and read/write
	ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams
	ALSA: pcm: Fix mutex unbalance in OSS emulation ioctls
	ALSA: pcm: Fix endless loop for XRUN recovery in OSS emulation
	ext4: don't allow r/w mounts if metadata blocks overlap the superblock
	drm/amdgpu: Add an ATPX quirk for hybrid laptop
	drm/amdgpu: Fix always_valid bos multiple LRU insertions.
	drm/amdgpu: Fix PCIe lane width calculation
	drm/rockchip: Clear all interrupts before requesting the IRQ
	drm/radeon: Fix PCIe lane width calculation
	ALSA: line6: Use correct endpoint type for midi output
	ALSA: rawmidi: Fix missing input substream checks in compat ioctls
	ALSA: hda - New VIA controller suppor no-snoop path
	random: fix crng_ready() test
	random: crng_reseed() should lock the crng instance that it is modifying
	random: add new ioctl RNDRESEEDCRNG
	HID: hidraw: Fix crash on HIDIOCGFEATURE with a destroyed device
	MIPS: uaccess: Add micromips clobbers to bzero invocation
	MIPS: memset.S: EVA & fault support for small_memset
	MIPS: memset.S: Fix return of __clear_user from Lpartial_fixup
	MIPS: memset.S: Fix clobber of v1 in last_fixup
	powerpc/eeh: Fix enabling bridge MMIO windows
	powerpc/lib: Fix off-by-one in alternate feature patching
	udf: Fix leak of UTF-16 surrogates into encoded strings
	jffs2_kill_sb(): deal with failed allocations
	hypfs_kill_super(): deal with failed allocations
	orangefs_kill_sb(): deal with allocation failures
	rpc_pipefs: fix double-dput()
	Don't leak MNT_INTERNAL away from internal mounts
	autofs: mount point create should honour passed in mode
	mm/filemap.c: fix NULL pointer in page_cache_tree_insert()
	fanotify: fix logic of events on child
	writeback: safer lock nesting
	block/mq: fix potential deadlock during cpu hotplug
	Linux 4.9.96

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-04-24 11:26:46 +02:00
Amir Goldstein
71f24a9130 fanotify: fix logic of events on child
commit 54a307ba8d3cd00a3902337ffaae28f436eeb1a4 upstream.

When event on child inodes are sent to the parent inode mark and
parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event
will not be delivered to the listener process. However, if the same
process also has a mount mark, the event to the parent inode will be
delivered regadless of the mount mark mask.

This behavior is incorrect in the case where the mount mark mask does
not contain the specific event type. For example, the process adds
a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD)
and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR).

A modify event on a file inside that directory (and inside that mount)
should not create a FAN_MODIFY event, because neither of the marks
requested to get that event on the file.

Fixes: 1968f5eed5 ("fanotify: use both marks when possible")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
[natechancellor: Fix small conflict due to lack of 3cd5eca8d7a2f]
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24 09:34:18 +02:00
Greg Kroah-Hartman
e6b0c64f6f This is the 4.9.41 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlmHya0ACgkQONu9yGCS
 aT7J4BAArYY7/NDbB+rIFtIdqdk3d6PiOyfQjQB2fSrhdR/+38a3Z+AtfNoPkppd
 DQsEnXuZyJeUP5GCNAue1urp0nUuuw/Kr/7DUZvfO8wnjFjS/hfKxnRE8o/S+acN
 y/TBFO3LuWSp8ROouAnAr7EEDrLtses6/boQHiioGcb/uqI9JVutkdnOTnm5xYJQ
 bAo7dEOlK4trRRGbbMju2FHTFHO/NStos3GD2ORcqS3ibj2FyHvGEmbyyBfmmi+2
 Jf5cOvn99lszHweUTxnRw93q6RZcGtmhYoPoANzYdtoZiXkV++EaEAfQ6KPVqeno
 mLNPqmWWvgJUsd1uCaR89GXwczVJtqThvGuJMroau+ShLJWuI8mL8N/N+g9wThJr
 ormq8PTP3oea9u63hfae1hfr4KsPqQ6UTQcSY7OuT4mgchU5T/DuSyYShy3kdtx4
 bT1hzpXwD9y0JCgPVfH1o7ad996D9/MVXpELNREcvrqEGsvRbxBx7gPc4FUzMqaz
 vft6z0MX8AvgtNLm9EY5a956ixRG27vGYA7drsLPFXKxIuHQ5oh2nBCupxkrDtGK
 0FEWeV6p+DJZm4Gl+0pCYt6AqxzszNW0eVM7sKngbAIYQ77m2h50TOb0HIA0MrKu
 Nrjs68hKc/louKc1eYQZnt6bD4EVHVjbpgVP+l1PGpKo/OSnxes=
 =tb62
 -----END PGP SIGNATURE-----

Merge 4.9.41 into android-4.9

Changes in 4.9.41
	af_key: Add lock to key dump
	pstore: Make spinlock per zone instead of global
	net: reduce skb_warn_bad_offload() noise
	jfs: Don't clear SGID when inheriting ACLs
	ALSA: fm801: Initialize chip after IRQ handler is registered
	ALSA: hda - Add missing NVIDIA GPU codec IDs to patch table
	parisc: Prevent TLB speculation on flushed pages on CPUs that only support equivalent aliases
	parisc: Extend disabled preemption in copy_user_page
	parisc: Suspend lockup detectors before system halt
	powerpc/pseries: Fix of_node_put() underflow during reconfig remove
	NFS: invalidate file size when taking a lock.
	NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter
	crypto: authencesn - Fix digest_null crash
	KVM: PPC: Book3S HV: Enable TM before accessing TM registers
	md/raid5: add thread_group worker async_tx_issue_pending_all
	drm/vmwgfx: Fix gcc-7.1.1 warning
	drm/nouveau/disp/nv50-: bump max chans to 21
	drm/nouveau/bar/gf100: fix access to upper half of BAR2
	KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
	KVM: PPC: Book3S HV: Save/restore host values of debug registers
	Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"
	Staging: comedi: comedi_fops: Avoid orphaned proc entry
	drm: rcar-du: Simplify and fix probe error handling
	smp/hotplug: Move unparking of percpu threads to the control CPU
	smp/hotplug: Replace BUG_ON and react useful
	nfc: Fix hangup of RC-S380* in port100_send_ack()
	nfc: fdp: fix NULL pointer dereference
	net: phy: Do not perform software reset for Generic PHY
	isdn: Fix a sleep-in-atomic bug
	isdn/i4l: fix buffer overflow
	ath10k: fix null deref on wmi-tlv when trying spectral scan
	wil6210: fix deadlock when using fw_no_recovery option
	mailbox: always wait in mbox_send_message for blocking Tx mode
	mailbox: skip complete wait event if timer expired
	mailbox: handle empty message in tx_tick
	sched/cgroup: Move sched_online_group() back into css_online() to fix crash
	RDMA/uverbs: Fix the check for port number
	ipmi/watchdog: fix watchdog timeout set on reboot
	dentry name snapshots
	v4l: s5c73m3: fix negation operator
	pstore: Allow prz to control need for locking
	pstore: Correctly initialize spinlock and flags
	pstore: Use dynamic spinlock initializer
	net: skb_needs_check() accepts CHECKSUM_NONE for tx
	device-dax: fix sysfs duplicate warnings
	x86/mce/AMD: Make the init code more robust
	r8169: add support for RTL8168 series add-on card.
	ARM: omap2+: fixing wrong strcat for Non-NULL terminated string
	dt-bindings: power/supply: Update TPS65217 properties
	dt-bindings: input: Specify the interrupt number of TPS65217 power button
	ARM: dts: am57xx-idk: Put USB2 port in peripheral mode
	ARM: dts: n900: Mark eMMC slot with no-sdio and no-sd flags
	net/mlx5: Disable RoCE on the e-switch management port under switchdev mode
	ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
	net/mlx4_core: Use-after-free causes a resource leak in flow-steering detach
	net/mlx4: Remove BUG_ON from ICM allocation routine
	net/mlx4_core: Fix raw qp flow steering rules under SRIOV
	drm/msm: Ensure that the hardware write pointer is valid
	drm/msm: Put back the vaddr in submit_reloc()
	drm/msm: Verify that MSM_SUBMIT_BO_FLAGS are set
	vfio-pci: use 32-bit comparisons for register address for gcc-4.5
	irqchip/keystone: Fix "scheduling while atomic" on rt
	ASoC: tlv320aic3x: Mark the RESET register as volatile
	spi: dw: Make debugfs name unique between instances
	ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL
	irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
	openrisc: Add _text symbol to fix ksym build error
	dmaengine: ioatdma: Add Skylake PCI Dev ID
	dmaengine: ioatdma: workaround SKX ioatdma version
	l2tp: consider '::' as wildcard address in l2tp_ip6 socket lookup
	dmaengine: ti-dma-crossbar: Add some 'of_node_put()' in error path.
	usb: dwc3: omap: fix race of pm runtime with irq handler in probe
	ARM64: zynqmp: Fix W=1 dtc 1.4 warnings
	ARM64: zynqmp: Fix i2c node's compatible string
	perf probe: Fix to get correct modname from elf header
	ARM: s3c2410_defconfig: Fix invalid values for NF_CT_PROTO_*
	ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
	usb: gadget: Fix copy/pasted error message
	Btrfs: use down_read_nested to make lockdep silent
	Btrfs: fix lockdep warning about log_mutex
	benet: stricter vxlan offloading check in be_features_check
	Btrfs: adjust outstanding_extents counter properly when dio write is split
	Xen: ARM: Zero reserved fields of xatp before making hypervisor call
	tools lib traceevent: Fix prev/next_prio for deadline tasks
	xfrm: Don't use sk_family for socket policy lookups
	perf tools: Install tools/lib/traceevent plugins with install-bin
	perf symbols: Robustify reading of build-id from sysfs
	video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
	vfio-pci: Handle error from pci_iomap
	arm64: mm: fix show_pte KERN_CONT fallout
	nvmem: imx-ocotp: Fix wrong register size
	net: usb: asix_devices: add .reset_resume for USB PM
	ASoC: fsl_ssi: set fifo watermark to more reliable value
	sh_eth: enable RX descriptor word 0 shift on SH7734
	ARCv2: IRQ: Call entry/exit functions for chained handlers in MCIP
	ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
	x86/platform/intel-mid: Rename 'spidev' to 'mrfld_spidev'
	perf/x86: Set pmu->module in Intel PMU modules
	ASoC: Intel: bytcr-rt5640: fix settings in internal clock mode
	HID: ignore Petzl USB headlamp
	scsi: fnic: Avoid sending reset to firmware when another reset is in progress
	scsi: snic: Return error code on memory allocation failure
	scsi: bfa: Increase requested firmware version to 3.2.5.1
	ASoC: Intel: Skylake: Release FW ctx in cleanup
	ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused
	Linux 4.9.41

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-08-07 14:21:32 -07:00
Al Viro
ad25f11ed2 dentry name snapshots
commit 49d31c2f389acfe83417083e1208422b4091cd9e upstream.

take_dentry_name_snapshot() takes a safe snapshot of dentry name;
if the name is a short one, it gets copied into caller-supplied
structure, otherwise an extra reference to external name is grabbed
(those are never modified).  In either case the pointer to stable
string is stored into the same structure.

dentry must be held by the caller of take_dentry_name_snapshot(),
but may be freely dropped afterwards - the snapshot will stay
until destroyed by release_dentry_name_snapshot().

Intended use:
	struct name_snapshot s;

	take_dentry_name_snapshot(&s, dentry);
	...
	access s.name
	...
	release_dentry_name_snapshot(&s);

Replaces fsnotify_oldname_...(), gets used in fsnotify to obtain the name
to pass down with event.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-06 18:59:43 -07:00
Greg Kroah-Hartman
951d823c04 This is the 4.9.30 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlkm3+kACgkQONu9yGCS
 aT72UhAAuNgHUXQWWaTy7Fl0bsIn5TDB5I9hrVll1EDehdqyrPt3bHec9/kClDOs
 eBov++TfYFkCbHMRYSzcxX36AXhD52LDeJ7t3D8L59YaASJv/mjTAqhXUvFlsxX+
 qNOxFJtfB7eURRMd8Onn/04F1I7h7wNh2Vu49NYZ3NYbCrMVmdUSZOL686lm20gT
 sceXEvawCtrJN3+c3g45Kbm4wrB+ymJDtUApRGbzrWm94hbjkD67wn5T0YRDHpfV
 oYjIwlZBCjW2yjSfkuLD+1I9ApmbMRUR/WYl1Zthh3FVh16ZCXAGDDhoBb+SOoor
 AFkgiikz2DcNoqmWBzQ8W7h46nZbO7Fm6Ejg/F2kSvUY/4C6S5W+T4rsHHCHhcI3
 rNGw6j+lPemlNtcIDA7JjfPjbgdtvHRAUHTMBrUpktxpMJU1L2gcbSJnNjgWTVP7
 yvnqp8nf2bOKGvv/yKNLAbt1XHOo+TCBERbGA6bXy/5QDMseMB4QnIhEjrUWt1MG
 QxgKxdFiSeONnkaAHecB3uaS7CiWSTWMCljESK6rcomXLBAmjVfjzTgrixBLuSzh
 Qi/xZTdLcMXYb0U0fK7vohWryxuN43o5Q3m93mAi1Sqm2uTAHE4ZTGm7BKh5O2uZ
 tXZ5r9+GkOqvwk6ssfOKDDMprPiMnBxZ/OxvMLjFdA4N5QTPYJE=
 =kXgS
 -----END PGP SIGNATURE-----

Merge 4.9.30 into android-4.9

Changes in 4.9.30
	usb: misc: legousbtower: Fix buffers on stack
	usb: misc: legousbtower: Fix memory leak
	USB: ene_usb6250: fix DMA to the stack
	watchdog: pcwd_usb: fix NULL-deref at probe
	char: lp: fix possible integer overflow in lp_setup()
	USB: core: replace %p with %pK
	tpm_tis_core: Choose appropriate timeout for reading burstcount
	ALSA: hda: Fix cpu lockup when stopping the cmd dmas
	ARM: tegra: paz00: Mark panel regulator as enabled on boot
	fanotify: don't expose EOPENSTALE to userspace
	tpm_tis_spi: Use single function to transfer data
	tpm_tis_spi: Abort transfer when too many wait states are signaled
	tpm_tis_spi: Check correct byte for wait state indicator
	tpm_tis_spi: Remove limitation of transfers to MAX_SPI_FRAMESIZE bytes
	tpm_tis_spi: Add small delay after last transfer
	tpm: msleep() delays - replace with usleep_range() in i2c nuvoton driver
	tpm: add sleep only for retry in i2c_nuvoton_write_status()
	tpm_crb: check for bad response size
	ASoC: cs4271: configure reset GPIO as output
	mlx5: Fix mlx5_ib_map_mr_sg mr length
	infiniband: call ipv6 route lookup via the stub interface
	dm btree: fix for dm_btree_find_lowest_key()
	dm raid: select the Kconfig option CONFIG_MD_RAID0
	dm bufio: avoid a possible ABBA deadlock
	dm bufio: check new buffer allocation watermark every 30 seconds
	dm mpath: split and rename activate_path() to prepare for its expanded use
	dm cache metadata: fail operations if fail_io mode has been established
	dm bufio: make the parameter "retain_bytes" unsigned long
	dm thin metadata: call precommit before saving the roots
	dm space map disk: fix some book keeping in the disk space map
	md: update slab_cache before releasing new stripes when stripes resizing
	md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop
	rtlwifi: rtl8821ae: setup 8812ae RFE according to device type
	mwifiex: MAC randomization should not be persistent
	mwifiex: pcie: fix cmd_buf use-after-free in remove/reset
	ima: accept previously set IMA_NEW_FILE
	KVM: x86: Fix load damaged SSEx MXCSR register
	KVM: x86: Fix potential preemption when get the current kvmclock timestamp
	KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation
	x86: fix 32-bit case of __get_user_asm_u64()
	regulator: rk808: Fix RK818 LDO2
	regulator: tps65023: Fix inverted core enable logic.
	s390/kdump: Add final note
	s390/cputime: fix incorrect system time
	ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device
	ath9k_htc: fix NULL-deref at probe
	drm/amdgpu: Make display watermark calculations more accurate
	drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations.
	drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path.
	drm/nouveau/therm: remove ineffective workarounds for alarm bugs
	drm/nouveau/tmr: ack interrupt before processing alarms
	drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
	drm/nouveau/tmr: avoid processing completed alarms when adding a new one
	drm/nouveau/tmr: handle races with hw when updating the next alarm time
	gpio: omap: return error if requested debounce time is not possible
	cdc-acm: fix possible invalid access when processing notification
	ohci-pci: add qemu quirk
	cxl: Force context lock during EEH flow
	cxl: Route eeh events to all drivers in cxl_pci_error_detected()
	proc: Fix unbalanced hard link numbers
	of: fix sparse warning in of_pci_range_parser_one
	of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes()
	of: fdt: add missing allocation-failure check
	ibmvscsis: Do not send aborted task response
	iio: dac: ad7303: fix channel description
	IIO: bmp280-core.c: fix error in humidity calculation
	IB/hfi1: Return an error on memory allocation failure
	IB/hfi1: Fix a subcontext memory leak
	pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
	pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()
	USB: serial: ftdi_sio: fix setting latency for unprivileged users
	USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs
	USB: chaoskey: fix Alea quirk on big-endian hosts
	f2fs: check entire encrypted bigname when finding a dentry
	fscrypt: avoid collisions when presenting long encrypted filenames
	libnvdimm: fix clear length of nvdimm_forget_poison()
	xhci: remove GFP_DMA flag from allocation
	usb: host: xhci-plat: propagate return value of platform_get_irq()
	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
	net: irda: irda-usb: fix firmware name on big-endian hosts
	usbvision: fix NULL-deref at probe
	mceusb: fix NULL-deref at probe
	ttusb2: limit messages to buffer size
	dvb-usb-dibusb-mc-common: Add MODULE_LICENSE
	usb: dwc3: gadget: Prevent losing events in event cache
	usb: musb: tusb6010_omap: Do not reset the other direction's packet size
	usb: musb: Fix trying to suspend while active for OTG configurations
	USB: iowarrior: fix info ioctl on big-endian hosts
	usb: serial: option: add Telit ME910 support
	USB: serial: qcserial: add more Lenovo EM74xx device IDs
	USB: serial: mct_u232: fix big-endian baud-rate handling
	USB: serial: io_ti: fix div-by-zero in set_termios
	USB: hub: fix SS hub-descriptor handling
	USB: hub: fix non-SS hub-descriptor handling
	ipx: call ipxitf_put() in ioctl error path
	iio: proximity: as3935: fix as3935_write
	iio: hid-sensor: Store restore poll and hysteresis on S3
	s5p-mfc: Fix race between interrupt routine and device functions
	gspca: konica: add missing endpoint sanity check
	s5p-mfc: Fix unbalanced call to clock management
	dib0700: fix NULL-deref at probe
	zr364xx: enforce minimum size when reading header
	dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops
	digitv: limit messages to buffer size
	dw2102: limit messages to buffer size
	cx231xx-audio: fix init error path
	cx231xx-audio: fix NULL-deref at probe
	cx231xx-cards: fix NULL-deref at probe
	powerpc/mm: Ensure IRQs are off in switch_mm()
	powerpc/eeh: Avoid use after free in eeh_handle_special_event()
	powerpc/book3s/mce: Move add_taint() later in virtual mode
	powerpc/pseries: Fix of_node_put() underflow during DLPAR remove
	powerpc/iommu: Do not call PageTransHuge() on tail pages
	powerpc/64e: Fix hang when debugging programs with relocated kernel
	powerpc/tm: Fix FP and VMX register corruption
	arm64: KVM: Do not use stack-protector to compile EL2 code
	arm: KVM: Do not use stack-protector to compile HYP code
	KVM: arm: plug potential guest hardware debug leakage
	ARM: 8662/1: module: split core and init PLT sections
	ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call
	ARM: dts: at91: sama5d3_xplained: fix ADC vref
	ARM: dts: at91: sama5d3_xplained: not all ADC channels are available
	ARM: dts: imx6sx-sdb: Remove OPP override
	arm64: dts: hi6220: Reset the mmc hosts
	arm64: xchg: hazard against entire exchange variable
	arm64: ensure extension of smp_store_release value
	arm64: armv8_deprecated: ensure extension of addr
	arm64: uaccess: ensure extension of access_ok() addr
	arm64: documentation: document tagged pointer stack constraints
	staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
	staging: rtl8192e: fix 2 byte alignment of register BSSIDR.
	staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.
	staging: rtl8192e: GetTs Fix invalid TID 7 warning.
	iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings
	metag/uaccess: Fix access_ok()
	metag/uaccess: Check access_ok in strncpy_from_user
	stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
	uwb: fix device quirk on big-endian hosts
	genirq: Fix chained interrupt data ordering
	nvme: unmap CMB and remove sysfs file in reset path
	MIPS: Loongson-3: Select MIPS_L1_CACHE_SHIFT_6
	osf_wait4(): fix infoleak
	um: Fix to call read_initrd after init_bootmem
	tracing/kprobes: Enforce kprobes teardown after testing
	PCI: hv: Allocate interrupt descriptors with GFP_ATOMIC
	PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs
	PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms
	PCI: Fix another sanity check bug in /proc/pci mmap
	PCI: Only allow WC mmap on prefetchable resources
	PCI: Freeze PME scan before suspending devices
	mtd: nand: orion: fix clk handling
	mtd: nand: omap2: Fix partition creation via cmdline mtdparts
	mtd: nand: add ooblayout for old hamming layout
	drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2
	NFSv4: Fix a hang in OPEN related to server reboot
	NFS: Fix use after free in write error path
	NFS: Use GFP_NOIO for two allocations in writeback
	nfsd: fix undefined behavior in nfsd4_layout_verify
	nfsd: encoders mustn't use unitialized values in error cases
	drivers: char: mem: Check for address space wraparound with mmap()
	drm/i915/gvt: Disable access to stolen memory as a guest
	Linux 4.9.30

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-05-25 17:32:43 +02:00
Amir Goldstein
cc0f994c20 fanotify: don't expose EOPENSTALE to userspace
commit 4ff33aafd32e084f5ee7faa54ba06e95f8b1b8af upstream.

When delivering an event to userspace for a file on an NFS share,
if the file is deleted on server side before user reads the event,
user will not get the event.

If the event queue contained several events, the stale event is
quietly dropped and read() returns to user with events read so far
in the buffer.

If the event queue contains a single stale event or if the stale
event is a permission event, read() returns to user with the kernel
internal error code 518 (EOPENSTALE), which is not a POSIX error code.

Check the internal return value -EOPENSTALE in fanotify_read(), just
the same as it is checked in path_openat() and drop the event in the
cases that it is not already dropped.

This is a reproducer from Marko Rauhamaa:

Just take the example program listed under "man fanotify" ("fantest")
and follow these steps:

    ==============================================================
    NFS Server    NFS Client(1)     NFS Client(2)
    ==============================================================
    # echo foo >/nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                  foo
                                    # ./fantest /nfsshare
                                    Press enter key to terminate.
                                    Listening for events.
    # rm -f /nfsshare/bar.txt
                  # cat /nfsshare/bar.txt
                                    read: Unknown error 518
                  cat: /nfsshare/bar.txt: Operation not permitted
    ==============================================================

where NFS Client (1) and (2) are two terminal sessions on a single NFS
Client machine.

Reported-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com>
Tested-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com>
Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25 15:44:31 +02:00
Dmitry Shmidt
cd08287396 This is the 4.9.6 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAliJpGsACgkQONu9yGCS
 aT5S3Q/9GjL13+Bv/KXRDtCxN7bGjyKqc45OyBzPT7h9atr51PUVOXTNDhlTaHmT
 7LzLw9le1D113svQC4iJHceqJAm2SHi0MjzpiZvF7j2dDERQ/Gqrde+w51SW67VE
 bB6QdyHHsx/pLmkEVR5Q3MJZ4+PlkzbklLCkLH7b2HpPFsMLTd+dPQlBtbttd5yk
 +K7WH3z549bh+JzLQxhi0NenFLq0EPK3kIw+hAqWqj7vW9XdVMtsuQGZP2ZMmdDW
 3aCcvVH+OCjxBJWZUDTJF++EVChoppqSgIMQ2BR8jLg9g/K3DqHfy75K9HF3DOek
 rBdxDlGFMe408QpN0JOdGz0508KZZOeY/rPEJF1eqwumudcWdYv5FPo/OmSDP/o7
 +7frMed/VohXzY8IITJeSenKFb+C+GBMn5qtq48uxUBiAtbQc5sOKUaSdhLpkRIe
 diXHkkP3oOHds5XlgnPr1oy+06iwOJjWSDu1wWyscZ6eBqalroymPg1JARTkEZjP
 L1GWW0OsxViGX5bGigFoajpnskU0c6I0JjvipoNm1Ii+9Y1Z9ZfWYPcISOBMwwLG
 6UbaizY4eE580bO9soNP7QURrVYvo7KljMYtCq/O6VikaAiJ7Vmu7DjLtz0r96zB
 mA2jajeujjB0dNH5M9UyTzzyzHpw+5rgqnyTHYcyf7k+uzIaLSM=
 =Kh8d
 -----END PGP SIGNATURE-----

Merge tag 'v4.9.6' into android-4.9

This is the 4.9.6 stable release

Change-Id: I318df4b9d706d50c13fe3969d734117c25fc94bc
2017-01-31 13:55:27 -08:00
Daniel Rosenberg
bbcd0ffae5 ANDROID: vfs: Add permission2 for filesystems with per mount permissions
This allows filesystems to use their mount private data to
influence the permssions they return in permission2. It has
been separated into a new call to avoid disrupting current
permission users.

Change-Id: I9d416e3b8b6eca84ef3e336bd2af89ddd51df6ca
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-31 10:47:22 -08:00
Daniel Rosenberg
973117c661 ANDROID: vfs: change d_canonical_path to take two paths
bug: 23904372
Change-Id: I4a686d64b6de37decf60019be1718e1d820193e6
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-27 13:55:00 -08:00
Daniel Rosenberg
835d38b4b5 ANDROID: inotify: Fix erroneous update of bit count
Patch "vfs: add d_canonical_path for stacked filesystem support"
erroneously updated the ALL_INOTIFY_BITS count. This changes it back

Change-Id: Idb04edc736da276159d30f04c40cff9d6b1e070f
2017-01-27 13:54:54 -08:00
Daniel Rosenberg
dab6f5031d ANDROID: vfs: add d_canonical_path for stacked filesystem support
Inotify does not currently know when a filesystem
is acting as a wrapper around another fs. This means
that inotify watchers will miss any modifications to
the base file, as well as any made in a separate
stacked fs that points to the same file.
d_canonical_path solves this problem by allowing the fs
to map a dentry to a path in the lower fs. Inotify
can use it to find the appropriate place to watch to
be informed of all changes to a file.

Change-Id: I09563baffad1711a045e45c1bd0bd8713c2cc0b6
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-27 13:54:51 -08:00
Jan Kara
d06485e0fc fsnotify: Fix possible use-after-free in inode iteration on umount
commit 5716863e0f8251d3360d4cbfc0e44e08007075df upstream.

fsnotify_unmount_inodes() plays complex tricks to pin next inode in the
sb->s_inodes list when iterating over all inodes. Furthermore the code has a
bug that if the current inode is the last on i_sb_list that does not have e.g.
I_FREEING set, then we leave next_i pointing to inode which may get removed
from the i_sb_list once we drop s_inode_list_lock thus resulting in
use-after-free issues (usually manifesting as infinite looping in
fsnotify_unmount_inodes()).

Fix the problem by keeping current inode pinned somewhat longer. Then we can
make the code much simpler and standard.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09 08:32:22 +01:00
Jan Kara
ed2726406c fsnotify: clean up spinlock assertions
Use assert_spin_locked() macro instead of hand-made BUG_ON statements.

Link: http://lkml.kernel.org/r/1474537439-18919-1-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:26 -07:00
Jan Kara
0b1b86527d fanotify: fix possible false warning when freeing events
When freeing permission events by fsnotify_destroy_event(), the warning
WARN_ON(!list_empty(&event->list)); may falsely hit.

This is because although fanotify_get_response() saw event->response
set, there is nothing to make sure the current CPU also sees the removal
of the event from the list.  Add proper locking around the WARN_ON() to
avoid the false warning.

Link: http://lkml.kernel.org/r/1473797711-14111-7-git-send-email-jack@suse.cz
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:26 -07:00
Jan Kara
073f65522a fanotify: use notification_lock instead of access_lock
Fanotify code has its own lock (access_lock) to protect a list of events
waiting for a response from userspace.

However this is somewhat awkward as the same list_head in the event is
protected by notification_lock if it is part of the notification queue
and by access_lock if it is part of the fanotify private queue which
makes it difficult for any reliable checks in the generic code.  So make
fanotify use the same lock - notification_lock - for protecting its
private event list.

Link: http://lkml.kernel.org/r/1473797711-14111-6-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:26 -07:00
Jan Kara
c21dbe20f6 fsnotify: convert notification_mutex to a spinlock
notification_mutex is used to protect the list of pending events.  As such
there's no reason to use a sleeping lock for it.  Convert it to a
spinlock.

[jack@suse.cz: fixed version]
  Link: http://lkml.kernel.org/r/1474031567-1831-1-git-send-email-jack@suse.cz
Link: http://lkml.kernel.org/r/1473797711-14111-5-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:26 -07:00
Jan Kara
1404ff3cc3 fsnotify: drop notification_mutex before destroying event
fsnotify_flush_notify() and fanotify_release() destroy notification
event while holding notification_mutex.

The destruction of fanotify event includes a path_put() call which may
end up calling into a filesystem to delete an inode if we happen to be
the last holders of dentry reference which happens to be the last holder
of inode reference.

That in turn may violate lock ordering for some filesystems since
notification_mutex is also acquired e. g. during write when generating
fanotify event.

Also this is the only thing that forces notification_mutex to be a
sleeping lock.  So drop notification_mutex before destroying a
notification event.

Link: http://lkml.kernel.org/r/1473797711-14111-4-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:26 -07:00
Jan Kara
96d41019e3 fanotify: fix list corruption in fanotify_get_response()
fanotify_get_response() calls fsnotify_remove_event() when it finds that
group is being released from fanotify_release() (bypass_perm is set).

However the event it removes need not be only in the group's notification
queue but it can have already moved to access_list (userspace read the
event before closing the fanotify instance fd) which is protected by a
different lock.  Thus when fsnotify_remove_event() races with
fanotify_release() operating on access_list, the list can get corrupted.

Fix the problem by moving all the logic removing permission events from
the lists to one place - fanotify_release().

Fixes: 5838d4442b ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-19 15:36:17 -07:00
Jan Kara
12703dbfeb fsnotify: add a way to stop queueing events on group shutdown
Implement a function that can be called when a group is being shutdown
to stop queueing new events to the group.  Fanotify will use this.

Fixes: 5838d4442b ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-19 15:36:17 -07:00
Jan Kara
35e481761c fsnotify: avoid spurious EMFILE errors from inotify_init()
Inotify instance is destroyed when all references to it are dropped.
That not only means that the corresponding file descriptor needs to be
closed but also that all corresponding instance marks are freed (as each
mark holds a reference to the inotify instance).  However marks are
freed only after SRCU period ends which can take some time and thus if
user rapidly creates and frees inotify instances, number of existing
inotify instances can exceed max_user_instances limit although from user
point of view there is always at most one existing instance.  Thus
inotify_init() returns EMFILE error which is hard to justify from user
point of view.  This problem is exposed by LTP inotify06 testcase on
some machines.

We fix the problem by making sure all group marks are properly freed
while destroying inotify instance.  We wait for SRCU period to end in
that path anyway since we have to make sure there is no event being
added to the instance while we are tearing down the instance.  So it
takes only some plumbing to allow for marks to be destroyed in that path
as well and not from a dedicated work item.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Tested-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Jeff Layton
0918f1c309 fsnotify: turn fsnotify reaper thread into a workqueue job
We don't require a dedicated thread for fsnotify cleanup.  Switch it
over to a workqueue job instead that runs on the system_unbound_wq.

In the interest of not thrashing the queued job too often when there are
a lot of marks being removed, we delay the reaper job slightly when
queueing it, to allow several to gather on the list.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Tested-by: Eryu Guan <guaneryu@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Jeff Layton
13d34ac6e5 Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
This reverts commit c510eff6be ("fsnotify: destroy marks with
call_srcu instead of dedicated thread").

Eryu reported that he was seeing some OOM kills kick in when running a
testcase that adds and removes inotify marks on a file in a tight loop.

The above commit changed the code to use call_srcu to clean up the
marks.  While that does (in principle) work, the srcu callback job is
limited to cleaning up entries in small batches and only once per jiffy.
It's easily possible to overwhelm that machinery with too many call_srcu
callbacks, and Eryu's reproduer did just that.

There's also another potential problem with using call_srcu here.  While
you can obviously sleep while holding the srcu_read_lock, the callbacks
run under local_bh_disable, so you can't sleep there.

It's possible when putting the last reference to the fsnotify_mark that
we'll end up putting a chain of references including the fsnotify_group,
uid, and associated keys.  While I don't see any obvious ways that that
could occurs, it's probably still best to avoid using call_srcu here
after all.

This patch reverts the above patch.  A later patch will take a different
approach to eliminated the dedicated thread here.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reported-by: Eryu Guan <guaneryu@gmail.com>
Tested-by: Eryu Guan <guaneryu@gmail.com>
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Jeff Layton
c510eff6be fsnotify: destroy marks with call_srcu instead of dedicated thread
At the time that this code was originally written, call_srcu didn't
exist, so this thread was required to ensure that we waited for that
SRCU grace period to settle before finally freeing the object.

It does exist now however and we can much more efficiently use call_srcu
to handle this.  That also allows us to potentially use srcu_barrier to
ensure that they are all of the callbacks have run before proceeding.
In order to conserve space, we union the rcu_head with the g_list.

This will be necessary for nfsd which will allocate marks from a
dedicated slabcache.  We have to be able to ensure that all of the
objects are destroyed before destroying the cache.  That's fairly

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: Eric Paris <eparis@parisplace.org>
Reviewed-by: Jan Kara <jack@suse.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Geliang Tang
1deaf9d197 fs/notify/inode_mark.c: use list_next_entry in fsnotify_unmount_inodes
To make the intention clearer, use list_next_entry instead of
list_entry.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Dave Hansen
d30e2c05a1 inotify: actually check for invalid bits in sys_inotify_add_watch()
The comment here says that it is checking for invalid bits.  But, the mask
is *actually* checking to ensure that _any_ valid bit is set, which is
quite different.

Without this check, an unexpected bit could get set on an inotify object.
Since these bits are also interpreted by the fsnotify/dnotify code, there
is the potential for an object to be mishandled inside the kernel.  For
instance, can we be sure that setting the dnotify flag FS_DN_RENAME on an
inotify watch is harmless?

Add the actual check which was intended.  Retain the existing inotify bits
are being added to the watch.  Plus, this is existing behavior which would
be nice to preserve.

I did a quick sniff test that inotify functions and that my
'inotify-tools' package passes 'make check'.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Dave Hansen
6933599697 inotify: hide internal kernel bits from fdinfo
There was a report that my patch:

    inotify: actually check for invalid bits in sys_inotify_add_watch()

broke CRIU.

The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/* to
figure out how to rebuild inotify watches and then passes those flags
directly back in to the inotify API.  One of those flags
(FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the inotify
API.  It is used inside the kernel to _implement_ inotify but it is not
and has never been part of the API.

My patch above ensured that we only allow bits which are part of the API
(IN_ALL_EVENTS).  This broke CRIU.

FS_EVENT_ON_CHILD is really internal to the kernel.  It is set _anyway_ on
all inotify marks.  So, CRIU was really just trying to set a bit that was
already set.

This patch hides that bit from fdinfo.  CRIU will not see the bit, not try
to set it, and should work as before.  We should not have been exposing
this bit in the first place, so this is a good patch independent of the
CRIU problem.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: Andrey Wagin <avagin@gmail.com>
Acked-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Linus Torvalds
7d9071a095 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "In this one:

   - d_move fixes (Eric Biederman)

   - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
     handling ought to be fixed)

   - switch of sb_writers to percpu rwsem (Oleg Nesterov)

   - superblock scalability (Josef Bacik and Dave Chinner)

   - swapon(2) race fix (Hugh Dickins)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
  vfs: Test for and handle paths that are unreachable from their mnt_root
  dcache: Reduce the scope of i_lock in d_splice_alias
  dcache: Handle escaped paths in prepend_path
  mm: fix potential data race in SyS_swapon
  inode: don't softlockup when evicting inodes
  inode: rename i_wb_list to i_io_list
  sync: serialise per-superblock sync operations
  inode: convert inode_sb_list_lock to per-sb
  inode: add hlist_fake to avoid the inode hash lock in evict
  writeback: plug writeback at a high level
  change sb_writers to use percpu_rw_semaphore
  shift percpu_counter_destroy() into destroy_super_work()
  percpu-rwsem: kill CONFIG_PERCPU_RWSEM
  percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
  percpu-rwsem: introduce percpu_down_read_trylock()
  document rwsem_release() in sb_wait_write()
  fix the broken lockdep logic in __sb_start_write()
  introduce __sb_writers_{acquired,release}() helpers
  ufs_inode_get{frag,block}(): get rid of 'phys' argument
  ufs_getfrag_block(): tidy up a bit
  ...
2015-09-05 20:34:28 -07:00
Jan Kara
4712e722f9 fsnotify: get rid of fsnotify_destroy_mark_locked()
fsnotify_destroy_mark_locked() is subtle to use because it temporarily
releases group->mark_mutex.  To avoid future problems with this
function, split it into two.

fsnotify_detach_mark() is the part that needs group->mark_mutex and
fsnotify_free_mark() is the part that must be called outside of
group->mark_mutex.  This way it's much clearer what's going on and we
also avoid some pointless acquisitions of group->mark_mutex.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Jan Kara
925d1132a0 fsnotify: remove mark->free_list
Free list is used when all marks on given inode / mount should be
destroyed when inode / mount is going away.  However we can free all of
the marks without using a special list with some care.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Jan Kara
3c53e51421 fsnotify: fix check in inotify fdinfo printing
A check in inotify_fdinfo() checking whether mark is valid was always
true due to a bug.  Luckily we can never get to invalidated marks since
we hold mark_mutex and invalidated marks get removed from the group list
when they are invalidated under that mutex.

Anyway fix the check to make code more future proof.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Dave Hansen
7c49b86164 fs/notify: optimize inotify/fsnotify code for unwatched files
I have a _tiny_ microbenchmark that sits in a loop and writes single
bytes to a file.  Writing one byte to a tmpfs file is around 2x slower
than reading one byte from a file, which is a _bit_ more than I expecte.
This is a dumb benchmark, but I think it's hard to deny that write() is
a hot path and we should avoid unnecessary overhead there.

I did a 'perf record' of 30-second samples of read and write.  The top
item in a diffprofile is srcu_read_lock() from fsnotify().  There are
active inotify fd's from systemd, but nothing is actually listening to
the file or its part of the filesystem.

I *think* we can avoid taking the srcu_read_lock() for the common case
where there are no actual marks on the file.  This means that there will
both be nothing to notify for *and* implies that there is no need for
clearing the ignore mask.

This patch gave a 13.1% speedup in writes/second on my test, which is an
improvement from the 10.8% that I saw with the last version.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Dave Chinner
74278da9f7 inode: convert inode_sb_list_lock to per-sb
The process of reducing contention on per-superblock inode lists
starts with moving the locking to match the per-superblock inode
list. This takes the global lock out of the picture and reduces the
contention problems to within a single filesystem. This doesn't get
rid of contention as the locks still have global CPU scope, but it
does isolate operations on different superblocks form each other.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
2015-08-17 18:39:46 -04:00
Jan Kara
8f2f3eb59d fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so that when fsnotify_destroy_mark_locked()
drops mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and thus the next
entry pointer we have cached may become stale and we dereference free
memory.

Fix the problem by first moving marks to free to a special private list
and then always free the first entry in the special list.  This method
is safe even when entries from the list can disappear once we drop the
lock.

Signed-off-by: Jan Kara <jack@suse.com>
Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Ashish Sangwan <a.sangwan@samsung.com>
Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:41 +03:00
Linus Torvalds
d725e66c06 Revert "fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()"
This reverts commit a2673b6e04.

Kinglong Mee reports a memory leak with that patch, and Jan Kara confirms:

 "Thanks for report! You are right that my patch introduces a race
  between fsnotify kthread and fsnotify_destroy_group() which can result
  in leaking inotify event on group destruction.

  I haven't yet decided whether the right fix is not to queue events for
  dying notification group (as that is pointless anyway) or whether we
  should just fix the original problem differently...  Whenever I look
  at fsnotify code mark handling I get lost in the maze of locks, lists,
  and subtle differences between how different notification systems
  handle notification marks :( I'll think about it over night"

and after thinking about it, Jan says:

 "OK, I have looked into the code some more and I found another
  relatively simple way of fixing the original oops.  It will be IMHO
  better than trying to fixup this issue which has more potential for
  breakage.  I'll ask Linus to revert the fsnotify fix he already merged
  and send a new fix"

Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Requested-by: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-21 16:06:53 -07:00
Jan Kara
a2673b6e04 fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and we dereference free
memory in the loop there.

Fix the problem by keeping mark_mutex held in
fsnotify_destroy_mark_locked().  The reason why we drop that mutex is that
we need to call a ->freeing_mark() callback which may acquire mark_mutex
again.  To avoid this and similar lock inversion issues, we move the call
to ->freeing_mark() callback to the kthread destroying the mark.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
Suggested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-17 16:39:54 -07:00
Paul Gortmaker
c013d5a458 fs/notify: don't use module_init for non-modular inotify_user code
The INOTIFY_USER option is bool, and hence this code is either
present or absent.  It will never be modular, so using
module_init as an alias for __initcall is rather misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future.  If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups.  As __initcall gets
mapped onto device_initcall, our use of fs_initcall (which
makes sense for fs code) will thus change this registration
from level 6-device to level 5-fs (i.e. slightly earlier).
However no observable impact of that small difference has
been observed during testing, or is expected.

Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2015-06-16 14:12:34 -04:00
Suzuki K. Poulose
b3c1030d50 fanotify: fix event filtering with FAN_ONDIR set
With FAN_ONDIR set, the user can end up getting events, which it hasn't
marked.  This was revealed with fanotify04 testcase failure on
Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0d7
("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").

   # /opt/ltp/testcases/bin/fanotify04
   [ ... ]
  fanotify04    7  TPASS  :  event generated properly for type 100000
  fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
  fanotify04    9  TPASS  :  No event as expected

The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
a fanotify on a dir.  Then does an open(), followed by close() of the
directory and expects to see an event FAN_OPEN(0x20).  However, the
fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
the flaw in the check for event_mask in fanotify_should_send_event() which
does:

	if (event_mask & marks_mask & ~marks_ignored_mask)
		return true;

where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
       marks_mask == (FAN_ONDIR | FAN_OPEN),
       marks_ignored_mask == 0

Fix this by masking the outgoing events to the user, as we already take
care of FAN_ONDIR and FAN_EVENT_ON_CHILD.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12 18:46:08 -07:00
David Howells
54f2a2f427 fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
rather than d_is_dir() when checking a dir watch and give an error on fake
directories.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:42 -05:00
David Howells
e36cb0b89c VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:

 (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

 (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

 (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
     complicated than it appears as some calls should be converted to
     d_can_lookup() instead.  The difference is whether the directory in
     question is a real dir with a ->lookup op or whether it's a fake dir with
     a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer.  In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
    print "No matches\n";
    exit(0);
}

my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
	die "spatch failed";
}

[AV: overlayfs parts skipped]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22 11:38:41 -05:00
Lino Sanfilippo
66ba93c0d7 fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask
Currently FAN_ONDIR is always set on a mark's ignored mask when the
event mask is extended without FAN_MARK_ONDIR being set.  This may
result in events for directories being ignored unexpectedly for call
sequences like

  fanotify_mark(fd, FAN_MARK_ADD, FAN_OPEN | FAN_ONDIR , AT_FDCWD, "dir");
  fanotify_mark(fd, FAN_MARK_ADD, FAN_CLOSE, AT_FDCWD, "dir");

Also FAN_MARK_ONDIR is only honored when adding events to a mark's mask,
but not for event removal.  Fix both issues by not setting FAN_ONDIR
implicitly on the ignore mask any more.  Instead treat FAN_ONDIR as any
other event flag and require FAN_MARK_ONDIR to be set by the user for
both event mask and ignore mask.  Furthermore take FAN_MARK_ONDIR into
account when set for event removal.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:28 -08:00
Lino Sanfilippo
d2c1874ce6 fanotify: don't recalculate a marks mask if only the ignored mask changed
If removing bits from a mark's ignored mask, the concerning
inodes/vfsmounts mask is not affected.  So don't recalculate it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:28 -08:00
Lino Sanfilippo
a118449a77 fanotify: only destroy mark when both mask and ignored_mask are cleared
In fanotify_mark_remove_from_mask() a mark is destroyed if only one of
both bitmasks (mask or ignored_mask) of a mark is cleared.  However the
other mask may still be set and contain information that should not be
lost.  So only destroy a mark if both masks are cleared.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:28 -08:00
Ingo Molnar
f49028292c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

  - Documentation updates.

  - Miscellaneous fixes.

  - Preemptible-RCU fixes, including fixing an old bug in the
    interaction of RCU priority boosting and CPU hotplug.

  - SRCU updates.

  - RCU CPU stall-warning updates.

  - RCU torture-test updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-21 06:12:21 +01:00
Peter Zijlstra
536ebe9ca9 sched, fanotify: Deal with nested sleeps
As per e23738a730 ("sched, inotify: Deal with nested sleeps").

fanotify_read is a wait loop with sleeps in. Wait loops rely on
task_struct::state and sleeps do too, since that's the only means of
actually sleeping. Therefore the nested sleeps destroy the wait loop
state and the wait loop breaks the sleep functions that assume
TASK_RUNNING (mutex_lock).

Fix this by using the new woken_wake_function and wait_woken() stuff,
which registers wakeups in wait and thereby allows shrinking the
task_state::state changes to the actual sleep part.

Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Link: http://lkml.kernel.org/r/20141216152838.GZ3337@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-09 11:18:12 +01:00
Pranith Kumar
83fe27ea53 rcu: Make SRCU optional by using CONFIG_SRCU
SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

   text    data     bss     dec     hex filename
   2007       0       0    2007     7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

   text    data     bss     dec     hex filename
 831552   64180   23944  919676   e087c arch/powerpc/boot/zImage : before
 829504   64180   23952  917636   e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
2015-01-06 11:04:29 -08:00
Jan Kara
37d469e767 fsnotify: remove destroy_list from fsnotify_mark
destroy_list is used to track marks which still need waiting for srcu
period end before they can be freed.  However by the time mark is added to
destroy_list it isn't in group's list of marks anymore and thus we can
reuse fsnotify_mark->g_list for queueing into destroy_list.  This saves
two pointers for each fsnotify_mark.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 12:42:53 -08:00