Commit graph

365 commits

Author SHA1 Message Date
Tom Herbert
850e18a66f
proto_ops: Add locked held versions of sendmsg and sendpage
Add new proto_ops sendmsg_locked and sendpage_locked that can be
called when the socket lock is already held. Correspondingly, add
kernel_sendmsg_locked and kernel_sendpage_locked as front end
functions.

These functions will be used in zero proxy so that we can take
the socket lock in a ULP sendmsg/sendpage and then directly call the
backend transport proto_ops functions.

Change-Id: I4a8a6f5234486946ec2870ae22fa8ea561df3af0
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-25 16:54:42 +03:00
Greg Kroah-Hartman
5af9b7f917 This is the 4.9.211 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4pSQ8ACgkQONu9yGCS
 aT7byBAA1Sx2dkXj107MHQ/XQUQjn9LeDoUtdB105780XXRJqS1L6Bsm5pvsiOvQ
 GvkKpCHCWg12iuEKJCQ1pr88XKQrik68vRUOCgMt4rh0TovM8eUz9fuvFGO+u330
 tCyW9zftkKNMYJmPSn2w6hOZcDK4wxgVjP6hkFgQJjyjFy/dbkcwFb6Vg9cfMRKc
 kkuacR9Hm7hG4V2RWD1pNsI4Nlly/oPXEmLJMplVGY+YyOAB5ne14JCevVX22bV2
 WD9EUihPCsyB41LF2FhX5jzWhyFKgt/9tyrl7VFjsEqmvvdS7S9YMD3RJ2alzQbo
 qT4Wph+xVT3JIdXuFZuAaHUfFKwnWR/6cHcMiejsv/A6B72aRaECSMSN8aCJSYit
 eV3L/LNoLaKcpdpJLKVAWSny1ZaLnYTxk0E3OilQz+ZzqRk/LDjnxQry5sem6oXt
 3HJlo2cuvd2bQ0Jd5RDnGW6N8CLx4HIMwnnxEjJmOqpUog6zSnhSbsvzpkQ2IZVs
 3pFj1eYMausbEfdLXrFuky0cLvswjcYKT6W+CcapGba6IaHDhSg5V2WkPOktwxMW
 BYnnzJptWXbbCt6de1ZwOyVpKdmmf/9hDG4egPVaCAs7/AOzE7P5+zIheiN9KqRw
 3Fz+KNFB85oztp5Ds4gnd9xYa11uEzTSm+vVCKGVfymPEKvCQfU=
 =84H5
 -----END PGP SIGNATURE-----

Merge 4.9.211 into android-4.9-q

Changes in 4.9.211
	hidraw: Return EPOLLOUT from hidraw_poll
	HID: hidraw: Fix returning EPOLLOUT from hidraw_poll
	HID: hidraw, uhid: Always report EPOLLOUT
	ethtool: reduce stack usage with clang
	fs/select: avoid clang stack usage warning
	rsi: add fix for crash during assertions
	arm64: mm: BUG on unsupported manipulations of live kernel mappings
	arm64: don't open code page table entry creation
	arm64: mm: Change page table pointer name in p[md]_set_huge()
	arm64: Enforce BBM for huge IO/VMAP mappings
	arm64: Make sure permission updates happen for pmd/pud
	cfg80211/mac80211: make ieee80211_send_layer2_update a public function
	mac80211: Do not send Layer 2 Update frame before authorization
	media: usb:zr364xx:Fix KASAN:null-ptr-deref Read in zr364xx_vidioc_querycap
	wimax: i2400: fix memory leak
	wimax: i2400: Fix memory leak in i2400m_op_rfkill_sw_toggle
	ext4: fix use-after-free race with debug_want_extra_isize
	ext4: add more paranoia checking in ext4_expand_extra_isize handling
	dccp: Fix memleak in __feat_register_sp
	rtc: mt6397: fix alarm register overwrite
	iommu: Remove device link to group on failure
	gpio: Fix error message on out-of-range GPIO in lookup table
	hsr: reset network header when supervision frame is created
	cifs: Adjust indentation in smb2_open_file
	RDMA/srpt: Report the SCSI residual to the initiator
	scsi: enclosure: Fix stale device oops with hot replug
	scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI
	platform/x86: asus-wmi: Fix keyboard brightness cannot be set to 0
	iio: imu: adis16480: assign bias value only if operation succeeded
	mei: fix modalias documentation
	clk: samsung: exynos5420: Preserve CPU clocks configuration during suspend/resume
	compat_ioctl: handle SIOCOUTQNSD
	PCI/PTM: Remove spurious "d" from granularity message
	powerpc/powernv: Disable native PCIe port management
	tty: serial: imx: use the sg count from dma_map_sg
	tty: serial: pch_uart: correct usage of dma_unmap_sg
	media: exynos4-is: Fix recursive locking in isp_video_release()
	mtd: spi-nor: fix silent truncation in spi_nor_read()
	spi: atmel: fix handling of cs_change set on non-last xfer
	rtlwifi: Remove unnecessary NULL check in rtl_regd_init
	f2fs: fix potential overflow
	rtc: msm6242: Fix reading of 10-hour digit
	gpio: mpc8xxx: Add platform device to gpiochip->parent
	scsi: libcxgbi: fix NULL pointer dereference in cxgbi_device_destroy()
	rseq/selftests: Turn off timeout setting
	MIPS: Prevent link failure with kcov instrumentation
	ioat: ioat_alloc_ring() failure handling.
	hexagon: parenthesize registers in asm predicates
	hexagon: work around compiler crash
	ocfs2: call journal flush to mark journal as empty after journal recovery when mount
	dt-bindings: reset: meson8b: fix duplicate reset IDs
	clk: Don't try to enable critical clocks if prepare failed
	ALSA: seq: Fix racy access for queue timer in proc read
	Fix built-in early-load Intel microcode alignment
	block: fix an integer overflow in logical block size
	iio: buffer: align the size of scan bytes to size of the largest element
	USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
	USB: serial: opticon: fix control-message timeouts
	USB: serial: suppress driver bind attributes
	USB: serial: ch341: handle unbound port at reset_resume
	USB: serial: io_edgeport: add missing active-port sanity check
	USB: serial: quatech2: handle unbound ports
	scsi: mptfusion: Fix double fetch bug in ioctl
	usb: core: hub: Improved device recognition on remote wakeup
	x86/efistub: Disable paging at mixed mode entry
	perf hists: Fix variable name's inconsistency in hists__for_each() macro
	perf report: Fix incorrectly added dimensions as switch perf data file
	mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio()
	net: stmmac: 16KB buffer must be 16 byte aligned
	net: stmmac: Enable 16KB buffer size
	USB: serial: io_edgeport: use irqsave() in USB's complete callback
	USB: serial: io_edgeport: handle unbound ports on URB completion
	USB: serial: keyspan: handle unbound ports
	scsi: fnic: use kernel's '%pM' format option to print MAC
	scsi: fnic: fix invalid stack access
	arm64: dts: agilex/stratix10: fix pmu interrupt numbers
	cfg80211: fix page refcount issue in A-MSDU decap
	netfilter: fix a use-after-free in mtype_destroy()
	netfilter: arp_tables: init netns pointer in xt_tgdtor_param struct
	batman-adv: Fix DAT candidate selection on little endian systems
	macvlan: use skb_reset_mac_header() in macvlan_queue_xmit()
	net: dsa: tag_qca: fix doubled Tx statistics
	net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info
	r8152: add missing endpoint sanity check
	tcp: fix marked lost packets not being retransmitted
	net: usb: lan78xx: limit size of local TSO packets
	xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk
	cw1200: Fix a signedness bug in cw1200_load_firmware()
	cfg80211: check for set_wiphy_params
	reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr
	scsi: esas2r: unlock on error in esas2r_nvram_read_direct()
	scsi: qla4xxx: fix double free bug
	scsi: bnx2i: fix potential use after free
	scsi: target: core: Fix a pr_debug() argument
	scsi: core: scsi_trace: Use get_unaligned_be*()
	perf probe: Fix wrong address verification
	regulator: ab8500: Remove SYSCLKREQ from enum ab8505_regulator_id
	Linux 4.9.211

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifc27b3c6afdbd6a39bd7ae4e551d8bed42dc4973
2020-01-23 09:20:25 +01:00
Arnd Bergmann
fc49aa70b2 compat_ioctl: handle SIOCOUTQNSD
commit 9d7bf41fafa5b5ddd4c13eb39446b0045f0a8167 upstream.

Unlike the normal SIOCOUTQ, SIOCOUTQNSD was never handled in compat
mode. Add it to the common socket compat handler along with similar
ones.

Fixes: 2f4e1b3970 ("tcp: ioctl type SIOCOUTQNSD returns amount of data not sent")
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:19:36 +01:00
Greg Kroah-Hartman
9595aa8719 This is the 4.9.190 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iTCkACgkQONu9yGCS
 aT7n5BAAp5eiY3ZNOOsthb4kCnWoJopEbAx4bgKfrHiWYizcPmA8MBjbwGnR0Brh
 WQnLWROg/00y7M+DgUhuPk3sLEFsoTDJlEmsp5e0UDh8ZiO0qt1S82LRE9vLUAaK
 QW92EbNO0NDOSZbTeLb7P/TVMmBBlUkm1UILfjMEHKMSn+syPJxUpbmjynmTBLE+
 2fJ6metOCJrEoiRM0mVeWlewXsy+XF2VYx5sCV2t6fx6GgofWPW3HZkQtJDaz2/R
 rj5G5f2A6HVGpwPoSvvXKc+q6cBD3g2efQtQbWu8j+VDbWsw7d5oiGGLs44xiTjo
 jC0si9+577y5c3DHo1AvryYD3CXkjpwsV6Y4nbt7j8Rd6LmNoVF3+8ghVuVCzpDE
 DSz4MDW7q4aw3o91QIwgZsljpheLnSJNv54ZFlz63ToGESMHNxuRsoTkchrf7+3Y
 htF9KZsT+xkbzc+KlhqtQ6ozRwzPY9/zUdItugnYkt2WQWRFfJdmVbupRv9C0Xv+
 0PVdj1YYc1NbbD4/R5SRe7aaj+InDMYPGW8LXjyxR9eLZlMj0ReN5dFi/4mvZ/Yu
 QegxU9TRLpSvl3s1nH8hwi+85rM0Jw+B/2swqRRRmU52A3b2fW4qaDV+q2LuD0VS
 vNRvZ1bBLx2rrYeOw40W9KQKjivj8mGTs1A9aQJZJumHuiSD6Z0=
 =zCyG
 -----END PGP SIGNATURE-----

Merge 4.9.190 into android-4.9-q

Changes in 4.9.190
	usb: usbfs: fix double-free of usb memory upon submiturb error
	usb: iowarrior: fix deadlock on disconnect
	sound: fix a memory leak bug
	x86/mm: Check for pfn instead of page in vmalloc_sync_one()
	x86/mm: Sync also unmappings in vmalloc_sync_all()
	mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
	perf record: Fix wrong size in perf_record_mmap for last kernel module
	perf db-export: Fix thread__exec_comm()
	perf record: Fix module size on s390
	usb: yurex: Fix use-after-free in yurex_delete
	can: peak_usb: fix potential double kfree_skb()
	netfilter: nfnetlink: avoid deadlock due to synchronous request_module
	iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
	mac80211: don't warn about CW params when not using them
	hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
	cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
	s390/qdio: add sanity checks to the fast-requeue path
	ALSA: compress: Fix regression on compressed capture streams
	ALSA: compress: Prevent bypasses of set_params
	ALSA: compress: Don't allow paritial drain operations on capture streams
	ALSA: compress: Be more restrictive about when a drain is allowed
	perf probe: Avoid calling freeing routine multiple times for same pointer
	drbd: dynamically allocate shash descriptor
	ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
	ARM: davinci: fix sleep.S build error on ARMv4
	scsi: megaraid_sas: fix panic on loading firmware crashdump
	scsi: ibmvfc: fix WARN_ON during event pool release
	scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
	tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
	perf/core: Fix creating kernel counters for PMUs that override event->cpu
	can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
	can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
	hwmon: (nct7802) Fix wrong detection of in4 presence
	ALSA: firewire: fix a memory leak bug
	ALSA: hda - Don't override global PCM hw info flag
	mac80211: don't WARN on short WMM parameters from AP
	SMB3: Fix deadlock in validate negotiate hits reconnect
	smb3: send CAP_DFS capability during session setup
	mwifiex: fix 802.11n/WPA detection
	iwlwifi: don't unmap as page memory that was mapped as single
	scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
	sh: kernel: hw_breakpoint: Fix missing break in switch statement
	mm/usercopy: use memory range to be accessed for wraparound check
	mm/memcontrol.c: fix use after free in mem_cgroup_iter()
	bpf: get rid of pure_initcall dependency to enable jits
	bpf: restrict access to core bpf sysctls
	bpf: add bpf_jit_limit knob to restrict unpriv allocations
	vhost-net: set packet weight of tx polling to 2 * vq size
	vhost_net: use packet weight for rx handler, too
	vhost_net: introduce vhost_exceeds_weight()
	vhost: introduce vhost_exceeds_weight()
	vhost_net: fix possible infinite loop
	vhost: scsi: add weight support
	siphash: add cryptographically secure PRF
	siphash: implement HalfSipHash1-3 for hash tables
	inet: switch IP ID generator to siphash
	netfilter: ctnetlink: don't use conntrack/expect object addresses as id
	xtensa: add missing isync to the cpu_reset TLB code
	ALSA: hda - Fix a memory leak bug
	ALSA: hda - Add a generic reboot_notify
	ALSA: hda - Let all conexant codec enter D3 when rebooting
	HID: holtek: test for sanity of intfdata
	HID: hiddev: avoid opening a disconnected device
	HID: hiddev: do cleanup in failure of opening a device
	Input: kbtab - sanity check for endpoint type
	Input: iforce - add sanity checks
	net: usb: pegasus: fix improper read if get_registers() fail
	xen/pciback: remove set but not used variable 'old_state'
	irqchip/irq-imx-gpcv2: Forward irq type to parent
	perf header: Fix divide by zero error if f_header.attr_size==0
	perf header: Fix use of unitialized value warning
	libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
	scsi: hpsa: correct scsi command status issue after reset
	ata: libahci: do not complain in case of deferred probe
	kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
	arm64/efi: fix variable 'si' set but not used
	arm64/mm: fix variable 'pud' set but not used
	IB/core: Add mitigation for Spectre V1
	IB/mad: Fix use-after-free in ib mad completion handling
	ocfs2: remove set but not used variable 'last_hash'
	staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
	staging: comedi: dt3000: Fix rounding up of timer divisor
	USB: core: Fix races in character device registration and deregistraion
	usb: cdc-acm: make sure a refcount is taken early enough
	USB: CDC: fix sanity checks in CDC union parser
	USB: serial: option: add D-Link DWM-222 device ID
	USB: serial: option: Add support for ZTE MF871A
	USB: serial: option: add the BroadMobi BM818 card
	USB: serial: option: Add Motorola modem UARTs
	asm-generic: fix -Wtype-limits compiler warnings
	bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
	arm64: compat: Allow single-byte watchpoints on all addresses
	netfilter: conntrack: Use consistent ct id hash calculation
	Input: psmouse - fix build error of multiple definition
	iommu/amd: Move iommu_init_pci() to .init section
	bnx2x: Fix VF's VLAN reconfiguration in reload.
	net/packet: fix race in tpacket_snd()
	sctp: fix the transport error_count check
	xen/netback: Reset nr_frags before freeing skb
	net/mlx5e: Only support tx/rx pause setting for port owner
	net/mlx5e: Use flow keys dissector to parse packets for ARFS
	team: Add vlan tx offload to hw_enc_features
	bonding: Add vlan tx offload to hw_enc_features
	Linux 4.9.190

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-08-25 14:53:01 +02:00
Daniel Borkmann
5124abda30 bpf: get rid of pure_initcall dependency to enable jits
commit fa9dd599b4dae841924b022768354cfde9affecb upstream.

Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[bwh: Backported to 4.9 as dependency of commit 2e4a30983b0f
 "bpf: restrict access to core bpf sysctls":
 - Drop change in arch/mips/net/ebpf_jit.c
 - Drop change to bpf_jit_kallsyms
 - Adjust filenames, context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25 10:51:40 +02:00
Eric Biggers
60679b3b05 UPSTREAM: net: socket: set sock->sk to NULL after calling proto_ops::release()
Commit 9060cb719e61 ("net: crypto set sk to NULL when af_alg_release.")
fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
closed concurrently with fchownat().  However, it ignored that many
other proto_ops::release() methods don't set sock->sk to NULL and
therefore allow the same use-after-free:

    - base_sock_release
    - bnep_sock_release
    - cmtp_sock_release
    - data_sock_release
    - dn_release
    - hci_sock_release
    - hidp_sock_release
    - iucv_sock_release
    - l2cap_sock_release
    - llcp_sock_release
    - llc_ui_release
    - rawsock_release
    - rfcomm_sock_release
    - sco_sock_release
    - svc_release
    - vcc_release
    - x25_release

Rather than fixing all these and relying on every socket type to get
this right forever, just make __sock_release() set sock->sk to NULL
itself after calling proto_ops::release().

Reproducer that produces the KASAN splat when any of these socket types
are configured into the kernel:

    #include <pthread.h>
    #include <stdlib.h>
    #include <sys/socket.h>
    #include <unistd.h>

    pthread_t t;
    volatile int fd;

    void *close_thread(void *arg)
    {
        for (;;) {
            usleep(rand() % 100);
            close(fd);
        }
    }

    int main()
    {
        pthread_create(&t, NULL, close_thread, NULL);
        for (;;) {
            fd = socket(rand() % 50, rand() % 11, 0);
            fchownat(fd, "", 1000, 1000, 0x1000);
            close(fd);
        }
    }

Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit ff7b11aa481f682e0e9711abfeb7d03f5cd612bf)
Bug: 125367761
Test: used reproducer above
Change-Id: Ied4bbca5c7eb80c201fec6e0aabc95c24acc1b59
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-03-27 22:40:58 +00:00
Greg Kroah-Hartman
a0f30ae1d6 This is the 4.9.136 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlvm/IUACgkQONu9yGCS
 aT6mrw//ctcqOR9aZYTODrVHFZ4puE2xhae5Hr+hwtcE2WSjHWuxJfVkrEuJGlIH
 4oQpUfek+eYf3yZy8Iw9WLZH1+P3evGkR0G4gBD/A4f25qCKCcHEXOAPiKgeadnC
 tj49fEkiJgO3I9vRx8yJnUvhxR/Br5CTOUMdTYsWHbCsdewzCMHWlwpJhLwV053j
 P9cCrpfJLD55HDdj/jwcn2jfooIVfYsYkut8jP0qTKI04rWEZgOrCSjahN8KHtQ5
 GgykDU7db8mmP1IhM+bhGuQReSX7myx/MGx5dS7Mli+5aUtYCMlkqylpL96NuBbe
 axFpie4nBTny6dIHXodZx59J/T1ERBws9zLzKF1oyxANHEeTiO7q+hbaw9vRLN5G
 mNWyn0KZ8T0+BWSL1pyA+oVwZkjOcMDil5Gz7Y7A9kE4xj5grrl5IevAtSD6tb9X
 zwAk5hjvaBmZVVM9NgbG2bGATPNLnv1l57TCRjsx91p9uzReg8gYxNrijIwGqGip
 HrR/HJvgfI9Df52X8JtGfs+397mXevxl1Lo56Pv1nkagkD1fvhqFLRZgd3y1MoIO
 DNjdUohw0tBorHqdpvgnZnifuwk3AcPiCMqqfCcGwkcguoM8XFhedTkTPrut5+f4
 IPK0Qh25lcT9k+GHJUvDOEzQvx4CGcG8uVj0FgiebWdlS3KZ56s=
 =0M4P
 -----END PGP SIGNATURE-----

Merge 4.9.136 into android-4.9

Also revert commit b91d532928df ("ipv6: set rt6i_protocol properly in
the route when it is installed") as it breaks the test systems.

Changes in 4.9.136
	xfrm: Validate address prefix lengths in the xfrm selector.
	xfrm6: call kfree_skb when skb is toobig
	mac80211: Always report TX status
	cfg80211: reg: Init wiphy_idx in regulatory_hint_core()
	mac80211: fix pending queue hang due to TX_DROP
	cfg80211: Address some corner cases in scan result channel updating
	mac80211: TDLS: fix skb queue/priority assignment
	ARM: 8799/1: mm: fix pci_ioremap_io() offset check
	xfrm: validate template mode
	ARM: dts: BCM63xx: Fix incorrect interrupt specifiers
	net: macb: Clean 64b dma addresses if they are not detected
	soc: fsl: qbman: qman: avoid allocating from non existing gen_pool
	soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()
	nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT
	mac80211_hwsim: do not omit multicast announce of first added radio
	Bluetooth: SMP: fix crash in unpairing
	pxa168fb: prepare the clock
	qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
	qed: Avoid constant logical operation warning in qed_vf_pf_acquire
	asix: Check for supported Wake-on-LAN modes
	ax88179_178a: Check for supported Wake-on-LAN modes
	lan78xx: Check for supported Wake-on-LAN modes
	sr9800: Check for supported Wake-on-LAN modes
	r8152: Check for supported Wake-on-LAN Modes
	smsc75xx: Check for Wake-on-LAN modes
	smsc95xx: Check for Wake-on-LAN modes
	perf/ring_buffer: Prevent concurent ring buffer access
	perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX
	net: fec: fix rare tx timeout
	declance: Fix continuation with the adapter identification message
	net: cxgb3_main: fix a missing-check bug
	perf symbols: Fix memory corruption because of zero length symbols
	mm/memory_hotplug.c: fix overflow in test_pages_in_a_zone()
	MIPS: microMIPS: Fix decoding of swsp16 instruction
	MIPS: Handle non word sized instructions when examining frame
	scsi: aacraid: Fix typo in blink status
	f2fs: fix multiple f2fs_add_link() having same name for inline dentry
	igb: Remove superfluous reset to PHY and page 0 selection
	ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs
	PCI: Disable MSI for HiSilicon Hip06/Hip07 only in Root Port mode
	i2c: bcm2835: Avoid possible NULL ptr dereference
	efi/fb: Correct PCI_STD_RESOURCE_END usage
	ipv6: set rt6i_protocol properly in the route when it is installed
	platform/x86: acer-wmi: setup accelerometer when ACPI device was found
	IB/ipoib: Do not warn if IPoIB debugfs doesn't exist
	IB/core: Fix the validations of a multicast LID in attach or detach operations
	orangefs: off by ones in xattr size checks
	rxe: Fix a sleep-in-atomic bug in post_one_send
	nvme-pci: fix CMB sysfs file removal in reset path
	net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well.
	net/mlx5: Fix command completion after timeout access invalid structure
	tipc: Fix tipc_sk_reinit handling of -EAGAIN
	tipc: fix a race condition of releasing subscriber object
	bnxt_en: Don't use rtnl lock to protect link change logic in workqueue.
	ath10k: fix NAPI enable/disable symmetry for AHB interface
	ARM: dts: bcm283x: Reserve first page for firmware
	btrfs: fiemap: Cache and merge fiemap extent before submit it to user
	ata: sata_rcar: Handle return value of clk_prepare_enable
	reset: hi6220: Set module license so that it can be loaded
	ASoC: Intel: Skylake: Fix to parse consecutive string tkns in manifest
	arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
	mac80211: fix TX aggregation start/stop callback race
	libata: fix error checking in in ata_parse_force_one()
	net: ethernet: stmmac: Fix altr_tse_pcs SGMII Initialization
	qlcnic: Fix tunnel offload for 82xx adapters
	x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC
	ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M
	gpu: ipu-v3: Fix CSI selection for VDIC
	elevator: fix truncation of icq_cache_name
	net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
	Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io
	ufs: we need to sync inode before freeing it
	net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
	ip6_tunnel: Correct tos value in collect_md mode
	net/mlx5: Fix driver load error flow when firmware is stuck
	perf evsel: Fix probing of precise_ip level for default cycles event
	perf probe: Fix probe definition for inlined functions
	net/mlx5: Fix health work queue spin lock to IRQ safe
	usb: renesas_usbhs: gadget: fix spin_lock_init() for &uep->lock
	usb: renesas_usbhs: gadget: fix unused-but-set-variable warning
	usb: dwc3: omap: remove IRQ_NOAUTOEN used with shared irq
	clk: samsung: Fix m2m scaler clock on Exynos542x
	ptr_ring: fix up after recent ptr_ring changes
	staging: wilc1000: Fix problem with wrong vif index
	rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp
	iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications"
	qed: Warn PTT usage by wrong hw-function
	ocfs2: fix deadlock caused by recursive locking in xattr
	net: cdc_ncm: GetNtbFormat endian fix
	sctp: use right member as the param of list_for_each_entry
	ALSA: hda - No loopback on ALC299 codec
	ath10k: convert warning about non-existent OTP board id to debug message
	ipv6: fix cleanup ordering for ip6_mr failure
	IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
	IB/rxe: put the pool on allocation failure
	nbd: only set MSG_MORE when we have more to send
	mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'
	IB/mlx5: Avoid passing an invalid QP type to firmware
	scsi: qla2xxx: Avoid double completion of abort command
	drm: bochs: Don't remove uninitialized fbdev framebuffer
	i40e: avoid NVM acquire deadlock during NVM update
	Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
	Btrfs: incremental send, fix invalid memory access
	drm/msm: Fix possible null dereference on failure of get_pages()
	module: fix DEBUG_SET_MODULE_RONX typo
	iio: pressure: zpa2326: Remove always-true check which confuses gcc
	l2tp: remove configurable payload offset
	macsec: fix memory leaks when skb_to_sgvec fails
	perf/core: Fix locking for children siblings group read
	cifs: Use ULL suffix for 64-bit constant
	futex: futex_wake_op, do not fail on invalid op
	ALSA: hda - Fix incorrect usage of IS_REACHABLE()
	test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches
	xen-netfront: Update features after registering netdev
	sparc64: Fix regression in pmdp_invalidate().
	xen-netfront: Fix mismatched rtnl_unlock
	enic: do not overwrite error code
	bonding: ratelimit failed speed/duplex update warning
	nvmet: fix space padding in serial number
	iio: buffer: fix the function signature to match implementation
	x86/paravirt: Fix some warning messages
	IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
	libertas: call into generic suspend code before turning off power
	xhci: Fix USB3 NULL pointer dereference at logical disconnect.
	perf tests: Fix indexing when invoking subtests
	ARM: dts: imx53-qsb: disable 1.2GHz OPP
	rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
	rxrpc: Only take the rwind and mtu values from latest ACK
	net: ena: fix NULL dereference due to untimely napi initialization
	fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
	mtd: spi-nor: Add support for is25wp series chips
	Revert "netfilter: ipv6: nf_defrag: drop skb dst before queueing"
	perf tools: Disable parallelism for 'make clean'
	bridge: do not add port to router list when receives query with source 0.0.0.0
	net: bridge: remove ipv6 zero address check in mcast queries
	ipv6: mcast: fix a use-after-free in inet6_mc_check
	ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called
	llc: set SOCK_RCU_FREE in llc_sap_add_socket()
	net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
	net: sched: gred: pass the right attribute to gred_change_table_def()
	net: socket: fix a missing-check bug
	net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules
	net: udp: fix handling of CHECKSUM_COMPLETE packets
	r8169: fix NAPI handling under high load
	sctp: fix race on sctp_id2asoc
	vhost: Fix Spectre V1 vulnerability
	ethtool: fix a privilege escalation bug
	bonding: fix length of actor system
	net: drop skb on failure in ip_check_defrag()
	net: fix pskb_trim_rcsum_slow() with odd trim offset
	rtnetlink: Disallow FDB configuration for non-Ethernet device
	ip6_tunnel: Fix encapsulation layout
	Revert "x86/mm: Expand static page table for fixmap space"
	crypto: shash - Fix a sleep-in-atomic bug in shash_setkey_unaligned
	ahci: don't ignore result code of ahci_reset_controller()
	gpio: mxs: Get rid of external API call
	xfs: truncate transaction does not modify the inobt
	cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
	ptp: fix Spectre v1 vulnerability
	drm/edid: Add 6 bpc quirk for BOE panel in HP Pavilion 15-n233sl
	RDMA/ucma: Fix Spectre v1 vulnerability
	IB/ucm: Fix Spectre v1 vulnerability
	cdc-acm: correct counting of UART states in serial state notification
	usb: gadget: storage: Fix Spectre v1 vulnerability
	USB: fix the usbfs flag sanitization for control transfers
	Input: elan_i2c - add ACPI ID for Lenovo IdeaPad 330-15IGM
	sched/fair: Fix throttle_list starvation with low CFS quota
	x86/percpu: Fix this_cpu_read()
	x86/time: Correct the attribute on jiffies' definition
	net: fs_enet: do not call phy_stop() in interrupts
	posix-timers: Sanitize overrun handling
	Linux 4.9.136

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-11-23 10:47:24 +01:00
Wenwen Wang
f57ef24f81 net: socket: fix a missing-check bug
[ Upstream commit b6168562c8ce2bd5a30e213021650422e08764dc ]

In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.

This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-10 07:42:58 -08:00
Cong Wang
7fa8c15e72 UPSTREAM: socket: close race condition between sock_close() and sockfs_setattr()
fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().

As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().

sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.

Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor <shankarapailoor@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 6d8c50dcb029872b298eea68cc6209c866fd3e14)
Signed-off-by: Chenbo Feng <fengc@google.com>

Bug: 112220999
Test: syzcaller reproducer doesn't trigger the crash anymore
Change-Id: I586fbc3b200f8cb855017d5cd701a126a36b8172
2018-08-23 17:30:03 +00:00
Greg Kroah-Hartman
47b77b8d01 This is the 4.9.118 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltoWckACgkQONu9yGCS
 aT4rdxAAyB4LLh5ylp8b2wEbpSWpOIRGfb1Y78VLf8T3TPsCo/46pTgOPVwpGpeJ
 O9QDBcPBEwqJVJEYW0Hf5PBj/JhVGw9uQ4JM+6Tuy1BoZmlfxmUgQz2NotSAAxUD
 b5ymy5LnMOoM+GX2IsPILsz0h54NGTlQtdjH2C6dUYx/u8uWzUwgW1eXPdc+m++7
 OSWSQ276jZs0oAYgsS5r0GBpe5C+G72dRVDD0uRKTNQEsmSdCOTX6BzaxBzll4yQ
 gaZTQre0Sgmv6cyl0rJ6JqdyNECN1i+aw3oSU75Zr+1cfaRPh+8APtN0PW6HUV47
 WO08k1/0L5HA/EOU6YI4QwNcQS8yv+H0avmsDwnXc8a2NgKpLFlV+LjAQA2jDnTJ
 CWFkLFyfkFtYM/W1Xglyo7OyA1o1BmoZVzjiPECRtW2RqVfl9hORqH4gMtxoHxy2
 maE0he/FcVp6iu9hoas2g7V7T/O6UF2ipYWG/+WZBuZY3SjojNth/MKuQ7E+qLY5
 UDBMx9CCAjYqAKN4A+aMCAfociV5vTAeQLbwc1ffa4JtqX88nDQxAp7SBP8beEWc
 CQsnCvksTdqebeDN0DWcRbSs1abjjeZcoWiifdwGVwwiE5D1RgLZxrABaNEX4XJ6
 lQNUYzMuT8D9MzEoDn0TB5mLgIvxdA5gQzwWMV30h5f3fXax1ro=
 =qE4w
 -----END PGP SIGNATURE-----

Merge 4.9.118 into android-4.9

Changes in 4.9.118
	ipv4: remove BUG_ON() from fib_compute_spec_dst
	net: ena: Fix use of uninitialized DMA address bits field
	net: fix amd-xgbe flow-control issue
	net: lan78xx: fix rx handling before first packet is send
	net: mdio-mux: bcm-iproc: fix wrong getter and setter pair
	NET: stmmac: align DMA stuff to largest cache line length
	tcp_bbr: fix bw probing to raise in-flight data for very small BDPs
	xen-netfront: wait xenbus state change when load module manually
	tcp: do not force quickack when receiving out-of-order packets
	tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
	tcp: do not aggressively quick ack after ECN events
	tcp: refactor tcp_ecn_check_ce to remove sk type cast
	tcp: add one more quick ack after after ECN events
	pinctrl: intel: Read back TX buffer state
	sched/wait: Remove the lockless swait_active() check in swake_up*()
	bonding: avoid lockdep confusion in bond_get_stats()
	inet: frag: enforce memory limits earlier
	ipv4: frags: handle possible skb truesize change
	net: dsa: Do not suspend/resume closed slave_dev
	netlink: Fix spectre v1 gadget in netlink_create()
	net: stmmac: Fix WoL for PCI-based setups
	squashfs: more metadata hardening
	squashfs: more metadata hardenings
	can: ems_usb: Fix memory leak on ems_usb_disconnect()
	net: socket: fix potential spectre v1 gadget in socketcall
	virtio_balloon: fix another race between migration and ballooning
	kvm: x86: vmx: fix vpid leak
	crypto: padlock-aes - Fix Nano workaround data corruption
	drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats
	scsi: sg: fix minor memory leak in error path
	Linux 4.9.118

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-06 19:09:58 +02:00
Jeremy Cline
9a492f8c71 net: socket: fix potential spectre v1 gadget in socketcall
commit c8e8cd579bb4265651df8223730105341e61a2d1 upstream.

'call' is a user-controlled value, so sanitize the array index after the
bounds check to avoid speculating past the bounds of the 'nargs' array.

Found with the help of Smatch:

net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue
'nargs' [r] (local cap)

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06 16:23:04 +02:00
Greg Kroah-Hartman
71f1469722 This is the 4.9.79 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlpxrs0ACgkQONu9yGCS
 aT7Yxw/+NmM+Yh70QOpW02RCFHCB+F9tnuQXNlLfEoDqlujMS/UNuVMx39gQXDaU
 7T/JOmnVtp9WQL9RLgAegSc3ayIQELzvtKjDLSo/hzxYsOmr0WlN2CVTGT7hn9JH
 IQdf8cR2r4FZ/XcxQLpSsRabwhqfeoND1TTm5LUNB1Ii05hUU6/s0k1rQguabuo5
 vi0BzSh7v/URxlLyL0m4ZVqovWOASS5/qSv7wazd4i/bSqH3g7VXLNu93iyOB8ih
 XXpeTjtfAwJ5kUXBWZPNazUzpQ7b56sQPtsvN6CrvTv8jKJ+FH+7S4d50Vgbu51X
 YBC36yypYPXunMXB9iiLYkyb8jraKr12BRLXQyl3TlNANoYjBiT/a2XmHDMA1VbL
 +ydbswbmcAvZ1fuAekVY+HIogEroWzN7FbhdUgV12nm7/4WfxpBTZW+M8Es/Stuh
 2ACT9TWopbhwRFUhFT5kyDTTnK++NsshGzUXbR9qPQzhdaqe76RPfJ6uHV69MXxP
 gE9o3NQ3fUieJO5nQj54atErX+sJ4987DnGoWrg+Ye9Svsq1oVw0K1e44VLBp08v
 iZk2lvNjUWnkDGQOhsPEYCLq6KPjXkaqV4OZVS6tGxGEZ4QQJjbnYk+kPeKjrKIA
 iP3nfaLJ4HQc2kvwEI41HEJGWyGUlhdrnDqfpxpWgGXOStGJrq0=
 =RJ2h
 -----END PGP SIGNATURE-----

Merge 4.9.79 into android-4.9

Changes in 4.9.79
	x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
	orangefs: use list_for_each_entry_safe in purge_waiting_ops
	orangefs: initialize op on loop restart in orangefs_devreq_read
	usbip: prevent vhci_hcd driver from leaking a socket pointer address
	usbip: Fix implicit fallthrough warning
	usbip: Fix potential format overflow in userspace tools
	can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
	can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
	KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
	Prevent timer value 0 for MWAITX
	drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled
	drivers: base: cacheinfo: fix boot error message when acpi is enabled
	mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
	hwpoison, memcg: forcibly uncharge LRU pages
	cma: fix calculation of aligned offset
	mm, page_alloc: fix potential false positive in __zone_watermark_ok
	ipc: msg, make msgrcv work with LONG_MIN
	ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
	ACPICA: Namespace: fix operand cache leak
	netfilter: nfnetlink_cthelper: Add missing permission checks
	netfilter: xt_osf: Add missing permission checks
	reiserfs: fix race in prealloc discard
	reiserfs: don't preallocate blocks for extended attributes
	fs/fcntl: f_setown, avoid undefined behaviour
	scsi: libiscsi: fix shifting of DID_REQUEUE host byte
	Revert "module: Add retpoline tag to VERMAGIC"
	mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
	Input: trackpoint - force 3 buttons if 0 button is reported
	orangefs: fix deadlock; do not write i_size in read_iter
	um: link vmlinux with -no-pie
	vsyscall: Fix permissions for emulate mode with KAISER/PTI
	eventpoll.h: add missing epoll event masks
	dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
	ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
	ipv6: fix udpv6 sendmsg crash caused by too small MTU
	ipv6: ip6_make_skb() needs to clear cork.base.dst
	lan78xx: Fix failure in USB Full Speed
	net: igmp: fix source address check for IGMPv3 reports
	net: qdisc_pkt_len_init() should be more robust
	net: tcp: close sock if net namespace is exiting
	pppoe: take ->needed_headroom of lower device into account on xmit
	r8169: fix memory corruption on retrieval of hardware statistics.
	sctp: do not allow the v4 socket to bind a v4mapped v6 address
	sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
	tipc: fix a memory leak in tipc_nl_node_get_link()
	vmxnet3: repair memory leak
	net: Allow neigh contructor functions ability to modify the primary_key
	ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
	ppp: unlock all_ppp_mutex before registering device
	be2net: restore properly promisc mode after queues reconfiguration
	ip6_gre: init dev->mtu and dev->hard_header_len correctly
	gso: validate gso_type in GSO handlers
	mlxsw: spectrum_router: Don't log an error on missing neighbor
	tun: fix a memory leak for tfile->tx_array
	flow_dissector: properly cap thoff field
	perf/x86/amd/power: Do not load AMD power module on !AMD platforms
	x86/microcode/intel: Extend BDW late-loading further with LLC size check
	hrtimer: Reset hrtimer cpu base proper on CPU hotplug
	x86: bpf_jit: small optimization in emit_bpf_tail_call()
	bpf: fix bpf_tail_call() x64 JIT
	bpf: introduce BPF_JIT_ALWAYS_ON config
	bpf: arsh is not supported in 32 bit alu thus reject it
	bpf: avoid false sharing of map refcount with max_entries
	bpf: fix divides by zero
	bpf: fix 32-bit divide by zero
	bpf: reject stores into ctx via st and xadd
	nfsd: auth: Fix gid sorting when rootsquash enabled
	Linux 4.9.79

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-01-31 14:13:00 +01:00
Alexei Starovoitov
a3d6dd6a66 bpf: introduce BPF_JIT_ALWAYS_ON config
[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]

The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
  It will be sent when the trees are merged back to net-next

Considered doing:
  int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:55:56 +01:00
Greg Kroah-Hartman
319c8e1bc7 This is the 4.9.71 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlo6KFYACgkQONu9yGCS
 aT4mzw//cSnAjc7kuTtk96GKWat1bQExyb4scmsEkArfVKhoCy0Dhyr9yr4Y6+mX
 6l2uUyQ70jhqOvinWIVuoDJoiZhtloudCe6ehmXm81xZsLacmelIC9NHGZ/vx/10
 vC4BIZZgft5JiL4OSp/XTd0t++8maK5RUwp8cCqqTDeUyqHKNjUg+moMuJdjvRf7
 4qGoBZ4lyijUU5V+WC98KSZSPncU6U1atA6k6Yvgu7oMFGembztERCx19Ka0JxA5
 mzsmAH3TIhHUSGinDpTfW9x3Cmu0Dg3H7mQ0AaEVjhAi1oKTxxp0drLCZbeJUAXX
 9QPJqr20XZWkuGX/yuy1vkcVo6kRfjaPYi1yyiFoEQ23hYXAaTJyiXC/CWv6kAkc
 MOIXqHQgfegDAC33EzVunp/ue0sBwVAhFTpaTwbUiKJ+lpZY74mV+bjk6gZdlGJM
 9TAOE66oAPNt6SM+5QC5mtO9cC03nCDIzbud5KXzdjYH8RBfIEvidxNv5qM6x8Hb
 dJn6//nQzMTYIQFHja19Sqbt0xXq2lck5DrZZ+YnXlHr5JH1DzPQfqfmu8GD094e
 H3oLDUmyBVnkI5jmgo3Xc+ZLArUMX7HhTyKSp+mXxRtGNulcbbQwaSWjEUoqYSzN
 twMQPS+NKu+ZuubztP+7gOvyofmAAfcPX6yZpTnPyFKEnjyU3Uw=
 =nCSn
 -----END PGP SIGNATURE-----

Merge 4.9.71 into android-4.9

Changes in 4.9.71
	mfd: fsl-imx25: Clean up irq settings during removal
	crypto: rsa - fix buffer overread when stripping leading zeroes
	crypto: hmac - require that the underlying hash algorithm is unkeyed
	crypto: salsa20 - fix blkcipher_walk API usage
	autofs: fix careless error in recent commit
	tracing: Allocate mask_str buffer dynamically
	USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
	USB: core: prevent malicious bNumInterfaces overflow
	usbip: fix stub_rx: get_pipe() to validate endpoint number
	usb: add helper to extract bits 12:11 of wMaxPacketSize
	usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
	usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
	ceph: drop negative child dentries before try pruning inode's alias
	usb: xhci: fix TDS for MTK xHCI1.1
	Bluetooth: btusb: driver to enable the usb-wakeup feature
	xhci: Don't add a virt_dev to the devs array before it's fully allocated
	nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
	sched/rt: Do not pull from current CPU if only one CPU to pull
	eeprom: at24: change nvmem stride to 1
	dmaengine: dmatest: move callback wait queue to thread context
	ext4: fix fdatasync(2) after fallocate(2) operation
	ext4: fix crash when a directory's i_size is too small
	mac80211: Fix addition of mesh configuration element
	usb: phy: isp1301: Add OF device ID table
	KVM: nVMX: do not warn when MSR bitmap address is not backed
	usb: xhci-mtk: check hcc_params after adding primary hcd
	md-cluster: free md_cluster_info if node leave cluster
	userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
	userfaultfd: selftest: vm: allow to build in vm/ directory
	net: initialize msg.msg_flags in recvfrom
	bnxt_en: Ignore 0 value in autoneg supported speed from firmware.
	net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values
	net: bcmgenet: correct MIB access of UniMAC RUNT counters
	net: bcmgenet: reserved phy revisions must be checked first
	net: bcmgenet: power down internal phy if open or resume fails
	net: bcmgenet: synchronize irq0 status between the isr and task
	net: bcmgenet: Power up the internal PHY before probing the MII
	rxrpc: Wake up the transmitter if Rx window size increases on the peer
	net/mlx5: Fix create autogroup prev initializer
	net/mlx5: Don't save PCI state when PCI error is detected
	iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
	drm/amdgpu: fix parser init error path to avoid crash in parser fini
	NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
	NFSD: fix nfsd_reset_versions for NFSv4.
	Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
	drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
	netfilter: bridge: honor frag_max_size when refragmenting
	ASoC: rsnd: fix sound route path when using SRC6/SRC9
	blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
	writeback: fix memory leak in wb_queue_work()
	net: wimax/i2400m: fix NULL-deref at probe
	dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
	irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN
	net: Resend IGMP memberships upon peer notification.
	mlxsw: reg: Fix SPVM max record count
	mlxsw: reg: Fix SPVMLR max record count
	qed: Align CIDs according to DORQ requirement
	qed: Fix mapping leak on LL2 rx flow
	qed: Fix interrupt flags on Rx LL2
	drm: amd: remove broken include path
	intel_th: pci: Add Gemini Lake support
	openrisc: fix issue handling 8 byte get_user calls
	ASoC: rcar: clear DE bit only in PDMACHCR when it stops
	scsi: hpsa: update check for logical volume status
	scsi: hpsa: limit outstanding rescans
	scsi: hpsa: do not timeout reset operations
	fjes: Fix wrong netdevice feature flags
	drm/radeon/si: add dpm quirk for Oland
	Drivers: hv: util: move waiting for release to hv_utils_transport itself
	iwlwifi: mvm: cleanup pending frames in DQA mode
	sched/deadline: Add missing update_rq_clock() in dl_task_timer()
	sched/deadline: Make sure the replenishment timer fires in the next period
	sched/deadline: Throttle a constrained deadline task activated after the deadline
	sched/deadline: Use deadline instead of period when calculating overflow
	mmc: mediatek: Fixed bug where clock frequency could be set wrong
	drm/radeon: reinstate oland workaround for sclk
	afs: Fix missing put_page()
	afs: Populate group ID from vnode status
	afs: Adjust mode bits processing
	afs: Deal with an empty callback array
	afs: Flush outstanding writes when an fd is closed
	afs: Migrate vlocation fields to 64-bit
	afs: Prevent callback expiry timer overflow
	afs: Fix the maths in afs_fs_store_data()
	afs: Invalid op ID should abort with RXGEN_OPCODE
	afs: Better abort and net error handling
	afs: Populate and use client modification time
	afs: Fix page leak in afs_write_begin()
	afs: Fix afs_kill_pages()
	afs: Fix abort on signal while waiting for call completion
	nvme-loop: fix a possible use-after-free when destroying the admin queue
	nvmet: confirm sq percpu has scheduled and switched to atomic
	nvmet-rdma: Fix a possible uninitialized variable dereference
	net/mlx4_core: Avoid delays during VF driver device shutdown
	net: mpls: Fix nexthop alive tracking on down events
	rxrpc: Ignore BUSY packets on old calls
	tty: don't panic on OOM in tty_set_ldisc()
	tty: fix data race in tty_ldisc_ref_wait()
	perf symbols: Fix symbols__fixup_end heuristic for corner cases
	efi/esrt: Cleanup bad memory map log messages
	NFSv4.1 respect server's max size in CREATE_SESSION
	btrfs: add missing memset while reading compressed inline extents
	target: Use system workqueue for ALUA transitions
	target: fix ALUA transition timeout handling
	target: fix race during implicit transition work flushes
	Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
	HID: cp2112: fix broken gpio_direction_input callback
	sfc: don't warn on successful change of MAC
	fbdev: controlfb: Add missing modes to fix out of bounds access
	video: udlfb: Fix read EDID timeout
	video: fbdev: au1200fb: Release some resources if a memory allocation fails
	video: fbdev: au1200fb: Return an error code if a memory allocation fails
	rtc: pcf8563: fix output clock rate
	ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case
	dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
	PCI/PME: Handle invalid data when reading Root Status
	powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo
	PCI: Do not allocate more buses than available in parent
	iommu/mediatek: Fix driver name
	netfilter: ipvs: Fix inappropriate output of procfs
	powerpc/opal: Fix EBUSY bug in acquiring tokens
	powerpc/ipic: Fix status get and status clear
	platform/x86: intel_punit_ipc: Fix resource ioremap warning
	target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
	iscsi-target: fix memory leak in lio_target_tiqn_addtpg()
	target:fix condition return in core_pr_dump_initiator_port()
	target/file: Do not return error for UNMAP if length is zero
	badblocks: fix wrong return value in badblocks_set if badblocks are disabled
	iommu/amd: Limit the IOVA page range to the specified addresses
	xfs: truncate pagecache before writeback in xfs_setattr_size()
	arm-ccn: perf: Prevent module unload while PMU is in use
	crypto: tcrypt - fix buffer lengths in test_aead_speed()
	mm: Handle 0 flags in _calc_vm_trans() macro
	clk: mediatek: add the option for determining PLL source clock
	clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU
	clk: hi6220: mark clock cs_atb_syspll as critical
	clk: tegra: Fix cclk_lp divisor register
	ppp: Destroy the mutex when cleanup
	ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod
	thermal/drivers/step_wise: Fix temperature regulation misbehavior
	scsi: scsi_debug: write_same: fix error report
	GFS2: Take inode off order_write list when setting jdata flag
	bcache: explicitly destroy mutex while exiting
	bcache: fix wrong cache_misses statistics
	Ib/hfi1: Return actual operational VLs in port info query
	arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27
	btrfs: tests: Fix a memory leak in error handling path in 'run_test()'
	platform/x86: hp_accel: Add quirk for HP ProBook 440 G4
	nvme: use kref_get_unless_zero in nvme_find_get_ns
	l2tp: cleanup l2tp_tunnel_delete calls
	xfs: fix log block underflow during recovery cycle verification
	xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
	RDMA/cxgb4: Declare stag as __be32
	PCI: Detach driver before procfs & sysfs teardown on device remove
	scsi: hpsa: cleanup sas_phy structures in sysfs when unloading
	scsi: hpsa: destroy sas transport properties before scsi_host
	powerpc/perf/hv-24x7: Fix incorrect comparison in memord
	soc: mediatek: pwrap: fix compiler errors
	tty fix oops when rmmod 8250
	usb: musb: da8xx: fix babble condition handling
	pinctrl: adi2: Fix Kconfig build problem
	raid5: Set R5_Expanded on parity devices as well as data.
	scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry
	IB/core: Fix calculation of maximum RoCE MTU
	vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend
	rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd
	rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd
	scsi: sd: change manage_start_stop to bool in sysfs interface
	scsi: sd: change allow_restart to bool in sysfs interface
	scsi: bfa: integer overflow in debugfs
	udf: Avoid overflow when session starts at large offset
	macvlan: Only deliver one copy of the frame to the macvlan interface
	RDMA/cma: Avoid triggering undefined behavior
	IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
	icmp: don't fail on fragment reassembly time exceeded
	ath9k: fix tx99 potential info leak
	Linux 4.9.71

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-12-20 10:51:15 +01:00
Alexander Potapenko
ae0ebdba96 net: initialize msg.msg_flags in recvfrom
[ Upstream commit 9f138fa609c47403374a862a08a41394be53d461 ]

KMSAN reports a use of uninitialized memory in put_cmsg() because
msg.msg_flags in recvfrom haven't been initialized properly.
The flag values don't affect the result on this path, but it's still a
good idea to initialize them explicitly.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20 10:07:18 +01:00
Tobias Klauser
b68cf9315f UPSTREAM: net: socket: Make unnecessarily global sockfs_setattr() static
Make sockfs_setattr() static as it is not used outside of net/socket.c

This fixes the following GCC warning:
net/socket.c:534:5: warning: no previous prototype for ‘sockfs_setattr’ [-Wmissing-prototypes]

Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Fixes: android-4.9 commitID 81a159106e
       ("UPSTREAM: net: core: Add a UID field to struct sock.")
(cherry picked from commit dc647ec88e029307e60e6bf9988056605f11051a)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-04-14 18:42:27 +05:30
Todd Kjos
a6e00ae888 Merge branch 'upstream-linux-4.9.y' into android-4.9 2017-03-02 13:51:26 -08:00
Maxime Jayat
1a0e2594ef net: socket: fix recvmmsg not returning error from sock_error
[ Upstream commit e623a9e9dec29ae811d11f83d0074ba254aba374 ]

Commit 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path"),
changed the exit path of recvmmsg to always return the datagrams
variable and modified the error paths to set the variable to the error
code returned by recvmsg if necessary.

However in the case sock_error returned an error, the error code was
then ignored, and recvmmsg returned 0.

Change the error path of recvmmsg to correctly return the error code
of sock_error.

The bug was triggered by using recvmmsg on a CAN interface which was
not up. Linux 4.6 and later return 0 in this case while earlier
releases returned -ENETDOWN.

Fixes: 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-26 11:10:51 +01:00
Eric Biggers
1452d39a09 UPSTREAM: net: socket: don't set sk_uid to garbage value in ->setattr()
->setattr() was recently implemented for socket files to sync the socket
inode's uid to the new 'sk_uid' member of struct sock.  It does this by
copying over the ia_uid member of struct iattr.  However, ia_uid is
actually only valid when ATTR_UID is set in ia_valid, indicating that
the uid is being changed, e.g. by chown.  Other metadata operations such
as chmod or utimes leave ia_uid uninitialized.  Therefore, sk_uid could
be set to a "garbage" value from the stack.

Fix this by only copying the uid over when ATTR_UID is set.

Change-Id: I1efd83bd955325b33be3d4addccf5bac8ec803db
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-31 10:45:51 -08:00
Lorenzo Colitti
81a159106e UPSTREAM: net: core: Add a UID field to struct sock.
Protocol sockets (struct sock) don't have UIDs, but most of the
time, they map 1:1 to userspace sockets (struct socket) which do.

Various operations such as the iptables xt_owner match need
access to the "UID of a socket", and do so by following the
backpointer to the struct socket. This involves taking
sk_callback_lock and doesn't work when there is no socket
because userspace has already called close().

Simplify this by adding a sk_uid field to struct sock whose value
matches the UID of the corresponding struct socket. The semantics
are as follows:

1. Whenever sk_socket is non-null: sk_uid is the same as the UID
   in sk_socket, i.e., matches the return value of sock_i_uid.
   Specifically, the UID is set when userspace calls socket(),
   fchown(), or accept().
2. When sk_socket is NULL, sk_uid is defined as follows:
   - For a socket that no longer has a sk_socket because
     userspace has called close(): the previous UID.
   - For a cloned socket (e.g., an incoming connection that is
     established but on which userspace has not yet called
     accept): the UID of the socket it was cloned from.
   - For a socket that has never had an sk_socket: UID 0 inside
     the user namespace corresponding to the network namespace
     the socket belongs to.

Kernel sockets created by sock_create_kern are a special case
of #1 and sk_uid is the user that created them. For kernel
sockets created at network namespace creation time, such as the
per-processor ICMP and TCP sockets, this is the user that created
the network namespace.

Change-Id: Id890c6ea724b6929cc543a474ab37ec2d9e3f815
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27 13:55:58 -08:00
Andreas Gruenbacher
4a59015372 xattr: Fix setting security xattrs on sockfs
The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
setxattr support as well.  This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity.  The
smack security module relies on socket labels (xattrs).

Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.

We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity.  A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-17 00:00:23 -05:00
Soheil Hassas Yeganeh
3023898b7d sock: fix sendmmsg for partial sendmsg
Do not send the next message in sendmmsg for partial sendmsg
invocations.

sendmmsg assumes that it can continue sending the next message
when the return value of the individual sendmsg invocations
is positive. It results in corrupting the data for TCP,
SCTP, and UNIX streams.

For example, sendmmsg([["abcd"], ["efgh"]]) can result in a stream
of "aefgh" if the first sendmsg invocation sends only the first
byte while the second sendmsg goes through.

Datagram sockets either send the entire datagram or fail, so
this patch affects only sockets of type SOCK_STREAM and
SOCK_SEQPACKET.

Fixes: 228e548e60 ("net: Add sendmmsg socket system call")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 13:18:12 -05:00
Andreas Gruenbacher
fd50ecaddf vfs: Remove {get,set,remove}xattr inode operations
These inode operations are no longer used; remove them.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07 21:48:36 -04:00
Andreas Gruenbacher
bba0bd31b1 sockfs: Get rid of getxattr iop
If we allow pseudo-filesystems created with mount_pseudo to have xattr
handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler
to use the xattr handler name parsing.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06 22:17:38 -04:00
Andreas Gruenbacher
971df15bd5 sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
The standard return value for unsupported attribute names is
-EOPNOTSUPP, as opposed to undefined but supported attributes
(-ENODATA).

Also, fail for attribute names like "system.sockprotonameXXX" and
simplify the code a bit.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06 22:17:38 -04:00
Deepa Dinamani
766b9f928b fs: poll/select/recvmmsg: use timespec64 for timeout events
struct timespec is not y2038 safe.  Even though timespec might be
sufficient to represent timeouts, use struct timespec64 here as the plan
is to get rid of all timespec reference in the kernel.

The patch transitions the common functions: poll_select_set_timeout()
and select_estimate_accuracy() to use timespec64.  And, all the syscalls
that use these functions are transitioned in the same patch.

The restart block parameters for poll uses monotonic time.  Use
timespec64 here as well to assign timeout value.  This parameter in the
restart block need not change because this only holds the monotonic
timestamp at which timeout should occur.  And, unsigned long data type
should be big enough for this timestamp.

The system call interfaces will be handled in a separate series.

Compat interfaces need not change as timespec64 is an alias to struct
timespec on a 64 bit system.

Link: http://lkml.kernel.org/r/1461947989-21926-3-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Soheil Hassas Yeganeh
0a2cf20c3f tcp: remove SKBTX_ACK_TSTAMP since it is redundant
The SKBTX_ACK_TSTAMP flag is set in skb_shinfo->tx_flags when
the timestamp of the TCP acknowledgement should be reported on
error queue. Since accessing skb_shinfo is likely to incur a
cache-line miss at the time of receiving the ack, the
txstamp_ack bit was added in tcp_skb_cb, which is set iff
the SKBTX_ACK_TSTAMP flag is set for an skb. This makes
SKBTX_ACK_TSTAMP flag redundant.

Remove the SKBTX_ACK_TSTAMP and instead use the txstamp_ack bit
everywhere.

Note that this frees one bit in shinfo->tx_flags.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28 16:06:10 -04:00
David S. Miller
6c61403dae Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-04-14 00:39:15 -04:00
Al Viro
ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Hannes Frederic Sowa
1e1d04e678 net: introduce lockdep_is_held and update various places to use it
The socket is either locked if we hold the slock spin_lock for
lock_sock_fast and unlock_sock_fast or we own the lock (sk_lock.owned
!= 0). Check for this and at the same time improve that the current
thread/cpu is really holding the lock.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:44:14 -04:00
Soheil Hassas Yeganeh
c14ac9451c sock: enable timestamping using control messages
Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
This is very costly when users want to sample writes to gather
tx timestamps.

Add support for enabling SO_TIMESTAMPING via control messages by
using tsflags added in `struct sockcm_cookie` (added in the previous
patches in this series) to set the tx_flags of the last skb created in
a sendmsg. With this patch, the timestamp recording bits in tx_flags
of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.

Please note that this is only effective for overriding the recording
timestamps flags. Users should enable timestamp reporting (e.g.,
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
socket options and then should ask for SOF_TIMESTAMPING_TX_*
using control messages per sendmsg to sample timestamps for each
write.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-04 15:50:30 -04:00
Al Viro
2da62906b1 [net] drop 'size' argument of sock_recvmsg()
all callers have it equal to msg_data_left(msg).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-28 13:57:51 -04:00
Arnaldo Carvalho de Melo
34b88a68f2 net: Fix use after free in the recvmmsg exit path
The syzkaller fuzzer hit the following use-after-free:

  Call Trace:
   [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
   [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
   [<     inline     >] SYSC_recvmmsg net/socket.c:2281
   [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
   [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
  arch/x86/entry/entry_64.S:185

And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.

Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Fixes: a2e2725541 ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14 12:41:49 -04:00
liping.zhang
f3c986908c net: socket: use pr_info_once to tip the obsolete usage of PF_PACKET
There is no need to use the static variable here, pr_info_once is more
concise.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 22:37:50 -04:00
Tom Herbert
f092276d85 net: Add MSG_BATCH flag
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to
indicate that more messages will follow (i.e. a batch of messages is
being sent). This is similar to MSG_MORE except that the following
messages are not merged into one packet, they are sent individually.
sendmmsg is updated so that each contained message except for the
last one is marked as MSG_BATCH.

MSG_BATCH is a performance optimization in cases where a socket
implementation can benefit by transmitting packets in a batch.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert
28a94d8fb3 net: Allow MSG_EOR in each msghdr of sendmmsg
This patch allows setting MSG_EOR in each individual msghdr passed
in sendmmsg. This allows a sendmmsg to send multiple messages when
using SOCK_SEQPACKET.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Tom Herbert
f4a00aacdb net: Make sock_alloc exportable
Export it for cases where we want to create sockets by hand.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09 16:36:13 -05:00
Vladimir Davydov
5d097056c9 kmemcg: account certain kmem allocations to memcg
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Eric Dumazet
a78cb84c62 net: add scheduling point in recvmmsg/sendmmsg
Applications often have to reduce number of datagrams
they receive or send per system call to avoid starvation problems.

Really the kernel should take care of this by using cond_resched(),
so that applications can experiment bigger batch sizes.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-10 22:56:29 -05:00
Nicolai Stange
574aab1e02 net, socket, socket_wq: fix missing initialization of flags
Commit ceb5d58b21 ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.

Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().

One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.

Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.

Fixes: ceb5d58b21 ("net: fix sock_wake_async() rcu protection")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30 16:38:01 -05:00
tadeusz.struk@intel.com
130ed5d105 net: fix uninitialized variable issue
msg_iocb needs to be initialized on the recv/recvfrom path.
Otherwise afalg will wrongly interpret it as an async call.

Cc: stable@vger.kernel.org
Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 15:46:48 -05:00
Eric Dumazet
ceb5d58b21 net: fix sock_wake_async() rcu protection
Dmitry provided a syzkaller (http://github.com/google/syzkaller)
triggering a fault in sock_wake_async() when async IO is requested.

Said program stressed af_unix sockets, but the issue is generic
and should be addressed in core networking stack.

The problem is that by the time sock_wake_async() is called,
we should not access the @flags field of 'struct socket',
as the inode containing this socket might be freed without
further notice, and without RCU grace period.

We already maintain an RCU protected structure, "struct socket_wq"
so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
is the safe route.

It also reduces number of cache lines needing dirtying, so might
provide a performance improvement anyway.

In followup patches, we might move remaining flags (SOCK_NOSPACE,
SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
being mostly read and let it being shared between cpus.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00
Eric Dumazet
9cd3e072b0 net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
This patch is a cleanup to make following patch easier to
review.

Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()

To ease backports, we rename both constants.

Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00
Viresh Kumar
b5ffe63442 net: Drop unlikely before IS_ERR(_OR_NULL)
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-09-29 15:15:40 +02:00
Eric W. Biederman
eeb1bd5c40 net: Add a struct net parameter to sock_create_kern
This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 10:50:17 -04:00
Eric W. Biederman
140e807da1 tun: Utilize the normal socket network namespace refcounting.
There is no need for tun to do the weird network namespace refcounting.
The existing network namespace refcounting in tfile has almost exactly
the same lifetime.  So rewrite the code to use the struct sock network
namespace refcounting and remove the unnecessary hand rolled network
namespace refcounting and the unncesary tfile->net.

This change allows the tun code to directly call sock_put bypassing
sock_release and making SOCK_EXTERNALLY_ALLOCATED unnecessary.

Remove the now unncessary tun_release so that if anything tries to use
the sock_release code path the kernel will oops, and let us know about
the bug.

The macvtap code already uses it's internal socket this way.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11 10:50:16 -04:00
David Howells
c5ef603528 VFS: net/: d_inode() annotations
socket inodes and sunrpc filesystems - inodes owned by that code

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:56 -04:00
Al Viro
5d5d568975 make new_sync_{read,write}() static
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:29:40 -04:00
Al Viro
01e97e6517 new helper: msg_data_left()
convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:53:35 -04:00