Add new proto_ops sendmsg_locked and sendpage_locked that can be
called when the socket lock is already held. Correspondingly, add
kernel_sendmsg_locked and kernel_sendpage_locked as front end
functions.
These functions will be used in zero proxy so that we can take
the socket lock in a ULP sendmsg/sendpage and then directly call the
backend transport proto_ops functions.
Change-Id: I4a8a6f5234486946ec2870ae22fa8ea561df3af0
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4pSQ8ACgkQONu9yGCS
aT7byBAA1Sx2dkXj107MHQ/XQUQjn9LeDoUtdB105780XXRJqS1L6Bsm5pvsiOvQ
GvkKpCHCWg12iuEKJCQ1pr88XKQrik68vRUOCgMt4rh0TovM8eUz9fuvFGO+u330
tCyW9zftkKNMYJmPSn2w6hOZcDK4wxgVjP6hkFgQJjyjFy/dbkcwFb6Vg9cfMRKc
kkuacR9Hm7hG4V2RWD1pNsI4Nlly/oPXEmLJMplVGY+YyOAB5ne14JCevVX22bV2
WD9EUihPCsyB41LF2FhX5jzWhyFKgt/9tyrl7VFjsEqmvvdS7S9YMD3RJ2alzQbo
qT4Wph+xVT3JIdXuFZuAaHUfFKwnWR/6cHcMiejsv/A6B72aRaECSMSN8aCJSYit
eV3L/LNoLaKcpdpJLKVAWSny1ZaLnYTxk0E3OilQz+ZzqRk/LDjnxQry5sem6oXt
3HJlo2cuvd2bQ0Jd5RDnGW6N8CLx4HIMwnnxEjJmOqpUog6zSnhSbsvzpkQ2IZVs
3pFj1eYMausbEfdLXrFuky0cLvswjcYKT6W+CcapGba6IaHDhSg5V2WkPOktwxMW
BYnnzJptWXbbCt6de1ZwOyVpKdmmf/9hDG4egPVaCAs7/AOzE7P5+zIheiN9KqRw
3Fz+KNFB85oztp5Ds4gnd9xYa11uEzTSm+vVCKGVfymPEKvCQfU=
=84H5
-----END PGP SIGNATURE-----
Merge 4.9.211 into android-4.9-q
Changes in 4.9.211
hidraw: Return EPOLLOUT from hidraw_poll
HID: hidraw: Fix returning EPOLLOUT from hidraw_poll
HID: hidraw, uhid: Always report EPOLLOUT
ethtool: reduce stack usage with clang
fs/select: avoid clang stack usage warning
rsi: add fix for crash during assertions
arm64: mm: BUG on unsupported manipulations of live kernel mappings
arm64: don't open code page table entry creation
arm64: mm: Change page table pointer name in p[md]_set_huge()
arm64: Enforce BBM for huge IO/VMAP mappings
arm64: Make sure permission updates happen for pmd/pud
cfg80211/mac80211: make ieee80211_send_layer2_update a public function
mac80211: Do not send Layer 2 Update frame before authorization
media: usb:zr364xx:Fix KASAN:null-ptr-deref Read in zr364xx_vidioc_querycap
wimax: i2400: fix memory leak
wimax: i2400: Fix memory leak in i2400m_op_rfkill_sw_toggle
ext4: fix use-after-free race with debug_want_extra_isize
ext4: add more paranoia checking in ext4_expand_extra_isize handling
dccp: Fix memleak in __feat_register_sp
rtc: mt6397: fix alarm register overwrite
iommu: Remove device link to group on failure
gpio: Fix error message on out-of-range GPIO in lookup table
hsr: reset network header when supervision frame is created
cifs: Adjust indentation in smb2_open_file
RDMA/srpt: Report the SCSI residual to the initiator
scsi: enclosure: Fix stale device oops with hot replug
scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI
platform/x86: asus-wmi: Fix keyboard brightness cannot be set to 0
iio: imu: adis16480: assign bias value only if operation succeeded
mei: fix modalias documentation
clk: samsung: exynos5420: Preserve CPU clocks configuration during suspend/resume
compat_ioctl: handle SIOCOUTQNSD
PCI/PTM: Remove spurious "d" from granularity message
powerpc/powernv: Disable native PCIe port management
tty: serial: imx: use the sg count from dma_map_sg
tty: serial: pch_uart: correct usage of dma_unmap_sg
media: exynos4-is: Fix recursive locking in isp_video_release()
mtd: spi-nor: fix silent truncation in spi_nor_read()
spi: atmel: fix handling of cs_change set on non-last xfer
rtlwifi: Remove unnecessary NULL check in rtl_regd_init
f2fs: fix potential overflow
rtc: msm6242: Fix reading of 10-hour digit
gpio: mpc8xxx: Add platform device to gpiochip->parent
scsi: libcxgbi: fix NULL pointer dereference in cxgbi_device_destroy()
rseq/selftests: Turn off timeout setting
MIPS: Prevent link failure with kcov instrumentation
ioat: ioat_alloc_ring() failure handling.
hexagon: parenthesize registers in asm predicates
hexagon: work around compiler crash
ocfs2: call journal flush to mark journal as empty after journal recovery when mount
dt-bindings: reset: meson8b: fix duplicate reset IDs
clk: Don't try to enable critical clocks if prepare failed
ALSA: seq: Fix racy access for queue timer in proc read
Fix built-in early-load Intel microcode alignment
block: fix an integer overflow in logical block size
iio: buffer: align the size of scan bytes to size of the largest element
USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
USB: serial: opticon: fix control-message timeouts
USB: serial: suppress driver bind attributes
USB: serial: ch341: handle unbound port at reset_resume
USB: serial: io_edgeport: add missing active-port sanity check
USB: serial: quatech2: handle unbound ports
scsi: mptfusion: Fix double fetch bug in ioctl
usb: core: hub: Improved device recognition on remote wakeup
x86/efistub: Disable paging at mixed mode entry
perf hists: Fix variable name's inconsistency in hists__for_each() macro
perf report: Fix incorrectly added dimensions as switch perf data file
mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio()
net: stmmac: 16KB buffer must be 16 byte aligned
net: stmmac: Enable 16KB buffer size
USB: serial: io_edgeport: use irqsave() in USB's complete callback
USB: serial: io_edgeport: handle unbound ports on URB completion
USB: serial: keyspan: handle unbound ports
scsi: fnic: use kernel's '%pM' format option to print MAC
scsi: fnic: fix invalid stack access
arm64: dts: agilex/stratix10: fix pmu interrupt numbers
cfg80211: fix page refcount issue in A-MSDU decap
netfilter: fix a use-after-free in mtype_destroy()
netfilter: arp_tables: init netns pointer in xt_tgdtor_param struct
batman-adv: Fix DAT candidate selection on little endian systems
macvlan: use skb_reset_mac_header() in macvlan_queue_xmit()
net: dsa: tag_qca: fix doubled Tx statistics
net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info
r8152: add missing endpoint sanity check
tcp: fix marked lost packets not being retransmitted
net: usb: lan78xx: limit size of local TSO packets
xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk
cw1200: Fix a signedness bug in cw1200_load_firmware()
cfg80211: check for set_wiphy_params
reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr
scsi: esas2r: unlock on error in esas2r_nvram_read_direct()
scsi: qla4xxx: fix double free bug
scsi: bnx2i: fix potential use after free
scsi: target: core: Fix a pr_debug() argument
scsi: core: scsi_trace: Use get_unaligned_be*()
perf probe: Fix wrong address verification
regulator: ab8500: Remove SYSCLKREQ from enum ab8505_regulator_id
Linux 4.9.211
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifc27b3c6afdbd6a39bd7ae4e551d8bed42dc4973
commit 9d7bf41fafa5b5ddd4c13eb39446b0045f0a8167 upstream.
Unlike the normal SIOCOUTQ, SIOCOUTQNSD was never handled in compat
mode. Add it to the common socket compat handler along with similar
ones.
Fixes: 2f4e1b3970 ("tcp: ioctl type SIOCOUTQNSD returns amount of data not sent")
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iTCkACgkQONu9yGCS
aT7n5BAAp5eiY3ZNOOsthb4kCnWoJopEbAx4bgKfrHiWYizcPmA8MBjbwGnR0Brh
WQnLWROg/00y7M+DgUhuPk3sLEFsoTDJlEmsp5e0UDh8ZiO0qt1S82LRE9vLUAaK
QW92EbNO0NDOSZbTeLb7P/TVMmBBlUkm1UILfjMEHKMSn+syPJxUpbmjynmTBLE+
2fJ6metOCJrEoiRM0mVeWlewXsy+XF2VYx5sCV2t6fx6GgofWPW3HZkQtJDaz2/R
rj5G5f2A6HVGpwPoSvvXKc+q6cBD3g2efQtQbWu8j+VDbWsw7d5oiGGLs44xiTjo
jC0si9+577y5c3DHo1AvryYD3CXkjpwsV6Y4nbt7j8Rd6LmNoVF3+8ghVuVCzpDE
DSz4MDW7q4aw3o91QIwgZsljpheLnSJNv54ZFlz63ToGESMHNxuRsoTkchrf7+3Y
htF9KZsT+xkbzc+KlhqtQ6ozRwzPY9/zUdItugnYkt2WQWRFfJdmVbupRv9C0Xv+
0PVdj1YYc1NbbD4/R5SRe7aaj+InDMYPGW8LXjyxR9eLZlMj0ReN5dFi/4mvZ/Yu
QegxU9TRLpSvl3s1nH8hwi+85rM0Jw+B/2swqRRRmU52A3b2fW4qaDV+q2LuD0VS
vNRvZ1bBLx2rrYeOw40W9KQKjivj8mGTs1A9aQJZJumHuiSD6Z0=
=zCyG
-----END PGP SIGNATURE-----
Merge 4.9.190 into android-4.9-q
Changes in 4.9.190
usb: usbfs: fix double-free of usb memory upon submiturb error
usb: iowarrior: fix deadlock on disconnect
sound: fix a memory leak bug
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
x86/mm: Sync also unmappings in vmalloc_sync_all()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
perf record: Fix wrong size in perf_record_mmap for last kernel module
perf db-export: Fix thread__exec_comm()
perf record: Fix module size on s390
usb: yurex: Fix use-after-free in yurex_delete
can: peak_usb: fix potential double kfree_skb()
netfilter: nfnetlink: avoid deadlock due to synchronous request_module
iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
mac80211: don't warn about CW params when not using them
hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
s390/qdio: add sanity checks to the fast-requeue path
ALSA: compress: Fix regression on compressed capture streams
ALSA: compress: Prevent bypasses of set_params
ALSA: compress: Don't allow paritial drain operations on capture streams
ALSA: compress: Be more restrictive about when a drain is allowed
perf probe: Avoid calling freeing routine multiple times for same pointer
drbd: dynamically allocate shash descriptor
ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
ARM: davinci: fix sleep.S build error on ARMv4
scsi: megaraid_sas: fix panic on loading firmware crashdump
scsi: ibmvfc: fix WARN_ON during event pool release
scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
perf/core: Fix creating kernel counters for PMUs that override event->cpu
can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
hwmon: (nct7802) Fix wrong detection of in4 presence
ALSA: firewire: fix a memory leak bug
ALSA: hda - Don't override global PCM hw info flag
mac80211: don't WARN on short WMM parameters from AP
SMB3: Fix deadlock in validate negotiate hits reconnect
smb3: send CAP_DFS capability during session setup
mwifiex: fix 802.11n/WPA detection
iwlwifi: don't unmap as page memory that was mapped as single
scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
sh: kernel: hw_breakpoint: Fix missing break in switch statement
mm/usercopy: use memory range to be accessed for wraparound check
mm/memcontrol.c: fix use after free in mem_cgroup_iter()
bpf: get rid of pure_initcall dependency to enable jits
bpf: restrict access to core bpf sysctls
bpf: add bpf_jit_limit knob to restrict unpriv allocations
vhost-net: set packet weight of tx polling to 2 * vq size
vhost_net: use packet weight for rx handler, too
vhost_net: introduce vhost_exceeds_weight()
vhost: introduce vhost_exceeds_weight()
vhost_net: fix possible infinite loop
vhost: scsi: add weight support
siphash: add cryptographically secure PRF
siphash: implement HalfSipHash1-3 for hash tables
inet: switch IP ID generator to siphash
netfilter: ctnetlink: don't use conntrack/expect object addresses as id
xtensa: add missing isync to the cpu_reset TLB code
ALSA: hda - Fix a memory leak bug
ALSA: hda - Add a generic reboot_notify
ALSA: hda - Let all conexant codec enter D3 when rebooting
HID: holtek: test for sanity of intfdata
HID: hiddev: avoid opening a disconnected device
HID: hiddev: do cleanup in failure of opening a device
Input: kbtab - sanity check for endpoint type
Input: iforce - add sanity checks
net: usb: pegasus: fix improper read if get_registers() fail
xen/pciback: remove set but not used variable 'old_state'
irqchip/irq-imx-gpcv2: Forward irq type to parent
perf header: Fix divide by zero error if f_header.attr_size==0
perf header: Fix use of unitialized value warning
libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
scsi: hpsa: correct scsi command status issue after reset
ata: libahci: do not complain in case of deferred probe
kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
arm64/efi: fix variable 'si' set but not used
arm64/mm: fix variable 'pud' set but not used
IB/core: Add mitigation for Spectre V1
IB/mad: Fix use-after-free in ib mad completion handling
ocfs2: remove set but not used variable 'last_hash'
staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
staging: comedi: dt3000: Fix rounding up of timer divisor
USB: core: Fix races in character device registration and deregistraion
usb: cdc-acm: make sure a refcount is taken early enough
USB: CDC: fix sanity checks in CDC union parser
USB: serial: option: add D-Link DWM-222 device ID
USB: serial: option: Add support for ZTE MF871A
USB: serial: option: add the BroadMobi BM818 card
USB: serial: option: Add Motorola modem UARTs
asm-generic: fix -Wtype-limits compiler warnings
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
arm64: compat: Allow single-byte watchpoints on all addresses
netfilter: conntrack: Use consistent ct id hash calculation
Input: psmouse - fix build error of multiple definition
iommu/amd: Move iommu_init_pci() to .init section
bnx2x: Fix VF's VLAN reconfiguration in reload.
net/packet: fix race in tpacket_snd()
sctp: fix the transport error_count check
xen/netback: Reset nr_frags before freeing skb
net/mlx5e: Only support tx/rx pause setting for port owner
net/mlx5e: Use flow keys dissector to parse packets for ARFS
team: Add vlan tx offload to hw_enc_features
bonding: Add vlan tx offload to hw_enc_features
Linux 4.9.190
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit fa9dd599b4dae841924b022768354cfde9affecb upstream.
Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[bwh: Backported to 4.9 as dependency of commit 2e4a30983b0f
"bpf: restrict access to core bpf sysctls":
- Drop change in arch/mips/net/ebpf_jit.c
- Drop change to bpf_jit_kallsyms
- Adjust filenames, context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9060cb719e61 ("net: crypto set sk to NULL when af_alg_release.")
fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
closed concurrently with fchownat(). However, it ignored that many
other proto_ops::release() methods don't set sock->sk to NULL and
therefore allow the same use-after-free:
- base_sock_release
- bnep_sock_release
- cmtp_sock_release
- data_sock_release
- dn_release
- hci_sock_release
- hidp_sock_release
- iucv_sock_release
- l2cap_sock_release
- llcp_sock_release
- llc_ui_release
- rawsock_release
- rfcomm_sock_release
- sco_sock_release
- svc_release
- vcc_release
- x25_release
Rather than fixing all these and relying on every socket type to get
this right forever, just make __sock_release() set sock->sk to NULL
itself after calling proto_ops::release().
Reproducer that produces the KASAN splat when any of these socket types
are configured into the kernel:
#include <pthread.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <unistd.h>
pthread_t t;
volatile int fd;
void *close_thread(void *arg)
{
for (;;) {
usleep(rand() % 100);
close(fd);
}
}
int main()
{
pthread_create(&t, NULL, close_thread, NULL);
for (;;) {
fd = socket(rand() % 50, rand() % 11, 0);
fchownat(fd, "", 1000, 1000, 0x1000);
close(fd);
}
}
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ff7b11aa481f682e0e9711abfeb7d03f5cd612bf)
Bug: 125367761
Test: used reproducer above
Change-Id: Ied4bbca5c7eb80c201fec6e0aabc95c24acc1b59
Signed-off-by: Eric Biggers <ebiggers@google.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlvm/IUACgkQONu9yGCS
aT6mrw//ctcqOR9aZYTODrVHFZ4puE2xhae5Hr+hwtcE2WSjHWuxJfVkrEuJGlIH
4oQpUfek+eYf3yZy8Iw9WLZH1+P3evGkR0G4gBD/A4f25qCKCcHEXOAPiKgeadnC
tj49fEkiJgO3I9vRx8yJnUvhxR/Br5CTOUMdTYsWHbCsdewzCMHWlwpJhLwV053j
P9cCrpfJLD55HDdj/jwcn2jfooIVfYsYkut8jP0qTKI04rWEZgOrCSjahN8KHtQ5
GgykDU7db8mmP1IhM+bhGuQReSX7myx/MGx5dS7Mli+5aUtYCMlkqylpL96NuBbe
axFpie4nBTny6dIHXodZx59J/T1ERBws9zLzKF1oyxANHEeTiO7q+hbaw9vRLN5G
mNWyn0KZ8T0+BWSL1pyA+oVwZkjOcMDil5Gz7Y7A9kE4xj5grrl5IevAtSD6tb9X
zwAk5hjvaBmZVVM9NgbG2bGATPNLnv1l57TCRjsx91p9uzReg8gYxNrijIwGqGip
HrR/HJvgfI9Df52X8JtGfs+397mXevxl1Lo56Pv1nkagkD1fvhqFLRZgd3y1MoIO
DNjdUohw0tBorHqdpvgnZnifuwk3AcPiCMqqfCcGwkcguoM8XFhedTkTPrut5+f4
IPK0Qh25lcT9k+GHJUvDOEzQvx4CGcG8uVj0FgiebWdlS3KZ56s=
=0M4P
-----END PGP SIGNATURE-----
Merge 4.9.136 into android-4.9
Also revert commit b91d532928df ("ipv6: set rt6i_protocol properly in
the route when it is installed") as it breaks the test systems.
Changes in 4.9.136
xfrm: Validate address prefix lengths in the xfrm selector.
xfrm6: call kfree_skb when skb is toobig
mac80211: Always report TX status
cfg80211: reg: Init wiphy_idx in regulatory_hint_core()
mac80211: fix pending queue hang due to TX_DROP
cfg80211: Address some corner cases in scan result channel updating
mac80211: TDLS: fix skb queue/priority assignment
ARM: 8799/1: mm: fix pci_ioremap_io() offset check
xfrm: validate template mode
ARM: dts: BCM63xx: Fix incorrect interrupt specifiers
net: macb: Clean 64b dma addresses if they are not detected
soc: fsl: qbman: qman: avoid allocating from non existing gen_pool
soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()
nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT
mac80211_hwsim: do not omit multicast announce of first added radio
Bluetooth: SMP: fix crash in unpairing
pxa168fb: prepare the clock
qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
qed: Avoid constant logical operation warning in qed_vf_pf_acquire
asix: Check for supported Wake-on-LAN modes
ax88179_178a: Check for supported Wake-on-LAN modes
lan78xx: Check for supported Wake-on-LAN modes
sr9800: Check for supported Wake-on-LAN modes
r8152: Check for supported Wake-on-LAN Modes
smsc75xx: Check for Wake-on-LAN modes
smsc95xx: Check for Wake-on-LAN modes
perf/ring_buffer: Prevent concurent ring buffer access
perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX
net: fec: fix rare tx timeout
declance: Fix continuation with the adapter identification message
net: cxgb3_main: fix a missing-check bug
perf symbols: Fix memory corruption because of zero length symbols
mm/memory_hotplug.c: fix overflow in test_pages_in_a_zone()
MIPS: microMIPS: Fix decoding of swsp16 instruction
MIPS: Handle non word sized instructions when examining frame
scsi: aacraid: Fix typo in blink status
f2fs: fix multiple f2fs_add_link() having same name for inline dentry
igb: Remove superfluous reset to PHY and page 0 selection
ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs
PCI: Disable MSI for HiSilicon Hip06/Hip07 only in Root Port mode
i2c: bcm2835: Avoid possible NULL ptr dereference
efi/fb: Correct PCI_STD_RESOURCE_END usage
ipv6: set rt6i_protocol properly in the route when it is installed
platform/x86: acer-wmi: setup accelerometer when ACPI device was found
IB/ipoib: Do not warn if IPoIB debugfs doesn't exist
IB/core: Fix the validations of a multicast LID in attach or detach operations
orangefs: off by ones in xattr size checks
rxe: Fix a sleep-in-atomic bug in post_one_send
nvme-pci: fix CMB sysfs file removal in reset path
net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well.
net/mlx5: Fix command completion after timeout access invalid structure
tipc: Fix tipc_sk_reinit handling of -EAGAIN
tipc: fix a race condition of releasing subscriber object
bnxt_en: Don't use rtnl lock to protect link change logic in workqueue.
ath10k: fix NAPI enable/disable symmetry for AHB interface
ARM: dts: bcm283x: Reserve first page for firmware
btrfs: fiemap: Cache and merge fiemap extent before submit it to user
ata: sata_rcar: Handle return value of clk_prepare_enable
reset: hi6220: Set module license so that it can be loaded
ASoC: Intel: Skylake: Fix to parse consecutive string tkns in manifest
arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
mac80211: fix TX aggregation start/stop callback race
libata: fix error checking in in ata_parse_force_one()
net: ethernet: stmmac: Fix altr_tse_pcs SGMII Initialization
qlcnic: Fix tunnel offload for 82xx adapters
x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC
ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M
gpu: ipu-v3: Fix CSI selection for VDIC
elevator: fix truncation of icq_cache_name
net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io
ufs: we need to sync inode before freeing it
net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
ip6_tunnel: Correct tos value in collect_md mode
net/mlx5: Fix driver load error flow when firmware is stuck
perf evsel: Fix probing of precise_ip level for default cycles event
perf probe: Fix probe definition for inlined functions
net/mlx5: Fix health work queue spin lock to IRQ safe
usb: renesas_usbhs: gadget: fix spin_lock_init() for &uep->lock
usb: renesas_usbhs: gadget: fix unused-but-set-variable warning
usb: dwc3: omap: remove IRQ_NOAUTOEN used with shared irq
clk: samsung: Fix m2m scaler clock on Exynos542x
ptr_ring: fix up after recent ptr_ring changes
staging: wilc1000: Fix problem with wrong vif index
rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp
iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications"
qed: Warn PTT usage by wrong hw-function
ocfs2: fix deadlock caused by recursive locking in xattr
net: cdc_ncm: GetNtbFormat endian fix
sctp: use right member as the param of list_for_each_entry
ALSA: hda - No loopback on ALC299 codec
ath10k: convert warning about non-existent OTP board id to debug message
ipv6: fix cleanup ordering for ip6_mr failure
IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
IB/rxe: put the pool on allocation failure
nbd: only set MSG_MORE when we have more to send
mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'
IB/mlx5: Avoid passing an invalid QP type to firmware
scsi: qla2xxx: Avoid double completion of abort command
drm: bochs: Don't remove uninitialized fbdev framebuffer
i40e: avoid NVM acquire deadlock during NVM update
Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
Btrfs: incremental send, fix invalid memory access
drm/msm: Fix possible null dereference on failure of get_pages()
module: fix DEBUG_SET_MODULE_RONX typo
iio: pressure: zpa2326: Remove always-true check which confuses gcc
l2tp: remove configurable payload offset
macsec: fix memory leaks when skb_to_sgvec fails
perf/core: Fix locking for children siblings group read
cifs: Use ULL suffix for 64-bit constant
futex: futex_wake_op, do not fail on invalid op
ALSA: hda - Fix incorrect usage of IS_REACHABLE()
test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches
xen-netfront: Update features after registering netdev
sparc64: Fix regression in pmdp_invalidate().
xen-netfront: Fix mismatched rtnl_unlock
enic: do not overwrite error code
bonding: ratelimit failed speed/duplex update warning
nvmet: fix space padding in serial number
iio: buffer: fix the function signature to match implementation
x86/paravirt: Fix some warning messages
IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
libertas: call into generic suspend code before turning off power
xhci: Fix USB3 NULL pointer dereference at logical disconnect.
perf tests: Fix indexing when invoking subtests
ARM: dts: imx53-qsb: disable 1.2GHz OPP
rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
rxrpc: Only take the rwind and mtu values from latest ACK
net: ena: fix NULL dereference due to untimely napi initialization
fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
mtd: spi-nor: Add support for is25wp series chips
Revert "netfilter: ipv6: nf_defrag: drop skb dst before queueing"
perf tools: Disable parallelism for 'make clean'
bridge: do not add port to router list when receives query with source 0.0.0.0
net: bridge: remove ipv6 zero address check in mcast queries
ipv6: mcast: fix a use-after-free in inet6_mc_check
ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called
llc: set SOCK_RCU_FREE in llc_sap_add_socket()
net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
net: sched: gred: pass the right attribute to gred_change_table_def()
net: socket: fix a missing-check bug
net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules
net: udp: fix handling of CHECKSUM_COMPLETE packets
r8169: fix NAPI handling under high load
sctp: fix race on sctp_id2asoc
vhost: Fix Spectre V1 vulnerability
ethtool: fix a privilege escalation bug
bonding: fix length of actor system
net: drop skb on failure in ip_check_defrag()
net: fix pskb_trim_rcsum_slow() with odd trim offset
rtnetlink: Disallow FDB configuration for non-Ethernet device
ip6_tunnel: Fix encapsulation layout
Revert "x86/mm: Expand static page table for fixmap space"
crypto: shash - Fix a sleep-in-atomic bug in shash_setkey_unaligned
ahci: don't ignore result code of ahci_reset_controller()
gpio: mxs: Get rid of external API call
xfs: truncate transaction does not modify the inobt
cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
ptp: fix Spectre v1 vulnerability
drm/edid: Add 6 bpc quirk for BOE panel in HP Pavilion 15-n233sl
RDMA/ucma: Fix Spectre v1 vulnerability
IB/ucm: Fix Spectre v1 vulnerability
cdc-acm: correct counting of UART states in serial state notification
usb: gadget: storage: Fix Spectre v1 vulnerability
USB: fix the usbfs flag sanitization for control transfers
Input: elan_i2c - add ACPI ID for Lenovo IdeaPad 330-15IGM
sched/fair: Fix throttle_list starvation with low CFS quota
x86/percpu: Fix this_cpu_read()
x86/time: Correct the attribute on jiffies' definition
net: fs_enet: do not call phy_stop() in interrupts
posix-timers: Sanitize overrun handling
Linux 4.9.136
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit b6168562c8ce2bd5a30e213021650422e08764dc ]
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.
This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().
As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().
sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor <shankarapailoor@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6d8c50dcb029872b298eea68cc6209c866fd3e14)
Signed-off-by: Chenbo Feng <fengc@google.com>
Bug: 112220999
Test: syzcaller reproducer doesn't trigger the crash anymore
Change-Id: I586fbc3b200f8cb855017d5cd701a126a36b8172
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltoWckACgkQONu9yGCS
aT4rdxAAyB4LLh5ylp8b2wEbpSWpOIRGfb1Y78VLf8T3TPsCo/46pTgOPVwpGpeJ
O9QDBcPBEwqJVJEYW0Hf5PBj/JhVGw9uQ4JM+6Tuy1BoZmlfxmUgQz2NotSAAxUD
b5ymy5LnMOoM+GX2IsPILsz0h54NGTlQtdjH2C6dUYx/u8uWzUwgW1eXPdc+m++7
OSWSQ276jZs0oAYgsS5r0GBpe5C+G72dRVDD0uRKTNQEsmSdCOTX6BzaxBzll4yQ
gaZTQre0Sgmv6cyl0rJ6JqdyNECN1i+aw3oSU75Zr+1cfaRPh+8APtN0PW6HUV47
WO08k1/0L5HA/EOU6YI4QwNcQS8yv+H0avmsDwnXc8a2NgKpLFlV+LjAQA2jDnTJ
CWFkLFyfkFtYM/W1Xglyo7OyA1o1BmoZVzjiPECRtW2RqVfl9hORqH4gMtxoHxy2
maE0he/FcVp6iu9hoas2g7V7T/O6UF2ipYWG/+WZBuZY3SjojNth/MKuQ7E+qLY5
UDBMx9CCAjYqAKN4A+aMCAfociV5vTAeQLbwc1ffa4JtqX88nDQxAp7SBP8beEWc
CQsnCvksTdqebeDN0DWcRbSs1abjjeZcoWiifdwGVwwiE5D1RgLZxrABaNEX4XJ6
lQNUYzMuT8D9MzEoDn0TB5mLgIvxdA5gQzwWMV30h5f3fXax1ro=
=qE4w
-----END PGP SIGNATURE-----
Merge 4.9.118 into android-4.9
Changes in 4.9.118
ipv4: remove BUG_ON() from fib_compute_spec_dst
net: ena: Fix use of uninitialized DMA address bits field
net: fix amd-xgbe flow-control issue
net: lan78xx: fix rx handling before first packet is send
net: mdio-mux: bcm-iproc: fix wrong getter and setter pair
NET: stmmac: align DMA stuff to largest cache line length
tcp_bbr: fix bw probing to raise in-flight data for very small BDPs
xen-netfront: wait xenbus state change when load module manually
tcp: do not force quickack when receiving out-of-order packets
tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
tcp: do not aggressively quick ack after ECN events
tcp: refactor tcp_ecn_check_ce to remove sk type cast
tcp: add one more quick ack after after ECN events
pinctrl: intel: Read back TX buffer state
sched/wait: Remove the lockless swait_active() check in swake_up*()
bonding: avoid lockdep confusion in bond_get_stats()
inet: frag: enforce memory limits earlier
ipv4: frags: handle possible skb truesize change
net: dsa: Do not suspend/resume closed slave_dev
netlink: Fix spectre v1 gadget in netlink_create()
net: stmmac: Fix WoL for PCI-based setups
squashfs: more metadata hardening
squashfs: more metadata hardenings
can: ems_usb: Fix memory leak on ems_usb_disconnect()
net: socket: fix potential spectre v1 gadget in socketcall
virtio_balloon: fix another race between migration and ballooning
kvm: x86: vmx: fix vpid leak
crypto: padlock-aes - Fix Nano workaround data corruption
drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats
scsi: sg: fix minor memory leak in error path
Linux 4.9.118
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit c8e8cd579bb4265651df8223730105341e61a2d1 upstream.
'call' is a user-controlled value, so sanitize the array index after the
bounds check to avoid speculating past the bounds of the 'nargs' array.
Found with the help of Smatch:
net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue
'nargs' [r] (local cap)
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlpxrs0ACgkQONu9yGCS
aT7Yxw/+NmM+Yh70QOpW02RCFHCB+F9tnuQXNlLfEoDqlujMS/UNuVMx39gQXDaU
7T/JOmnVtp9WQL9RLgAegSc3ayIQELzvtKjDLSo/hzxYsOmr0WlN2CVTGT7hn9JH
IQdf8cR2r4FZ/XcxQLpSsRabwhqfeoND1TTm5LUNB1Ii05hUU6/s0k1rQguabuo5
vi0BzSh7v/URxlLyL0m4ZVqovWOASS5/qSv7wazd4i/bSqH3g7VXLNu93iyOB8ih
XXpeTjtfAwJ5kUXBWZPNazUzpQ7b56sQPtsvN6CrvTv8jKJ+FH+7S4d50Vgbu51X
YBC36yypYPXunMXB9iiLYkyb8jraKr12BRLXQyl3TlNANoYjBiT/a2XmHDMA1VbL
+ydbswbmcAvZ1fuAekVY+HIogEroWzN7FbhdUgV12nm7/4WfxpBTZW+M8Es/Stuh
2ACT9TWopbhwRFUhFT5kyDTTnK++NsshGzUXbR9qPQzhdaqe76RPfJ6uHV69MXxP
gE9o3NQ3fUieJO5nQj54atErX+sJ4987DnGoWrg+Ye9Svsq1oVw0K1e44VLBp08v
iZk2lvNjUWnkDGQOhsPEYCLq6KPjXkaqV4OZVS6tGxGEZ4QQJjbnYk+kPeKjrKIA
iP3nfaLJ4HQc2kvwEI41HEJGWyGUlhdrnDqfpxpWgGXOStGJrq0=
=RJ2h
-----END PGP SIGNATURE-----
Merge 4.9.79 into android-4.9
Changes in 4.9.79
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
orangefs: use list_for_each_entry_safe in purge_waiting_ops
orangefs: initialize op on loop restart in orangefs_devreq_read
usbip: prevent vhci_hcd driver from leaking a socket pointer address
usbip: Fix implicit fallthrough warning
usbip: Fix potential format overflow in userspace tools
can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
Prevent timer value 0 for MWAITX
drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled
drivers: base: cacheinfo: fix boot error message when acpi is enabled
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
hwpoison, memcg: forcibly uncharge LRU pages
cma: fix calculation of aligned offset
mm, page_alloc: fix potential false positive in __zone_watermark_ok
ipc: msg, make msgrcv work with LONG_MIN
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
ACPICA: Namespace: fix operand cache leak
netfilter: nfnetlink_cthelper: Add missing permission checks
netfilter: xt_osf: Add missing permission checks
reiserfs: fix race in prealloc discard
reiserfs: don't preallocate blocks for extended attributes
fs/fcntl: f_setown, avoid undefined behaviour
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
Revert "module: Add retpoline tag to VERMAGIC"
mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Input: trackpoint - force 3 buttons if 0 button is reported
orangefs: fix deadlock; do not write i_size in read_iter
um: link vmlinux with -no-pie
vsyscall: Fix permissions for emulate mode with KAISER/PTI
eventpoll.h: add missing epoll event masks
dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
ipv6: fix udpv6 sendmsg crash caused by too small MTU
ipv6: ip6_make_skb() needs to clear cork.base.dst
lan78xx: Fix failure in USB Full Speed
net: igmp: fix source address check for IGMPv3 reports
net: qdisc_pkt_len_init() should be more robust
net: tcp: close sock if net namespace is exiting
pppoe: take ->needed_headroom of lower device into account on xmit
r8169: fix memory corruption on retrieval of hardware statistics.
sctp: do not allow the v4 socket to bind a v4mapped v6 address
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
tipc: fix a memory leak in tipc_nl_node_get_link()
vmxnet3: repair memory leak
net: Allow neigh contructor functions ability to modify the primary_key
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
ppp: unlock all_ppp_mutex before registering device
be2net: restore properly promisc mode after queues reconfiguration
ip6_gre: init dev->mtu and dev->hard_header_len correctly
gso: validate gso_type in GSO handlers
mlxsw: spectrum_router: Don't log an error on missing neighbor
tun: fix a memory leak for tfile->tx_array
flow_dissector: properly cap thoff field
perf/x86/amd/power: Do not load AMD power module on !AMD platforms
x86/microcode/intel: Extend BDW late-loading further with LLC size check
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
x86: bpf_jit: small optimization in emit_bpf_tail_call()
bpf: fix bpf_tail_call() x64 JIT
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: arsh is not supported in 32 bit alu thus reject it
bpf: avoid false sharing of map refcount with max_entries
bpf: fix divides by zero
bpf: fix 32-bit divide by zero
bpf: reject stores into ctx via st and xadd
nfsd: auth: Fix gid sorting when rootsquash enabled
Linux 4.9.79
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."
To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64
The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)
v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-next
Considered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlo6KFYACgkQONu9yGCS
aT4mzw//cSnAjc7kuTtk96GKWat1bQExyb4scmsEkArfVKhoCy0Dhyr9yr4Y6+mX
6l2uUyQ70jhqOvinWIVuoDJoiZhtloudCe6ehmXm81xZsLacmelIC9NHGZ/vx/10
vC4BIZZgft5JiL4OSp/XTd0t++8maK5RUwp8cCqqTDeUyqHKNjUg+moMuJdjvRf7
4qGoBZ4lyijUU5V+WC98KSZSPncU6U1atA6k6Yvgu7oMFGembztERCx19Ka0JxA5
mzsmAH3TIhHUSGinDpTfW9x3Cmu0Dg3H7mQ0AaEVjhAi1oKTxxp0drLCZbeJUAXX
9QPJqr20XZWkuGX/yuy1vkcVo6kRfjaPYi1yyiFoEQ23hYXAaTJyiXC/CWv6kAkc
MOIXqHQgfegDAC33EzVunp/ue0sBwVAhFTpaTwbUiKJ+lpZY74mV+bjk6gZdlGJM
9TAOE66oAPNt6SM+5QC5mtO9cC03nCDIzbud5KXzdjYH8RBfIEvidxNv5qM6x8Hb
dJn6//nQzMTYIQFHja19Sqbt0xXq2lck5DrZZ+YnXlHr5JH1DzPQfqfmu8GD094e
H3oLDUmyBVnkI5jmgo3Xc+ZLArUMX7HhTyKSp+mXxRtGNulcbbQwaSWjEUoqYSzN
twMQPS+NKu+ZuubztP+7gOvyofmAAfcPX6yZpTnPyFKEnjyU3Uw=
=nCSn
-----END PGP SIGNATURE-----
Merge 4.9.71 into android-4.9
Changes in 4.9.71
mfd: fsl-imx25: Clean up irq settings during removal
crypto: rsa - fix buffer overread when stripping leading zeroes
crypto: hmac - require that the underlying hash algorithm is unkeyed
crypto: salsa20 - fix blkcipher_walk API usage
autofs: fix careless error in recent commit
tracing: Allocate mask_str buffer dynamically
USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
USB: core: prevent malicious bNumInterfaces overflow
usbip: fix stub_rx: get_pipe() to validate endpoint number
usb: add helper to extract bits 12:11 of wMaxPacketSize
usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
ceph: drop negative child dentries before try pruning inode's alias
usb: xhci: fix TDS for MTK xHCI1.1
Bluetooth: btusb: driver to enable the usb-wakeup feature
xhci: Don't add a virt_dev to the devs array before it's fully allocated
nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
sched/rt: Do not pull from current CPU if only one CPU to pull
eeprom: at24: change nvmem stride to 1
dmaengine: dmatest: move callback wait queue to thread context
ext4: fix fdatasync(2) after fallocate(2) operation
ext4: fix crash when a directory's i_size is too small
mac80211: Fix addition of mesh configuration element
usb: phy: isp1301: Add OF device ID table
KVM: nVMX: do not warn when MSR bitmap address is not backed
usb: xhci-mtk: check hcc_params after adding primary hcd
md-cluster: free md_cluster_info if node leave cluster
userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
userfaultfd: selftest: vm: allow to build in vm/ directory
net: initialize msg.msg_flags in recvfrom
bnxt_en: Ignore 0 value in autoneg supported speed from firmware.
net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values
net: bcmgenet: correct MIB access of UniMAC RUNT counters
net: bcmgenet: reserved phy revisions must be checked first
net: bcmgenet: power down internal phy if open or resume fails
net: bcmgenet: synchronize irq0 status between the isr and task
net: bcmgenet: Power up the internal PHY before probing the MII
rxrpc: Wake up the transmitter if Rx window size increases on the peer
net/mlx5: Fix create autogroup prev initializer
net/mlx5: Don't save PCI state when PCI error is detected
iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
drm/amdgpu: fix parser init error path to avoid crash in parser fini
NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
NFSD: fix nfsd_reset_versions for NFSv4.
Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
netfilter: bridge: honor frag_max_size when refragmenting
ASoC: rsnd: fix sound route path when using SRC6/SRC9
blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
writeback: fix memory leak in wb_queue_work()
net: wimax/i2400m: fix NULL-deref at probe
dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN
net: Resend IGMP memberships upon peer notification.
mlxsw: reg: Fix SPVM max record count
mlxsw: reg: Fix SPVMLR max record count
qed: Align CIDs according to DORQ requirement
qed: Fix mapping leak on LL2 rx flow
qed: Fix interrupt flags on Rx LL2
drm: amd: remove broken include path
intel_th: pci: Add Gemini Lake support
openrisc: fix issue handling 8 byte get_user calls
ASoC: rcar: clear DE bit only in PDMACHCR when it stops
scsi: hpsa: update check for logical volume status
scsi: hpsa: limit outstanding rescans
scsi: hpsa: do not timeout reset operations
fjes: Fix wrong netdevice feature flags
drm/radeon/si: add dpm quirk for Oland
Drivers: hv: util: move waiting for release to hv_utils_transport itself
iwlwifi: mvm: cleanup pending frames in DQA mode
sched/deadline: Add missing update_rq_clock() in dl_task_timer()
sched/deadline: Make sure the replenishment timer fires in the next period
sched/deadline: Throttle a constrained deadline task activated after the deadline
sched/deadline: Use deadline instead of period when calculating overflow
mmc: mediatek: Fixed bug where clock frequency could be set wrong
drm/radeon: reinstate oland workaround for sclk
afs: Fix missing put_page()
afs: Populate group ID from vnode status
afs: Adjust mode bits processing
afs: Deal with an empty callback array
afs: Flush outstanding writes when an fd is closed
afs: Migrate vlocation fields to 64-bit
afs: Prevent callback expiry timer overflow
afs: Fix the maths in afs_fs_store_data()
afs: Invalid op ID should abort with RXGEN_OPCODE
afs: Better abort and net error handling
afs: Populate and use client modification time
afs: Fix page leak in afs_write_begin()
afs: Fix afs_kill_pages()
afs: Fix abort on signal while waiting for call completion
nvme-loop: fix a possible use-after-free when destroying the admin queue
nvmet: confirm sq percpu has scheduled and switched to atomic
nvmet-rdma: Fix a possible uninitialized variable dereference
net/mlx4_core: Avoid delays during VF driver device shutdown
net: mpls: Fix nexthop alive tracking on down events
rxrpc: Ignore BUSY packets on old calls
tty: don't panic on OOM in tty_set_ldisc()
tty: fix data race in tty_ldisc_ref_wait()
perf symbols: Fix symbols__fixup_end heuristic for corner cases
efi/esrt: Cleanup bad memory map log messages
NFSv4.1 respect server's max size in CREATE_SESSION
btrfs: add missing memset while reading compressed inline extents
target: Use system workqueue for ALUA transitions
target: fix ALUA transition timeout handling
target: fix race during implicit transition work flushes
Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
HID: cp2112: fix broken gpio_direction_input callback
sfc: don't warn on successful change of MAC
fbdev: controlfb: Add missing modes to fix out of bounds access
video: udlfb: Fix read EDID timeout
video: fbdev: au1200fb: Release some resources if a memory allocation fails
video: fbdev: au1200fb: Return an error code if a memory allocation fails
rtc: pcf8563: fix output clock rate
ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case
dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
PCI/PME: Handle invalid data when reading Root Status
powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo
PCI: Do not allocate more buses than available in parent
iommu/mediatek: Fix driver name
netfilter: ipvs: Fix inappropriate output of procfs
powerpc/opal: Fix EBUSY bug in acquiring tokens
powerpc/ipic: Fix status get and status clear
platform/x86: intel_punit_ipc: Fix resource ioremap warning
target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
iscsi-target: fix memory leak in lio_target_tiqn_addtpg()
target:fix condition return in core_pr_dump_initiator_port()
target/file: Do not return error for UNMAP if length is zero
badblocks: fix wrong return value in badblocks_set if badblocks are disabled
iommu/amd: Limit the IOVA page range to the specified addresses
xfs: truncate pagecache before writeback in xfs_setattr_size()
arm-ccn: perf: Prevent module unload while PMU is in use
crypto: tcrypt - fix buffer lengths in test_aead_speed()
mm: Handle 0 flags in _calc_vm_trans() macro
clk: mediatek: add the option for determining PLL source clock
clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU
clk: hi6220: mark clock cs_atb_syspll as critical
clk: tegra: Fix cclk_lp divisor register
ppp: Destroy the mutex when cleanup
ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod
thermal/drivers/step_wise: Fix temperature regulation misbehavior
scsi: scsi_debug: write_same: fix error report
GFS2: Take inode off order_write list when setting jdata flag
bcache: explicitly destroy mutex while exiting
bcache: fix wrong cache_misses statistics
Ib/hfi1: Return actual operational VLs in port info query
arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27
btrfs: tests: Fix a memory leak in error handling path in 'run_test()'
platform/x86: hp_accel: Add quirk for HP ProBook 440 G4
nvme: use kref_get_unless_zero in nvme_find_get_ns
l2tp: cleanup l2tp_tunnel_delete calls
xfs: fix log block underflow during recovery cycle verification
xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
RDMA/cxgb4: Declare stag as __be32
PCI: Detach driver before procfs & sysfs teardown on device remove
scsi: hpsa: cleanup sas_phy structures in sysfs when unloading
scsi: hpsa: destroy sas transport properties before scsi_host
powerpc/perf/hv-24x7: Fix incorrect comparison in memord
soc: mediatek: pwrap: fix compiler errors
tty fix oops when rmmod 8250
usb: musb: da8xx: fix babble condition handling
pinctrl: adi2: Fix Kconfig build problem
raid5: Set R5_Expanded on parity devices as well as data.
scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry
IB/core: Fix calculation of maximum RoCE MTU
vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend
rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd
rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd
scsi: sd: change manage_start_stop to bool in sysfs interface
scsi: sd: change allow_restart to bool in sysfs interface
scsi: bfa: integer overflow in debugfs
udf: Avoid overflow when session starts at large offset
macvlan: Only deliver one copy of the frame to the macvlan interface
RDMA/cma: Avoid triggering undefined behavior
IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
icmp: don't fail on fragment reassembly time exceeded
ath9k: fix tx99 potential info leak
Linux 4.9.71
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 9f138fa609c47403374a862a08a41394be53d461 ]
KMSAN reports a use of uninitialized memory in put_cmsg() because
msg.msg_flags in recvfrom haven't been initialized properly.
The flag values don't affect the result on this path, but it's still a
good idea to initialize them explicitly.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sockfs_setattr() static as it is not used outside of net/socket.c
This fixes the following GCC warning:
net/socket.c:534:5: warning: no previous prototype for ‘sockfs_setattr’ [-Wmissing-prototypes]
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: android-4.9 commitID 81a159106e
("UPSTREAM: net: core: Add a UID field to struct sock.")
(cherry picked from commit dc647ec88e029307e60e6bf9988056605f11051a)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[ Upstream commit e623a9e9dec29ae811d11f83d0074ba254aba374 ]
Commit 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path"),
changed the exit path of recvmmsg to always return the datagrams
variable and modified the error paths to set the variable to the error
code returned by recvmsg if necessary.
However in the case sock_error returned an error, the error code was
then ignored, and recvmmsg returned 0.
Change the error path of recvmmsg to correctly return the error code
of sock_error.
The bug was triggered by using recvmmsg on a CAN interface which was
not up. Linux 4.6 and later return 0 in this case while earlier
releases returned -ENETDOWN.
Fixes: 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
->setattr() was recently implemented for socket files to sync the socket
inode's uid to the new 'sk_uid' member of struct sock. It does this by
copying over the ia_uid member of struct iattr. However, ia_uid is
actually only valid when ATTR_UID is set in ia_valid, indicating that
the uid is being changed, e.g. by chown. Other metadata operations such
as chmod or utimes leave ia_uid uninitialized. Therefore, sk_uid could
be set to a "garbage" value from the stack.
Fix this by only copying the uid over when ATTR_UID is set.
Change-Id: I1efd83bd955325b33be3d4addccf5bac8ec803db
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Protocol sockets (struct sock) don't have UIDs, but most of the
time, they map 1:1 to userspace sockets (struct socket) which do.
Various operations such as the iptables xt_owner match need
access to the "UID of a socket", and do so by following the
backpointer to the struct socket. This involves taking
sk_callback_lock and doesn't work when there is no socket
because userspace has already called close().
Simplify this by adding a sk_uid field to struct sock whose value
matches the UID of the corresponding struct socket. The semantics
are as follows:
1. Whenever sk_socket is non-null: sk_uid is the same as the UID
in sk_socket, i.e., matches the return value of sock_i_uid.
Specifically, the UID is set when userspace calls socket(),
fchown(), or accept().
2. When sk_socket is NULL, sk_uid is defined as follows:
- For a socket that no longer has a sk_socket because
userspace has called close(): the previous UID.
- For a cloned socket (e.g., an incoming connection that is
established but on which userspace has not yet called
accept): the UID of the socket it was cloned from.
- For a socket that has never had an sk_socket: UID 0 inside
the user namespace corresponding to the network namespace
the socket belongs to.
Kernel sockets created by sock_create_kern are a special case
of #1 and sk_uid is the user that created them. For kernel
sockets created at network namespace creation time, such as the
per-processor ICMP and TCP sockets, this is the user that created
the network namespace.
Change-Id: Id890c6ea724b6929cc543a474ab37ec2d9e3f815
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for
setxattr support as well. This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity. The
smack security module relies on socket labels (xattrs).
Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.
We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity. A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Do not send the next message in sendmmsg for partial sendmsg
invocations.
sendmmsg assumes that it can continue sending the next message
when the return value of the individual sendmsg invocations
is positive. It results in corrupting the data for TCP,
SCTP, and UNIX streams.
For example, sendmmsg([["abcd"], ["efgh"]]) can result in a stream
of "aefgh" if the first sendmsg invocation sends only the first
byte while the second sendmsg goes through.
Datagram sockets either send the entire datagram or fail, so
this patch affects only sockets of type SOCK_STREAM and
SOCK_SEQPACKET.
Fixes: 228e548e60 ("net: Add sendmmsg socket system call")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These inode operations are no longer used; remove them.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If we allow pseudo-filesystems created with mount_pseudo to have xattr
handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler
to use the xattr handler name parsing.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The standard return value for unsupported attribute names is
-EOPNOTSUPP, as opposed to undefined but supported attributes
(-ENODATA).
Also, fail for attribute names like "system.sockprotonameXXX" and
simplify the code a bit.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
struct timespec is not y2038 safe. Even though timespec might be
sufficient to represent timeouts, use struct timespec64 here as the plan
is to get rid of all timespec reference in the kernel.
The patch transitions the common functions: poll_select_set_timeout()
and select_estimate_accuracy() to use timespec64. And, all the syscalls
that use these functions are transitioned in the same patch.
The restart block parameters for poll uses monotonic time. Use
timespec64 here as well to assign timeout value. This parameter in the
restart block need not change because this only holds the monotonic
timestamp at which timeout should occur. And, unsigned long data type
should be big enough for this timestamp.
The system call interfaces will be handled in a separate series.
Compat interfaces need not change as timespec64 is an alias to struct
timespec on a 64 bit system.
Link: http://lkml.kernel.org/r/1461947989-21926-3-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights:
1) Support SPI based w5100 devices, from Akinobu Mita.
2) Partial Segmentation Offload, from Alexander Duyck.
3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.
4) Allow cls_flower stats offload, from Amir Vadai.
5) Implement bpf blinding, from Daniel Borkmann.
6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.
7) Run TCP more preemptibly, also from Eric Dumazet.
8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.
9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.
10) Improve BPF usage documentation, from Jesper Dangaard Brouer.
11) Support tunneling offloads in qed, from Manish Chopra.
12) aRFS offloading in mlx5e, from Maor Gottlieb.
13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.
14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.
15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.
16) Per-vlan stats in bridging, from Nikolay Aleksandrov.
17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.
18) Checksum neutral ILA in ipv6, from Tom Herbert.
19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot
20) Add VF support to qed driver, from Yuval Mintz"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...
The SKBTX_ACK_TSTAMP flag is set in skb_shinfo->tx_flags when
the timestamp of the TCP acknowledgement should be reported on
error queue. Since accessing skb_shinfo is likely to incur a
cache-line miss at the time of receiving the ack, the
txstamp_ack bit was added in tcp_skb_cb, which is set iff
the SKBTX_ACK_TSTAMP flag is set for an skb. This makes
SKBTX_ACK_TSTAMP flag redundant.
Remove the SKBTX_ACK_TSTAMP and instead use the txstamp_ack bit
everywhere.
Note that this frees one bit in shinfo->tx_flags.
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The socket is either locked if we hold the slock spin_lock for
lock_sock_fast and unlock_sock_fast or we own the lock (sk_lock.owned
!= 0). Check for this and at the same time improve that the current
thread/cpu is really holding the lock.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
This is very costly when users want to sample writes to gather
tx timestamps.
Add support for enabling SO_TIMESTAMPING via control messages by
using tsflags added in `struct sockcm_cookie` (added in the previous
patches in this series) to set the tx_flags of the last skb created in
a sendmsg. With this patch, the timestamp recording bits in tx_flags
of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
Please note that this is only effective for overriding the recording
timestamps flags. Users should enable timestamp reporting (e.g.,
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
socket options and then should ask for SOF_TIMESTAMPING_TX_*
using control messages per sendmsg to sample timestamps for each
write.
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The syzkaller fuzzer hit the following use-after-free:
Call Trace:
[<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
[<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
[< inline >] SYSC_recvmmsg net/socket.c:2281
[<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
[<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.
Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Fixes: a2e2725541 ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to use the static variable here, pr_info_once is more
concise.
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to
indicate that more messages will follow (i.e. a batch of messages is
being sent). This is similar to MSG_MORE except that the following
messages are not merged into one packet, they are sent individually.
sendmmsg is updated so that each contained message except for the
last one is marked as MSG_BATCH.
MSG_BATCH is a performance optimization in cases where a socket
implementation can benefit by transmitting packets in a batch.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows setting MSG_EOR in each individual msghdr passed
in sendmmsg. This allows a sendmmsg to send multiple messages when
using SOCK_SEQPACKET.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Export it for cases where we want to create sockets by hand.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:
- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.
The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Applications often have to reduce number of datagrams
they receive or send per system call to avoid starvation problems.
Really the kernel should take care of this by using cond_resched(),
so that applications can experiment bigger batch sizes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ceb5d58b21 ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.
Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().
One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.
Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.
Fixes: ceb5d58b21 ("net: fix sock_wake_async() rcu protection")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
msg_iocb needs to be initialized on the recv/recvfrom path.
Otherwise afalg will wrongly interpret it as an async call.
Cc: stable@vger.kernel.org
Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry provided a syzkaller (http://github.com/google/syzkaller)
triggering a fault in sock_wake_async() when async IO is requested.
Said program stressed af_unix sockets, but the issue is generic
and should be addressed in core networking stack.
The problem is that by the time sock_wake_async() is called,
we should not access the @flags field of 'struct socket',
as the inode containing this socket might be freed without
further notice, and without RCU grace period.
We already maintain an RCU protected structure, "struct socket_wq"
so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
is the safe route.
It also reduces number of cache lines needing dirtying, so might
provide a performance improvement anyway.
In followup patches, we might move remaining flags (SOCK_NOSPACE,
SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
being mostly read and let it being shared between cpus.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a cleanup to make following patch easier to
review.
Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()
To ease backports, we rename both constants.
Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for tun to do the weird network namespace refcounting.
The existing network namespace refcounting in tfile has almost exactly
the same lifetime. So rewrite the code to use the struct sock network
namespace refcounting and remove the unnecessary hand rolled network
namespace refcounting and the unncesary tfile->net.
This change allows the tun code to directly call sock_put bypassing
sock_release and making SOCK_EXTERNALLY_ALLOCATED unnecessary.
Remove the now unncessary tun_release so that if anything tries to use
the sock_release code path the kernel will oops, and let us know about
the bug.
The macvtap code already uses it's internal socket this way.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
socket inodes and sunrpc filesystems - inodes owned by that code
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>