diff --git a/Documentation/ABI/obsolete/sysfs-block-zram b/Documentation/ABI/obsolete/sysfs-block-zram deleted file mode 100644 index 720ea92cfb2e..000000000000 --- a/Documentation/ABI/obsolete/sysfs-block-zram +++ /dev/null @@ -1,119 +0,0 @@ -What: /sys/block/zram/num_reads -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The num_reads file is read-only and specifies the number of - reads (failed or successful) done on this device. - Now accessible via zram/stat node. - -What: /sys/block/zram/num_writes -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The num_writes file is read-only and specifies the number of - writes (failed or successful) done on this device. - Now accessible via zram/stat node. - -What: /sys/block/zram/invalid_io -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The invalid_io file is read-only and specifies the number of - non-page-size-aligned I/O requests issued to this device. - Now accessible via zram/io_stat node. - -What: /sys/block/zram/failed_reads -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The failed_reads file is read-only and specifies the number of - failed reads happened on this device. - Now accessible via zram/io_stat node. - -What: /sys/block/zram/failed_writes -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The failed_writes file is read-only and specifies the number of - failed writes happened on this device. - Now accessible via zram/io_stat node. - -What: /sys/block/zram/notify_free -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The notify_free file is read-only. Depending on device usage - scenario it may account a) the number of pages freed because - of swap slot free notifications or b) the number of pages freed - because of REQ_DISCARD requests sent by bio. The former ones - are sent to a swap block device when a swap slot is freed, which - implies that this disk is being used as a swap disk. The latter - ones are sent by filesystem mounted with discard option, - whenever some data blocks are getting discarded. - Now accessible via zram/io_stat node. - -What: /sys/block/zram/zero_pages -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The zero_pages file is read-only and specifies number of zero - filled pages written to this disk. No memory is allocated for - such pages. - Now accessible via zram/mm_stat node. - -What: /sys/block/zram/orig_data_size -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The orig_data_size file is read-only and specifies uncompressed - size of data stored in this disk. This excludes zero-filled - pages (zero_pages) since no memory is allocated for them. - Unit: bytes - Now accessible via zram/mm_stat node. - -What: /sys/block/zram/compr_data_size -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The compr_data_size file is read-only and specifies compressed - size of data stored in this disk. So, compression ratio can be - calculated using orig_data_size and this statistic. - Unit: bytes - Now accessible via zram/mm_stat node. - -What: /sys/block/zram/mem_used_total -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The mem_used_total file is read-only and specifies the amount - of memory, including allocator fragmentation and metadata - overhead, allocated for this disk. So, allocator space - efficiency can be calculated using compr_data_size and this - statistic. - Unit: bytes - Now accessible via zram/mm_stat node. - -What: /sys/block/zram/mem_used_max -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The mem_used_max file is read/write and specifies the amount - of maximum memory zram have consumed to store compressed data. - For resetting the value, you should write "0". Otherwise, - you could see -EINVAL. - Unit: bytes - Downgraded to write-only node: so it's possible to set new - value only; its current value is stored in zram/mm_stat - node. - -What: /sys/block/zram/mem_limit -Date: August 2015 -Contact: Sergey Senozhatsky -Description: - The mem_limit file is read/write and specifies the maximum - amount of memory ZRAM can use to store the compressed data. - The limit could be changed in run time and "0" means disable - the limit. No limit is the initial state. Unit: bytes - Downgraded to write-only node: so it's possible to set new - value only; its current value is stored in zram/mm_stat - node. diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index 4518d30b8c2e..14b2bf2e5105 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -22,41 +22,6 @@ Description: device. The reset operation frees all the memory associated with this device. -What: /sys/block/zram/num_reads -Date: August 2010 -Contact: Nitin Gupta -Description: - The num_reads file is read-only and specifies the number of - reads (failed or successful) done on this device. - -What: /sys/block/zram/num_writes -Date: August 2010 -Contact: Nitin Gupta -Description: - The num_writes file is read-only and specifies the number of - writes (failed or successful) done on this device. - -What: /sys/block/zram/invalid_io -Date: August 2010 -Contact: Nitin Gupta -Description: - The invalid_io file is read-only and specifies the number of - non-page-size-aligned I/O requests issued to this device. - -What: /sys/block/zram/failed_reads -Date: February 2014 -Contact: Sergey Senozhatsky -Description: - The failed_reads file is read-only and specifies the number of - failed reads happened on this device. - -What: /sys/block/zram/failed_writes -Date: February 2014 -Contact: Sergey Senozhatsky -Description: - The failed_writes file is read-only and specifies the number of - failed writes happened on this device. - What: /sys/block/zram/max_comp_streams Date: February 2014 Contact: Sergey Senozhatsky @@ -73,74 +38,24 @@ Description: available and selected compression algorithms, change compression algorithm selection. -What: /sys/block/zram/notify_free -Date: August 2010 -Contact: Nitin Gupta -Description: - The notify_free file is read-only. Depending on device usage - scenario it may account a) the number of pages freed because - of swap slot free notifications or b) the number of pages freed - because of REQ_DISCARD requests sent by bio. The former ones - are sent to a swap block device when a swap slot is freed, which - implies that this disk is being used as a swap disk. The latter - ones are sent by filesystem mounted with discard option, - whenever some data blocks are getting discarded. - -What: /sys/block/zram/zero_pages -Date: August 2010 -Contact: Nitin Gupta -Description: - The zero_pages file is read-only and specifies number of zero - filled pages written to this disk. No memory is allocated for - such pages. - -What: /sys/block/zram/orig_data_size -Date: August 2010 -Contact: Nitin Gupta -Description: - The orig_data_size file is read-only and specifies uncompressed - size of data stored in this disk. This excludes zero-filled - pages (zero_pages) since no memory is allocated for them. - Unit: bytes - -What: /sys/block/zram/compr_data_size -Date: August 2010 -Contact: Nitin Gupta -Description: - The compr_data_size file is read-only and specifies compressed - size of data stored in this disk. So, compression ratio can be - calculated using orig_data_size and this statistic. - Unit: bytes - -What: /sys/block/zram/mem_used_total -Date: August 2010 -Contact: Nitin Gupta -Description: - The mem_used_total file is read-only and specifies the amount - of memory, including allocator fragmentation and metadata - overhead, allocated for this disk. So, allocator space - efficiency can be calculated using compr_data_size and this - statistic. - Unit: bytes - What: /sys/block/zram/mem_used_max Date: August 2014 Contact: Minchan Kim Description: - The mem_used_max file is read/write and specifies the amount - of maximum memory zram have consumed to store compressed data. - For resetting the value, you should write "0". Otherwise, - you could see -EINVAL. + The mem_used_max file is write-only and is used to reset + the counter of maximum memory zram have consumed to store + compressed data. For resetting the value, you should write + "0". Otherwise, you could see -EINVAL. Unit: bytes What: /sys/block/zram/mem_limit Date: August 2014 Contact: Minchan Kim Description: - The mem_limit file is read/write and specifies the maximum - amount of memory ZRAM can use to store the compressed data. The - limit could be changed in run time and "0" means disable the - limit. No limit is the initial state. Unit: bytes + The mem_limit file is write-only and specifies the maximum + amount of memory ZRAM can use to store the compressed data. + The limit could be changed in run time and "0" means disable + the limit. No limit is the initial state. Unit: bytes What: /sys/block/zram/compact Date: August 2015 @@ -175,3 +90,50 @@ Description: device's debugging info useful for kernel developers. Its format is not documented intentionally and may change anytime without any notice. + +What: /sys/block/zram/backing_dev +Date: June 2017 +Contact: Minchan Kim +Description: + The backing_dev file is read-write and set up backing + device for zram to write incompressible pages. + For using, user should enable CONFIG_ZRAM_WRITEBACK. + +What: /sys/block/zram/idle +Date: November 2018 +Contact: Minchan Kim +Description: + idle file is write-only and mark zram slot as idle. + If system has mounted debugfs, user can see which slots + are idle via /sys/kernel/debug/zram/zram/block_state + +What: /sys/block/zram/writeback +Date: November 2018 +Contact: Minchan Kim +Description: + The writeback file is write-only and trigger idle and/or + huge page writeback to backing device. + +What: /sys/block/zram/bd_stat +Date: November 2018 +Contact: Minchan Kim +Description: + The bd_stat file is read-only and represents backing device's + statistics (bd_count, bd_reads, bd_writes) in a format + similar to block layer statistics file format. + +What: /sys/block/zram/writeback_limit_enable +Date: November 2018 +Contact: Minchan Kim +Description: + The writeback_limit_enable file is read-write and specifies + eanbe of writeback_limit feature. "1" means eable the feature. + No limit "0" is the initial state. + +What: /sys/block/zram/writeback_limit +Date: November 2018 +Contact: Minchan Kim +Description: + The writeback_limit file is read-write and specifies the maximum + amount of writeback ZRAM can do. The limit could be changed + in run time. diff --git a/Documentation/ABI/testing/sysfs-bus-mei b/Documentation/ABI/testing/sysfs-bus-mei index 6bd45346ac7e..3f8701e8fa24 100644 --- a/Documentation/ABI/testing/sysfs-bus-mei +++ b/Documentation/ABI/testing/sysfs-bus-mei @@ -4,7 +4,7 @@ KernelVersion: 3.10 Contact: Samuel Ortiz linux-mei@linux.intel.com Description: Stores the same MODALIAS value emitted by uevent - Format: mei::: + Format: mei::: What: /sys/bus/mei/devices/.../name Date: May 2015 diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 6d75a9c00e8a..b41046b5713b 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -356,6 +356,10 @@ What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 /sys/devices/system/cpu/vulnerabilities/spec_store_bypass + /sys/devices/system/cpu/vulnerabilities/l1tf + /sys/devices/system/cpu/vulnerabilities/mds + /sys/devices/system/cpu/vulnerabilities/tsx_async_abort + /sys/devices/system/cpu/vulnerabilities/itlb_multihit Date: January 2018 Contact: Linux kernel mailing list Description: Information about CPU vulnerabilities @@ -367,3 +371,25 @@ Description: Information about CPU vulnerabilities "Not affected" CPU is not affected by the vulnerability "Vulnerable" CPU is affected and no mitigation in effect "Mitigation: $M" CPU is affected and mitigation $M is in effect + + See also: Documentation/hw-vuln/index.rst + +What: /sys/devices/system/cpu/smt + /sys/devices/system/cpu/smt/active + /sys/devices/system/cpu/smt/control +Date: June 2018 +Contact: Linux kernel mailing list +Description: Control Symetric Multi Threading (SMT) + + active: Tells whether SMT is active (enabled and siblings online) + + control: Read/write interface to control SMT. Possible + values: + + "on" SMT is enabled + "off" SMT is disabled + "forceoff" SMT is force disabled. Cannot be changed. + "notsupported" SMT is not supported by the CPU + + If control status is "forceoff" or "notsupported" writes + are rejected. diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index f82da9bbb1fd..e916c1eddf1b 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -51,6 +51,14 @@ Description: Controls the dirty page count condition for the in-place-update policies. +What: /sys/fs/f2fs//min_seq_blocks +Date: August 2018 +Contact: "Jaegeuk Kim" +Description: + Controls the dirty page count condition for batched sequential + writes in ->writepages. + + What: /sys/fs/f2fs//min_hot_blocks Date: March 2017 Contact: "Jaegeuk Kim" @@ -78,12 +86,28 @@ Description: The unit size is one block, now only support configuring in range of [1, 512]. +What: /sys/fs/f2fs//umount_discard_timeout +Date: January 2019 +Contact: "Jaegeuk Kim" +Description: + Set timeout to issue discard commands during umount. + Default: 5 secs + What: /sys/fs/f2fs//max_victim_search Date: January 2014 Contact: "Jaegeuk Kim" Description: Controls the number of trials to find a victim segment. +What: /sys/fs/f2fs//migration_granularity +Date: October 2018 +Contact: "Chao Yu" +Description: + Controls migration granularity of garbage collection on large + section, it can let GC move partial segment{s} of one section + in one GC cycle, so that dispersing heavy overhead GC to + multiple lightweight one. + What: /sys/fs/f2fs//dir_level Date: March 2014 Contact: "Jaegeuk Kim" @@ -113,7 +137,22 @@ What: /sys/fs/f2fs//idle_interval Date: January 2016 Contact: "Jaegeuk Kim" Description: - Controls the idle timing. + Controls the idle timing for all paths other than + discard and gc path. + +What: /sys/fs/f2fs//discard_idle_interval +Date: September 2018 +Contact: "Chao Yu" +Contact: "Sahitya Tummala" +Description: + Controls the idle timing for discard path. + +What: /sys/fs/f2fs//gc_idle_interval +Date: September 2018 +Contact: "Chao Yu" +Contact: "Sahitya Tummala" +Description: + Controls the idle timing for gc path. What: /sys/fs/f2fs//iostat_enable Date: August 2017 diff --git a/Documentation/Changes b/Documentation/Changes index 22797a15dc24..76d6dc0d3227 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -33,7 +33,7 @@ GNU C 3.2 gcc --version GNU make 3.80 make --version binutils 2.12 ld -v util-linux 2.10o fdformat --version -module-init-tools 0.9.10 depmod -V +kmod 13 depmod -V e2fsprogs 1.41.4 e2fsck -V jfsutils 1.1.3 fsck.jfs -V reiserfsprogs 3.6.3 reiserfsck -V @@ -143,12 +143,6 @@ is not build with ``CONFIG_KALLSYMS`` and you have no way to rebuild and reproduce the Oops with that option, then you can still decode that Oops with ksymoops. -Module-Init-Tools ------------------ - -A new module loader is now in the kernel that requires ``module-init-tools`` -to use. It is backward compatible with the 2.4.x series kernels. - Mkinitrd -------- @@ -363,16 +357,17 @@ Util-linux - +Kmod +---- + +- +- + Ksymoops -------- - -Module-Init-Tools ------------------ - -- - Mkinitrd -------- diff --git a/Documentation/accounting/psi.txt b/Documentation/accounting/psi.txt new file mode 100644 index 000000000000..4fb40fe94828 --- /dev/null +++ b/Documentation/accounting/psi.txt @@ -0,0 +1,180 @@ +================================ +PSI - Pressure Stall Information +================================ + +:Date: April, 2018 +:Author: Johannes Weiner + +When CPU, memory or IO devices are contended, workloads experience +latency spikes, throughput losses, and run the risk of OOM kills. + +Without an accurate measure of such contention, users are forced to +either play it safe and under-utilize their hardware resources, or +roll the dice and frequently suffer the disruptions resulting from +excessive overcommit. + +The psi feature identifies and quantifies the disruptions caused by +such resource crunches and the time impact it has on complex workloads +or even entire systems. + +Having an accurate measure of productivity losses caused by resource +scarcity aids users in sizing workloads to hardware--or provisioning +hardware according to workload demand. + +As psi aggregates this information in realtime, systems can be managed +dynamically using techniques such as load shedding, migrating jobs to +other systems or data centers, or strategically pausing or killing low +priority or restartable batch jobs. + +This allows maximizing hardware utilization without sacrificing +workload health or risking major disruptions such as OOM kills. + +Pressure interface +================== + +Pressure information for each resource is exported through the +respective file in /proc/pressure/ -- cpu, memory, and io. + +The format for CPU is as such: + +some avg10=0.00 avg60=0.00 avg300=0.00 total=0 + +and for memory and IO: + +some avg10=0.00 avg60=0.00 avg300=0.00 total=0 +full avg10=0.00 avg60=0.00 avg300=0.00 total=0 + +The "some" line indicates the share of time in which at least some +tasks are stalled on a given resource. + +The "full" line indicates the share of time in which all non-idle +tasks are stalled on a given resource simultaneously. In this state +actual CPU cycles are going to waste, and a workload that spends +extended time in this state is considered to be thrashing. This has +severe impact on performance, and it's useful to distinguish this +situation from a state where some tasks are stalled but the CPU is +still doing productive work. As such, time spent in this subset of the +stall state is tracked separately and exported in the "full" averages. + +The ratios are tracked as recent trends over ten, sixty, and three +hundred second windows, which gives insight into short term events as +well as medium and long term trends. The total absolute stall time is +tracked and exported as well, to allow detection of latency spikes +which wouldn't necessarily make a dent in the time averages, or to +average trends over custom time frames. + +Monitoring for pressure thresholds +================================== + +Users can register triggers and use poll() to be woken up when resource +pressure exceeds certain thresholds. + +A trigger describes the maximum cumulative stall time over a specific +time window, e.g. 100ms of total stall time within any 500ms window to +generate a wakeup event. + +To register a trigger user has to open psi interface file under +/proc/pressure/ representing the resource to be monitored and write the +desired threshold and time window. The open file descriptor should be +used to wait for trigger events using select(), poll() or epoll(). +The following format is used: + +