path: root/Documentation/vm
diff options
authorPatrick McHardy <kaber@trash.net>2013-03-31 18:10:34 +0200
committerPatrick McHardy <kaber@trash.net>2013-03-31 18:10:34 +0200
commit70711d223510ba1773cfe1d7770a56141c815ff8 (patch)
tree4a71f38a3a554ddecaa31b7d8c6bc49b7d1705b4 /Documentation/vm
parentd53b4ed072d9779cdf53582c46436dec06d0961f (diff)
parent19f949f52599ba7c3f67a5897ac6be14bfcb1200 (diff)
Merge tag 'v3.8' of /home/kaber/src/repos/linux
Linux 3.8 Signed-off-by: Patrick McHardy <kaber@trash.net> Conflicts: include/linux/Kbuild include/linux/netlink.h
Diffstat (limited to 'Documentation/vm')
4 files changed, 33 insertions, 16 deletions
diff --git a/Documentation/vm/frontswap.txt b/Documentation/vm/frontswap.txt
index 37067cf455f..c71a019be60 100644
--- a/Documentation/vm/frontswap.txt
+++ b/Documentation/vm/frontswap.txt
@@ -25,7 +25,7 @@ with the specified swap device number (aka "type"). A "store" will
copy the page to transcendent memory and associate it with the type and
offset associated with the page. A "load" will copy the page, if found,
from transcendent memory into kernel memory, but will NOT remove the page
-from from transcendent memory. An "invalidate_page" will remove the page
+from transcendent memory. An "invalidate_page" will remove the page
from transcendent memory and an "invalidate_area" will remove ALL pages
associated with the swap type (e.g., like swapoff) and notify the "device"
to refuse further stores with that swap type.
@@ -99,7 +99,7 @@ server configured with a large amount of RAM... without pre-configuring
how much of the RAM is available for each of the clients!
In the virtual case, the whole point of virtualization is to statistically
-multiplex physical resources acrosst the varying demands of multiple
+multiplex physical resources across the varying demands of multiple
virtual machines. This is really hard to do with RAM and efforts to do
it well with no kernel changes have essentially failed (except in some
well-publicized special-case workloads).
@@ -193,7 +193,7 @@ faster.
or maybe swap-over-nbd/NFS)?
No. First, the existing swap subsystem doesn't allow for any kind of
-swap hierarchy. Perhaps it could be rewritten to accomodate a hierarchy,
+swap hierarchy. Perhaps it could be rewritten to accommodate a hierarchy,
but this would require fairly drastic changes. Even if it were
rewritten, the existing swap subsystem uses the block I/O layer which
assumes a swap device is fixed size and any page in it is linearly
diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt
index f8551b3879f..4ac359b7aa1 100644
--- a/Documentation/vm/hugetlbpage.txt
+++ b/Documentation/vm/hugetlbpage.txt
@@ -299,11 +299,17 @@ map_hugetlb.c.
- * hugepage-shm: see Documentation/vm/hugepage-shm.c
+ * map_hugetlb: see tools/testing/selftests/vm/map_hugetlb.c
- * hugepage-mmap: see Documentation/vm/hugepage-mmap.c
+ * hugepage-shm: see tools/testing/selftests/vm/hugepage-shm.c
+ */
+ * hugepage-mmap: see tools/testing/selftests/vm/hugepage-mmap.c
diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index f734bb2a78d..8785fb87d9c 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -116,6 +116,13 @@ echo always >/sys/kernel/mm/transparent_hugepage/defrag
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo never >/sys/kernel/mm/transparent_hugepage/defrag
+By default kernel tries to use huge zero page on read page fault.
+It's possible to disable huge zero page by writing 0 or enable it
+back by writing 1:
+echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
+echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
khugepaged will be automatically started when
transparent_hugepage/enabled is set to "always" or "madvise, and it'll
be automatically shutdown if it's set to "never".
@@ -197,6 +204,14 @@ thp_split is incremented every time a huge page is split into base
pages. This can happen for a variety of reasons but a common
reason is that a huge page is old and is being reclaimed.
+thp_zero_page_alloc is incremented every time a huge zero page is
+ successfully allocated. It includes allocations which where
+ dropped due race with other allocation. Note, it doesn't count
+ every map of the huge zero page, only its allocation.
+thp_zero_page_alloc_failed is incremented if kernel fails to allocate
+ huge zero page and falls back to using small pages.
As the system ages, allocating huge pages may be expensive as the
system uses memory compaction to copy data around memory to free a
huge page for use. There are some counters in /proc/vmstat to help
@@ -276,7 +291,7 @@ unaffected. libhugetlbfs will also work fine as usual.
== Graceful fallback ==
Code walking pagetables but unware about huge pmds can simply call
-split_huge_page_pmd(mm, pmd) where the pmd is the one returned by
+split_huge_page_pmd(vma, addr, pmd) where the pmd is the one returned by
pmd_offset. It's trivial to make the code transparent hugepage aware
by just grepping for "pmd_offset" and adding split_huge_page_pmd where
missing after pmd_offset returns the pmd. Thanks to the graceful
@@ -299,7 +314,7 @@ diff --git a/mm/mremap.c b/mm/mremap.c
return NULL;
pmd = pmd_offset(pud, addr);
-+ split_huge_page_pmd(mm, pmd);
++ split_huge_page_pmd(vma, addr, pmd);
if (pmd_none_or_clear_bad(pmd))
return NULL;
diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt
index fa206cccf89..a68db7692ee 100644
--- a/Documentation/vm/unevictable-lru.txt
+++ b/Documentation/vm/unevictable-lru.txt
@@ -197,12 +197,8 @@ the pages are also "rescued" from the unevictable list in the process of
freeing them.
page_evictable() also checks for mlocked pages by testing an additional page
-flag, PG_mlocked (as wrapped by PageMlocked()). If the page is NOT mlocked,
-and a non-NULL VMA is supplied, page_evictable() will check whether the VMA is
-VM_LOCKED via is_mlocked_vma(). is_mlocked_vma() will SetPageMlocked() and
-update the appropriate statistics if the vma is VM_LOCKED. This method allows
-efficient "culling" of pages in the fault path that are being faulted in to
+flag, PG_mlocked (as wrapped by PageMlocked()), which is set when a page is
+faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED.
@@ -371,8 +367,8 @@ mlock_fixup() filters several classes of "special" VMAs:
mlock_fixup() will call make_pages_present() in the hugetlbfs VMA range to
allocate the huge pages and populate the ptes.
-3) VMAs with VM_DONTEXPAND or VM_RESERVED are generally userspace mappings of
- kernel pages, such as the VDSO page, relay channel pages, etc. These pages
+3) VMAs with VM_DONTEXPAND are generally userspace mappings of kernel pages,
+ such as the VDSO page, relay channel pages, etc. These pages
are inherently unevictable and are not managed on the LRU lists.
mlock_fixup() treats these VMAs the same as hugetlbfs VMAs. It calls
make_pages_present() to populate the ptes.
@@ -651,7 +647,7 @@ PAGE RECLAIM IN shrink_*_list()
shrink_active_list() culls any obviously unevictable pages - i.e.
-!page_evictable(page, NULL) - diverting these to the unevictable list.
+!page_evictable(page) - diverting these to the unevictable list.
However, shrink_active_list() only sees unevictable pages that made it onto the
active/inactive lru lists. Note that these pages do not have PageUnevictable
set - otherwise they would be on the unevictable list and shrink_active_list