dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

349529 Commits

Author SHA1 Message Date
H. Peter Anvin 686966d881 x86, doc: Add a bootloader ID for OVMF
OVMF (an implementation of UEFI based on TianoCore used in virtual
environments) now has the ability to boot Linux natively; this is used
for "qemu -kernel" and similar things in a UEFI environment.

Accordingly, assign it a bootloader ID.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
2013-02-08 09:26:57 -08:00
Kees Cook e575a86fdc x86: Do not leak kernel page mapping locations
Without this patch, it is trivial to determine kernel page
mappings by examining the error code reported to dmesg[1].
Instead, declare the entire kernel memory space as a violation
of a present page.

Additionally, since show_unhandled_signals is enabled by
default, switch branch hinting to the more realistic
expectation, and unobfuscate the setting of the PF_PROT bit to
improve readability.

[1] http://vulnfactory.org/blog/2013/02/06/a-linux-memory-trick/

Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130207174413.GA12485@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07 19:57:44 +01:00
H. Peter Anvin bb9b1a834f Retract MCE-specific UAPI exports which are unused and shouldn't be
used.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ7XZeAAoJEBLB8Bhh3lVKzAAQAKJPmNM4jK3o7QixkySFOKAN
 GKDmciwBCSLDvhDU5bXSKEzhxaWq+/Wp8aTy0u28TUtIq+Dnya8+9xAMZqe56lNv
 slBNk1N2IIJBqZoul1FCY7MyBnss19nh2h/jN9PChcY9dGpgkRnKc3rkjjy4UJ10
 YibUm/wLuzNGCskt6nGO4uVSDXyv5W6PCnpOb6IW5KFywHE0ojsVwZQM5Vnakjrd
 vsyicHHaIXSnRtyDWGZSX4JxY7HDK7+v03OHNvO3FtFzNjTJ8r1qU7LVbC1yBPP6
 uFN+piBUYn6q6Gsy67H1FcufGTz0IzFwyx/akW1MkgdN4FggrLiFf0a10Fh0Dx3s
 jd2tuCv72rfOQUZAYg9kobHk1NecCZt0Wt3NWul3WjZHZC35nFgznnUiznAPhoCI
 /sBKE2VQt0CejigNTIEpaToc/kHuQy3RzmMphknROEehXdm79TqdirNNHKrDkYug
 U9SV4nsOaRFcSZDPXK52xFIEb9Uc90zSWqNblhBRXt7sBC34EqZk95jYDgOrZxHQ
 v8iNieYgOezXzIyyUAHC7dlIUbkHlBsPG+8eEoWFcvOBI3UbGvOv3abMO18yBZqr
 28An+v8pMsT9Q5sUwHFwvblD/vDaj/NIO0OP+y40Z96KnXz8Cm4mpkzBvBdVVxUs
 cmaDId47QRqdvIW2mBZZ
 =2Zr4
 -----END PGP SIGNATURE-----

Merge tag 'ras_for_3.8' into x86/urgent

Retract MCE-specific UAPI exports which are unused and shouldn't be
used.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-06 14:18:53 -08:00
Linus Torvalds fe547d7714 Merge branch 'fix-max-write' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm fix from David Teigland:
 "Thanks to Jana who reported the problem and was able to test this fix
  so quickly."

This fixes an incorrect size check that triggered for CONFIG_COMPAT
whether the code was actually doing compat or not.  The incorrect write
size check broke userland (clvmd) when maximum resource name lengths are
used.

* 'fix-max-write' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: check the write size from user
2013-02-05 20:50:11 +11:00
Linus Torvalds 3296944e29 Merge branch 'akpm' (Andrew's patch-bomb)
Merge mix fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 commits)
  drivers/rtc/rtc-pl031.c: fix the missing operation on enable
  drivers/rtc/rtc-isl1208.c: call rtc_update_irq() from the alarm irq handler
  samples/seccomp: be less stupid about cross compiling
  checkpatch: fix $Float creation of match variables
  memcg: fix typo in kmemcg cache walk macro
  mm: fix wrong comments about anon_vma lock
  MAINTAINERS: update avr32 web ressources
  mm/hugetlb: set PTE as huge in hugetlb_change_protection and remove_migration_pte
  drivers/rtc/rtc-vt8500.c: fix year field in vt8500_rtc_set_time()
  tools/vm: add .gitignore to ignore built binaries
  thp: avoid dumping huge zero page
  nilfs2: fix fix very long mount time issue
2013-02-05 20:38:59 +11:00
Haojian Zhuang e7e034e18a drivers/rtc/rtc-pl031.c: fix the missing operation on enable
The RTC control register should be enabled in the process of
initializing.

Without this patch, I failed to enable RTC in Hisilicon Hi3620 SoC.  The
register mapping section in RTC is always read as zero.  So I doubt that
ST guys may already enable this register in bootloader.  So they won't
meet this issue.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:49 +11:00
Jan Luebbe 72fca4a4b3 drivers/rtc/rtc-isl1208.c: call rtc_update_irq() from the alarm irq handler
Previously the alarm event was not propagated into the RTC subsystem.
By adding a call to rtc_update_irq, this fixes a timeout problem with
the hwclock utility.

Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:49 +11:00
Arnd Bergmann 275aaa6833 samples/seccomp: be less stupid about cross compiling
The seccomp filters are currently built for the build host, not for the
machine that they are going to run on, but they are also built for with
the -m32 flag if the kernel is built for a 32 bit machine, both of which
seems rather odd.

It broke allyesconfig on my machine, which is x86-64, but building for
32 bit ARM, with this error message:

  In file included from /usr/include/stdio.h:28:0,
                   from samples/seccomp/bpf-fancy.c:15:
  /usr/include/features.h:324:26: fatal error: bits/predefs.h: No such file or directory

because there are no 32 bit libc headers installed on this machine.  We
should really be building all the samples for the target machine rather
than the build host, but since the infrastructure for that appears to be
missing right now, let's be a little bit smarter and not pass the '-m32'
flag to the HOSTCC when cross- compiling.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Morris <james.l.morris@oracle.com>
Acked-by: Will Drewry <wad@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:49 +11:00
Joe Perches 326b1ffc13 checkpatch: fix $Float creation of match variables
Commit 74349bcced ("checkpatch: add support for floating point
constants") added an unnecessary match variable that caused tests that
used a $Constant or $LvalOrFunc to have one too many matches.

This causes problems with usleep_range, min/max and other extended
tests.

Avoid using match variables in $Float.
Avoid using match variables in $Assignment too.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:48 +11:00
Glauber Costa 91c777d867 memcg: fix typo in kmemcg cache walk macro
The macro for_each_memcg_cache_index contains a silly yet potentially
deadly mistake.  Although the macro parameter is _idx, the loop tests
are done over i, not _idx.

This hasn't generated any problems so far, because all users use i as a
loop index.  However, while playing with an extension of the code I
ended using another loop index and the compiler was quick to complain.

Unfortunately, this is not the kind of thing that testing reveals =(

Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:48 +11:00
Yuanhan Liu 631b0cfdbd mm: fix wrong comments about anon_vma lock
We use rwsem since commit 5a505085f0 ("mm/rmap: Convert the struct
anon_vma::mutex to an rwsem").  And most of comments are converted to
the new rwsem lock; while just 2 more missed from:

	 $ git grep 'anon_vma->mutex'

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:48 +11:00
Matthias Brugger 249d9d9d6b MAINTAINERS: update avr32 web ressources
Web resource http://avr32linux.org/ is no longer available.  We add the
mirror of the web page foud at http://mirror.egtvedt.no/avr32linux.org/.

Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:47 +11:00
Tony Lu be7517d6ab mm/hugetlb: set PTE as huge in hugetlb_change_protection and remove_migration_pte
When setting a huge PTE, besides calling pte_mkhuge(), we also need to
call arch_make_huge_pte(), which we indeed do in make_huge_pte(), but we
forget to do in hugetlb_change_protection() and remove_migration_pte().

Signed-off-by: Zhigang Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Hillf Danton <dhillf@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:47 +11:00
Tony Prisk 3bdf8cd322 drivers/rtc/rtc-vt8500.c: fix year field in vt8500_rtc_set_time()
The year field is incorrectly masked when setting the date.  If the year
is beyond 2099, the year field will be incorrectly updated in hardware.

This patch masks the year field correctly.

Signed-off-by: Edgar Toernig <froese@gmx.de>
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:47 +11:00
Joonsoo Kim 1363b563a7 tools/vm: add .gitignore to ignore built binaries
There is no .gitignore in tools/vm, so 'git status' always show built
binaries.  To ignore this, add .gitignore.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:46 +11:00
Kirill A. Shutemov 85facf2570 thp: avoid dumping huge zero page
No reason to preserve the huge zero page in core dumps.

Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:46 +11:00
Vyacheslav Dubeyko a9bae18954 nilfs2: fix fix very long mount time issue
There exists a situation when GC can work in background alone without
any other filesystem activity during significant time.

The nilfs_clean_segments() method calls nilfs_segctor_construct() that
updates superblocks in the case of NILFS_SC_SUPER_ROOT and
THE_NILFS_DISCONTINUED flags are set.  But when GC is working alone the
nilfs_clean_segments() is called with unset THE_NILFS_DISCONTINUED flag.
As a result, the update of superblocks doesn't occurred all this time
and in the case of SPOR superblocks keep very old values of last super
root placement.

SYMPTOMS:

Trying to mount a NILFS2 volume after SPOR in such environment ends with
very long mounting time (it can achieve about several hours in some
cases).

REPRODUCING PATH:

1. It needs to use external USB HDD, disable automount and doesn't
   make any additional filesystem activity on the NILFS2 volume.

2. Generate temporary file with size about 100 - 500 GB (for example,
   dd if=/dev/zero of=<file_name> bs=1073741824 count=200).  The size of
   file defines duration of GC working.

3. Then it needs to delete file.

4. Start GC manually by means of command "nilfs-clean -p 0".  When you
   start GC by means of such way then, at the end, superblocks is updated
   by once.  So, for simulation of SPOR, it needs to wait sometime (15 -
   40 minutes) and simply switch off USB HDD manually.

5. Switch on USB HDD again and try to mount NILFS2 volume.  As a
   result, NILFS2 volume will mount during very long time.

REPRODUCIBILITY: 100%

FIX:

This patch adds checking that superblocks need to update and set
THE_NILFS_DISCONTINUED flag before nilfs_clean_segments() call.

Reported-by: Sergey Alexandrov <splavgm@gmail.com>
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Tested-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-05 20:38:46 +11:00
David Teigland d4b0bcf32b dlm: check the write size from user
Return EINVAL from write if the size is larger than
allowed.  Do this before allocating kernel memory for
the bogus size, which could lead to OOM.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Tested-by: Jana Saout <jana@saout.de>
Signed-off-by: David Teigland <teigland@redhat.com>
2013-02-04 15:31:22 -06:00
Linus Torvalds 3f4e5aacf7 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Three small fixlets"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel/cacheinfo: Shut up annoying warning
  x86, doc: Boot protocol 2.12 is in 3.8
  x86-64: Replace left over sti/cli in ia32 audit exit code
2013-02-05 07:59:44 +11:00
Linus Torvalds 2a6f79e8c1 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Three small fixlets"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/debug: Fix format string for 32-bit platforms
  sched: Fix warning in kernel/sched/fair.c
  sched/rt: Use root_domain of rt_rq not current processor
2013-02-05 07:58:24 +11:00
Linus Torvalds 51c1abb95f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Three fixlets and two small (and low risk) hw-enablement changes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix event group context move
  x86/perf: Add IvyBridge EP support
  perf/x86: Fix P6 driver section warning
  arch/x86/tools/insn_sanity.c: Identify source of messages
  perf/x86: Enable Intel Lincroft/Penwell/Cloverview Atom support
2013-02-05 07:57:09 +11:00
Linus Torvalds 5dc31b5767 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two small RCU fixlets from Ingo Molnar.

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Make rcu_nocb_poll an early_param instead of module_param
  rcu: Prevent soft-lockup complaints about no-CBs CPUs
2013-02-05 07:56:07 +11:00
Linus Torvalds d92dd35920 A small set of simple regression and build fixes for 3.8:
- Fix a warning introduced in ONFI NAND probe
  - Fix commandline partition parsing
  - Require BITREVERSE for DiskOnChip G3 driver
  - Fix build failure for davinci_nand as module
  - Bump NFLASH_READY_RETRIES for bcm47xxnflash
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iEYEABECAAYFAlEPq8kACgkQdwG7hYl686NrCwCdH28jbk1MKSrge2Vf4ZeFvnbM
 AvUAoMvuEYrZbfAUYN7jf0t4FVvnJw97
 =hr+/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20130204' of git://git.infradead.org/linux-mtd

Pull MTD fixes from David Woodhouse:
 "A small set of simple regression and build fixes for 3.8:
   - Fix a warning introduced in ONFI NAND probe
   - Fix commandline partition parsing
   - Require BITREVERSE for DiskOnChip G3 driver
   - Fix build failure for davinci_nand as module
   - Bump NFLASH_READY_RETRIES for bcm47xxnflash"

* tag 'for-linus-20130204' of git://git.infradead.org/linux-mtd:
  mtd: nand: onfi don't WARN if we are in 16 bits mode
  mtd: physmap_of: fix cmdline partition method w/o linux, mtd-name
  mtd: docg3 fix missing bitreverse lib
  mtd: davinci_nand: fix modular build with CONFIG_OF=y
  mtd: bcm47xxnflash: increase NFLASH_READY_RETRIES
2013-02-05 07:54:11 +11:00
Borislav Petkov f76e39c531 x86/intel/cacheinfo: Shut up annoying warning
I've been getting the following warning when doing randbuilds
since forever. Now it finally pissed me off just the perfect
amount so that I can fix it.

  arch/x86/kernel/cpu/intel_cacheinfo.c:489:27: warning: ‘cache_disable_0’ defined but not used [-Wunused-variable]
  arch/x86/kernel/cpu/intel_cacheinfo.c:491:27: warning: ‘cache_disable_1’ defined but not used [-Wunused-variable] arch/x86/kernel/cpu/intel_cacheinfo.c:524:27: warning: ‘subcaches’ defined but not used [-Wunused-variable]

It happens because in randconfigs where CONFIG_SYSFS is not set,
the whole sysfs-interface to L3 cache index disabling is
remaining unused and gcc correctly warns about it. Make it
optional, depending on CONFIG_SYSFS too, as is the case with
other sysfs-related machinery in this file.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/1359969195-27362-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04 11:29:52 +01:00
Linus Torvalds 6edacf05c8 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc update from Benjamin Herrenschmidt:
 "Just so that you don't get too bored on your Island here's a patch for
  3.8 fixing a nasty bug that affects the new 64T support that was
  merged in 3.7.  Please apply whenever you have a chance (and an
  internet connection!)"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mm: Fix hash computation function
2013-02-04 16:58:41 +11:00
Linus Torvalds f19637e743 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull radeon fixes from Dave Airlie:
 "I got these late last week, the main chunks of these fix a rendering
  regression since 3.7, and the settle ones all fix the issue where we
  don't wait long enough for the memory controller to settle after
  turning it off which causes bad memory reads, they all fix real users
  bugs, and most of them are destined for stable.

  Can't remember if you had net connection on that island :-)"

I don't know if the "two tin-cans and a string" thing here on "that
island" can really be considered internet, but I guess I can pull
things.  Barely.

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: switch back to the CP ring for VM PT updates
  drm/radeon: prevent crash in the ring space allocation
  drm/radeon: Calling object_unrefer() when creating fb failure
  drm/radeon/r5xx-r7xx: wait for the MC to settle after MC blackout
  drm/radeon/evergreen+: wait for the MC to settle after MC blackout
  drm/radeon: protect against div by 0 in backend setup
  drm/radeon: fix backend map setup on 1 RB sumo boards
  drm/radeon: add quirk for RV100 board
  drm/radeon: add WAIT_UNTIL to the non-VM safe regs list for cayman/TN
  drm/radeon: fix MC blackout on evergreen+
2013-02-04 16:51:53 +11:00
Aneesh Kumar K.V eda8eebdd1 powerpc/mm: Fix hash computation function
The ASM version of hash computation function was truncating the upper bit.
Make the ASM version similar to hpt_hash function. Remove masking vsid bits.
Without this patch, we observed hang during bootup due to not satisfying page
fault request correctly. The fault handler used wrong hash values to update
the HPTE. Hence we kept looping with page fault.

hash_page(ea=000001003e260008, access=203, trap=300 ip=3fff91787134 dsisr 42000000
The computed value of hash 000000000f22f390
update: avpnv=4003e46054003e00, hash=000000000722f390, f=80000006, psize: 2 ...

BenH: The over-masking has been there for ever but only hurts with the
new 64T support introduced in 3.7

Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.7]
2013-02-04 15:15:08 +11:00
Jiri Olsa 0231bb5336 perf: Fix event group context move
When we have group with mixed events (hw/sw) we want to end up
with group leader being in hw context. So if group leader is
initialy sw event, we move all the events under hw context.

The move is done for each event by removing it from its context
and adding it back into proper one. As a part of the removal the
event is automatically disabled, which is not what we want at
this stage of creating groups.

The fix is to initialize event state after removal from sw
context.

This fix resulted from the following discussion:

  http://thread.gmane.org/gmane.linux.kernel.perf.user/1144

Reported-by: Andreas Hollmann <hollmann@in.tum.de>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/r/1359714225-4231-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-03 12:01:29 +01:00
Linus Torvalds 8b31849a11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
 "Here's the current set of v3.8-rc fixes in the target-pending.git
  queue.  Apologies in advance for these missing the -rc6 release, and
  having to be destined for -rc7 code.

  The majority of these patches are regression bugfixes specific to
  v3.8-rc code changes, namely the zero-length CDB handling breakage
  after the sense_reason_t conversion, and preventing configfs port
  linking for unconfigured devices after the recent struct
  se_subsystem_dev removal.  These is also one (the divide by zero bug
  for unconfigured devices) that is CC'ed to stable."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Fix divide by zero bug in fabric_max_sectors for unconfigured devices
  target: Fix regression allowing unconfigured devices to fabric port link
  tcm_vhost: fix pr_err on early kick
  target: Fix zero-length READ_CAPACITY_16 regression
  target: Fix zero-length MODE_SENSE regression
  target: Fix zero-length INQUIRY additional sense code regression
2013-02-02 13:37:56 +11:00
YOSHIFUJI Hideaki 7810cc1e77 digsig: Fix memory leakage in digsig_verify_rsa()
digsig_verify_rsa() does not free kmalloc'ed buffer returned by
mpi_get_buffer().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-01 15:59:33 +11:00
H. Peter Anvin 972f7c8322 x86, doc: Boot protocol 2.12 is in 3.8
The boot protocol 2.12 changes were pulled for 3.8, so update the
documentation accordingly.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-31 20:23:49 -08:00
H. Peter Anvin b5831174f9 Linux 3.8-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJRCxWdAAoJEHm+PkMAQRiG3bAH/28D2NbRLIDo6QIzk3seCCh3
 A6vqDWbX671JRa0RO38DrWhqv6L/EisvjNZMVxBNaN635+3+yC7wsKziXYxA4+AL
 Ef9JISiwXykRWwrC8Q34YBWfWgeFTmaau71IRv45x2OTwiijd6cvjXhLDrOhHEGL
 rdAGJxIdBOfuASw1zpn9PxzfAtg1j/FGP03B4yy+JFhYzuDp3pvnDIysAnXefLml
 UheuBELdNf8Jx7NtW80PTa+TQtruWPC8otoiJevV4TrAWmAOctJiM1izL+VOZqqD
 I/v5aGf9mCN+cxLq06imwgEHWmP1I7yDdjjTHrr4wwIMws1vg8PNsJqG3AsPcwM=
 =dR8S
 -----END PGP SIGNATURE-----

Merge tag 'v3.8-rc6' into x86/urgent

Linux 3.8-rc6

Merged in order to add a documentation update versus new code in
upstream.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-31 20:22:57 -08:00
Linus Torvalds 88b62b915b Linux 3.8-rc6 2013-02-01 12:08:14 +11:00
Linus Torvalds cc6c954a07 A fix for stacked dm thin devices and a fix for the new dm WRITE SAME
support.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRCn+IAAoJEK2W1qbAHj1naQEP/2eXMOslRyws7M6CcsEgpEK9
 N2L2hf6bD3xF/04ZSLbHFI6hPe9wDXSL9Vxd+DRLYTSnc0E9WYBXHmE6Eb0L0xK8
 m0Iubk/hi7mk6mnMJtpTFT5pazBTPhVz0nXOijguh5U6PW0xL+4ypXe9nrH2jtW0
 DvEHFDIPbKcqwplm8nvo/QJ5O3YNQaMifKUtpXF/JWGlCYP4vPk0dJVg9ATbscEV
 Fg3kefoJzZM09q3Uvo01wigbj+wRkpBK9+CiyW6XcE0lkOAnFmpvyYerIoAHAK37
 Rsw5J4aMPA9U8mggBEtlHBWa0q5utZafHM11lT2ZeFGCXkdn+TSWni6O7ov54xPP
 Cd7jx+uNpe/OuLT5YjbCg2IMXgJs+zIZMSeqSj3SrywE0a0EQHECWiXaFmMmrCCJ
 TgZtmp/HS1UsdoiHA3v3ZX3AaX4W+mggYp/5md9P1vHyYS9uTlgSVplhwtVgsL23
 EsDxNNxODSIFMAMnrXxAV+NBPiQRY42K22hK/RrnWew9roAQHxroIvQDmzzm5ZRL
 BqCFW3w/x2loJOZZ6NH/J8IUEoF9RhCK1tOGVjFuAVn30srt3zXb4pOzYeydykT7
 m04HaGO7rCBJI75XdVDhm6ozOvV/GhXF2fJOt4qyoX/X6M8YN5i0jfwC3rhKeuOe
 U9fyyYoQV37EWIRI9K5Q
 =qtaF
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull more device-mapper fixes from Alasdair G Kergon:
 "A fix for stacked dm thin devices and a fix for the new dm WRITE SAME
  support."

* tag 'dm-3.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm: fix write same requests counting
  dm thin: fix queue limits stacking
2013-02-01 12:04:22 +11:00
Dave Airlie 089c71a7c3 Merge branch 'drm-fixes-3.8' of git://people.freedesktop.org/~agd5f/linux
Alex writes:
"A few more radeon fixes for 3.8.  Mostly small stuff.  The big
change is disabling the use of the DMA ring for VM PT updates.  This
reverts back to the 3.7 behavior.  Problem is we can get huge PT
updates in certain cases that are too big for the DMA ring.  I've
got patches to use an IB for this so I can re-enable the use of the
DMA ring for VM PT updates in 3.9.  This request also includes the
patches from the last pull request I sent on Monday in case you haven't
pulled them yet."

* 'drm-fixes-3.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: switch back to the CP ring for VM PT updates
  drm/radeon: prevent crash in the ring space allocation
  drm/radeon: Calling object_unrefer() when creating fb failure
  drm/radeon/r5xx-r7xx: wait for the MC to settle after MC blackout
  drm/radeon/evergreen+: wait for the MC to settle after MC blackout
  drm/radeon: protect against div by 0 in backend setup
  drm/radeon: fix backend map setup on 1 RB sumo boards
  drm/radeon: add quirk for RV100 board
  drm/radeon: add WAIT_UNTIL to the non-VM safe regs list for cayman/TN
  drm/radeon: fix MC blackout on evergreen+
2013-02-01 10:51:02 +11:00
Nicholas Bellinger 7a3cf6ca1a target: Fix divide by zero bug in fabric_max_sectors for unconfigured devices
This patch fixes a possible divide by zero bug when the fabric_max_sectors
device attribute is written and backend se_device failed to be successfully
configured -> enabled.

Go ahead and use block_size=512 within se_dev_set_fabric_max_sectors()
in the event of a target_configure_device() failure case, as no valid
dev->dev_attrib.block_size value will have been setup yet.

Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-01-31 15:22:53 -08:00
Nicholas Bellinger faa06ab9ae target: Fix regression allowing unconfigured devices to fabric port link
This patch fixes a v3.8-rc1 regression bug where an unconfigured se_device
was incorrectly allowed to perform a fabric port-link.  This bug was
introduced in commit:

  commit 0fd97ccf45
  Author: Christoph Hellwig <hch@infradead.org>
  Date:   Mon Oct 8 00:03:19 2012 -0400

      target: kill struct se_subsystem_dev

which ended up dropping the original se_subsystem_dev->se_dev_ptr check
preventing this from happening with pre commit 0fd97ccf code.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-01-31 15:13:23 -08:00
Linus Torvalds cf5425bfcd Merge branch 'for-3.8/upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
PullHID fixes from Jiri Kosina:

 - fix i2c-hid and hidraw interaction, by Benjamin Tissoires

 - a quirk to make a particular device (Formosa IR receiver) work
   properly, by Nicholas Santos

* 'for-3.8/upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: i2c-hid: fix i2c_hid_output_raw_report
  HID: usbhid: quirk for Formosa IR receiver
  HID: remove x bit from sensor doc
2013-02-01 08:44:59 +11:00
Linus Torvalds bf6c8a8148 NFS client bugfixe for Linux 3.8
- Error reporting in nfs_xdev_mount incorrectly maps all errors to ENOMEM
 - Fix an NFSv4 refcounting issue
 - Fix a mount failure when the server reboots during NFSv4 trunking discovery
 - NFSv4.1 mounts may need to run the lease recovery thread.
 - Don't silently fail setattr() requests on mountpoints
 - Fix a SUNRPC socket/transport livelock and priority queue issue
 - We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRCpS4AAoJEGcL54qWCgDyqucP/2CTv5leu+X5/0PXBOykAIHg
 8oEsTEz7/4IvIxTXzuHDYirMnm/mulfGF6NrdPqfvxpAHqRfVBfLFocfNLMVhQci
 97RmBfEEGM22AToYUubML5bIxr0QllV4s9Vmyh/zGDan52y7zNNlZX+v6aLjZbJB
 Fbolihpcch6lhQEUNAzK0B0ddimDl9lazx/WTmMOD/JrwOqzA4FJC+YxBe88nfzQ
 c6sYyEptBaSirbCOlueqGpv8skB1CLpFJXguXToPXFxpWed6uoGrIwLO7MLUdFpJ
 Xw+j8cuv/wjyYJGVKjhW7kXtwK8T7+u4bT2L883R01XYXr8XfkkLON0dgG1X/unk
 80mLzCO1+qRdoDSQ4b/V4B0nScPRCJuoZpftjCi2uhKewNcxQPMZ2V5/D7pO3uyE
 NdhxByB8D86JfNrIcBcRaxfuiQsurQBDvsDNmWPBZSOmH/dmHqTQGLcIe6N94A0B
 c7KFtXrN2MzOl8S68dUpbhftObq9X0oK2oxFFLWRQoqjrFtiDLU5JilV0bEH2CMo
 gJX7CRrPoJ2wmNToKdPJRWBYDmtMMIThq3vIpCj1FFzX17r5grJpqCmp/TViu/ER
 r8rmJzni+nmHaO1NLJMTCHzLJ8soiKyBq8PZKAbipTJxy+TXI/69jnWzpIQ/pbM/
 JN+0tiCcqmUmXsO+hrCP
 =lWFV
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Error reporting in nfs_xdev_mount incorrectly maps all errors to
   ENOMEM

 - Fix an NFSv4 refcounting issue

 - Fix a mount failure when the server reboots during NFSv4 trunking
   discovery

 - NFSv4.1 mounts may need to run the lease recovery thread.

 - Don't silently fail setattr() requests on mountpoints

 - Fix a SUNRPC socket/transport livelock and priority queue issue

 - We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.

* tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session
  SUNRPC: When changing the queue priority, ensure that we change the owner
  NFS: Don't silently fail setattr() requests on mountpoints
  NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery
  NFSv4: Fix NFSv4 trunking discovery
  NFSv4: Fix NFSv4 reference counting for trunked sessions
  NFS: Fix error reporting in nfs_xdev_mount
2013-02-01 08:43:52 +11:00
Linus Torvalds aeb8eede8e Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "A number of fixes all across the MIPS tree.  No area is particularly
  standing out and things have cooled down quite nicely for a release."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Function tracer: Fix broken function tracing
  mips: Move __virt_addr_valid() to a place for MIPS 64
  MIPS: Netlogic: Fix UP compilation on XLR
  MIPS: AR71xx: Fix AR71XX_PCI_MEM_SIZE
  MIPS: AR724x: Fix AR724X_PCI_MEM_SIZE
  MIPS: Lantiq: Fix cp0_perfcount_irq mapping
  MIPS: DSP: Fix DSP mask for registers.
  MIPS: Fix build failure by adding definition of pfn_pmd().
  MIPS: Octeon: Fix warning.
  MIPS: delay.c: Check BITS_PER_LONG instead of __SIZEOF_LONG__
  MIPS: PNX833x: Fix comment.
  MIPS: Add struct p_format to union mips_instruction.
  MIPS: Export <asm/break.h>.
  MIPS: BCM47xx: Enable SSB prerequisite SSB_DRIVER_PCICORE.
  MIPS: BCM47xx: Select GPIOLIB for BCMA on bcm47xx platform
  MIPS: vpe.c: Fix null pointer dereference in print arguments.
2013-02-01 08:43:04 +11:00
Alex Deucher 3646e4209f drm/radeon: switch back to the CP ring for VM PT updates
For large VM page table updates, we can sometimes generate
more packets than there is space on the ring.  This happens
more readily with the DMA ring since it is 64K (vs 1M for the
CP).  For now, switch back to the CP.  For the next kernel,
I have a patch to utilize IBs for VM PT updates which
alleviates this problem.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=58354

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-31 16:19:19 -05:00
Alex Deucher fd5d93a001 drm/radeon: prevent crash in the ring space allocation
If the requested number of DWs on the ring is larger than
the size of the ring itself, return an error.

In testing with large VM updates, we've seen crashes when we
try and allocate more space on the ring than the total size
of the ring without checking.

This prevents the crash but for large VM updates or bo moves
of very large buffers, we will need to break the transaction
down into multiple batches.  I have patches to use IBs for
the next kernel.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2013-01-31 16:14:16 -05:00
liu chuansheng f2d68cf4da drm/radeon: Calling object_unrefer() when creating fb failure
When kzalloc() failed in radeon_user_framebuffer_create(), need to
call object_unreference() to match the object_reference().

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Signed-off-by: xueminsu <xuemin.su@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2013-01-31 16:14:15 -05:00
Alex Deucher 39dc9aabd8 drm/radeon/r5xx-r7xx: wait for the MC to settle after MC blackout
Some chips seem to need a little delay after blacking out
the MC before the requests actually stop. Stop DMAR errors
reported by Shuah Khan.

Reported-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-31 16:13:42 -05:00
Michael S. Tsirkin 71f1e45aa9 tcm_vhost: fix pr_err on early kick
It's OK to get kick before backend is set or after
it is cleared, we can just ignore it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-01-31 13:06:06 -08:00
Benjamin Tissoires c284979aff HID: i2c-hid: fix i2c_hid_output_raw_report
i2c_hid_output_raw_report is used by hidraw to forward set_report requests.
The current implementation of i2c_hid_set_report needs to take the
report_id as an argument. The report_id is stored in the first byte
of the buffer in argument of i2c_hid_output_raw_report.

Not removing the report_id from the given buffer adds this byte 2 times
in the command, leading to a non working command.

Reported-by: Andrew Duggan <aduggan@synaptics.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-01-31 17:57:53 +01:00
Al Cooper 58b69401c7 MIPS: Function tracer: Fix broken function tracing
Function tracing is currently broken for all 32 bit MIPS platforms.
When tracing is enabled, the kernel immediately hangs on boot.
This is a result of commit b732d439cb
that changes the kernel/trace/Kconfig file so that is no longer
forces FRAME_POINTER when FUNCTION_TRACING is enabled.

MIPS frame pointers are generally considered to be useless because
they cannot be used to unwind the stack. Unfortunately the MIPS
function tracing code has bugs that are masked by the use of frame
pointers. This commit fixes the bugs so that MIPS frame pointers
don't need to be enabled.

The bugs are a result of the odd calling sequence used to call the trace
routine. This calling sequence is inserted into every traceable function
when the tracing CONFIG option is enabled. This sequence is generated
for 32bit MIPS platforms by the compiler via the "-pg" flag.

Part of the sequence is "addiu sp,sp,-8" in the delay slot after every
call to the trace routine "_mcount" (some legacy thing where 2 arguments
used to be pushed on the stack). The _mcount routine is expected to
adjust the sp by +8 before returning.  So when not disabled, the original
jalr and addiu will be there, so _mcount has to adjust sp.

The problem is that when tracing is disabled for a function, the
"jalr _mcount" instruction is replaced with a nop, but the
"addiu sp,sp,-8" is still executed and the stack pointer is left
trashed. When frame pointers are enabled the problem is masked
because any access to the stack is done through the frame
pointer and the stack pointer is restored from the frame pointer when
the function returns.

This patch writes two nops starting at the address of the "jalr _mcount"
instruction whenever tracing is disabled. This means that the
"addiu sp,sp.-8" will be converted to a nop along with the "jalr".  When
disabled, there will be two nops.

This is SMP safe because the first time this happens is during
ftrace_init() which is before any other processor has been started.
Subsequent calls to enable/disable tracing when other CPUs ARE running
will still be safe because the enable will only change the first nop
to a "jalr" and the disable, while writing 2 nops, will only be changing
the "jalr". This patch also stops using stop_machine() to call the
tracer enable/disable routines and calls them directly because the
routines are SMP safe.

When the kernel first boots we have to be able to handle the gcc
generated jalr, addui sequence until ftrace_init gets a chance to run
and change the sequence. At this point mcount just adjusts the stack
and returns. When ftrace_init runs, we convert the jalr/addui to nops.
Then whenever tracing is enabled we convert the first nop to a "jalr
mcount+8". The mcount+8 entry point skips the stack adjust.

[ralf@linux-mips.org: Folded in  Steven Rostedt's build fix.]

Signed-off-by: Al Cooper <alcooperx@gmail.com>
Cc: rostedt@goodmis.org
Cc: ddaney.cavm@gmail.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/4806/
Patchwork: https://patchwork.linux-mips.org/patch/4841/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-01-31 15:28:48 +01:00
Alasdair G Kergon fe7af2d3ba dm: fix write same requests counting
When processing write same requests, fix dm to send the configured
number of WRITE SAME requests to the target rather than the number of
discards, which is not always the same.

Device-mapper WRITE SAME support was introduced by commit
23508a96cd ("dm: add WRITE SAME support").

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
2013-01-31 14:23:36 +00:00
Steven Rostedt 196897a297 mips: Move __virt_addr_valid() to a place for MIPS 64
Commit d3ce884318 "MIPS: Fix modpost error in modules attepting to use
virt_addr_valid()" moved __virt_addr_valid() from a macro in a header
file to a function in ioremap.c. But ioremap.c is only compiled for MIPS
32, and not for MIPS 64.

When compiling for my yeeloong2, which supposedly supports hibernation,
which compiles kernel/power/snapshot.c which calls virt_addr_valid(), I
got this error:

  LD      init/built-in.o
kernel/built-in.o: In function `memory_bm_free':
snapshot.c:(.text+0x4c9c4): undefined reference to `__virt_addr_valid'
snapshot.c:(.text+0x4ca58): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x4e44c): undefined reference to `__virt_addr_valid'
kernel/built-in.o: In function `snapshot_write_next':
(.text+0x4e890): undefined reference to `__virt_addr_valid'
make[1]: *** [vmlinux] Error 1
make: *** [sub-make] Error 2

I suspect that __virt_addr_valid() is fine for mips 64. I moved it to
mmap.c such that it gets compiled for mips 64 and 32.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-01-31 15:14:59 +01:00
Mike Snitzer 0f640dca08 dm thin: fix queue limits stacking
thin_io_hints() is blindly copying the queue limits from the thin-pool
which can lead to incorrect limits being set.  The fix here simply
deletes the thin_io_hints() hook which leaves the existing stacking
infrastructure to set the limits correctly.

When a thin-pool uses an MD device for the data device a thin device
from the thin-pool must respect MD's constraints about disallowing a bio
from spanning multiple chunks.  Otherwise we can see problems.  If the raid0
chunksize is 1152K and thin-pool chunksize is 256K I see the following
md/raid0 error (with extra debug tracing added to thin_endio) when
mkfs.xfs is executed against the thin device:

md/raid0:md99: make_request bug: can't convert block across chunks or bigger than 1152k 6688 127
device-mapper: thin: bio sector=2080 err=-5 bi_size=130560 bi_rw=17 bi_vcnt=32 bi_idx=0

This extra DM debugging shows that the failing bio is spanning across
the first and second logical 1152K chunk (sector 2080 + 255 takes the
bio beyond the first chunk's boundary of sector 2304).  So the bio
splitting that DM is doing clearly isn't respecting the MD limits.

max_hw_sectors_kb is 127 for both the thin-pool and thin device
(queue_max_hw_sectors returns 255 so we'll excuse sysfs's lack of
precision).  So this explains why bi_size is 130560.

But the thin device's max_hw_sectors_kb should be 4 (PAGE_SIZE) given
that it doesn't have a .merge function (for bio_add_page to consult
indirectly via dm_merge_bvec) yet the thin-pool does sit above an MD
device that has a compulsory merge_bvec_fn.  This scenario is exactly
why DM must resort to sending single PAGE_SIZE bios to the underlying
layer. Some additional context for this is available in the header for
commit 8cbeb67a ("dm: avoid unsupported spanning of md stripe boundaries").

Long story short, the reason a thin device doesn't properly get
configured to have a max_hw_sectors_kb of 4 (PAGE_SIZE) is that
thin_io_hints() is blindly copying the queue limits from the thin-pool
device directly to the thin device's queue limits.

Fix this by eliminating thin_io_hints.  Doing so is safe because the
block layer's queue limits stacking already enables the upper level thin
device to inherit the thin-pool device's discard and minimum_io_size and
optimal_io_size limits that get set in pool_io_hints.  But avoiding the
queue limits copy allows the thin and thin-pool limits to be different
where it is important, namely max_hw_sectors_kb.

Reported-by: Daniel Browning <db@kavod.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-01-31 14:11:14 +00:00