Bisecting kernel errors

Today I tried to run the latest kernel from Linus on VMWare Fusion. The new kernel stuck at Booting. Nothing more.

Now you can just manually check through the previous commits, using git log etc, but this manual bisecting will take a lot of time. Git support a function bisect, which will guide you through the process. From the manual:

git-bisect: Find by binary search the change that introduced a bug.

So first you'll start the bisection process using:

git bisect start  

Next you'll define the current version as bad:

git bisect bad  

Now you'll find a previous version that is working. For example the v3.18 tag. You'll checkout this version, compile and test. If it is good, you'll say:

git bisect good  
Bisecting: 1116 revisions left to test after this (roughly 10 steps)  
[3a647c1d7ab08145cee4b650f5e797d168846c51] Merge tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Bisect will checkout the next commit to test, and you'll go through the compilation and testing again.

After iterating through the whole bisection process you'll find the bad commit:

root@debian:/usr/local/src/linux# git bisect bad bd809af16e3ab1f8d55b3e2928c47c67e2a865d2  
Bisecting: 0 revisions left to test after this (roughly 0 steps)  
[f5b2831d654167d77da8afbef4d2584897b12d0c] x86: Respect PAT bit when copying pte values between large and normal pages
root@debian:/usr/local/src/linux# git bisect good f5b2831d654167d77da8afbef4d2584897b12d0c  
bd809af16e3ab1f8d55b3e2928c47c67e2a865d2 is the first bad commit  
commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2  
Author: Juergen Gross <jgross@suse.com>  
Date:   Mon Nov 3 14:02:03 2014 +0100

    x86: Enable PAT to use cache mode translation tables

    Update the translation tables from cache mode to pgprot values
    according to the PAT settings. This enables changing the cache
    attributes of a PAT index in just one place without having to change
    at the users side.

    With this change it is possible to use the same kernel with different
    PAT configurations, e.g. supporting Xen.

    Signed-off-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Toshi Kani <toshi.kani@hp.com>
    Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: stefan.bader@canonical.com
    Cc: xen-devel@lists.xensource.com
    Cc: ville.syrjala@linux.intel.com
    Cc: david.vrabel@citrix.com
    Cc: jbeulich@suse.com
    Cc: plagnioj@jcrosoft.com
    Cc: tomi.valkeinen@ti.com
    Cc: bhelgaas@google.com
    Link: http://lkml.kernel.org/r/1415019724-4317-18-git-send-email-jgross@suse.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

:040000 040000 1d3924c3a82c82084633603f825bb330d841ce32 63fe9d9686871c2dc406fc49afc2cc5eddb27607 M    arch

All versions where compiled and tested using the following commands:

cp /boot/config-`uname -r`* .config  
yes ''|make oldconfig  
make localmodconfig  
make menuconfig  
make -j2  
make modules_install install  

To see the bisection history you can use the command:

git bisect log  

Now I have the specific patch that isn't working, I can return to the latest commit (before bisect start), and revert this specific bad commit:

git bisect reset  
git revert --no-commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2  

And yes, now the latest kernel is running, without this specific commit.