Sunday, March 20, 2011

Energy management in Linux: kernel version

UPDATE here.

Continuing from this post I want to show how the choice of a kernel version can have an important impact on the energy consumed by a computer (in my case, a Lenovo x200s).

I've been working on the battery quite often lately and I have noticed that the power consumption can vary quite a bit from kernel to kernel. This was of course a very subjective appreciation as the load could vary, the number of firefox tabs, the task I was doing or even how fast did I type.

The other day, however, after updating to kernel 2.6.38 i realized that idling, the computer barely went under 7W. I remember perfectly that "before", it could idle at aroung 6.0W, even with the wifi on, and now it was off. I decided to try an older version, and decided for 2.6.34. This is because from 2.6.35 to 2.6.37 there has been a very nasty bug that prevented my Intel Wifi 5300 agn card from injecting packets do to the famous -1 bug. Yes, I do audit my own wifi very often, why you ask? ;)

So I hacked the PKGBUILD file a bit and installed a custom 2.6.34.8 kernel along with my custom 2.6.38 one. I booted the laptop, turned the wifi off, closed Dropbox (powertop doesn't like it) and let it sit idle for a while. After a few minutes I closed the lid, previously deactivating sleep-on-close, to see how turning the screen completely off affects thigs. You can see the results on the following graph:

Energy consumption on a Lenovo x200s, KDE 4.6.1, WiFi & Bluetooth off, SSD disk, Screen 6/15 -> off
.

The result was so different that I used the .config from the 2.6.34 kernel to recompile the .38 and see maybe I missed something. As you can see from the blue line, that is not the case, the .38 kernel just consumes consistently 1W (~20%) more than the .34 version...

Take this results as they are: two different kernel versions with a particular custom configuration on a particular hardware.

I am NOT saying that kernel 2.6.34 is more energy-efficient than 2.6.38 as a general rule.
I AM saying that some kernel versions are more efficient than others on some hardware - test your versions on your hardware and pick the one that works best for you.

Friday, March 11, 2011

Too hipster

Lately there have been lots of hipster jokes on the interwebs. I myself got this funny idea about a geek-hipster joke and I though of making a small strip. After a while with Gimp, I proudly present:

[Click the image to see the original 3 scene file]

 Look, I'm an artist now! I have my own web-strip! :D

Monday, March 7, 2011

BeagleBoard-xM u-boot without serial cable - USB console

This is a note-to-self post, if you find it useful, you're welcome. If something is not clear, just ask.

Background: I got a BeagleBoard-xM but I had no serial cable and didn't want to get one (the shop is too far, internet store shipping is too slow). The thing connects via USB to the computer, and many devices implement serial over USB, so I thought "well, I'll connect the thing and as soon as it powers up I will get a /dev/ttyUSB0 to connect and interact with the bootloader/kernel". No luck. From my previous experiences with foneras I also tried an ethernet connection, in case it comes with ssh by default, but that didn't work either (the thing comes with a very small test implemetation that doesn't even power up the ethernet hardware).

Ok, lets do some RTFM. Done. Looking aroud, turns out that there is in fact a project for a USB console. I tried it but something was so wrong that it didn't even boot. Since people were reporting success with it, I assume that is some change in the xM version that makes it incompatible. and the last commit to the git repo was in mid-2009, so there was little hope that way. Next...

Short version

Turns out that the angstrom demo page contains almost everything needed. Download MLO, u-boot.bin, put them on the boot partition of the SD card as described in the wiki and jump to the boot.scr section.

Long version

Cross compile

First problem is getting cross compiling to work. We have a x86, we want to get arm code, gcc is not enough. There are many compilers and they have their differences.

For Arch, I used the package "arm-2010-arm-none-eabi 2010.09-1" from the AUR, which is this version. It fetches the i686 version so I used an Arch VM for compiling.

To use cross compiling, invoke make with "CROSS_COMPILE=arm-none-eabi-" parameter.

WARNING1: The name may differ, for older versions is "CROSS_COMPILE=arm-none-linux-gnueabi-". For any linux "locate eabi | grep gcc" should solve your problem, in Arch "pacman -Ql PACKAGE_NAME | grep bin" will do the trick even better :)

WARNING2: It turns out that the 2010 version has a nasty bug - or maybe it's something with my VM system - and it doesn't use the cross-assembler by default. Try to compile something and it keeps dying:
Assembler messages:
Fatal error: Invalid -march= option: `armv5'
Of course it's not valid, since it's calling the x86 assembler. I worked around it with the following script:
$ emacs /usr/local/bin/as
#!/bin/sh                                                                                
for i in $@; do
    echo $i | grep "\-march=arm" > /dev/null;
    if [ "$?" == "0"  ]; then
        /usr/bin/arm-none-eabi-as $@
        exit $?
    fi
done
/usr/bin/as $@
exit $?
Just make sure that /usr/local/bin is before /usr/bin in your $PATH, and you're good to go.

U-Boot

So, now we need that u-boot configures the USB OTG as a serial device and listens to it. And the only project aimed at it so fails hard on the xM that it doesn't even boot. Let's start with the wiki:

Mainline U-Boot has good support for BeagleBoard (except for revision C4; see note below).
[...]
Note: For experimental U-Boot patches not ready for mainline yet, Steve's Beagle U-Boot git repository is used to test them. [This was the omap3-dev-usb version no longer mantained that faile hardly]
[...]
Note: For beagleboard revision C4, above sources will not work. USB EHCI does not get powered, hence devices are not detected... get a patched version of u-boot from http://gitorious.org/beagleboard-default-u-boot/beagle_uboot_revc4/ (Update on April 23 - 2010: This repository has been superseded by the U-Boot version found at http://gitorious.org/beagleboard-validation/)
Ok, so I understand that the mainline is superseeded by the omap3, which are superseeded by the beagleboard-validation repository. Very well.

Let's checkout the BeagleBoard validation which has the validation code.

It looks promising, since the default git branch is called "xm".
$ make CROSS_COMPILE=arm-none-eabi- mrproper
$ make CROSS_COMPILE=arm-none-eabi- omap3_beagle_config
$ make CROSS_COMPILE=arm-none-eabi-
$ cp u-boot.bin /mnt/SDCARD/
It boots but unfortunately it fails to create a usb device. Last commit is June 2010, so I don't expect it being developed anymore. There are mentions to the musb device in the source code, it must be doable somehow. So I try the newest possible branch, jason 20110303 - doesn't even compile.
A bit less new, jason 20110302 - it works! When plugged to a computer it is detected as /dev/ttyACM0! Hurray... not so fast. When added boot.scr (see below) and connected with screen or minicom, it's silent. Damn, so close...
Let's go one more step back koen/beagle-2010.12. Compiles, loads (with boot.scr), creates the device... and answers! Yoohoo! But wait... (yes, there still is a catch) the output is semi-garbage! Well, let's try some other u-boot version...

Looking at the commit messages turn out that the upstream version is still being developed! All the steps again, at it goes silent. Tried with the latest stable release and it was almost-working still a bit unstable some letters were a bit off from the output, but pretty usable and functional.

boot.scr

By default the bootloader listens and speaks to the hardware serial console. To convince it to do otherwise we need to put a small boot.scr file on the sd card, just after copying u-boot.bin to it. To create the file we write the script to a text file:
$ emacs myscript.txt
setenv stdin usbtty
setenv stdout usbtty
Now we download any u-boot source and we issue a "make tools" command (no cross-compiling needed). After it finishes compiling:
tools/mkimage -A arm -T script -C none -d src.txt boot.scr
Then we copy the boot.scr file to the sd card in order to have a working usb bootloader console :D
In case you don't want to do all the stuff, here is a sample file:
$ hexdump boot.scr
0000000 0527 5619 0680 b4cc 744d b50a 0000 3100
0000010 0000 0000 0000 0000 e47f 58bb 0205 0006
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0000040 0000 2900 0000 0000 6573 6574 766e 7320
0000050 6474 6e69 7520 6273 7474 0a79 6573 6574
0000060 766e 7320 6474 756f 2074 7375 7462 7974
0000070 000a                                   
0000071

Geek level: hard

Why boot.scr and not some other name?
$ emacs include/configs/omap3_beagle.h
#define CONFIG_EXTRA_ENV_SETTINGS \
        "loadaddr=0x82000000\0" \
        "usbtty=cdc_acm\0" \
        "console=ttyS2,115200n8\0" \
        "mpurate=500\0" \
        "vram=12M\0" \
        "dvimode=1024x768MR-16@60\0" \
        "defaultdisplay=dvi\0" \
        "mmcdev=0\0" \
        "mmcroot=/dev/mmcblk0p2 rw\0" \
        "mmcrootfstype=ext3 rootwait\0" \
        "nandroot=/dev/mtdblock4 rw\0" \
        "nandrootfstype=jffs2\0" \
        "mmcargs=setenv bootargs console=${console} " \
                "mpurate=${mpurate} " \
                "vram=${vram} " \
                "omapfb.mode=dvi:${dvimode} " \
                "omapfb.debug=y " \
                "omapdss.def_disp=${defaultdisplay} " \
                "root=${mmcroot} " \
                "rootfstype=${mmcrootfstype}\0" \
        "nandargs=setenv bootargs console=${console} " \
                "mpurate=${mpurate} " \
                "vram=${vram} " \
                "omapfb.mode=dvi:${dvimode} " \
                "omapfb.debug=y " \
                "omapdss.def_disp=${defaultdisplay} " \
                "root=${nandroot} " \
                "rootfstype=${nandrootfstype}\0" \
LOOK!-> "loadbootscript=fatload mmc ${mmcdev} ${loadaddr} boot.scr\0" \
        "bootscript=echo Running bootscript from mmc ...; " \
                "source ${loadaddr}\0" \
        "loaduimage=fatload mmc ${mmcdev} ${loadaddr} uImage\0" \
        "mmcboot=echo Booting from mmc ...; " \
                "run mmcargs; " \
                "bootm ${loadaddr}\0" \
        "nandboot=echo Booting from nand ...; " \
                "run nandargs; " \
                "nand read ${loadaddr} 280000 400000; " \
                "bootm ${loadaddr}\0" \
#define CONFIG_BOOTCOMMAND \
        "if mmc rescan ${mmcdev}; then " \
                "if run loadbootscript; then " \
                        "run bootscript; " \
                "else " \
                        "if run loaduimage; then " \
                                "run mmcboot; " \
                        "else run nandboot; " \
                        "fi; " \
                "fi; " \
        "else run nandboot; fi"
Btw, you can change all kinds of fun stuff there, I recommend you take a look :D

Geek level: harder

Ok, so we have a self-made u-boot.bin and boot.scr. Why not have a MLO also? (MLO is the equivalent to grub's STAGE1 bootloader).
We grab the sources.
make distclean
make make omap3530beagle_config
make CROSS_COMPILE=arm-none-eabi-
This will result in a x-load.bin file. It's not ready yet, it needs to be signed (AFAIU, its just putting some size header, not real signing).
$ gcc scripts/signGP.c
$ ./a.out
And there we go! We can copy the x-load.bin.ift to the sdcard as MLO, then out u-boot.ini, our boot.src and we are good to go!

Next step

Have a kernel/init that allows USB console logging. Or, suboptimally, maybe just use a distro with a default ssh daemon...

Friday, March 4, 2011

Real world JavaScript solution

This is the final look of the code to solve the nasty performance problems with the instant search.

if(!$.browser.msie) { /* SORRY, BUT IE IS *SLOW* WITH JQUERY */
    uls = $("#metrics ul:visible"); /* Nasty hack: */
    uls.hide();                     /* 500x speedup on Chrome */
    if(text == "") {
        $(field_search).parents(".search_realm").find(".search_item").show();
    } else {
        $(field_search).parents(".search_realm").find(".search_item[id*="+text+"]").show();
        $(field_search).parents(".search_realm").find(".search_item:not([id*="+text+"])").hide();
    }
    uls.show();
} else { /* IE SPECIFIC ALGORITHM (x10 speedup on IE) */
    $( field_search ).parents(".search_realm").find(".search_item").each(function(){
        if(this.id.indexOf(text) == -1){
            $(this).hide();
        } else {
            $(this).show();
        }
    });
}


So, what happened here?
1. The horrible, horrible Chrome performance was due to a too-early rendering attempt. Hiding the containing ul makes Chrome stop trying to render after each element "reappears" and causes no flicker on the screen. The time goes from 12000ms to ~70ms for a 1549 element set.
2. IE didn't like jQuery. Well, don't make it use jQuery. Simple, huh? ;)
Now Chrome is an absolute performance champion with times 30/70, where as IE stays in the 320's and firefox in the 300/150's.

Thursday, March 3, 2011

Real world JavaScript performance mess

For a change, a post that is neither a rant nor a joke. Yes, I can hear people leaving already...

Well, I wanted to show a funny fact about JavaScript performance in a real-world case. The task is quite simple. I have a ul with 1549 (!) li elements, each with a unique id. I want to show only those whose id contains a certain substring (instant search).

For this task I have two candidates, either using jQuery selectors or "manually" filtering the list. The code is as below:

$(field_search).parents(".search_realm").find(".search_item").each(function(){
    if(this.id.indexOf(text) == -1){
        $(this).hide();
    } else {
        $(this).show();
    }
});


The jQuery option is:
$(field_search).parents(".search_realm").
            find(".search_item[id*="+text+"]").show();
$(field_search).parents(".search_realm").
            find(".search_item:not([id*="+text+"])").hide();

As you can see, it's not that complicated. Of course, suggestions accepted ;)

Now let's see the results (Hide/Show) in ms, averaged over multiple runs:
Manual         jQuery       [Browser Version]
Firefox Linux:   510/810        350/750      3.6.13
Firefox WinXP:   240/589        114/550      3.6.13
Chromium Linux: 1680/14320       35/14600    9.0.597.94
Chrome WinXP:   2340/12500       60/12600    9.0.597.107
IE7:             320/320        410/3000     7.0.5730.13

jQuery version: 1.4.4, 1.5.1 (minified)

That's right, no typos there. Weird facts:
  1. Chrome is slower than firefox in 3 out of 4 cases, 2 of them being a trainwreck case.
  2. Showing the previously hidden li elements takes significantly longer than the opposite except in IE7, manual method. In Chrome showing things back takes 10x more time, 500x (!) jQuery case.
  3. jQuery makes it slighty faster for Firefox, variable on Chrome, slower on IE7.
  4. Firefox on Windows is faster than on Linux. For Chrome results are mixed.
  5. Chrome is both the best and worst performer, by an order of magnitude in both cases.
  6. IE7 is capable of the fastest time to show the elements back, Chrome the slowest. Firefox is neither, but it's the best on average.
The performance fight is far from being over...

Update: In the end, after some hacks, Chrome wins the battle...