More SystemTap Thoughts
I've read the SystemTap architecture paper. My initial reactions, which might be muddied by misunderstandings, misreadings, general ignorance, etc.:
Or, of course, maybe I'm just not getting it. Perhaps some SystemTappers out there can set me straight?
- Every script execution invokes the GCC toolchain to produce a kernel module, then loads that module, executes for a time, and unloads the module. This makes implementation more tractable, because the SystemTap folks don't have to write a different back-end for every CPU, nor do they have to define a little bytecode VM for user-level to communicate with the kernel, as DTrace did. However, this compile-link-load cycle may take a user-perceptible chunk of time, especially if a script is invoked repeatedly from some other script. Worrying about this sort of incremental friction might seem like premature optimization, but when you rely for runtime performance on a big piece of software that is not optimized for runtime performance, such as the GCC compile/link cycle (which, after all, is optimized to produce fast binaries, not to produce binaries fast), you're throwing away a lot of flexibility right out of the gate.
- I'm not in love with the language. Like D, it uses C's expression syntax. However, unlike D, it doesn't use C's type syntax. This isn't just an inconvenience. D scripts can #include header files right out of a source tree, or the kernel, or wherever, and can use those types in a natural way. This can be very helpful when instrumenting a C application.
- On the other hand, when not instrumenting a C application, the architecture doesn't seem to anticipate external sources of probe points. They discuss being able to probe user-level applications, but only in terms of tracing specific program counters. This doesn't always make sense. If, e.g., the target application is a scheme interpreter, the programmer will want to interact with his program's control flow in terms of source-level function entry/exit, rather than random program counters within the interpreter. While the core functionality of SystemTap can be extended via "TapSets," it sounds like these tapsets are stuck on the wrong side of the application being probed to do this sort of thing well. (I.e., instead of the scheme interpreter publishing a semantic interface to its internals, the TapSet has to contain enough knowledge about the scheme interpreter to reverse engineer its current state.)
Or, of course, maybe I'm just not getting it. Perhaps some SystemTappers out there can set me straight?
1 Comments:
Sorry for not responding to your weblog article earlier. We can't monitor everywhere - feel free to post to our mailing list.
With respect to your points:
1) Indeed, relying on gcc is a speed challenge. We are working on several means to reduce the penalty (caching compiled scripts, reorganizing the generated C code to let the compiler work faster). Simple scripts are already fast (a second or two).
2) I'm not familiar with implementation details of dtrace's #include, but I suspect it's not quite what it seems (consider #ifdefs). In systemtap, we extract type information from the debugging information compiled into the executable, so to a large extent things work automatically. It's the same as in a debugger, when you evaluate an target-side expression, it does so by processing debugging information, not by parsing source code.
3) With respect to probe points such as perl interpreter function calls, we will be able to include that soon after general user-space dynamic probing is implemented. It's a matter of cleverly composing several technologies, not one of architectural change or inadequacy.
Post a Comment
<< Home