Linux Trace Toolkit Status

Last updated February 10, 2003.

During the 2002 Ottawa Linux Symposium tracing BOF, a list of desirable features for LTT was collected by Richard Moore. Since then, a lot of infrastructure work on LTT has been taking place. This status report aims to track current development efforts and the current status of the various features. This first version of this status page is most certainly incomplete, please send any additions and corrections to Michel Dagenais (michel.dagenais@polymtl.ca)

Work recently performed

Tom Zanussi of IBM has implemented per CPU lockless buffering, with low overhead very fine grained timestamping, and has updated accordingly the kernel patch and the trace visualizer except for viewing multiple per CPU traces simultaneously.

Ongoing work

RelayFS: Tom Zanussi is currently working on implementing a separate data relay filesystem (relayfs) which will be a separate, simple and efficient component, reusable by other projects (printk, evlog, lustre...). This will remove a sizeable chunk from the current LTT, making each piece (relayfs and relayfs-based LTT) simpler, more modular and possibly more palatable for inclusion in the standard Linux kernel.

A group in the department of Computer Engineering at Ecole Polytechnique Montreal, where Karim Yaghmour graduated, is working on a reorganization of the LTT infrastructure to define event types and insert tracing statements in the kernel without having to modify or recompile the trace visualizer.

GenEvent: XiangXiu Yang developed a simple program which parses event descriptions and generates the inline functions to record events in the kernel.

LibEvent: XiangXiu Yang is working on an event reading library and API which parses event descriptions and accordingly reads traces and decodes events.

LTTVis: Michel Dagenais is remodeling the trace visualizer to use the LibEvent API and to allow compiled and scripted plugins, which can dynamically add new custom trace analysis functions.

Planned work

LTT already interfaces with Dynamic Probes. This feature will need to be updated for the new LTT version.

The Kernel Crash Dump utilities is another very interesting complementary project. Interfacing it with RelayFS will help implement useful flight-recorder like tracing for post-mortem analysis.

User level tracing is available in the current LTT version but requires one system call per event. With the new RelayFS based infrastructure, it would be interesting to use a shared memory buffer directly accessible from user space. Having one RelayFS channel per user would allow an extremely efficient, yet secure, user level tracing mechanism.

Sending important events (process creation, event types/facilities definitions) to a separate channel could be used to browse traces interactively more efficiently. Only this concise trace of important events would need to be processed in its entirety, other larger gigabyte size traces could be used in random access without requiring a first preprocessing pass. A separate channel would also be required in case of incomplete traces such as when tracing to a circular buffer in "flight recorder" mode; the important events would all be kept while only the last buffers of ordinary events would be kept.

Once the visualizer is able to read and display several traces, it will be interesting to produce side by side synchronized views (events from two interacting machines A and B one above the other) or even merged views (combined events from several CPUs in a single merged graph). Time differences between interacting systems will need to be estimated and somewhat compensated for.

LTT currently writes a proc file at trace start time. This file only contains minimal information about processes and interrupts names. More information would be desirable for several applications (process maps, opened descriptors, content of buffer cache). Furthermore, this information may be more conveniently gathered from within the kernel and simply written to the trace as events at start time.

New features already implemented since LTT 0.9.5

Per-CPU Buffering scheme.
Logging without locking.
Minimal latency - minimal or no serialisation. (Lockless tracing using read_cycle_counter instead of gettimeofday.)
Fine granularity time stamping - min=o(CPU cycle time), max=.05 Gb Ethernet interrupt rate. (Cycle counter being used).
Random access to trace event stream. (Random access reading of events in the trace is already available in LibLTT. However, one first pass is required through the trace to find all the process creation events; the cost of this first pass may be reduced in the future if process creation events are sent to a separate much smaller trace.)

Features being worked on

Simple wrapper macros for trace instrumentation. (GenEvent)
Easily expandable with new trace types. (GenEvent)
Multiple buffering schemes - switchable globally or selectable by trace client. (Will be simpler to obtain with RelayFS.)
Global buffer scheme. (Will be simpler to obtain with RelayFS.)
Per-process buffer scheme. (Will be simpler to obtain with RelayFS.)
Per-NGPT thread buffer scheme. (Will be simpler to obtain with RelayFS.)
Per-component buffer scheme. (Will be simpler to obtain with RelayFS.)
A set of extensible and modular performance analysis post-processing programs. (LTTVis)
Filtering and selection mechanisms within formatting utility. (LTTVis)
Variable size event records. (GenEvent, LibEvent, LTTVis)
Data reduction facilities able to logically combine traces from more than one system. (LibEvent, LTTVis)
Data presentation utilities to be able to present data from multiple trace instances in a logically combined form (LibEvent, LTTVis)
Major/minor code means of identification/registration/assignment. (GenEvent)
A flexible formatting mechanism that will cater for structures and arrays of structures with recursion. (GenEvent)

Features already planned for

Init-time tracing. (To be part of RelayFS.)
Updated interface for Dynamic Probes. (As soon as things stabilize.)
Support "flight recorder" always on tracing with minimal resource consumption. (To be part of RelayFS and interfaced to the Kernel crash dump facilities.)
Fine grained dynamic trace instrumentation for kernel space and user subsystems. (Dynamic Probes, more efficient user level tracing.)
System information logged at trace start. (New special events to add.)
Collection of process memory map information at trace start/restart and updates of that information at fork/exec/exit. This allows address-to-name resolution for user space.
Include the facility to write system snapshots (total memory layout for kernel, drivers, and all processes) to a file. This is required for trace post-processing on a system other than the one producing the trace. Perhaps some of this is already implemented in the Kernel Crash Dump.
Even more efficient tracing from user space.
Better integration with tools to define static trace hooks.
Better integration with tools to dynamically activate tracing statements.

Features not currently planned

POSIX Tracing API compliance.
Ability to do function entry/exit tracing facility. (Probably a totally orthogonal mechanism using either Dynamic Probes hooks or static code instrumentation using the suitable GCC options for basic blocks instrumentation.)
Processor performance counter (which most modern CPUs have) sampling and recording. (These counters can be read and their value sent in traced events. Some support to collect these automatically at specific state change times and to visualize the results would be nice.)
Suspend & Resume capability. (Why not simply stop the trace and start a new one later, otherwise important information like process creations while suspended must be obtained in some other way.)
Per-packet send/receive event. (New event types will be easily added as needed.)