NetLog is an event logging mechanism for Chrome’s network stack to help debug problems and analyze performance. It enables a “capture → dump → analyze” style of workflow similar to tools like tcpdump.
Here is a typical use case:
Under the hood Chrome’s network stack has been instrumented to emit events at various interesting trace points. This event stream can be watched by observers to do interesting things with the data. Some of Chrome’s NetLog observers are:
Goals of NetLog
NetLog is intended solely for logging.
Features needing reliable network information should never be built on top of NetLog. It may be tempting to add an observer of NetLog as a quick hack to glean some internal state (since you circumvent having to create a proper interface and plumb it deep into the network stack). But that is wrong, and it will break. Instead, you should add an interface to NetworkDelegate, complete with regression tests and documentation.
Think of NetLog as a fancy way of writing:
LOG(INFO) << “Some network event; bla bla bla”;
It would be silly if a feature broke simply because you moved around such a logging statement or changed its format slightly!
Similarly, we want net logging to be easy to add/edit so developers can instrument problem areas as needed.
There are a couple of terms Chrome engineers use when talking about this domain. They sound similar but have slightly different meanings!
Structure of a NetLog event
On the C++ side of things, events are represented by a net::NetLog::Entry. This in-memory representation is intended to be easily serialized to JSON.
We try not to serialize events to JSON until it is absolutely necessary (most of the time there are no consumers of the events so we don't have to). When we do need to serialize the events, this is the JSON format that we use:
How network events are emitted
net::NetLog is the interface for emitting NetLog events.
Code throughout src/net/* that needs to emit network events must be passed a pointer to a net::NetLog to send the events to.
Most commonly, this net::NetLog dependency is passed via a net::BoundNetLog parameter rather than directly as a net::NetLog*. This is a wrapper to “bind” the same source parameter to each event output to the stream. You can think of it like creating a private event stream for a single entity. Read more about this in net_log.h
Ultimately though, net::NetLog is just a boring interface and doesn’t actually do anything.
In the Chrome browser, the concrete implementation of net::NetLog used is ChromeNetLog. We configure things so that all network logging events for all browser profiles flow through a single instance of ChromeNetLog. ChromeNetLog is responsible for forwarding this event stream on to other interested parties via an observer mechanism.
As was alluded to earlier in the overview, about:net-internals is implemented as one such observer (which forwards the network events to a webapp).
How custom parameters are attached to events
Custom parameters are specified by a base::Value*. Value is used to represent a hierarchy of nodes, that maps directly into JSON; it has all the expected building blocks -- dictionaries, strings, numbers, arrays. See values.h for more details.
For the sake of efficiency, you do not directly create a base::Value* when emitting events. Rather, you pass in a Callback which knows how to build the Value*, in case one is needed.
This decoupling allows deferring the creation of Values until it is really necessary. This is good since in the common case when about:net-internals is not open, we simply don’t need that data. Creating the custom Value parameters is not free after-all, since it involves copying internal state into a new Value* hierarchy.
You are guaranteed that the parameter callback will only be invoked synchronously before the return of the logging function. We use a Callback for convenience, since Bind makes it easy to piece things together without needing to define helper structures!
Here is an example of how to emit an event with custom parameters:
&url(), &method_, load_flags_, priority_));
By the time BeginEvent returns, NetLogURLRequestStartCallback() will have been invoked if and only if the Value* parameter was needed. That is why it is safe for the callback to take pointers to url() and method_ (pointers are preferred in this case to avoid making unnecessary copies).
Here is what the bound function might look like:
Value* NetLogURLRequestStartCallback(const GURL* url,
const std::string* method,
DictionaryValue* dict = new DictionaryValue();
What does NetLog actually log?
The logging is subject to change, but in general we try to log whatever is useful for debugging.
Admittedly that description isn’t very helpful.
… let me give some examples of what we currently log:
The sorts of events emitted are application specific. NetLog does not aim to replace lower level network tools like packet captures (i.e. tcpdump). Rather it focuses on application level logic and caches which cannot possibly be known at the lower layers. There is some overlap though, since NetLog can optionally capture bytes sent/received over sockets, as well as the decrypted bytes for secure channels.
chrome://net-internals (aka about:net-internals)
net-internals is visualizer for the NetLog event stream. It can be used both in real-time, and also to load up post-mortem NetLog dumps.
Most of the data displayed by net-internals comes from the NetLog event stream. When you load about:net-internals it installs an observer into ChromeNetLog (inside the browser process) which serializes network events to JSON and sends them over IPC to the renderer running about:net-internals application.
Some of the other data displayed by net-internals comes by polling other state functions. For instance the “DNS cache” listed on about:net-internals is obtained by polling the browser for the full list, rather than by an event-based mechanism.
History: the evolution of NetLog/NetInternals
version 1: LoadLog
In the beginning, there was LoadLog. This was basically a small per-request log buffer. Each URLRequest had its own LoadLog member field, which was essentially a std::vector of events and their timings. These events got accumulated directly in the browser process. In other words, it was a form of always-on passive logging.
My original use case for LoadLog was to diagnose performance problems.
At the time, internal users were complaining to me (eroman) about sporadic performance issues. But of course they would never reproduce them when I was around! I wanted a system that could dump the relevant information reliably AFTER the problem had already happened.
To capture the information I added LoadLogs to URLRequests to track the performance of core operations like DNS resolving and proxy resolution. This structure was really just a small set of event timings.
To expose the data, I created a very simple webapp at “about:net-internals”. This page was entirely generated by C++, and believe me when I say it was basic!
Since LoadLogs were attached to particular URLRequests, the visualizer would work by traversing the list of in-progress requests and subsequently printing its logs. I also kept a “graveyard” of recently finished requests in the browser so we would have some data on recently finished requests, (which ultimately was my original objective). Since about:net-internals was a static page, you would have to reload it to view new data. You would reload at the risk of losing the data it was currently displaying, since the data was backed by the browser process and might have already been evicted from the circular graveyard buffer.
The exchange format for users to send me their logs was ad-hoc. Essentially they would save the HTML page, often by using the “save webpage as” feature, or simply copy-pasting. (Some users would even print-to-pdf and attach that).
You might be wandering about the URL... why not simply call it “about:net”? Well, at the time there was already a page running at “about:net”! So I added the suffix “-internals” to distinguish them. Moreover, at the time I just wanted to build something quick and dirty and did not want to go through any UI reviews. By calling it “-internals” I was assuring people that it was just a janky developer page that didn’t need to meet our higher standards for end-user content.
Interestingly “about:net” got deprecated when porting Chrome to Mac/Linux (since it relied on native UI which no-one wanted to port). We eventually deleted it, but by the time that happened about:net-internals was already a well established URL so I saw no point in changing it. Today, ChromeOS has re-appropriated the url about:net.
version 2: NetLog
I started doing this, and created the experimental page “about:net2” for my in-progress work. These two pages coexisted for a while since it was just a side project for me. But eventually when it was good-enough I would rename about:net2 to about:net-internals.
The problem with LoadLog was the data was siloed by request. This meant my webapp couldn’t receive real-time updates. Rather it would have to do gnarly polling to see what had changed (and suck down all the data each time). Another issue that scared me was the browser-side accumulation of data being done with LoadLogs. When tracing I wanted to turn on higher logging granularity, but this came at the risk of using up lots of browser memory, which at the extreme might kill the process.
To address this, I decided to generalize the mechanism of a per-request events into a single flat event stream. This gave me a nice choke point for transferring updates to the net-internals app. It also meant that the accumulation of data was done in a renderer process, so at worst a memory exhaustion would kill just that single tab.
However this new approach also created some new problems. First of all, having a flat event stream meant needing to re-build the structure within each consumer of the NetLog. This is why the “source” parameter exists in events today. Since this is inefficient, I tried to keep browser-side interpretation of NetLog events to a minimum, and leaving this task to the visualizer.
The second issue was how to maintain feature parity with LoadLog’s ability to do “passive logging”. To support passive logging (i.e. “always on logging”) I introduced the PassiveLogCollector observer.
PassiveLogCollector was an observer that would watch the event stream and save recent events into a circular buffer (using a variety of heuristics to try and keep the most relevant information, clustered by request). That way if something went wrong, we would have a partial log showing what happened most recently. Getting the heuristics for PassiveLogCollector good-enough was non-trivial, since it required re-building and tracking the structure for the events.
Initially PassiveLogCollector worked as promised. But it didn’t scale well over time, and has since been removed.
The problem is that over time the amount of net logging grew substantially (success!), and also the event stream became more fragmented into sources (due to more overlapping async stuff). The tension with PassiveLogCollector was wanting to collect enough useful information to be able to debug problems, while at the same time not having fragile heuristics which could break at any moment leading to browser memory bloat. Ultimately I wasn’t comfortable with the risk of bad heuristics leading to bloat, and how it complicated adding new logging, and hence removed it.
Some of the gaps left by the removal of PassiveLogCollector can be addressed by issue 117026.
version 3: The age of mmenke
The current evolution of about:net-internals is what I would call v3. It starts when Matt Menke teamed up with me to work on net-internals.
Prior to the golden age of menke, exchanging NetLogs involved copy-pasting the webpage! This was a hassle, since invariably users would paste the useless parts of the log and not the whole thing, or provide screenshots rather than text... It was downright embarrassing.
Matt added the ability to save logs to a JSON file, and to import them back again into about:net-internals, which was a huge step forward.
v3 also got custom views for Sockets, SPDY sessions, Winsock LSPs (SPIs), a slew of other functionality and increased logging sprinkled throughout the network stack.
version 4: ???
… patches welcome!