Category Archives: Firefox

Hacks all the way down

So, I did this thing… It’s a little complicated to explain, but bear with me: I used a Software Defined Radio to capture the radio transmissions from a Bravo Ph esophageal monitor, wrote a browser-based decoder using the AudioData API, and reverse-engineered the broadcast data packets. As a bonus, I hope to do something similar to catch signals from a tiny satellite I helped fund on Kickstarter, which launches this weekend on SpaceX’s CRS-3 mission to the International Space Station.

Phew. Ok, now let’s break that down. 🙂

Software Defined Radio

Software Defined Radio (SDR) is central to all of this. It’s a pretty complex field — and I am by no means an expert — but the simplified basics are not hard to understand. Traditionally, a radio is built for a specific purpose with specific hardware. It’s effectively a black box customized to convert audio/video/data to or from a particular pattern of electromagnetic radiation in some particular part of the spectrum. Each box is different; you need one for satellite TV, one for FM radio, one for WiFi, one for GPS, and so on. There are myriad variations, and devices that might seem similar can actually be completely different. You’ve probably seen news reports about police and fire departments who are responding to the same disaster, but are literally unable to talk to each other because their radio systems are different.

SDR is a radical departure from all that. You still need a piece of hardware that can tune to a relevant slice of the radio spectrum, but it becomes a general-purpose device that relies on software to do all the application-specific bits. For example, you might ask such a device to tune to 66Mhz, and capture 3Mhz of bandwidth on either side. You then feed the result to a software NTSC decoder, et voilà, you’re watching TV (analog channel 4). The hardware doesn’t know anything about the contents of what it’s receiving, since it’s the software that deals with it. If your device can capture more bandwidth, you could even watch multiple channels at the same time. And since it’s just generic data being processed by software, it doesn’t need to happen in real-time. You can record a stream of RF data, and process it in different ways after the fact.

DVB-T dongle (digital TV) based on the Realtek RTL2832U and Elonics E4000 tuner.

Until recently, SDR was only possible with fairly expensive equipment, which made it a niche hobby. But in the 2010-2012 timeframe, some folks discovered that a cheap USB device intended for digital TV reception (“watch TV on your laptop!”) contained surprisingly capable hardware that could be repurposed as a general-purpose SDR. Specifically, the RTL2832U chipset and a variety of tuner chips. For $10 to $20 you could get one of these mass-produced dongles that, with the right software, let you receive and decode all kinds of interesting transmissions from roughly 50Mhz to 1800Mhz.

Here are just a few of the things possible:

There’s a whole world of analog and digital RF data being broadcast around us, which cheap Software Defined Radio hardware makes readily accessible.

The Bravo Ph system

Around the time I was starting to play with SDR, I had gone to my doctor because I was experiencing some of the symptoms of acid-reflux. Or, as it’s more formally known, gastroesophageal reflux disease: GERD. [Spoiler: no big deal, weightloss + antacid and I’m all good.] One of the steps in the diagnosis is monitoring the acidity level in your esophagus over time. This used to involve inserting a  tube into your nose and throat, leaving it there for a few days of measurement, and was generally quite unpleasant. Now they can just attach a tiny wireless sensor in you; it sticks there for about a week, and then gets eliminated naturally. It’s only a couple centimeters in size, and you don’t even notice it:

Bravo pH device compared with the tip of a pencil

During the monitoring period, you carry around a receiver (which basically looks like a giant 1990-era pager). It’s supposed to be kept within 3 feet of you at all times, or else it makes an annoying beep when it loses the sensor’s signal. It records pH measurements every few seconds, and conveniently displays the last reading.

Bravo pH receiver

When the study is over, you return the receiver, your doctor downloads the data, and you get a nice little report with graphs and numbers to help your doctor make a diagnosis.

Capturing Data

So that’s SDR and Bravo Ph. Now, if you’re connecting the obvious dots like I was, you’re wondering if it might be possible to snoop on the sensor’s broadcasts to see what they contain. Indeed it is!

But the first step is finding the signal.

I wasn’t really sure where to start, but some Googling turned up a User’s Guide (doctor’s guide, really) for the system, and buried in an appendix was the info I needed:

Output & Transmission
EIRP: 17.6 μW (-47.53 dBm) at a 3-meter distance
Format: Amplitude-shift keying
Frequency: 433.92 MHz
Rate: 60 ms every 12 seconds

Bingo! All I needed to do was tune my dongle to around 433.92MHz, and look for a bursty signal repeating about every 12 seconds. It was literally as simple as that — here’s a waterfall display from the GQRX app I was using, showing two transmissions (time is the vertical axis, frequency is the horizontal):

Screenshot from GQRX showing signals

And here’s what it sounds like as AM-demodulated audio: MP3 | WAV (Sorry, WordPress doesn’t seem to support inline HTML5 audio.)

Decoding Data

Ok, so now we get signal. But what’s in it? The brief bursts are obviously too fast for our meaty ears to discern meaning, so looking at the waveform in an audio editor was the easiest way to take a first look. I used Audacity on OS X:

Ah, there it is. Digital data. There are clear hi/lo levels, but what’s actually important is the length of the pulses. After examining a few more transmissions, the basic format of the data packet is apparent:

  1. A preamble consisting of 10 500μs hi pulses (each separated by 500μs lo). This likely serves as a clear “beginning of message” indicator, and to establish clock speed.
  2. 48 bits (6 bytes) of data. Each bit is 1 millisecond; with a “0” indicated by 250μs hi followed by 750μs lo, and a “1” indicated by 666μs hi followed by 333μs lo.
  3. A single 500μs hi stop bit.

(Note that I’m slightly rounding the timings to what would seem to be likely values. The actual data is imprecise due to noise and rising/falling edges, on the order of tens-of-microseconds.)

I decoded a few packets by hand, both for fun and to validate the format. But it quickly became tiring to scribble down data like “pppppppppp111110111111101000000100110100011101000001100101s” and then convert it to hex (fbfa04d1d065), so I decided to write a tool to do it. In the browser, of course!

At the time, Firefox supported a simple low-level Audio API that was exactly what I was looking for. In a nutshell, you add a MozAudioAvailable event listener to an <audio> element, and the listener periodically gets an array of sample data as the media plays. I implemented a simple state machine to decode the data, based on a manually set threshold between the hi/lo states (and some fudge-factors to deal with the imprecision/noise previously noted). I’m sure there are more elegant and automatic ways to do this, but simple brute force was enough for my limited needs. The one annoying downside is that this API can only process in real-time(!); there’s no way to ignore playback speed and just get all the data as fast as possible. If you’d like to play around with it, here’s a live demo and the source on Github. (*cough* I’ve been so slow in finishing this post, that the Audio API I’m using has been removed from Firefox 28, in favor of the newer Web Audio API. So you’ll need an older Firefox, or just gaze upon the following screenshot.)

Ok, so now I’ve got a bunch of decoded data values to examine, such as:

fbfa04b4b79b
fbfa04b4a9a9
fbfa049b80eb
fbfa04857a07
fbfa04858af7
fbfa048483ff

What do they mean?

The first 3 bytes (0xFBFA04) are always the same, so that’s presumably a serial number or unique ID (and the manual confirms that during setup, there’s a step to ensure that the receiver is getting the expected sensor ID).

The next two bytes must be the actual pH measurements. They are usually similar to each other, and by graphing the values I can see they follow the trend of the pH values reported on the receiver (which I was writing down when capturing the transmissions). Why two values? The manual says that a measurement is made every 6 seconds but a transmission only every 12, which I assume is done to save power. The pH is roughly obtained by dividing the byte’s value by 25 — but it looks like it’s somewhat non-linear or uncalibrated, as data for the lowest pH values needs to instead be divided by 30 to match what the receiver reports.

The last byte took a bit more effort to figure out. At first glance it appeared fairly random, so I assumed it was some kind of checksum. Validating medical data seems important, after all. Probably something simple to compute for an 8-bit microcontroller, so no fancy FEC or CRC magic… I fiddled around with a few guesses, but graphing the data led me to the answer:

Graph compating pH1/pH2 with checksum

The checksum value (green) looks like a stretched, inverted, and offset version of the average pH (yellow). How about (pH1 + pH2) ^ 0xFF + 7? (All modulo-255, since this is likely an 8-bit microcontroller.) That’s it! It correctly generates the observed checksum for each of the 144 packets I captured.

So with that, I’m able to decode, interpret, and validate the data packets. Neat. It’s not directly useful for anything, but made for a fun experiment.

Afterwards, I got to wondering if there might be some further technical details buried somewhere online to help explain or confirm what I found. I’ve seen patents and FCC filings used to glean data in other cases, so I went to look…

Patent US6689056 has a number of interesting tidbits. It indicates that the microcontroller in the sensor is probably a MicroChip 12C672 (a member of their PIC family, which is similar to the Atmel AVR family familiar to Arduino folks). There’s a basic description of the packet format, but the only detail I hadn’t caught was that the 3-byte header is actually a 2-byte ID and a 1-byte Message Type (I only ever saw one type). It does confirm that the last byte is a checksum, but doesn’t go into how it’s computed.

On the FCC’s website, I found a 4/25/2001 application from Meditronics for the PHZ-BRAVO100, which has a number of close-up photos and an extremely detailed Test Report. It basically confirms what I had found, with additional info on the bit timing and packet format, and also reveals that there is a “transmitter status” message type that’s sent once an hour.

Kicksat

Now let’s shift from inner space to outer space — or at least low Earth orbit. Back in November 2011, a fascinating project appeared on Kickstarter: “KickSat – Your personal spacecraft in space.” Usually satellites are large vehicles that cost millions, but recently this has been made more affordable by using  the small, standardized Cubesat format (a 10cm cube, weighing 1.3kg). KickSat takes this a step further, by packing a Cubesat carrier with tiny “nanosatellites” (3cm square, weighing a few grams). That brings the cost down to just $300 to sponsor a KickSat in orbit, broadcasting a custom callsign and other simple data. They don’t do much, but it helps demonstrate the concept of using a fleet of cheap, simple sensors instead of a single expensive “Cadillac” spacecraft. For example, instead of predicting space weather using reports from a handful of satellites, you might use a huge number of cheap nanosatellites monitoring a wide area.

KickSat Sprite (solar cells and antenna not installed)

As a bonafide space nerd, I jumped at the opportunity. And now, after a long wait, KickSat is poised to launch in just a few days (March 30th), onboard the SpaceX CRS-3 resupply flight to the International Space Station. Assuming all goes well, the KickSat CubeSat carrier will be deployed immediately after 2nd stage cutoff. It orbits by itself for 16 days, to ensure wide clearance from the ISS, and then deploys its 104 KickSats. Including mine, which will be broadcasting “MOZFF“. They’ll orbit for a few weeks, and then burn-up as they reenter the atmosphere. (No space junk!)

Rendering of the KickSat Sprites being deployed

The project is publishing info about the satellite’s transmissions, as well as info on how to set an inexpensive ground station using… That’s right, software defined radio (GNURadio + RTL2832U dongle). I’ve got my equipment ready, and will attempt to capture signals from KickSat while it’s in orbit. More on that after launch!

Mozilla office history

If you don’t live in the San Francisco Bay area, you may not be aware that the culmination of a major infrastructure project is underway this holiday weekend. The Bay Bridge, one of 5 major Bay-area bridges, is in the middle of a 5 1/2 day closure as it’s transitioned over to its replacement. (The other 4 major bridges, in case you’re wondering, are the
Richmond–San Rafael Bridge, the San Mateo Bridge, the Dumbarton Bridge, aaaand… hmmm… oh, right, the Golden Gate bridge)

The Bay Bridge was originally built in the 1930s, and after the Loma Prieta earthquake in 1989 it became clear it needed to be replaced. One of the flashbulb memories many people have of the quake — in addition to it interrupting the ’89 World Series and the collapse of the double-decker Cypress Street Viaduct — is the failure of a span on the Bay Bridge, with cars driving over it. Since then, the western half of the Bay Bridge has been retrofitted to be earthquake-safe, but the eastern span of the bridge has taken longer to completely replace. This weekend’s work is to transition the connection points, so that tomorrow people will be driving over a completely new bridge that’s been 11 years and $6.4 billion in the making!

(I’m getting to the part where Mozilla ties into this.)

Last Friday @BurritoJustice tweeted a link to a slideshow that dove into the engineering history of the Bay Bridge, complete with photos taken during the construction.

It’s some fantastic engineering porn, and I spent my lunch reading through all of it. I happened to notice that the building in the background of one of the photos looked familiar…

Mozilla’s San Francisco office, in the historic Hills Brothers Coffee building at 2 Harrison Street, is literally next-door to where the western span of the Bay Bridge lands in S.F. It makes for some really great views of the bridge from our top-floor patio:

As well as a first-row seat beneath the giant “HILLS BROS COFFEE” sign atop the building.

It’s this sign that made it easy to spot our building in the engineering history slideshow. The building was constructed in 1926, and the Bay Bridge wasn’t built until 1933-1936, so I was curious to see if the sign was visible in other contemporaneous photos. I started digging though some online resources, and got lucky right away by finding a highres version of that picture at the Library Of Congress:

I skimmed through the other 415 photos in this collection and another 1160 from UC Berkeley (So You Don’t Have Toâ„¢), and found some other nice shots with the Hills Coffee sign peeking out from the background:






So there you have it. Pics of the Mozilla San Francisco office from both ends of an 80 year span of history.

On Firefox’s Password Manager

It’s been a while since I last blogged about Firefox’s password manager. No big deal, it really hasn’t fundamentally changed since since I rewrote it 6 years ago for Firefox 3.0. Well, ok, the storage backend did switch to SQLite, but that’s mostly invisible to users. There’s a noteworthy change coming soon (see next post!), but I figured that would be easier to explain if I first talked about how things currently work. Which I’ve been meaning to do for a long time, ahem.

The first thing you should know is that there is no standard or specification for how login pages work! The browser isn’t involved with the login process, other than to do generic things like loading pages and running javascript. So we basically have to guess about what’s going on in order to fill or save a username/password, and sometimes sites can do things that break this guesswork. I’m actually surprised I don’t get questions about this more often.

So let’s talk about how two of the main functions work — filling in known logins, and saving new logins.

Filling in a known login

The overall process for filling in an existing stored login is simple and brutish.

  1. Use the chrome docloader service and nsIWebProgress to learn when we start loading a new page.
  2. Add a DOMContentLoaded event listener to learn when that page has mostly loaded.
  3. When that event fires…
    1. Check to see if there are any logins stored for this site. If not, we’re done.
    2. Loop through each form element on the page…
      1. Is there an <input type=password> in the form? If not, skip form.
      2. Check to see if any known logins match the particular conditions of this form. If not, skip form.
      3. Check to see if various other conditions prevent us from using the login in this form.
      4. Fill in the username and/or password. Great success!

Phew! But it’s the details of looking at a form where things get complex.

To start with, where do the username and password go? The password is fairly obvious, because we can look for the password-specific input type. (What if there’s more than one? Then we ignore everything after the first.) There’s no unique “username” type, instead we just look for the first regular input field before the password field. At least, that was before HTML5 added more input types. Now we also allow types that could be plausibly be usernames, like <input type=email> (but not types like <input type=color>). Note that this all relies on the order of fields in the DOM — we can’t detect cases where a username is intended to go after the password (thankfully I’ve never seen anyone do this), or cases where other text inputs are inserted between the actual username field and the password (perhaps with a table or CSS to adjust visual ordering).

And then there’s the quirks of the fields themselves. If your username is “alice”, what should happen if the username field already has “bob” filled in? (We don’t change it or fill the password.) Or, more common and depressing, what if the username field already contains “Enter your sign in name here”? In Mongolian? (We treat it like “bob”.) What if the page has <input maxlength=8> but your username is “billvonweiterheimerschmidt”? (We avoid filling anything the user couldn’t have typed.)

And then there’s the quirks of the saved logins. What if the username field already has “ALICE” instead of “alice”? (We ignore case when filling, it’s a little trickier when saving.) Is there a difference between <input name=user> and <input name=login>? (Nope, we stopped using the fieldname in Firefox 3 because it was being used inconsistently, even within a site.) What about a site has both a username+password _and_ a separate password-like PIN? (Surprisingly, we were able to make that work! Depending on the presence of a username field, we prefer one login or the other.)

And then. And then and then and then. Like I said, there’s no spec, and sometimes a site’s usage can break the guesses we make.

Saving a new/changed login

In comparison, the process for saving a login is simpler.

  1. Watch for any form submissions (via a chrome earlyformsubmit observer)
  2. Given a form submission, is there a password field in it? If not, we’re done.
  3. Determine the username and password from the form, and compare with existing logins…
    • If username is new, ask if user wants to save this new login
    • If username exists but the password is different, ask if user wants to change the saved password
    • If username and password are already saved, there’s nothing else to do.

Of course, there are still a number of complicating details!

This whole process is initiated by a form submission. If a site doesn’t actually submit a form (via form.submit() or <button type=submit>), but just runs some JavaScript to process the login, the password manager won’t see it. And thus can’t offer to save a new/changed login for the user. (Note that this is easy for a site to work around — do your work in the form’s onsubmit, but return |false| to cancel the submission).

Oh, and there’s still the same question as before — how to determine which fields are the username and password? We reuse the same algorithm as when filling a page, for consistency. But there are a few wrinkles. The form might be changing a password, so there could be up to 3 relevant password fields (for the cases of: just a new password, old and new passwords, and old + new + confirm.). And the password fields could be in any order! (Surprisingly, again, this works.) The most common problem I’ve seen is an account signup page with other form fields between the username and password, such as “choose a user name, enter your shipping address, set a password”. The password manager will guess wrong and think your zipcode is your username.

Oh, and somewhere in all this I should mention how differences in URLs can prevent filling in a login (or result in saving a seemingly-duplicate login). Clearly google.com and yahoo.com logins should be separate. But we also match on protocol, so that a https://site.com login will not be filled in on http://site.com. And what about http://www.foo.com and foo.com or accounts.foo.com? (We treat them separately.) What about mail.mozilla.com and people.mozilla.com? (Also separate.) What you might not realize is that we also use the form’s action URL (i.e., where the form is submitted to), ever since bug 360493. While this prevented sending your myspace.com password to evilhacker.com, it also breaks things when a site uses slightly different URLs in various login pages, or later changes where their login forms submit to.

Oh, bother.

All the gory details

This is one of the benefits of being Open Source. If you want to see alllll the gory details of how the Firefox password manager works, you can look at the source code. See http://mxr.mozilla.org/mozilla-central/source/toolkit/components/passwordmgr/. In particular, LoginManagerContent.jsm contains the code implementing the stuff discussed in this post, with the main entry points being onContentLoaded() and onFormSubmit().

Finally (!), I’ll mention that the Firefox password manager has some built in logging to help with debugging problems. If you find it not working in some particular case, you can enable the logging and it will often make the problem clear — or at least clearer to a developer when you file a bug!

Boom goes the dynamite

This weekend bug 758812 landed on mozilla-central. So begins a long, slow process of splitting up browser.js into smaller pieces.

“What’s the deal with browser.js, anyway?”

It’s big. Too damn big. For those unfamiliar with it, browser.js basically contains a bunch of the code for driving the UI in a Firefox window. Anything you click or see changing as you browse probably involves this code. A browser needs to do a surprising amount of stuff to work, and over time browser.js has become an eclectic collection of code. About 13K lines of code, in fact. That’s a lot, and leads to a number of problems… It’s a daunting behemoth for those new to the Mozilla code base. It’s haphazardly organized (at best), so it’s hard to find things unless you already know what you’re looking for. And looking at Mercurial’s history for the whole file doesn’t really give you a clear picture of how specific pieces evolve.

“So what are you going to do about it, punk?”

Break it up! The first step landed, which is just spinning out a few big chunks of related code into #included files (about 2K lines). As time goes on (and we see how changes work out), we’ll likely spin out even more. Some of these pieces may also end up evolving into JSMs, which has modest benefits for improving memory usage and startup time. We might even be able to share some of this code across products (Firefox, Fennec, B2G).

“And the bad news?”

Well, there really isn’t any. If you’re a Firefox user, you won’t see anything change. If you’re an add-on developer, you’re unlikely to be affected by these changes. The cleaving of browser.js is currently just a source-tree change with #includes, so the resulting browser.js that ships in Firefox is mostly the same. Future changes to move code into JSMs will have some impact, but we’ll try to keep change small and approachable. Really, it’s only Firefox front-end developers who are likely to notice the changes, patches to certain code in browser.js may now need to patch code in browser-foo.js instead.

“I’m not seeing any cats in this post so far.”

I am never one to disappoint. Here is my cat wearing a tiara.

Unowned Reviews

I was feeling a little bit ornery today, and decided to take a look at unowned reviews in Bugzilla (aka patches with a review request “to the wind”, instead of requesting review from some specific person).

After a bit of head scratching to figure out _how_ to get a list of such bugs through the search form, I gave up and use Teh Googlez to get my search query.

When I started there were ~130 requests (in ~110 bugs) across all of Bugzilla. Surprisingly, many of the requests (maybe 2/3rds?) were in bugs that were already resolved fixed/invalid/dupe/wfm, and were thus easily cleared out — along with a comment to take action if there was some non-obvious reason to keep the patch active. (And I use the word “active” loosely as most of these patches were _years_ old.)

For the patches in bugs which were still open, I generally assigned to a reviewer I knew was active in that area (to either close the bug out, perform a review, or reassign to someone else). In a few cases, where patches were really ancient (e.g. 6+ years waiting for a review!) I cleared the review and asked that the patch be updated or bug closed. Reviews of that vintage are simply not useful, and stand in the way of driving this list of reviews to Zarro.

Currently, there are now ~35 unassigned review requests (in ~25 bugs). Most of those are in in Rhino/Tamarin or addons.mozilla.org, and I wasn’t sure what to do with them. I’ll ask around this week.

Progress.

Firefox flowers

I was at IKEA the other day (for the first time ever!), and juuust as I was about to make it out the door with minimal expense I saw their assortment of fake flowers, and knew what had to be done.

Adds some nice color to my desk at work, when I get bored of it I’ll probably move it so some conference room. 🙂

Make your own!

1 REKTANGEL vase.
19 SNÄRTIG flowers (5 blue, 1 yellow, 3 yellow-orange, 7 orange, 3 red).
1 red SMYCKA flower.

I had the red and black filler sitting around from some other store, but it’s easily available. (Note symbolism in Firefox being anchored in the rock of Mozilla.)

My cat, Munchkin, is not available for sale. Sorry.

(…i can haz flowers?)

Speaking of Community

In my last post I mentioned how super satisfying it is to be part of a community working together to make Firefox better.

This morning someone new dropped into #fx-team on IRC, asking about a reproducible problem and how to submit a patch. Normally #fx-team is a fairly busy channel, but since we were all at the Mozilla All-Hands last week, people were traveling and recovering… So it was 12 hours of dead silence before anyone replied. Not great, but also pretty rare. 😦

Anyway, the awesome part was that in that time this person managed to grab the source code, diagnose the problem with gdb, fix the problem, and then blog about it — check it out. That’s pretty awesome! (Oops, I already said awesome once. Still deserving of a 2nd ‘awesome’. 🙂

The timing couldn’t have been better. At the All Hands, I went to one of the talks by Ubuntu’s Jono Bacon on Growing and Maintaining Community. It was pretty awesome (3rd time, yes I know). Obviously Ubuntu has a large community, but I was hadn’t realized just how much thoughtful effort has gone into improving it… TONS of fantastic ideas and practices that I hope Mozilla can also make use of. The one downside, though, was that I came away feeling a little bummed at how much work need to do to reach the high level of wide-ranging, effective engagement Bacon et al have developed. So it was really splendid timing to have a new community member pop out and have a good experience.

Today was a good day.