Trying to generate source maps by performing dynamic analysis on the compiler

Sometimes the source maps Babel generates aren’t super accurate. Then you get something like this:

Often this isn’t actually Babel’s fault, but rather a problem with the Babel plugin that does the transformation.

Generating source maps through dynamic analysis

The project I’m working on, FromJS, collects information about JavaScript code while it’s running. Specifically, it allows you to see how two strings relate to each other.

By running Babel while FromJS is tracing it I can find out how the compiled code relates to the original source code.

For example, let’s compile the arrow function var square = x => x * x. I can look at the compiled code and see where the square variable was defined in the original code.

And I can do the same for the x parameter.

Using the source-map package on NPM I can then piece the individual mappings together into a source map.

Does it work?

A little bit, maybe.

As I showed above, the mappings tend to work on literals or variable identifiers. But if you look at something like a variable declaration you can see that the "var" string isn’t taken directly from the uncompiled source code.

Rather, "var" appears directly in the source code for the compiler. (The Babel code is in sm-test-compield.js).

So in practice I have to discard a lot of the collected data, because the relationship it describes isn’t the relationship between the original code and the compiled code.

While stepping through certain parts of the code works well, there are other parts where the source map doesn’t work at all.

Improving Babel source maps

My original idea wasn’t to generate complete source maps from scratch. Rather, I wanted to find problems in the normal source maps generated by Babel and improve them with the data I collected.

However, I think most of the places where I can collect mapping data are already well covered by Babel. The problems with Babel source maps seem to come mostly from new code that was generated, rather than code that’s based directly on the input source code.

Overall, it was a fun experiment, but it doesn’t seem useful in practice.