Ladies and gentlemen, demork.py. My first foray into Python. Somebody else’s python, specifically.
Demork.py takes a Mozilla history.db file (in the now-legendary “mork” format) and spits out valid XML. I hope that some of you find it useful now, and that it makes migrating away from Mozilla’s current data format easier.
Thank you, Shaver, for keeping me from starting from scratch, and more generally to the Mozilla Project for their fine, fine browser. I hope this helps a little, maybe. I’ve got one possibly-misguided question, if there is any XML expertise out there – I’ve done a straight-up regex replacement of all the ampersands in the URLs in the history file to “&”; this makes for valid XML, but I foresee broken URLs. Should this be taken care of automagically during the trip back from XML, or should I be replacing them with “%something” instead?
One last word: those of you who are thinking about rolling your own undocumented, one-off data format for whatever job you happen to be doing, don’t. Just don’t. Whether you realize it or not you are playing a nasty, frathouse-grade prank on the future. You are pre-emptively saran-wrapping millions of as-yet-unbuilt toilets, and people will remember your name and hate you for it.
If anybody would like to help me test this (I only have one history file, as you might expect) I’d appreciate it. I hope you can take a minute to run it against your history file like so:
python demork.py history.db > history.xml
and then validate the XML file, either here or the household appliance of your preference. Please let me know if it works, and any details of failure.Many thanks.