Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
How does one parse/decode/extract information stored by a web page in Firefox's browser local storage without the browser? So far, I've worked out that the data lives at ~/.mozilla/firefox/${profi...
#2: Post edited
- How does one parse/decode/extract information stored by a web page in Firefox's browser local storage without the browser?
So far, I've worked out that, at `~/.mozilla/firefox/${profile}/storage/default/${site}/ls/data.sqlite`, site looks something like `https+++software.codidact.com`. Because the browser locks the database at least when the page is open, I copy it off to a temporary version.- SQLite3 then shows two tables, `database` that mostly only describes the site, and `data` which holds what we find under Developer Tools/Storage/Local Storage/(relevant URL). Querying `SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'` gives...something.
- So far, so good.
However, the "something" that we get from the `value` column *might* come through as plain JSON - it looks like this happens when the browser serializes a smaller object - or it might come through as something more complicated, which resembles small fragments of the expected JSON interspersed with opaque binary sequences.The site that interests me happens to be Open Source, so I checked to see if they maybe used some bizarre encryption, but no, they store to local storage with `localStorage.setItem(JSON.stringify(object))` from a single central utility function.I've seen suggestions that Firefox uses [MessagePack](https://msgpack.org/index.html) for this, but the data doesn't seem to decode that way. Another suggests that this might be the internal representation of the [structured clone algorithm](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm), but I couldn't find any corroboration for this or any code or tool to get a serialized form to verify it.The ideal would be to run something like `sqlite3 "copied-data.sqlite" "SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'" | tool-to-get-JSON | jq .field` and use that further in a script. Though I'm open to anything that doesn't tie up the browser.
- How does one parse/decode/extract information stored by a web page in Firefox's browser local storage without the browser?
- So far, I've worked out that the data lives at `~/.mozilla/firefox/${profile}/storage/default/${site}/ls/data.sqlite`, where the site looks something like `https+++software.codidact.com`. Because the browser locks the database at least when the page is open, I copy it off to a temporary version.
- SQLite3 then shows two tables, `database` that mostly only describes the site, and `data` which holds what we find under Developer Tools/Storage/Local Storage/(relevant URL). Querying `SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'` gives...something.
- So far, so good.
- However, the "something" that we get from the `value` column *might* come through as plain JSON - it looks like this happens when the browser serializes a smaller object - or it might come through as something more complicated, which resembles small fragments of the expected JSON interspersed with opaque binary sequences. The latter is the concern, here.
- The site that interests me happens to be Open Source, so I checked to see if they maybe used some bizarre encryption, but no, they store to local storage with `localStorage.setItem(JSON.stringify(object))` from a single central utility function, so I can't pull the mechanism out of there.
- I've seen suggestions that Firefox uses [MessagePack](https://msgpack.org/index.html) for this storage, but the data doesn't seem to decode with that assumption. Another suggests that this might be the internal representation of the [structured clone algorithm](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm), but I couldn't find any corroboration for this or any code or tool to get a serialized form to verify it.
- The ideal would be to run something like `sqlite3 "copied-data.sqlite" "SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'" | tool-to-get-JSON | jq .field` and use that further in a script. Though I'm open to almost anything that doesn't tie up the browser.
#1: Initial revision
Extracting Firefox Local Data
How does one parse/decode/extract information stored by a web page in Firefox's browser local storage without the browser? So far, I've worked out that, at `~/.mozilla/firefox/${profile}/storage/default/${site}/ls/data.sqlite`, site looks something like `https+++software.codidact.com`. Because the browser locks the database at least when the page is open, I copy it off to a temporary version. SQLite3 then shows two tables, `database` that mostly only describes the site, and `data` which holds what we find under Developer Tools/Storage/Local Storage/(relevant URL). Querying `SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'` gives...something. So far, so good. However, the "something" that we get from the `value` column *might* come through as plain JSON - it looks like this happens when the browser serializes a smaller object - or it might come through as something more complicated, which resembles small fragments of the expected JSON interspersed with opaque binary sequences. The site that interests me happens to be Open Source, so I checked to see if they maybe used some bizarre encryption, but no, they store to local storage with `localStorage.setItem(JSON.stringify(object))` from a single central utility function. I've seen suggestions that Firefox uses [MessagePack](https://msgpack.org/index.html) for this, but the data doesn't seem to decode that way. Another suggests that this might be the internal representation of the [structured clone algorithm](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm), but I couldn't find any corroboration for this or any code or tool to get a serialized form to verify it. The ideal would be to run something like `sqlite3 "copied-data.sqlite" "SELECT value FROM data WHERE key = 'whateverKeyWeCareAbout'" | tool-to-get-JSON | jq .field` and use that further in a script. Though I'm open to anything that doesn't tie up the browser.