Is it dangerous to use json.loads on untrusted data?

−0

Short answer: No, it's not dangerous.

Short of bugs in the implementation or monkey-patching, there's no reason it would or should allow executing of anything other than the JSON parsing code. This Stack Overflow answer goes into detail for the actual implementation at that time. I didn't find any CVEs related to json.loads.

As a potental denial-of-service vector, I haven't looked at the code, but there's no real reason to expect it to be super-linear in either time or space with respect to the length of the string input.

While it doesn't seem relevant here, you can sometimes have security issues where different pieces of code interpret the same input in different ways. This is usually security relevant when one of those pieces of code is doing some kind of security check. This doesn't seem to be the case here.

However, to that end, this code is likely slightly better than passing the string through untouched assuming some other part of your system processes it as JSON. This code will then ensure that you are passing valid JSON to those deeper parts of the code. (Ideally, you would also check the JSON against a schema.) Even better, would be to parse into a JSON object and then pass that around rather than a string. You can convert it to a string when you need it as a string. If you still want to pass it around as a string though, it would be better to reserialize (i.e. json.dumps) so that you know the JSON is in a consistent, well-behaved format. For example, let's say the JSON parser used elsewhere in your code does have some vulnerability when exposed to legal but carefully formatted JSON. If the input to that JSON parser is only ever the output of json.dumps, then you only need to consider the output json.dumps can produce, and not all possible legal JSON strings. In general, anything you can do to reduce the number of representations for the same value is a good thing for correctness, ease of testing, and security.

If the string is truly an opaque blob with respect to your code as well as to any 3rd party services you pass it to, then you shouldn't be touching it at all. I assume that isn't the case, because otherwise you'd have no reason to expect it to be JSON at all.

Generally speaking, you want to parse into a structured representation as soon as you can and as strictly as you can^[1]. (If this seems to conflict with Postel's Principle, often sloganized as "Be conservative in what you do, and liberal in what you accept from others", that's because it is commonly misunderstood and the misunderstood form is terrible advice. See A Patch for Postel's Robustness Principle.) I recommend the LangSec site generally for more discussion of these matters.

To be clear, "as strictly as you can" means respecting the protocol/file format. If you claim to accept JSON matching a schema, you should accept all JSON matching that schema, otherwise you aren't accepting JSON. ↩︎

posted over 3 years ago

CC BY-SA 4.0

3y ago by hkotsubo‭

Derek Elkins‭

2719 reputation 0 53 267 12

Copy Link

Raw

Markdown

History

Communities

Is it dangerous to use json.loads on untrusted data?

0 comment threads

1 answer

0 comment threads