Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Validate All Object Properties with JSON Schema
I'm writing a JSON schema to validate asset files for a program. The JSON I need to parse is structured so:
{
"jobs": {
"software-developer": { "job-description": "sw-dev.md:0", "pay": 80000 },
"sales-bro": { "job-description": "sales-bro", "pay": 190000 }
},
"people": {
"alice": { "name": "Alice", "profile-pic": "profile-alice.jpg:0" },
"bob": { "name": "Bob", "profile-pic": "bob.png:0" }
}
}
As you can see, each sub-object of a primary group follows a very deterministic pattern, but the top-level object keys vary based on the id of the asset being described. In this format, things like alice, bob, and sales-bro aren't being treated as object properties, but more like identifiers.
Is there a way in JSON schema to validate such a scheme? Can I create a rule that validates the values corresponding to every key within an object?
{
"$schema": "https://json-schema.org/draft-04/schema",
"$id": "https://example.com/my-assets.json",
"title": "Assets",
"type": "object",
"properties": {
"jobs": {
"type": "object",
"each-property": { // <--- dubious
"type": "object",
"properties": {
"job-description": ...,
"pay": ...
}
}
},
"people": {
// etc...
}
}
}
1 answer
If the names of the properties in the jobs and people objects are variable, you can use the patternProperties
keyword instead of the properties
keyword in the respective schemas for jobs and people.
As per specification for patternProperties
(https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.2):
The value of "patternProperties" MUST be an object. Each property name of this object SHOULD be a valid regular expression, according to the ECMA-262 regular expression dialect. Each property value of this object MUST be a valid JSON Schema.
Validation succeeds if, for each instance name that matches any regular expressions that appear as a property name in this keyword's value, the child instance for that name successfully validates against each schema that corresponds to a matching regular expression.
This allows specifying regex patterns for instance property names instead of having to specify each and every exact instance property name.
So, how will this be used in an actual schema?
Format of the "identifier" property names does not matter
If the precise format of the "identifier" property names in jobs and people (like "software-developer" or "alice", for example) doesn't matter, the pattern matching the property names could be a simple catch-all pattern ".*" or a zero-length pattern like "" (the quotation marks are not part of the pattern).
{
"$schema": "https://json-schema.org/draft-04/schema",
"$id": "https://example.com/my-assets.json",
"title": "Assets",
"type": "object",
"properties": {
"jobs": {
"type": "object",
"patternProperties": {
".*": { ... schema for an job entity object ... }
}
...
}
(For entirely subjective readability reasons, i personally prefer ".*" over "", but feel free to use whatever you like most.)
"Identifier" property names have to adhere to a format
If the "identifier" property names in jobs and/or people have to adhere to a well-defined format to be considered valid, the regex patterns could be designed in a way to match only valid identifier property names.
To make the schema validation fail if an "identifier" property name is not matching any of the defined property name patterns, the additionalProperties
keyword (https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.3) can be used to assign the false
schema to any instance properties whose names did not match any of the regex patterns in patternProperties
. The false
schema will always fail validation.
As an illustrative example, lets assume the identifier property names in jobs must adhere to the following format:
- only consist of any combination of upper/lower-case latin letters and hyphens
- must start with a upper/lower-case letter
- have a max length of 10 characters
(I intentionally chose a length shorter than the "software-developer" identifier to demonstrate how the schema correctly fails in case a property name violates the specified format. See also the paragraph underneath the schema example below.)
These rules can for example be expressed by the regex pattern (?i)^[a-z][a-z-]{0,9}$
. The following schema utilizes this regex pattern and thus makes sure that validation only succeeds if all identifier property names in jobs adhere to the format i just made up above:
{
"$schema": "https://json-schema.org/draft-04/schema",
"$id": "https://example.com/my-assets.json",
"title": "Assets",
"type": "object",
"properties": {
"jobs": {
"type": "object",
"patternProperties": {
"(?i)^[a-z][a-z-]{0,9}$": { ... schema for an job entity object ... }
},
"additionalProperties": false,
...
}
Validating the json data from the question with this schema will fail. Which makes sense, since my made-up rules specify a max length of 10 characters for jobs identifier property names, and "software-developer" is obviously longer than that. Using the additionalProperties
keyword with the false
schema is crucial here.
If in my example schema the additionalProperties
keyword were to be omitted or not assigned the false
schema, the validation would incorrectly succeed, despite some "identifier" property names violating the specified format and not matching the pattern(s) in patternProperties
. Depending on the overall specification of the schema and the purpose/goal of the validation, this could constitute a serious flaw. Since if the pattern is not matched then the schema assigned to that pattern is not being used to validate the value of the violating instance property either, thus succeeding validation while not only not detecting a malformed "identifier" property name, but also leaving the value of that property completely unvalidated.
1 comment thread