Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users

Dashboard
Notifications
Mark all as read
Q&A

Validate All Object Properties with JSON Schema

+5
−0

I'm writing a JSON schema to validate asset files for a program. The JSON I need to parse is structured so:

{
  "jobs": {
    "software-developer": { "job-description": "sw-dev.md:0", "pay": 80000 },
    "sales-bro": { "job-description": "sales-bro", "pay": 190000 }
  },
  "people": {
    "alice": { "name": "Alice", "profile-pic": "profile-alice.jpg:0" },
    "bob": { "name": "Bob", "profile-pic": "bob.png:0" }
  }
}

As you can see, each sub-object of a primary group follows a very deterministic pattern, but the top-level object keys vary based on the id of the asset being described. In this format, things like alice, bob, and sales-bro aren't being treated as object properties, but more like identifiers.

Is there a way in JSON schema to validate such a scheme? Can I create a rule that validates the values corresponding to every key within an object?

{
  "$schema": "https://json-schema.org/draft-04/schema",
  "$id": "https://example.com/my-assets.json",
  "title": "Assets",
  "type": "object",
  "properties": {
    "jobs": {
      "type": "object",
      "each-property": { // <--- dubious
        "type": "object",
        "properties": {
          "job-description": ...,
          "pay": ...
        }
      }
    },
    "people": {
      // etc...
    }
  }
}
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

5 comments

Just a rough idea from the top of my head (untested, unverified, might be bogus). You could perhaps use the "patternProperties" keyword, with which you would specify a regex pattern against which property names are being matched. Since in your case the property names in "jobs" and "people" don't matter at all wrt validation, a simple regex like .+ might perhaps do the trick (see also https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.2) elgonzo‭ 8 days ago

If your identifiers have to satisfy some specification wrt to format and/or allowed characters, you might construct the regex pattern(s) in way that they match the specification, and additionally employ the "additionalProperties" keyword that uses the "false" schema to fail validation if any properties do not match the defined regex pattern(s). (I don't like to write this as answer, as i have not verified anything of what i just said...) elgonzo‭ 8 days ago

@elgonzo Would you go ahead and post this as an answer while I verify it? This looks like exactly what I was asking for. jnhyatt‭ 8 days ago

I would prefer to have this verified first. Not that it turns out that my idea, while sounding good, is actually not being workable. Wouldn't be the first time... ;) elgonzo‭ 8 days ago

@elgonzo I've verified your idea. I've defined a regex that I want to match all identifiers that could be used to identify assets within the json, and from there I can specify a schema that I want asset descriptions to match. If you post this as an answer, I'll accept it. Thanks! jnhyatt‭ 8 days ago

1 answer

+2
−0

If the names of the properties in the jobs and people objects are variable, you can use the patternProperties keyword instead of the properties keyword in the respective schemas for jobs and people.

As per specification for patternProperties (https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.2):

The value of "patternProperties" MUST be an object. Each property name of this object SHOULD be a valid regular expression, according to the ECMA-262 regular expression dialect. Each property value of this object MUST be a valid JSON Schema.

Validation succeeds if, for each instance name that matches any regular expressions that appear as a property name in this keyword's value, the child instance for that name successfully validates against each schema that corresponds to a matching regular expression.

This allows specifying regex patterns for instance property names instead of having to specify each and every exact instance property name.


So, how will this be used in an actual schema?


Format of the "identifier" property names does not matter

If the precise format of the "identifier" property names in jobs and people (like "software-developer" or "alice", for example) doesn't matter, the pattern matching the property names could be a simple catch-all pattern ".*" or a zero-length pattern like "" (the quotation marks are not part of the pattern).

{
  "$schema": "https://json-schema.org/draft-04/schema",
  "$id": "https://example.com/my-assets.json",
  "title": "Assets",
  "type": "object",
  "properties": {
    "jobs": {
      "type": "object",
      "patternProperties": {
        ".*": { ... schema for an job entity object ... }
      }
  ...
}

(For entirely subjective readability reasons, i personally prefer ".*" over "", but feel free to use whatever you like most.)


"Identifier" property names have to adhere to a format

If the "identifier" property names in jobs and/or people have to adhere to a well-defined format to be considered valid, the regex patterns could be designed in a way to match only valid identifier property names.

To make the schema validation fail if an "identifier" property name is not matching any of the defined property name patterns, the additionalProperties keyword (https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.3) can be used to assign the false schema to any instance properties whose names did not match any of the regex patterns in patternProperties. The false schema will always fail validation.

As an illustrative example, lets assume the identifier property names in jobs must adhere to the following format:

  • only consist of any combination of upper/lower-case latin letters and hyphens
  • must start with a upper/lower-case letter
  • have a max length of 10 characters
    (I intentionally chose a length shorter than the "software-developer" identifier to demonstrate how the schema correctly fails in case a property name violates the specified format. See also the paragraph underneath the schema example below.)

These rules can for example be expressed by the regex pattern (?i)^[a-z][a-z-]{0,9}$. The following schema utilizes this regex pattern and thus makes sure that validation only succeeds if all identifier property names in jobs adhere to the format i just made up above:

{
  "$schema": "https://json-schema.org/draft-04/schema",
  "$id": "https://example.com/my-assets.json",
  "title": "Assets",
  "type": "object",
  "properties": {
    "jobs": {
      "type": "object",
      "patternProperties": {
        "(?i)^[a-z][a-z-]{0,9}$": { ... schema for an job entity object ... }
      },
      "additionalProperties": false,

  ...
}

Validating the json data from the question with this schema will fail. Which makes sense, since my made-up rules specify a max length of 10 characters for jobs identifier property names, and "software-developer" is obviously longer than that. Using the additionalProperties keyword with the false schema is crucial here.

If in my example schema the additionalProperties keyword were to be omitted or not assigned the false schema, the validation would incorrectly succeed, despite some "identifier" property names violating the specified format and not matching the pattern(s) in patternProperties. Depending on the overall specification of the schema and the purpose/goal of the validation, this could constitute a serious flaw. Since if the pattern is not matched then the schema assigned to that pattern is not being used to validate the value of the violating instance property either, thus succeeding validation while not only not detecting a malformed "identifier" property name, but also leaving the value of that property completely unvalidated.

Why does this post require moderator attention?
You might want to add some details to your flag.

0 comments

Sign up to answer this question »