Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics

Dashboard
Notifications
Mark all as read
Q&A

Does using an Integer have any speed/performance benefits over a string in JSON

+9
−0

I'm working on an API to respond some data about a bunch of orders and items.

The order and item numbers are always an integer (it's the order.id and item.id value, respectively).

Originally the response included each order number and each item number as a string, something like:

{
  "orders": [
    {
      "id": "12345",
      "item": [
        "123",
        "124",
        "125",
        "126"
      ]
    },
    {
      "id": "95812",
      "item": [
        "173",
        "198"
      ]
    }
  ]
}

I instructed the team to make the values as integers rather than strings, that exact response now looks like:

{
  "orders": [
    {
      "id": 12345,
      "item": [
        123,
        124,
        125,
        126
      ]
    },
    {
      "id": 95812,
      "item": [
        173,
        198
      ]
    }
  ]
}

My question is if there is really any purpose to what I've done? We're never going to need to perform any mathematical equations on the number, essentially they are functioning as strings (as far as I'm aware).

Perhaps it's counterproductive because now an order number cannot contain any other character besides a number - but that's anyhow how it works since it's using an INTEGER type in the database.


There's always the smaller and therefor faster response - since there's no ".
In the above example (minified) it's almost 17% smaller.

  • string = 96 bytes
  • int = 80 bytes
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comments

3 answers

+6
−0

Before doing any performance testing, I think you have already noticed that the payload is significantly smaller by missing those double quotes.

However, what I think is more important, especially when dealing with rather large applications and/or systems you do not own is using the appropriate data types and minimizing the chance that something wrong goes into that property.

Moreover, JSON is human readable and it is helpful for anyone reading the payload to understand what type that property has (a JSON schema can be used though).

My opinion is that if performance is not critical (i.e. you do not have to squeeze about 5%), having more readable and robust data structures is more important as any error here might lead to wasting a time worth more than what was gained through increased performace.

Why does this post require moderator attention?
You might want to add some details to your flag.

3 comments

Are you saying that it's more readable with the quotes? WELZ‭ 5 months ago

@WELZ No, I think it is more readable if the numbers are written without double quotes and only strings and other types serializable to strings are using them. By "more readable" I understand here "it conveys the message better" (i.e. item is an array of integers). Btw - what JSON serializer are you using? I thought that most serializers will encode numbers and booleans without quotes by default. Alexei‭ 5 months ago

@Alexei, I agree it's more readable like this. I'm not sure what serializer is being used. This was an api being built with simple-php-router... WELZ‭ 5 months ago

+4
−0

JSON is a standard. Performance is implementation specific and dependent on what is being done with the data. The answer really depends on what your environment is.

Some languages may have slight performance benefits from using integers when allocating memory during JSON serialisation/deserialisation. Allocating to String types might be done on the heap whereas true int types are a primitive datatype and will be allocated on the stack. Allocating memory on the heap is slower than allocation to the stack. Of course, if the JSON environment de/serialises everything on the heap then there will be practically no difference in allocation time.

Ultimately such savings will be minute and should probably be considered premature optimisation. Deciding whether to use strings or ints will likely have far more profound effects on development time and code robustness than performance.

Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment

It seems weird to focus on allocation times when discussing network I/O. If we can wait millions of processor cycles for a network response, we can probably afford a couple hundred cycles to allocate on the heap. meriton‭ 5 months ago

+0
−0

Let's start with this perl at https://www.json.org/json-en.html:

A number is very much like a C or Java number, except that the octal and hexadecimal formats are not used.

That's an extremely imprecise sentence. There are no numbers in C or Java. C provides signed and unsigned short, int, long, long long numeric types, and thin character types which allow some arithmetic. Java provides signed byte, short, int and long numeric types, their wrapping classes counterparts, and a Number superclass.

That clarified, you should stick to strings for the sake of security. That decreases the probabilities of overflowing the parser. Instead of having developers using a maybe happy times parser which may possibly believe (wrongly (or at least eventually wrongly)) that the number will fit default numeric types, have developers retrieve a number encapsulated in a string, with no doubt the parser isn't overflowing, and then let they explicitly choose between:

  • Make the wrong assumption that the number will fit the numeric type.

  • Asserting that the number fits, and fail to assert if it does not.

  • Feeding the number to a big integers parser, provided by some third party library (or your code), in the case of C; or provided by the language, in the case of Java.

Why does this post require moderator attention?
You might want to add some details to your flag.

2 comments

This answer could be improved by describing just how "overflowing the parser" could give rise to a security vulnerability, and why anybody would parse strings into numbers when (in OP's words) "we're never going to need to perform any mathematical equations on the number". meriton‭ 5 months ago

I tend to disagree. If you can verify that your developers implement and handle string-to-number conversions correctly as intended, you should also be able to verify if a json parser of your choice does number parsing correctly. If you are unable to test whether a given json parser does number conversion correctly, what would make you confident that you are able to test whether the code written by your developers parses numbers correctly? It's then "happy times" either way... elgonzo‭ 5 months ago

Sign up to answer this question »