Software Development

not removing the whole text since someone had mentioned it in answer. And I forgot the name that's why I used bitmap earlier. ↩︎

−0

The following users marked this post as Works for me:

User	Comment	Date
Anonymous‭	(no comment)	Oct 8, 2021 at 04:33

The question seems to show a certain level of confusion about terminology. I shall address what I believe to be the underlying question: How can I store binary data in SQLite?

SQLite supports 5 datatypes: NULL, INTEGER, REAL, TEXT, and BLOB. The natural datatype to use for binary data is BLOB. However, I assume that this is what you mean by bitmap when you say "I don't want to use bitmap".

That leaves text encodings. Note that

The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE)

so the most efficient text encoding to use depends on the database configuration.

If the database is using UTF-8 then the optimal encoding would be base-128 using codepoints \u0000 to \u007f. If for some reason you want to avoid certain codepoints, binary base encoding is fastest when it can work with blocks, i.e. with bases whose base-2 logarithm is just more than a fraction whose numerator is a small multiple of 8. E.g. if lg b = (8n + epsilon)/d you can encode a block of n bytes as a block of d bytes for efficiency n/d. That gives the following good bases:

Base b     lg b         Efficiency
   41    16.03 / 3         0.67
   49    56.15 / 10        0.70
   53    40.10 / 7         0.71
   57    64.16 / 11        0.72
   64    24.00 / 4         0.75
   75    56.06 / 9         0.78
   85    32.05 / 5         0.80
  102    40.03 / 6         0.83
  116    48.01 / 7         0.86
  128    56.00 / 8         0.88

So basically you want to pick the highest base which gives you enough codepoints in the range \u0000 to \u007f which you deem acceptable.

If the database is using UTF-16 then the basic principles are the same but you have the complication of working around the surrogate codepoints. I would avoid this if at all possible.

posted over 3 years ago

CC BY-SA 4.0

Peter Taylor‭

1302 reputation 7 34 141 7

Copy Link

Raw

Markdown

History

1 comment thread

I wonder about the relevance of `lg b` if encoding efficiency is expressed simply by `n/d`... (8 comments)

Communities

Comments on What's the better way to store base64 in SQLite Database?

What's the better way to store base64 in SQLite Database?

0 comment threads

1 comment thread