Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on How can I build a string from smaller pieces?

Parent

How can I build a string from smaller pieces?

+4
−1

Suppose I have some variables like:

>>> count = 8
>>> status = 'off'

I want to combine them with some hard-coded text, to get a single string like

'I have 8 cans of Spam®; baked beans are off'.

Simply writing the values in sequence only works for literal strings:

>>> 'one' 'two'
'onetwo'
>>> 'I have' count 'cans of Spam®; baked beans are' status
  File "<stdin>", line 1
    'I have' count 'cans of Spam®; baked beans are' status
             ^^^^^
SyntaxError: invalid syntax

(Python 3.9 and below will only highlight the "c" of count; this improvement was added in 3.10)

Using commas to separate the values gives a tuple instead of a single string:

>>> 'I have', count, 'cans of Spam®; baked beans are', status
('I have', 8, '; baked beans are', 'off')

"Adding" the strings doesn't work either (I know this is not like mathematics, but it works in some other languages!):

>>> 'I have' + count + 'cans of Spam®; baked beans are' + status
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

How am I meant to do it?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Somewhat meta comment on questions created to be self-answered (6 comments)
Post
+9
−0

Before attempting this, make sure it makes sense in context.

In a few particular situations, it would be better to take a different approach rather than using the normal tools for composing or formatting a string.

  • If the string is for an SQL query, use the SQL library's built-in functionality for parameterized queries. Trying to build the string by hand risks a critical security failure - one that historically has cost (and still does) real businesses huge amounts of money.
  • If the string is a URL query string - for example, to use an API endpoint - there are well established tools for this, that will take care of e.g. issues with escaping & in query parameters automatically.
  • Similarly, the standard library has a full set of tools for building and manipulating file paths.
  • If the string is only needed so that it can be displayed or written somewhere immediately, consider approaches for doing that directly instead.

That said, there are many tools available for this task. Each section below shows a minimal example of the task attempted in the question; there's a lot to say about each, so please click to expand.

f-strings (since Python 3.6)

This approach is recommended for cases where it's applicable.

f'I have {count} cans of Spam®; baked beans are {status}'

"f-string" is an informal (but official!) name for the formatted string literals introduced in Python 3.6, introduced by PEP 498. To use them, write an entire "template" string with "placeholders" (my terminology) that are surrounded by {} and contain the appropriate variable names for whatever will be inserted.

f-strings have the same quoting and escaping rules as normal strings, except for the placeholders. To use literal { and } symbols in the string, double them up:

>>> f'}}{{}}{{'
'}{}{'
>>> f'{{{count}}}' # escaped braces and placeholders mix freely.
'{8}'
Slightly more advanced placeholder usage

Placeholders can contain more complex expressions, as well as literal values:

>>> f'{"example"}' # Interpolate a normal, double-quoted string
'example'
>>> f'{1+2}'
'3'

However, in 3.6 through 3.11, backslashes can't be used anywhere inside the placeholder:

>>> f'{"\n"}'
  File "<stdin>", line 1
    f'{"\n"}'
             ^
SyntaxError: f-string expression part cannot include a backslash

and can't use the same quotes for a nested expression as for the string:

>>> f'{''}'
  File "<stdin>", line 1
    f'{''}'
           ^
SyntaxError: f-string: expecting '}'

These restrictions are lifted in 3.12.

Custom formatting

Within each placeholder, the value that will be formatted can be followed by a conversion and a format specifier.

Conversions

Conversions are mostly not very useful. They're used to convert the value explicitly to string first. This isn't needed for making it work (as shown above), but sometimes it's useful to bypass the type's own formatting rules. There are also different ways to convert Python values to strings. For example:

>>> f'{status!s}' # Converting the string using `str`
'off'
>>> f'{status!r}' # Converting the string using `repr`
"'off'"

Obviously, converting a string to a string, using str, has no real effect. However, !r converts the string to a representation of the string - a way that it could be expressed in Python source code. (There is also !a, for converting to an ASCII representation - this is a legacy purpose that should rarely if ever be necessary.)

Format specifiers

Format specifiers are mostly used for aligning and padding numeric values. How they work depends on the type of whatever is being formatted. There are some general rules (that are implemented by the built-in int and float), but there's too much to go over here; please see the documentation for a complete reference. Here are a few examples, though:

>>> f'{count:#b}' # binary, with a prefix (because of the #)
'0b1000'
>>> f'{count:<5}' # left-aligned within a "field"
'8    '
>>> f'{count:05}' # right-aligned (the default) and zero-padded
'00008'
>>> f'{count:e}' # scientific notation
'8.000000e+00'
>>> f'{count:{"5"}}' # numbers here can also use placeholders
'    8'

Implementation details: the format builtin and __format__ magic method

The builtin function format (a plain function, not the method of string objects) implements the logic for formatting a single value that will replace a placeholder. With one argument, it's basically equivalent to calling str to convert the object to string:

>>> format([1, 2, 5, 'Three, sir.'])
"[1, 2, 5, 'Three, sir.']"

The second argument can be a format specifier that works just like for f-strings:

>>> format(8, '#010b')
'0b00001000'

To emulate conversions, just do that conversion to the first argument instead:

>>> f'{"®"!a}' # for reference
"'\\xae'"
>>> format(ascii('®'))
"'\\xae'"

This format function is a wrapper for the __format__ magic method, which implements a type's logic for handling format specifiers. This can be used to implement custom rules for formatting, that can also be customized in a custom way in the template.

For example:

class Fancy(str):
    def __format__(self, spec):
        # The `str` call here avoids unbounded recursion.
        return f'{spec}{str(self.upper())}!{spec[::-1]}'

Which enables:

>>> meal = Fancy(spam)
>>> f'{meal:-+*}'
'-+*SPAM!*+-'

The format method (since Python 3.0, backported to 2.6)

This approach is recommended for cases where f-strings won't work. It offers compatibility with much older versions of Python, and allows storing a template for later reuse. However, it's slightly more awkward and limited.

'I have {} cans of Spam®; baked beans are {}'.format(count, status)

This uses the same basic template syntax as f-strings, but instead using the format method of an ordinary string. This method supports all the same custom formatting and escaping as the f-string approach:

>>> 'In binary, I have {:#010b} cans of Spam®.'.format(count)
'In binary, I have 0b00001000 cans of Spam®.'
>>> '{{{}}}'.format('wavy')
'{wavy}'
Advanced usage, and comparison to f-strings

The advantage over f-strings is that the "template" string can be stored for later use, and explicitly fill in the values later, and that the placeholders can usefully be empty:

>>> template = 'I have {} cans of Spam®; baked beans are {}'
>>> template.format(count, status)
'I have 8 cans of Spam®; baked beans are off'

However, the syntax is much more limited. The values that will be formatted need to be passed in as arguments - any calculation happens at or before that point, not as part of the template.

The placeholders can use names supplied using keyword arguments:

>>> '{x} + {x}'.format(x=1)
'1 + 1'

Local variable names will not be considered:

>>> x = 1
>>> '{x}'.format(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'x'

Numbered placeholders allow reusing or re-ordering the formatted values:

>>> # for example, for localization
>>> 'In English, {0} {1} {2}.'.format('a', 'dead', 'parrot')
'In English, a dead parrot.'
>>> 'En Français, {0} {2} {1}.'.format('un', 'mort', 'perroquet')
'En Français, un perroquet mort.'
>>> # or just repetition
>>> '{0}, {0}, {0}, {0}, {0}, {1} {0}, {0}'.format('spam', 'lovely')
'spam, spam, spam, spam, spam, lovely spam, spam'

The template can safely ignore extra values, regardless of numbering:

>>> '{}'.format('spam', 'eggs')
'spam'
>>> '{1}'.format('spam', 'eggs')
'eggs'
>>> '{veggie}'.format(meat='spam', veggie='baked beans')
'baked beans'

Placeholders have special, limited syntax for element/item/attribute access:

>>> # Doesn't work with negative numbers or slices.
>>> '{[0]}'.format([1, 2, 3])
'1'
>>> # Only works with keys that are strings.
>>> '{[key]}'.format({'key': 'value'})
'value'
>>> '{.imag}'.format(1+2j)
'2.0'

But they cannot use arbitrary expressions the way that f-strings do:

>>> '{x+y}'.format(x=1, y=2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'x+y'

And can give confusing error messages:

>>> # This negative index gets interpreted as a string key instead.
>>> '{[-1]}'.format([1,2,3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str

The format_map method

The format_map method is a minor variation on the format method that is rarely seen. It takes a single parameter which is some kind of mapping; it works like format, except looking up keys in the mapping instead of keyword arguments.

Calling .format_map(mapping) is similar to calling .format(**mapping), but it allows the mapping to compute values on-demand when the keys are looked up, and doesn't require the mapping to know what its keys are. It's possible to define custom mappings, like so:

>>> class ExampleMapping(dict):
...     def __missing__(self, key):
...         return f'value for {key}'
... 
>>> '{ham} and {eggs}'.format(**ExampleMapping()) # doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'ham'
>>> '{ham} and {eggs}'.format_map(ExampleMapping()) # this is needed
'value for ham and value for eggs'

Because it only uses keys, the template cannot contain any positional values:

>>> # This does not try an empty string key!
>>> '{}'.format_map(ExampleMapping()) 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Format string contains positional fields

The Formatter class

This is quite advanced, and useless for almost everyone. The documentation explains what everything does, but doesn't really make the practical usage clear.

The standard library string module provides a class called Formatter which is a pure-Python implementation of the format method's logic. A call like string.Formatter().format(my_string, ...) is equivalent to my_string.format(...), with whatever positional and keyword arguments replacing the .... The purpose of this class is to provide hooks for changing how the formatting process works.

A simple (sort of...) example, where the "template" is now a plain list of comma-separated items:

class CommaFormatter(string.Formatter):
    def parse(self, fstr):
        # This will also be called to look for nested placeholders within
        # format specs. Ignoring these fully requires special handling.
        if fstr is None:
            return
        field_names = [f.strip() for f in fstr.split(',')]
        leading = ''
        for fname in field_names:
            yield (leading, fname, None, 's')
            leading = ', ' # for each item except the first

Which allows:

>>> cf = CommaFormatter().format
>>> cf('meat, veggies', meat='spam', veggies='baked beans')
'spam, baked beans'
>>> cf('0,0,0,0,0,0,1,0,0,0,0,0', 'spam', 'baked beans')
'spam, spam, spam, spam, spam, spam, baked beans, spam, spam, spam, spam, spam'

And, perhaps counter-intuitively:

>>> cf('', 'spam', 'baked beans')
'spam'
>>> cf(',', 'spam', 'baked beans')
'spam, baked beans'

The Template class (since Python 2.4)

This approach is unpopular, and cumbersome for simple cases. It seems to be rarely used or mentioned, especially since the format method of strings also allows for simple, reusable templates. However, it's still maintained and even got new functionality added in 3.11.

from string import Template
t = Template('I have $count cans of Spam®; baked beans are $status')
t.substitute({'count': 8, 'status': 'off'})

The standard library string module provides a class called Template that provides reusable string formatting from keywords (similar to a string with braces that will have its format_map method called later). It uses a less powerful, but simpler syntax that appears inspired by Perl's string interpolation.

It's also possible to use keyword arguments to substitute into the template:

>>> t.substitute(count=8, status='off')
'I have 8 cans of Spam®; baked beans are off'
Advanced usage

The substitute method will raise an exception if a keyword is missing:

>>> t.substitute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/string.py", line 114, in convert
    return str(mapping[named])
               ~~~~~~~^^^^^^^
KeyError: 'count'

safe_substitute leaves those placeholders alone (this can hide problems):

>>> t.safe_substitute(status='off')
'I have $count cans of Spam®; baked beans are off'

Use {} to clarify the placeholder name if there need to be letters immediately after the substituted value:

>>> Template('${b}a${c}o$n').substitute(b='b',c='c',n='n')
'bacon'

To escape a literal $, double it up (I'm noticing a theme...):

>>> Template('ca$$h $cash').substitute(cash='money')
'ca$h money'

For really advanced use cases, the Template class can also be subclassed - refer to the documentation for details.

The % operator

This approach has many disadvantages and idiosyncracies. It should not be used in new code without a good reason. However, it's the original way to solve the problem, and some older interfaces are designed around it. It also isn't formally deprecated.

'I have %d cans of Spam®; baked beans are %s' % (count, status)

Formatting strings using the % operator still works by substituting values into the placeholders of a template, but they look very different. This approach is meant to mimic functions like printf in C, although it supports many things that the original functions don't.

The letters used for the placeholders follow mostly the same rules as in C, which are also used for format specifiers for f-strings and the format method. However, it's usually not necessary to worry about this too much. The %s placeholder will convert the value directly to string with str, and everything supports that by default (although for some types, the result might not always be what you want).

To escape a literal % sign, double it up (just like with the braces before):

>>> '%%%s%%' % 'I like percentage signs'
'%I like percentage signs%'
Complications

Extra positional arguments are not ignored and cause an error:

>>> '%s' % ('one', 'two')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Multiple arguments have to be in a tuple - not a list:

>>> 'I have %d cans of Spam®; baked beans are %s' % [count, status]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: %d format: a real number is required, not list
>>> 'I have %s cans of Spam®; baked beans are %s' % [count, status]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Parentheses are necessary around the tuple because of the order of operations:

>>> # This tries to do the formatting first, and THEN
>>> # make a tuple with the formatted string and the `status` value.
>>> 'I have %s cans of Spam®; baked beans are %s' % count, status
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Single values don't need to be in a tuple:

>>> my_list = [1, 2, 3]
>>> 'This is a list: %s' % my_list
'This is a list: [1, 2, 3]'

Unless the single value is a tuple:

>>> my_tuple = (1, 2, 3)
>>> 'This is a tuple: %s' % my_tuple
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Putting the tuple inside another 1-tuple solves the problem:

>>> 'This is a tuple: %s' % (my_tuple,)
'This is a tuple: (1, 2, 3)'
Advanced usage

% can also use a mapping (placeholder names go in parentheses):

>>> '%(meat)s and %(side)s' % {'side': 'eggs', 'meat': 'ham'}
'ham and eggs'

This can't be mixed with positional placeholders:

>>> '%(key)s %s' % ({'key': 'value'}, 'text')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: format requires a mapping
>>> '%(key)s %s' % {'key': 'value', '': ''}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Some advanced formatting options are also supported:

>>> '%08x' % count
'00000008'
>>> '%#010x' % count
'0x00000008'

However, certain format specifiers do not work (See full documentation here):

>>> f'{count:b}'
'1000'
>>> '%b' % count
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'b' (0x62) at index 1

String concatenation with explicit casting

This approach can be awkward to use, especially when there are multiple, non-string pieces to assemble. However, it's simple and effective. Contrary to what some would expect, it doesn't ordinarily run into performance issues; although Python strings are immutable from Python, the implementation can and does "cheat".

'I have ' + str(count) + ' cans of Spam®; baked beans are ' + status

The + operator does work for putting strings together, but it requires a string for both operands (on both sides of the +). Notice that any desired spaces between parts of the text need to be accounted for carefully.

This is disallowed because it's ambiguous:

>>> '2' + 2 # "In the face of ambiguity, refuse the temptation to guess."
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

Explicitly converting either side of the + makes the meaning clear:

>>> int('2') + 2
4
>>> '2' + str(2)
'22'

Aside from using + to join the strings together, an already-existing sequence of strings can simply be joined together (which is usually considered its own topic):

>>> pieces = ('I have', str(count), 'cans of Spam®; baked beans are', status)
>>> ' '.join(pieces) # join with a space in between each pair
'I have 8 cans of Spam®; baked beans are off'
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

f-string '=' format (2 comments)
f-string '=' format
Michael‭ wrote 10 months ago

Do you want to add an explainer for $'{status=}'? It's nice for logging or verbose output.

Karl Knechtel‭ wrote 10 months ago

I think that's out of scope for this question. The purpose isn't to showcase string formatting, but rather string assembly. I did touch on format specifiers, but deep within the nested details tags, and even then saying to read the documentation to understand it properly.