Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

How can I build a string from smaller pieces?

−1

Suppose I have some variables like:

>>> count = 8
>>> status = 'off'

I want to combine them with some hard-coded text, to get a single string like

'I have 8 cans of Spam®; baked beans are off'.

Simply writing the values in sequence only works for literal strings:

>>> 'one' 'two'
'onetwo'
>>> 'I have' count 'cans of Spam®; baked beans are' status
  File "<stdin>", line 1
    'I have' count 'cans of Spam®; baked beans are' status
             ^^^^^
SyntaxError: invalid syntax

(Python 3.9 and below will only highlight the "c" of count; this improvement was added in 3.10)

Using commas to separate the values gives a tuple instead of a single string:

>>> 'I have', count, 'cans of Spam®; baked beans are', status
('I have', 8, '; baked beans are', 'off')

"Adding" the strings doesn't work either (I know this is not like mathematics, but it works in some other languages!):

>>> 'I have' + count + 'cans of Spam®; baked beans are' + status
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

How am I meant to do it?

posted almost 2 years ago

CC BY-SA 4.0

2y ago

Karl Knechtel‭

2475 reputation 40 59 316 87

Raw

Markdown

History

is a duplicate

This question has been asked before and has already been answered. It should be marked as a duplicate.

Please enter the URL of the proposed duplicate in the details field below.

not constructive

This question cannot be answered in a way that is helpful to anyone. It's not possible to learn something from possible answers, except for the solution for the specific problem of the asker.

1 comment thread

Somewhat meta comment on questions created to be self-answered (6 comments)

2 answers

Score Active Age

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

−0

+ is string concatenation and only be applied by strings. Non-string operands like numbers or class instances must be converted to strings using str(). That's all there really is to it, except Python has some syntactic sugar that hides this in certain situations.

When I have a list u of things (i.e. may or may not be strings) that I want to combine into a string s, my go to is:

s = "".join(map(str, u))

This is somewhat advanced Python syntax but it's not that complex. map applies str to every element of u, and join glues them all together with an empty string as a delimiter (so that we're not adding anything extra). This is easy to remember, easy to type, and works with everything. You can modify it with various ways as well, such as passing list/sequence comprehensions instead of u or a lambda instead of str.

Syntactic sugars include:

print takes any object, not just strings, and will automatically convert to string. It can also handle multiple arguments. Therefore print(8, " cans of spam") will work, (although print(8 + " cans of spam") will not, because the concatenation must evaluate before print).
String formatting also auto-converts. Therefore "{} cans of {}".format(8, "spam") will work. f"{8} cans of {meat_name}" is just syntactic sugar for this. The % operator is the same, but it's an ancient Python syntax and I consider it superseded by f-strings.
When you put strings next to each other like "hello" "world" the + is implicit. Because, well, what else could you possibly mean besides concatenation?

posted almost 2 years ago

CC BY-SA 4.0

2y ago

matthewsnyder‭

2285 reputation 52 61 267 93

Copy Link

Raw

Markdown

History

2 comment threads

print() vs string creation (2 comments)

f-strings can take strings (1 comment)

−0

Worked for Karl Knechtel‭

The following users marked this post as Works for me:

User	Comment	Date
Karl Knechtel‭	(no comment)	Aug 6, 2023 at 03:30

Before attempting this, make sure it makes sense in context.

In a few particular situations, it would be better to take a different approach rather than using the normal tools for composing or formatting a string.

If the string is for an SQL query, use the SQL library's built-in functionality for parameterized queries. Trying to build the string by hand risks a critical security failure - one that historically has cost (and still does) real businesses huge amounts of money.
If the string is a URL query string - for example, to use an API endpoint - there are well established tools for this, that will take care of e.g. issues with escaping & in query parameters automatically.
Similarly, the standard library has a full set of tools for building and manipulating file paths.
If the string is only needed so that it can be displayed or written somewhere immediately, consider approaches for doing that directly instead.

That said, there are many tools available for this task. Each section below shows a minimal example of the task attempted in the question; there's a lot to say about each, so please click to expand.

f-strings (since Python 3.6)

This approach is recommended for cases where it's applicable.

f'I have {count} cans of Spam®; baked beans are {status}'

"f-string" is an informal (but official!) name for the formatted string literals introduced in Python 3.6, introduced by PEP 498. To use them, write an entire "template" string with "placeholders" (my terminology) that are surrounded by {} and contain the appropriate variable names for whatever will be inserted.

f-strings have the same quoting and escaping rules as normal strings, except for the placeholders. To use literal { and } symbols in the string, double them up:

>>> f'}}{{}}{{'
'}{}{'
>>> f'{{{count}}}' # escaped braces and placeholders mix freely.
'{8}'

Slightly more advanced placeholder usage

Placeholders can contain more complex expressions, as well as literal values:

>>> f'{"example"}' # Interpolate a normal, double-quoted string
'example'
>>> f'{1+2}'
'3'

However, in 3.6 through 3.11, backslashes can't be used anywhere inside the placeholder:

>>> f'{"\n"}'
  File "<stdin>", line 1
    f'{"\n"}'
             ^
SyntaxError: f-string expression part cannot include a backslash

and can't use the same quotes for a nested expression as for the string:

>>> f'{''}'
  File "<stdin>", line 1
    f'{''}'
           ^
SyntaxError: f-string: expecting '}'

These restrictions are lifted in 3.12.

Custom formatting

Within each placeholder, the value that will be formatted can be followed by a conversion and a format specifier.

Conversions

Conversions are mostly not very useful. They're used to convert the value explicitly to string first. This isn't needed for making it work (as shown above), but sometimes it's useful to bypass the type's own formatting rules. There are also different ways to convert Python values to strings. For example:

>>> f'{status!s}' # Converting the string using `str`
'off'
>>> f'{status!r}' # Converting the string using `repr`
"'off'"

Obviously, converting a string to a string, using str, has no real effect. However, !r converts the string to a representation of the string - a way that it could be expressed in Python source code. (There is also !a, for converting to an ASCII representation - this is a legacy purpose that should rarely if ever be necessary.)

Format specifiers

Format specifiers are mostly used for aligning and padding numeric values. How they work depends on the type of whatever is being formatted. There are some general rules (that are implemented by the built-in int and float), but there's too much to go over here; please see the documentation for a complete reference. Here are a few examples, though:

>>> f'{count:#b}' # binary, with a prefix (because of the #)
'0b1000'
>>> f'{count:<5}' # left-aligned within a "field"
'8    '
>>> f'{count:05}' # right-aligned (the default) and zero-padded
'00008'
>>> f'{count:e}' # scientific notation
'8.000000e+00'
>>> f'{count:{"5"}}' # numbers here can also use placeholders
'    8'

Implementation details: the format builtin and __format__ magic method

The builtin function format (a plain function, not the method of string objects) implements the logic for formatting a single value that will replace a placeholder. With one argument, it's basically equivalent to calling str to convert the object to string:

>>> format([1, 2, 5, 'Three, sir.'])
"[1, 2, 5, 'Three, sir.']"

The second argument can be a format specifier that works just like for f-strings:

>>> format(8, '#010b')
'0b00001000'

To emulate conversions, just do that conversion to the first argument instead:

>>> f'{"®"!a}' # for reference
"'\\xae'"
>>> format(ascii('®'))
"'\\xae'"

This format function is a wrapper for the __format__ magic method, which implements a type's logic for handling format specifiers. This can be used to implement custom rules for formatting, that can also be customized in a custom way in the template.

For example:

class Fancy(str):
    def __format__(self, spec):
        # The `str` call here avoids unbounded recursion.
        return f'{spec}{str(self.upper())}!{spec[::-1]}'

Which enables:

>>> meal = Fancy(spam)
>>> f'{meal:-+*}'
'-+*SPAM!*+-'

The `format` method (since Python 3.0, backported to 2.6)

This approach is recommended for cases where f-strings won't work. It offers compatibility with much older versions of Python, and allows storing a template for later reuse. However, it's slightly more awkward and limited.

'I have {} cans of Spam®; baked beans are {}'.format(count, status)

This uses the same basic template syntax as f-strings, but instead using the format method of an ordinary string. This method supports all the same custom formatting and escaping as the f-string approach:

>>> 'In binary, I have {:#010b} cans of Spam®.'.format(count)
'In binary, I have 0b00001000 cans of Spam®.'
>>> '{{{}}}'.format('wavy')
'{wavy}'

Advanced usage, and comparison to f-strings

The advantage over f-strings is that the "template" string can be stored for later use, and explicitly fill in the values later, and that the placeholders can usefully be empty:

>>> template = 'I have {} cans of Spam®; baked beans are {}'
>>> template.format(count, status)
'I have 8 cans of Spam®; baked beans are off'

However, the syntax is much more limited. The values that will be formatted need to be passed in as arguments - any calculation happens at or before that point, not as part of the template.

The placeholders can use names supplied using keyword arguments:

>>> '{x} + {x}'.format(x=1)
'1 + 1'

Local variable names will not be considered:

>>> x = 1
>>> '{x}'.format(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'x'

Numbered placeholders allow reusing or re-ordering the formatted values:

>>> # for example, for localization
>>> 'In English, {0} {1} {2}.'.format('a', 'dead', 'parrot')
'In English, a dead parrot.'
>>> 'En Français, {0} {2} {1}.'.format('un', 'mort', 'perroquet')
'En Français, un perroquet mort.'
>>> # or just repetition
>>> '{0}, {0}, {0}, {0}, {0}, {1} {0}, {0}'.format('spam', 'lovely')
'spam, spam, spam, spam, spam, lovely spam, spam'

The template can safely ignore extra values, regardless of numbering:

>>> '{}'.format('spam', 'eggs')
'spam'
>>> '{1}'.format('spam', 'eggs')
'eggs'
>>> '{veggie}'.format(meat='spam', veggie='baked beans')
'baked beans'

Placeholders have special, limited syntax for element/item/attribute access:

>>> # Doesn't work with negative numbers or slices.
>>> '{[0]}'.format([1, 2, 3])
'1'
>>> # Only works with keys that are strings.
>>> '{[key]}'.format({'key': 'value'})
'value'
>>> '{.imag}'.format(1+2j)
'2.0'

But they cannot use arbitrary expressions the way that f-strings do:

>>> '{x+y}'.format(x=1, y=2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'x+y'

And can give confusing error messages:

>>> # This negative index gets interpreted as a string key instead.
>>> '{[-1]}'.format([1,2,3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str

The format_map method

The format_map method is a minor variation on the format method that is rarely seen. It takes a single parameter which is some kind of mapping; it works like format, except looking up keys in the mapping instead of keyword arguments.

Calling .format_map(mapping) is similar to calling .format(**mapping), but it allows the mapping to compute values on-demand when the keys are looked up, and doesn't require the mapping to know what its keys are. It's possible to define custom mappings, like so:

>>> class ExampleMapping(dict):
...     def __missing__(self, key):
...         return f'value for {key}'
... 
>>> '{ham} and {eggs}'.format(**ExampleMapping()) # doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'ham'
>>> '{ham} and {eggs}'.format_map(ExampleMapping()) # this is needed
'value for ham and value for eggs'

Because it only uses keys, the template cannot contain any positional values:

>>> # This does not try an empty string key!
>>> '{}'.format_map(ExampleMapping()) 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Format string contains positional fields

The Formatter class

This is quite advanced, and useless for almost everyone. The documentation explains what everything does, but doesn't really make the practical usage clear.

The standard library string module provides a class called Formatter which is a pure-Python implementation of the format method's logic. A call like string.Formatter().format(my_string, ...) is equivalent to my_string.format(...), with whatever positional and keyword arguments replacing the .... The purpose of this class is to provide hooks for changing how the formatting process works.

A simple (sort of...) example, where the "template" is now a plain list of comma-separated items:

class CommaFormatter(string.Formatter):
    def parse(self, fstr):
        # This will also be called to look for nested placeholders within
        # format specs. Ignoring these fully requires special handling.
        if fstr is None:
            return
        field_names = [f.strip() for f in fstr.split(',')]
        leading = ''
        for fname in field_names:
            yield (leading, fname, None, 's')
            leading = ', ' # for each item except the first

Which allows:

>>> cf = CommaFormatter().format
>>> cf('meat, veggies', meat='spam', veggies='baked beans')
'spam, baked beans'
>>> cf('0,0,0,0,0,0,1,0,0,0,0,0', 'spam', 'baked beans')
'spam, spam, spam, spam, spam, spam, baked beans, spam, spam, spam, spam, spam'

And, perhaps counter-intuitively:

>>> cf('', 'spam', 'baked beans')
'spam'
>>> cf(',', 'spam', 'baked beans')
'spam, baked beans'

The `Template` class (since Python 2.4)

This approach is unpopular, and cumbersome for simple cases. It seems to be rarely used or mentioned, especially since the format method of strings also allows for simple, reusable templates. However, it's still maintained and even got new functionality added in 3.11.

from string import Template
t = Template('I have $count cans of Spam®; baked beans are $status')
t.substitute({'count': 8, 'status': 'off'})

The standard library string module provides a class called Template that provides reusable string formatting from keywords (similar to a string with braces that will have its format_map method called later). It uses a less powerful, but simpler syntax that appears inspired by Perl's string interpolation.

It's also possible to use keyword arguments to substitute into the template:

>>> t.substitute(count=8, status='off')
'I have 8 cans of Spam®; baked beans are off'

Advanced usage

The substitute method will raise an exception if a keyword is missing:

>>> t.substitute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/string.py", line 114, in convert
    return str(mapping[named])
               ~~~~~~~^^^^^^^
KeyError: 'count'

safe_substitute leaves those placeholders alone (this can hide problems):

>>> t.safe_substitute(status='off')
'I have $count cans of Spam®; baked beans are off'

Use {} to clarify the placeholder name if there need to be letters immediately after the substituted value:

>>> Template('${b}a${c}o$n').substitute(b='b',c='c',n='n')
'bacon'

To escape a literal $, double it up (I'm noticing a theme...):

>>> Template('ca$$h $cash').substitute(cash='money')
'ca$h money'

For really advanced use cases, the Template class can also be subclassed - refer to the documentation for details.

The `%` operator

This approach has many disadvantages and idiosyncracies. It should not be used in new code without a good reason. However, it's the original way to solve the problem, and some older interfaces are designed around it. It also isn't formally deprecated.

'I have %d cans of Spam®; baked beans are %s' % (count, status)

Formatting strings using the % operator still works by substituting values into the placeholders of a template, but they look very different. This approach is meant to mimic functions like printf in C, although it supports many things that the original functions don't.

The letters used for the placeholders follow mostly the same rules as in C, which are also used for format specifiers for f-strings and the format method. However, it's usually not necessary to worry about this too much. The %s placeholder will convert the value directly to string with str, and everything supports that by default (although for some types, the result might not always be what you want).

To escape a literal % sign, double it up (just like with the braces before):

>>> '%%%s%%' % 'I like percentage signs'
'%I like percentage signs%'

Complications

Extra positional arguments are not ignored and cause an error:

>>> '%s' % ('one', 'two')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Multiple arguments have to be in a tuple - not a list:

>>> 'I have %d cans of Spam®; baked beans are %s' % [count, status]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: %d format: a real number is required, not list
>>> 'I have %s cans of Spam®; baked beans are %s' % [count, status]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Parentheses are necessary around the tuple because of the order of operations:

>>> # This tries to do the formatting first, and THEN
>>> # make a tuple with the formatted string and the `status` value.
>>> 'I have %s cans of Spam®; baked beans are %s' % count, status
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Single values don't need to be in a tuple:

>>> my_list = [1, 2, 3]
>>> 'This is a list: %s' % my_list
'This is a list: [1, 2, 3]'

Unless the single value is a tuple:

>>> my_tuple = (1, 2, 3)
>>> 'This is a tuple: %s' % my_tuple
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Putting the tuple inside another 1-tuple solves the problem:

>>> 'This is a tuple: %s' % (my_tuple,)
'This is a tuple: (1, 2, 3)'

Advanced usage

% can also use a mapping (placeholder names go in parentheses):

>>> '%(meat)s and %(side)s' % {'side': 'eggs', 'meat': 'ham'}
'ham and eggs'

This can't be mixed with positional placeholders:

>>> '%(key)s %s' % ({'key': 'value'}, 'text')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: format requires a mapping
>>> '%(key)s %s' % {'key': 'value', '': ''}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string

Some advanced formatting options are also supported:

>>> '%08x' % count
'00000008'
>>> '%#010x' % count
'0x00000008'

However, certain format specifiers do not work (See full documentation here):

>>> f'{count:b}'
'1000'
>>> '%b' % count
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'b' (0x62) at index 1

String concatenation with explicit casting

This approach can be awkward to use, especially when there are multiple, non-string pieces to assemble. However, it's simple and effective. Contrary to what some would expect, it doesn't ordinarily run into performance issues; although Python strings are immutable from Python, the implementation can and does "cheat".

'I have ' + str(count) + ' cans of Spam®; baked beans are ' + status

The + operator does work for putting strings together, but it requires a string for both operands (on both sides of the +). Notice that any desired spaces between parts of the text need to be accounted for carefully.

This is disallowed because it's ambiguous:

>>> '2' + 2 # "In the face of ambiguity, refuse the temptation to guess."
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

Explicitly converting either side of the + makes the meaning clear:

>>> int('2') + 2
4
>>> '2' + str(2)
'22'

Aside from using + to join the strings together, an already-existing sequence of strings can simply be joined together (which is usually considered its own topic):

>>> pieces = ('I have', str(count), 'cans of Spam®; baked beans are', status)
>>> ' '.join(pieces) # join with a space in between each pair
'I have 8 cans of Spam®; baked beans are off'

posted almost 2 years ago

CC BY-SA 4.0

2y ago

Karl Knechtel‭

2475 reputation 40 59 316 87

Copy Link

Raw

Markdown

History

1 comment thread

f-string '=' format (2 comments)

Communities

How can I build a string from smaller pieces?

1 comment thread

2 answers

2 comment threads

f-strings (since Python 3.6)

The format method (since Python 3.0, backported to 2.6)

The Template class (since Python 2.4)

The % operator

String concatenation with explicit casting

1 comment thread

The `format` method (since Python 3.0, backported to 2.6)

The `Template` class (since Python 2.4)

The `%` operator