Q&A

# Separate digits of a number in groups with different sizes

+4
−0

I have a list of 11-digit numbers stored in one single column in Excel, and I need to separate the digits according to this pattern: `2-2-1-3-3`.

Example: `00002451018` becomes `00 00 2 451 018`.

How can I do it in Python?

I have tried the following: If you notice the last column new Value is getting only from the last row. Which is ending with 416. Somehow I want inside of this to update and create a new column to contain for each row the updated pattern.

Why does this post require moderator attention?
Why should this post be closed?

Provide code, not a picture (1 comment)

+6
−0

Before we start, I'd like to be a little bit pedantic regarding `00002451018` being a number.

When we talk about numeric types/values, the zeroes at the beginning are irrelevant: `2`, `02` and `000002` all refer to the number two. The numeric value is `2`, and only the representation - the way they're formatted - differs.

That said, this is invalid Python code:

``````n = 00002451018
``````

Because it gives this error:

SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

Therefore, if we were to be pedantic, `00002451018` is not a number in Python. When we read numbers from a file, we're actually reading characters that can be interpreted as a number. But when such conversion happens (text to number), the leading zeroes are "lost"/"ignored", and all we get is the numeric value `2451018`.

So first we need to know what kind of data you have. If you're seeing this number as `00002451018` in Excel, it can be because the cells are formatted as text, or as numbers with a custom formatting (such as "11-digit length with leading zeroes").

The thing is, when you read those cells values as numbers into your Python code, the actual value will be the integer `2451018` (if the column is text, though, the value will be the string `'00002451018'`).

Anyway, as I don't know what you actually have in your spreadsheet, let's assume it's a number (if it's text, the value will be a Python string, which can reuse the code below).

There are two approaches to this problem: extract the digits with maths, or convert the number to string and extract pieces of it with slices.

## Doing the math

To get the first two digits of a 11-digit number, we can simply divide it by 109 and discard the decimal places.

Then, to get the next two digits (third and fourth ones), we calculate the remainder of the previous division (the number divided by 109). The result will be a 9-digit number, so we just divide it by 107 to get its first two digits (which correspond to the third and fourth digits of the original number).

We repeat this process until we get all the pieces we want. At each step, the exponent of the power of 10 decreases by the quantity of digits we want.

Note that the number actually doesn't need to have 11 digits. If it has less, the result of the divisions will be zero, so this algorithm already takes care of the leading zeroes. The code is like that:

``````def get_slices(n, sizes, n_digits=11):
for size in sizes:
n_digits -= size
# get the division and the remainder
val, n = divmod(n, 10 ** n_digits)
yield f'{val:0{size}}'

n = 2451018
print(' '.join(get_slices(n, [2, 2, 1, 3, 3]))) # 00 00 2 451 018
``````

The `get_slices` function takes the number, a list of the sizes and the total number of digits to consider.

The `yield` keyword makes it a generator function, which yields one part of the number at a time. I used a f-string to properly format the value to have the correct length. This handles cases when the value is zero, or it has less digits than the required size (such as `12` when the size is 3, so it becomes `012`).

To perform the calculations I used the `divmod` built-in, which returns both the result of the division (ignoring the decimal places) and the remainder, all in one step.

Then I use `join` to build a single string with all the pieces together.
The result is the string `00 00 2 451 018`.

## String and slices

Another approach is to convert the number to a string with the leading zeroes, and then take slices of this string with the desired lengths.

``````def get_slices(n, sizes, n_digits=11):
# convert to string with the desired number of digits and leading zeroes
s = f'{n:0{n_digits}}'
prev = 0
for size in sizes:
yield s[prev : prev + size]
prev += size

n = 2451018
print(' '.join(get_slices(n, [2, 2, 1, 3, 3]))) # 00 00 2 451 018
``````

I use the slice notation to get pieces of the string, and then join those pieces together using `join`.

Of course you could also do it like this:

``````n = 2451018
s = f'{n:011}'
print(f'{s[0:2]} {s[2:4]} {s} {s[5:8]} {s[8:11]}')
``````

But I think the `get_slices` function provides a more flexible way, as you can customize the length and the size of each piece.

Regardless of the solution you choose (math or string slices), you can go further and provide a formatting function that uses the `get_slices` function internally. You could also make improvements, such as allowing a different separator:

``````def get_slices(n, sizes, n_digits=11):
# use one of the solutions above...

def format_number(n, sizes, n_digits=11, separator=' '):
return separator.join(get_slices(n, sizes, n_digits))

n = 2451018
print(format_number(n, [2, 2, 1, 3, 3])) # 00 00 2 451 018

# use a different separator
print(format_number(n, [2, 2, 1, 3, 3], separator='-')) # 00-00-2-451-018

# use different number of digits and sizes
print(format_number(n, [3, 4, 2, 2, 4], n_digits=15)) # 000 0000 02 45 1018
``````

Finally, as I said, if the value is already a string, you can simply skip the first step (convert the number to a string) and use the slice solution.

Why does this post require moderator attention?

+0
−1

I found a way how to fix it after I debug. It was only using the last used row because there was nothing to check that I want to edit only the current row. That's why I decide to use `at`

``````for index, row in priceListTest.iterrows():
#print(index,row)
def get_slices(n, sizes, n_digits=11):
for size in sizes:
n_digits -= size

val, n = divmod(n, 10 ** n_digits)
yield f'{val:0{size}}'

n = row['Парт номер']
newVar = (' '.join(get_slices(n, [2, 2, 1, 3, 3])))
priceListTest.at[index,['New Value']] = newVar

print("after changes")
print(priceListTest)``````
Why does this post require moderator attention? 