Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Convert .npy files in a directory to images (.png)

+1
−4

I have around 20,000 .npy files in a directory. That main directory has no subfolders:-

Main_dir
   |
   |--1.npy
   |--2.npy
   |--3.npy
   |--........

The absolute file paths are stored in a list named paths.

What is the fastest and most efficient method (time and memory-wise) to convert all these .npy files to image (.png preferable)?

I don't care if the solution would use python or bash.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

2 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+1
−0

fastest and most efficient method (time and memory-wise)

Time and memory go against each other in a simple way: amount of jobs divides the time (up to amount of cores, sans the overhead, assuming no I/O bottlenecks e.g. an SSD), and multiplies the peak memory used.

stored in a list named paths

Is that a Python list, or do you mean a newline-delimited file?

For the latter, an easy bash way to parallelize is xargs --max-procs=16 --max-args=1 command_to_convert_one < pathlist_file

Alternatively, to list all files recursively (just in case), you can use find Main_dir -type f -name '*.npy'.

convert all these .npy files to image

Here is the most unclear part of your question: npy files are datafiles (claimed as alternative to CSV).

There are many ways to convert them to images, such as plotting the data with matplotlib or any other library; or interpreting the data as image pixels; or anything else.

Without knowing what data is there and what you are trying to do, it is impossible to give a good answer.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

+3
−0

A possible parallel solution using imageio:

from imageio import imwrite
from multiprocessing import Pool
from numpy import load
from pathlib import Path


def npy2png(npyFile):
    imwrite(npyFile.stem + '.png', load(npyFile))


rootPath = Path('/path/to/Main_dir')
if __name__ == '__main__':
    with Pool(processes=8) as pool:
        pool.map(npy2png, rootPath.glob('*npy'))

Additional arguments can be passed to imwrite; in the case of the PNG format, execute

import imageio
imageio.help(name='png')

to know which ones are available.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »