rafa-br34/ImageToSpectrogram
A Python script that transforms images into audio
ImageToSpectrogram
A script that transforms images into audio.
Table of contents
Usage
i2s.py:
About
The i2s.py utility transforms images into audio files with a close approximation of your desired spectrograms.
Concepts
The way that i2s.py works is quite simple, first the desired image is loaded as RGB and transformed into a numpy array, for every pixel the following function is applied:
With the resulting array from the previous operation every row is then iterated, upsampled using linear interpolation, and used as the multiplier for the current tone.
TL;DR: Basically treat every linearly upsampled row of the image as a amplitude multiplier for the current tone.
For more information on how each tone is created please read i2s.py starting at line 124.
Arguments
--input/-i:
Path to the input file to use, the file format can be anything that's supported by the Pillow library.
--output/-o:
The path for the output file, this can be any audio format, but it's recommended to use WAV.
Default value: result.wav
--freq-min & --freq-max:
The desired frequency range.
Default values (min/max): 0/20000
--duration/-d:
Defines how long should the resulting audio file be. Higher values will allow for a higher resolution but will be more computationally expensive and will have a bigger file size.
Default value: 8
--sample-rate/-r:
The desired sample rate to use, for most purposes this shouldn't matter.
Default value: 44100
--step-y:
Defines how many steps to take before computing a single line.
--offsets:
Defines how to offset each tone, available modes:
45: Offsets each tone by 45 degreesrandom: Offsets each tone by a random amountnone: No offset
Default value: 45
Results
Rendered using Audacity
--offset random --step-y 4 -d 4








