This project develops a neural network that transforms a low-resolution digit image (28×28) from the MNIST/EMNIST dataset into a high-resolution spectrogram (1008×1008) that encodes the harmonic ...
That's an excellent work. However I have some difficullties. As I am going the finetune only some parts of the model, I need to calculate some intermediate data. Specifically, given an audio sequence, ...