NAM LSTM experimentations

Comparing my LSTM custom settings to NAM Nano models.
http://coginthemachine.ddns.net/mnt/namhtml/

4 Likes

Can you elaborate some more on that?

1 Like

I now managed to get a quite acceptable quality down to only ~1/10 of the CPU that a WaveNet Standard uses, on my computer. The NAM file is only 7kB and it has been added to the web page.

With this low CPU it is probably possible to use one or two NAM instances on a really cheap SoC/SBC or an ancient computer.

3 Likes

First off the capture files used need to have more things to use when training. Having more dynamic information and different polarities makes the models a lot better in my tests. Try these files and read the containing .txt file: https://fastupload.io/A4tYVcrQlPHbtNd/file

3 Likes

Interesting. Worth a branch here? GitHub - AidaDSP/Automated-GuitarAmpModelling at next we can setup a few tests with @itskais too and compare with our metrics!

4 Likes

If it is for AIDA-X then I have to look through the training script and see what parameters being used and maybe not used. I have found out while tweaking parameters that the LSTM parameters are very sensitive and can make a hugh difference.

1 Like

Okay maybe just point me / invite me to your NAM fork, so I can have a look at the differencies and I can take care of the tests on our training script.

is that you correct? I have implemented the necessary modifications on the training script, I just need to finalize the runtime mods to support a couple (for now) of multi-layer rnn. I still need to check CPU consumption

Also listened to your Dataset, honestly I had experienced the best results with our training script by using more musical / guitar oriented DI tracks instead of test signals. Would be worth discussing also this topic with you.

3 Likes

Some of the test signals are there to help with compression but I haven’t done much testing with that yet.

That is me, yes. The trainings that I have posted about in this thread is done with NAM though and doesn’t seem to ever need 3 num_layers. AIDA-X needs more tweakable parameters to be able to train like I am doing with NAM, e.g. drop_last and pin_memory which should be set to false. With WaveNet, my experience is if set to false that they reduce high frequencies BUT with LSTM they actually make the higher frequencies “available”. The problem I had with LSTM with default settings in NAM was that it sounded like it had a very audible low-pass filter. The strength it had though was no aliasing sounding artifacts AND the low CPU.

The parameters “train_burn_in” and “train_truncate” should also be able to be changed. Tweaking these and finding the optimal values lowers the ESR/MSE drastically.

The x_test and y_test validation files is taken from a part of the training file that has the loudest amplitude and in my tests has shown to produce most distortion and sag/compression characteristics. When used the part that is commonly used with NAM, which is a lot quieter, the trained model did not model the compression too well.

Adding many loudness and opposite polarity layers seem to have made the models act and feel more like the devices. Feels less forgiving than many models. This is highly subjective though and needs an ABX test.

One interesting thing with ESR is that a WaveNet can have an ESR of 0.0005 and sound much worse to me than an LSTM with ESR 0.05. I think WaveNet has issues with aliasing and ringing artifacts and it can be mitigated with “pre_emph_mrstft_weight” and “pre_emph_mrstft_coef”. Overall, I believe WaveNet models eq curves on full rigs and extreme eq settings better than LSTM. LSTM can in those difficult cases get there 95% with a lot less CPU though.

3 Likes

http://coginthemachine.ddns.net/mnt/namhtml/namconfig.html

Here is all you need. Download NAM Colab file and use the settings and wav files described on that page. Don’t forget to compile your AIDA-X or NAM plugin with CLANG for lower CPU usage. I compiled them with -O3 -lto. On Intel and AMD add -march="x86-64-v3" for a very big speed up if you have a compatible CPU.

I have compiles at:
http://coginthemachine.ddns.net/mnt/nam/software_src_misc/
http://coginthemachine.ddns.net/mnt/aida-x/software_src_misc/

3 Likes

It seems that no one is really interested in this or am I wrong?

It is interesting and appreciated to post you findings.
This forum might not be the best place because there are 3 people max that know about those specifics and what they mean.

1 Like

I personally don’t understand what is in play here… A better way to train NAM ? Which will require less CPU but sound better ?
If you have some .json files to test it would be less confusing for me.

As spunktsch said, your work is highly appreciated and the improvement in quality looks very promising, but you have to be quite tech-savvy to bring it to use and those who are struggle to find time for some experiments.
Is there a chance your findings could be integrated into the online aida-x training colab? Maybe in some kind of advanced mode that gives access to these parameters?

1 Like

Yes, exactly. What is needed is NAM Colab file and pasting the config that I have posted.

1 Like

If I remember correctly AIDA-X is limited to a selection of num_layers and hidden_sizes so things must at least be changed in the application code itself. I don’t know how AIDA-X loads NAM files or if it does.

I have some experience with Collab and NN so I’ve been reading what you posted with interest. I learned about LSTM because of what you wrote since I was curious.

Unfortunately my knowledge is limited and I can’t contribute unless it’s about running tests on Collab and sending the results or something like that. I have never captured an amp or trained a model either.

As the others said I really appreciate you’re doing this and sharing your results. Thanks a lot! :love_you_gesture:t6:

1 Like

And I’m pretty sure it doesn’t?

Nope, for me it’s just a really busy period. We are all happy when we see involvement on this part, which is freaking difficult to tame as you may had figured out. We all made assumptions to let the ML part as generic and simple to use as possible, but the reality is that every amplifier (device) is a story, with some specific settings helping a lot with some devices and not with others.

The problem is not tweaking the parameters of training to obtain better result on a device, or at least this is the beginning of the story. The problem is understanding if this would work as a generic rule for every training. Does it makes sense in the first place? I’m thinking about injecting device type at the beginning and select a pool of configs based on that…

  1. Can we move the optimization of the runtime to another thread maybe in github and continue here? On small devices we use aidadsp-lv2 runtime not AIDA-X. On the devices we believe we’re building very optimized binaries with all the flags and lto and stuff. See official buildroot recipe or community Yocto recipe.

I believe WaveNet models eq curves on full rigs and extreme eq settings better than LSTM

thanks for saying that. Why are we asking the neural network to implement the eq with no help? We have sweeps and white noise at multiple levels. We could just use them to guess the overall EQ setting and implement it using fir or iir, eventually recycling layers from the engine. Do you have time to look into this with me? I can provide instructions and technical support.

has the loudest amplitude and in my tests has shown to produce most distortion and sag/compression characteristics

just for this amp or you tested among a pool of devices? I think also you are doing your tests in snapshot mode? How about testing the same with conditioned models? Have you checked the guitar volume pot sensibility? Is the model correctly responding to guitar volume pot changes? On this I need more time to perform the tests.

Finally I’m repeating what I was asking in another thread: we need the COCO Dataset equivalent for Model Sims. A pool for carefully recorded devices with no Cabs, offering a wide range of devices: from od/dist pedals to clean amp to cowboys from shell. And we discuss overall score vs single device improvements. Otherwise we become mad you don’t think?

The first thing we should also fix: how NAM produced this Dataset? Was a script? Was a late night intuition? Was from a paper? Because I keep saying that with this Dataset and A-Weighting filter pre-emph, which is Aida default, we get those bassy sounding amps. If we use another Dataset, that we used for our premium models and that is IP (I’m currently busy registering a brand new one with a guitarist in the studio), the models sounds just right to my ears from EQ point of view.

4 Likes

Hi, regarding this issue, today had time to check. So basically what NAM calls train_burn_in and train_truncate are implemented with different names (because there are no standard names) here

it’s very easy to change them by simply passing correct args to this script, their default values are

init_len = 200
up_fr = 1000

while NAM here uses respectively 4096 and 512. The default values in Automated are the ones from the paper. Where NAM values come from? Would be good to experiment, if someone has time.

Regardng drop_last and pin_memory, a bit of context

n PyTorch, the drop_last and pin_memory parameters are used in the DataLoader class. Here’s what they do:

drop_last: When set to True, this parameter drops the last incomplete batch if the dataset size is not divisible by the batch size. If set to False, it will include the last incomplete batch.
pin_memory: When set to True, this parameter enables faster data transfer to CUDA-enabled GPUs by allocating the data in page-locked memory. If set to False, it will not use page-locked memory.

but if those variables are set to false, as @modep experiments are reporting, then it’s like not specifying them. NAM set both to true.

Of course I’ve inspected NAM LSTMCore and Nano models.

1 Like