Thanks @gianfranco, I’ve removed tensorflow dep past Friday already, I am now finishing testing after rebasing the Jupyter script on current next branch
Generating model file: SimpleRNN_MOD-AUDIO-UG_in1_LSTM-8-1_skip0.aidax
Model file saved to: /content/Automated-GuitarAmpModelling/SimpleRNN_MOD-AUDIO-UG_in1_LSTM-8-1_skip0.aidax
allright @LievenDV @pilal and others here, I’ve finished the testing for the new training script & codebase. Lot of things to share, but before diving into the new stuff I would like to double check with you if the basic stuff is working
https://colab.research.google.com/github/AidaDSP/Automated-GuitarAmpModelling/blob/next/AIDA_X_Model_Trainer.ipynb
can you help me in this job? I will then merge this stuff into aidadsp_devel
Step 4 Model Evaluation is still WIP atm, so just do the training in Step 3 and move to Step 5 for export
Will check this afternoon (Wednesday 9/4/25)
Step 0; check deps.
Script did some installs, asked to restart so I did.
Step 1: retup, run cell: I get error
Checking GPU availability... GPU available!
Getting the code...
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/usr/local/lib/python3.11/dist-packages/colab_kernel_launcher.py", line 37, in <module>
ColabKernelApp.launch_instance()
File "/usr/local/lib/python3.11/dist-packages/traitlets/config/application.py", line 992, in launch_instance
app.start()
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelapp.py", line 712, in start
self.io_loop.start()
File "/usr/local/lib/python3.11/dist-packages/tornado/platform/asyncio.py", line 205, in start
self.asyncio_loop.run_forever()
File "/usr/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/usr/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/usr/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 510, in dispatch_queue
await self.process_one()
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 499, in process_one
await dispatch(*args)
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 406, in dispatch_shell
await result
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 730, in execute_request
reply_content = await reply_content
File "/usr/local/lib/python3.11/dist-packages/ipykernel/ipkernel.py", line 383, in do_execute
res = shell.run_cell(
File "/usr/local/lib/python3.11/dist-packages/ipykernel/zmqshell.py", line 528, in run_cell
return super().run_cell(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 2975, in run_cell
result = self._run_cell(
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3030, in _run_cell
return runner(coro)
File "/usr/local/lib/python3.11/dist-packages/IPython/core/async_helpers.py", line 78, in _pseudo_sync_runner
coro.send(None)
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3257, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3473, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-727379d7cf67>", line 19, in <cell line: 0>
device = torch.device("cuda")
<ipython-input-1-727379d7cf67>:19: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device = torch.device("cuda")
Checking for code updates...
Installing dependencies...
Mounting google drive...
Mounted at /content/drive
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-727379d7cf67> in <cell line: 0>()
50 os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:2"
51
---> 52 from colab_functions import wav2tensor, extract_best_esr_model, create_csv_aidax
53 from prep_wav import WavParse
54 import plotly.graph_objects as go
ImportError: cannot import name 'create_csv_aidax' from 'colab_functions' (/content/Automated-GuitarAmpModelling/colab_functions.py)
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
Ok I’m looking into it
Ok I’ve updated the PyTorch / Cuda deps so that now my Colab instance looks like this
PyTorch: 2.3.1+cu121
Cuda: 12.1
Python 3.11.11
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
in the meantime my Colab instance is lagging heavily and I cannot execute the cells. I will let you know when I am able to continue the tests
I just trained a model and everything went fine.
The workbook CAN make models
First try
- I was having issues in step one, tried to refresh my browser etc but they persisted.
- I tried a whole new window with a different google user and google drive and succesfully trained a model.
- Landed on a ESR of 0.33 though.
(High gain model of a Trivium Amp Knob Rhythm)
The character is pretty similar though… but it misses oomph in low end.
Second try
- with built in IR this time instead of amp only
- Though, also on this second account, I can’t get the workbook to find folders and files since the workbook won’t refresh my google drive folders.
Caching creating issues? is there anotherway except the “refresh” option next to a folder?
The folders in the workbook won’t refresh and he doesn’t accept new folders on my drive, gives error saying folder/files do not exist.
- Trying to use an existing folder that I used for a previous training doesn’t work because then he says the files already exist and he won’t upload the files again.
Third try
- trying again an hour later, with fresh browser sessions and I could continue
the “cab included” version rendered a version with ESR 0.47 sqo that was even worse :p.
But it succeeded in making " a model"
Fourth try
I thought 'I’m in aworking session now, let’s try another". …but it couldn’t, getting the old errors in step 1 again
Okay first of all we need to separate specific training issues with a particular “Audio Circuit” from issues with current next branch
I.e. was this model that is now failing to train on next succeeding with old codebase? If possible use the exact input.wav and target.wav from a previous experiment. The key is to reduce the moving parts to a minimum, so that we can isolate the issue.
The folders in the workbook won’t refresh and he doesn’t accept new folders on my drive, gives error saying folder/files do not exist.
Let me check again how the files are handled in the AIDA_X_Model_Trainer.ipynb script
he says the files already exist
should now have been fixed by latest commit below
commit b0995ecfd37f1a100fc9b79edbdc2e323485afda (HEAD -> next, origin/next, github/next)
Date: Thu Apr 10 09:25:48 2025 +0200
Avoid using shutil.copy which is unable to handle simple file overwrite scenarios
alright:
rewrite enable fix: testing as soon a I get on a different machine
— up till now, before that test (wast typing this up during your reply)-----------
-
without issues caused elsewhere in previous tries, the step 0 seems to work now. That means the initial issue is solved.
-
The training part seems to work as well. The quality of a new model was off but it works.
-
Trying with previous files didn’t work but that’s mostly because the flow only works well under certain circumstances. I can’t do 2 trainings after each other
- Need a completely new session
- Some time needs to pass between sessions
- files need to be in uploaded in google drive before trying
- I tried with an already existed fodldr and files of a model that I knew that worked a year ago. got error it already existed so will try again with that later today.
- got the message my credits were depleted, could be due to security policy on this work laptop
Allright, with the commits
6017205 (HEAD -> next, origin/next, github/next) Get rid of shutil.copyfile also in Export
66a2373 Revert "Revert "Avoid loading existing model if found""
b0995ec Avoid using shutil.copy which is unable to handle simple file overwrite scenarios
15041d1 Just use DATA_DIR to copy the final exported model
the script should handle multiple experiments exec in a row. I have changed the default behaviour so that if an existing model is found, then is not continuing training anymore but is now starting from scratch every time. Also I’ve fixed the annoying file copy handle, that was basically failing if files already exist. Can you retry? Thanks for your patience!
Okay with latest commits now also
4. Model Evaluation
is working again. I also cleaned up some additional bits. Let’s pin also @spunktsch on this for additional feedback.
When can discuss the changes in a new thread and possibly troubleshoot training issues. I just need the “ok” from this thread so that I can merge next in aidadsp_devel and move to new stuff
According to tests made by several people the training is now working and this issue is closed. I will merge next into aidadsp_devel branch.
A new feature that I can’t wait to share is the possibility now to run the whole thing (the Colab aka the Jupyter script) locally
git clone https://github.com/aidadsp/Automated-GuitarAmpModelling.git
cd Automated-GuitarAmpModelling && git checkout next && git submodule update --init --recursive
docker compose up -d
then just type on your browser
http://localhost:8080
note that this require docker to be installed and configured in your workstation, and a GPU. There are TONS of tutorials online on how to do so, since it’s the backbone of AI training envs accross several scenarios: local, cloud and CI. So worth investing your precious time on this skill!!!
I have never done this before and managed to do it in a short timespan on a win10
(with a bit of help of chatgpt who guided me through some terminology and bios settings to enable virtualisation)
The truth is that I have not been able to get the workbook to work. If you could give me the steps specified to at least get it working, that would help me a lot.
Hello, you need to open this link and follow step by step.
https://colab.research.google.com/github/AidaDSP/Automated-GuitarAmpModelling/blob/aidadsp_devel/AIDA_X_Model_Trainer.ipynb
in the meantime I have merged next into aidadsp_devel. People is confirming now the script works
again, so I’m sure you will be able to make it work! In case, provide some additional details: "it doesn’t
work" is not enough for us to understand.
Well, I tried to run the workbook and I’m still getting the same error as the last few days. When I try to run the setup or step 1, I get this error:
python
Checking GPU availability... GPU available!
Getting the code...
Checking for code updates...
Installing dependencies...
Mounting google drive...
Mounted at /content/drive
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-2-727379d7cf67> in <cell line: 0>()
50 os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:2"
51
---> 52 from colab_functions import wav2tensor, extract_best_esr_model, create_csv_aidax
53 from prep_wav import WavParse
54 import plotly.graph_objects as go
5 frames
/usr/local/lib/python3.11/dist-packages/torch/_library/fake_impl.py in register(self, func, source)
RuntimeError: operator torchvision::nms does not exist
I have no idea what it could be and I haven’t seen anyone else here experiencing this.
I have this too…
I’ve been running training locally in a Docker image without this issue.
When I test this after a couple of days online via the classic way, I get this as well.
Deps step:
WARNING: Your environment has PyTorch 2.6.0+cu124 and CUDA 12.4. This environment is not supported.
Proceeding to install required dependencies...
Setup step:
Checking GPU availability... GPU available!
Getting the code...
Checking for code updates...
Installing dependencies...
Mounting google drive...
Mounted at /content/drive
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-2-727379d7cf67> in <cell line: 0>()
50 os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:2"
51
---> 52 from colab_functions import wav2tensor, extract_best_esr_model, create_csv_aidax
53 from prep_wav import WavParse
54 import plotly.graph_objects as go
5 frames
/usr/local/lib/python3.11/dist-packages/torch/_library/fake_impl.py in register(self, func, source)
RuntimeError: operator torchvision::nms does not exist