I give up!....validation patience limit reached!

I’m doing something stupid, just I know.
I get the same fail every time. I have not got a “good” model through yet.
What triggers “validation patience limit reached at epoch XX” ?

<- RUN CELL (►)

This will check for GPU availability, prepare the code for you, and mount your drive.

Show code

— Checking GPU availability… GPU unavailable, using CPU instead. RECOMMENDED: You can enable GPU through “Runtime” → “Change runtime type” → “Hardware accelerator:” GPU → Save Getting the code… Checking for code updates… Mounting google drive… Mounted at /content/drive Ready! you can now move to step 1: DATA

1. The Data (upload + preprocessing) :bookmark_tabs:

Step 1.1: Download the capture signal

Download the pre-crafted “capture signal” called input.wav from the provided link.

Step 1.2 Reamp your gear

Use the downloaded capture signal to reamp the gear that you want to model. Record the output and save it as “target.wav”. For a detailed demonstration of how to reamp your gear using the capture signal, refer to this video tutorial starting at 1:10 and ending at 3:44.



<- RUN CELL (►)

Step 1.3 upload

  • In drive, put the 2 audio files with which you would like to train in a single folder.
    • input.wav : contains the reference (dry/DI) sound.
    • target.wav : contains the target (amped/with effects) sound.
  • Use the file browser in the left panel to find a folder with your audio, right-click “Copy Path”, paste below, and run the cell.
    • ex. /content/drive/My Drive/training-data-folder




Show code

— Input file name: /content/drive/MyDrive/input.wav Target file name: /content/drive/MyDrive/target.wav Input rate: 48000 length: 14523000 [samples] Target rate: 48000 length: 14523000 [samples] Preprocessing the training data… Data prepared! you can now move to step 2: TRAINING

2. Model Training :man_lifting_weights:



<- RUN CELL (►)

Training usually takes around 10 minutes, but this can change depending on the duration of the training data that you provided and the model_type you choose.
Note that training doesn’t always lead to the same results. You may want to run it a couple of times and compare the results.

Choose the Model type you want to train:
Generally, the heavier the model the more accurate it is, but also the more CPU it consumes. Here’s a list of approximate CPU consumption of each model type on a MOD Dwarf:

  • Lightest: 25% CPU
  • Light: 30% CPU
  • Standard: 37% CPU
  • Heavy: 46% CPU


Some training hyper parameters (Recommended: ignore and continue with default values):



Show code

— device = MOD-DWARF file_name = MyDrive unit_type = LSTM size = 16 skip_connection = 0 36% 71/200 [21:31<37:59, 17.67s/it]validation patience limit reached at epoch 72 36% 71/200 [21:52<39:45, 18.49s/it] done training testing the final model testing the best model finished training: MyDrive_LSTM-16 Training done! ESR after training: 0.801609218120575

Correct me if I’m wrong (please):

It happens that he quits before the maximum of iterations but yours seems to stop quite quickly

That seems like a high value; this would mean accuracy is off.
My guess is that the high ESR and the short cycle is be due to the fact that there is too much difference between the input and the target?(that the algoritm attempts to match signals that are misaligned in time for example?)

I’m no specialist, this is speculation

I’ve been trying to model a BM Deacy - total ego trip, great sound but only one person on the planet can play that tone all night, and it is not me!
So yes there is distortion and compression but it is no more than the Rockman JSON that some kind person posted.

I had a similar problem once; I (totally random) solved fit by creating a longer input file by taking the longer one and adding some of my own chuggin’s DI signal behind it. Accuracy wasn’t supergreat but it was an improvement and the proces didn’t stop so early.

Perhaps you can try experimenting with the input file?

After you create target.wav try to enlarge at maximum resolution “target” and “input” and align them watching the two ticks at start, after that cut them in the same place at start and end and export them.
Maybe your “target” has a little delay due to latency.
My first captures had the same issue

I was aligning both “input” and “target” on the + leading edge of the first alignment click to compensate for any latency. I even got down to where on the square wave click I was aligning to. Still, all failed. Do we know what the ideal alignment point is?
I tried the going positive slopes, the plateaus, and right in between. all failed

@mark_couling I had the same issue. I could fix it by aligning a zero crossing on both signals. Meaning: I looked for a spot, where the input signal switched from negative to positive signal values. Then I took the target signal and shifted it forward (I made it come earlier) until the zero crossing of it was at the same sample as the one on the input signal.

just for the context:
My input signal was a clean bass signal and I captured a bass overdrive pedal. It’s not a 100% match, but with a bit of post-EQ, it works. ESR was 0.008 or so. I’m retraining it right now and will probably share it then

good tip! Will include that in my “best practices” article I’m writing