-
-[](https://github.com/anjok07/ultimatevocalremovergui/releases/latest)
-[](https://github.com/anjok07/ultimatevocalremovergui/releases)
-
-## About
-
-This application is a GUI version of the vocal remover AI created and posted by GitHub user [tsurumeso](https://github.com/tsurumeso). This version also comes with eight high-performance models trained by us. You can find tsurumeso's original command-line version [here](https://github.com/tsurumeso/vocal-remover).
-
-- **The Developers**
- - [Anjok07](https://github.com/anjok07)- Model collaborator & UVR developer.
- - [aufr33](https://github.com/aufr33) - Model collaborator & fellow UVR developer. This project wouldn't be what it is without your help. Thank you for your continued support!
- - [DilanBoskan](https://github.com/DilanBoskan) - Thank you for helping bring the GUI to life! Your contributions to this project are greatly appreciated.
- - [tsurumeso](https://github.com/tsurumeso) - The engineer who authored the original AI code. Thank you for the hard work and dedication you put into the AI code UVR is built on!
-
-## Change Log
-
-- **v4 vs. v5**
- - The v5 models significantly outperform the v4 models.
- - The extraction's aggressiveness can be adjusted using the "Aggression Setting". The default value of 10 is optimal for most tracks.
- - All v2 and v4 models have been removed.
- - Ensemble Mode added - This allows the user to get the strongest result from each model.
- - Stacked models have been entirely removed.
- - Stacked model feature has been replaced by the new aggression setting and model ensembling.
- - The NFFT, HOP_SIZE, and SR values are now set internally.
-
-- **Upcoming v5.2.0 Update**
- - MDX-NET AI engine and model support
-
-## Installation
-
-The application was made with Tkinter for cross-platform compatibility, so it should work with Windows, Mac, and Linux systems. However, this application has only been tested on Windows 10 & Linux Ubuntu.
-
-### Install Required Applications & Packages
-
-1. Download & install Python 3.9.8 [here](https://www.python.org/ftp/python/3.9.8/python-3.9.8-amd64.exe) (Windows link)
- - **Note:** Ensure the *"Add Python 3.9 to PATH"* box is checked
-2. Download the Source code zip here - https://github.com/Anjok07/ultimatevocalremovergui/archive/refs/heads/master.zip
-3. Download the models.zip here - https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.1.0/models.zip
-4. Extract the *ultimatevocalremovergui-master* folder within ultimatevocalremovergui-master.zip where ever you wish.
-5. Extract the *models* folder within models.zip to the *ultimatevocalremovergui-master* directory.
- - **Note:** At this time, the GUI is hardcoded to run the models included in this package only.
-6. Open the command prompt from the ultimatevocalremovergui-master directory and run the following commands, separately -
-
-```
-pip install --no-cache-dir -r requirements.txt
-```
-```
-pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
-```
-
-### FFmpeg
-
-FFmpeg must be installed and configured for the application to process any track that isn't a *.wav* file. Instructions for installing FFmpeg can be found on YouTube, WikiHow, Reddit, GitHub, and many other sources around the web.
-
-- **Note:** If you are experiencing any errors when attempting to process any media files, not in the *.wav* format, please ensure FFmpeg is installed & configured correctly.
-
-### Running the GUI & Models
-
-- Open the file labeled *'VocalRemover.py'*.
- - It's recommended that you create a shortcut for the file labeled *'VocalRemover.py'* to your desktop for easy access.
- - **Note:** If you are unable to open the *'VocalRemover.py'* file, please go to the [**troubleshooting**](https://github.com/Anjok07/ultimatevocalremovergui/tree/beta#troubleshooting) section below.
-- **Note:** All output audio files will be in the *'.wav'* format.
-
-## Option Guide
-
-### Main Checkboxes
-- **GPU Conversion** - Selecting this option ensures the GPU is used to process conversions.
- - **Note:** This option will not work if you don't have a Cuda compatible GPU.
- - Nvidia GPUs are most compatible with Cuda.
- - **Note:** CPU conversions are much slower than those processed through the GPU.
-- **Post-process** - This option can potentially identify leftover instrumental artifacts within the vocal outputs. This option may improve the separation of *some* songs.
- - **Note:** Having this option selected can adversely affect the conversion process, depending on the track. Because of this, it's only recommended as a last resort.
-- **TTA** - This option performs Test-Time-Augmentation to improve the separation quality.
- - **Note:** Having this selected will increase the time it takes to complete a conversion.
-- **Output Image** - Selecting this option will include the spectrograms in *.jpg* format for the instrumental & vocal audio outputs.
-
-### Special Checkboxes
-- **Model Test Mode** - Only selectable when using the "*Single Model*" conversion method. This option makes it easier for users to test the results of different models and model combinations by eliminating the hassle of manually changing the filenames and creating new folders when processing the same track through multiple models. This option structures the model testing process.
- - When *' Model Test Mode'* is selected, the application will auto-generate a new folder in the *' Save to'* path you have chosen.
- - The new auto-generated folder will be named after the model(s) selected.
- - The output audio files will be saved to the auto-generated directory.
- - The filenames for the instrumental & vocal outputs will have the selected model(s) name(s) appended.
-- **Save All Outputs** - Only selectable when using the "*Ensemble Mode*" conversion method. This option will save all of the individual conversion outputs from each model within the ensemble.
- - When *'Save All Outputs'* is un-selected, the application will auto-delete all of the individual conversions generated by each model in the ensemble.
-
-### Additional Options
-
-- **Window Size** - The smaller your window size, the better your conversions will be. However, a smaller window means longer conversion times and heavier resource usage.
- - Here are the selectable window size values -
- - **1024** - Low conversion quality, shortest conversion time, low resource usage
- - **512** - Average conversion quality, average conversion time, normal resource usage
- - **320** - Better conversion quality, long conversion time, high resource usage
-- **Aggression Setting** - This option allows you to set how strong the vocal removal will be.
- - The range is 0-100.
- - Higher values perform deeper extractions.
- - The default is 10 for instrumental & vocal models.
- - Values over 10 can result in muddy-sounding instrumentals for the non-vocal models.
-- **Default Values:**
- - **Window Size** - 512
- - **Aggression Setting** - 10 (optimal setting for all conversions)
-
-### Other Buttons
-
-- **Open Export Directory** - This button will open your 'save to' directory. You will find it to the right of the *'Start Conversion'* button.
-
-## Models Included
-
-All of the models included in the release were trained on large datasets containing a diverse range of music genres and different training parameters.
-
-**Please Note:** Do not change the name of the models provided! The required parameters are specified and appended to the end of the filenames.
-
-- **Model Network Types**
- - **HP2** - The model layers are much larger. However, this makes them resource-heavy.
- - **HP** - The model layers are the standard size for UVR v5.
-
-### Main Models
-
-- **HP2_3BAND_44100_MSB2.pth** - This is a strong instrumental model trained using more data and new parameters.
-- **HP2_4BAND_44100_1.pth** - This is a strong instrumental model.
-- **HP2_4BAND_44100_2.pth** - This is a fine tuned version of the HP2_4BAND_44100_1.pth model.
-- **HP_4BAND_44100_A.pth** - This is a strong instrumental model.
-- **HP_4BAND_44100_B.pth** - This is a fine tuned version of the HP_4BAND_44100_A.pth model.
-- **HP_KAROKEE_4BAND_44100_SN.pth** - This is a model that removes main vocals while leaving background vocals intact.
-- **HP_Vocal_4BAND_44100.pth** - This model emphasizes vocal extraction. The vocal stem will be clean, but the instrumental might sound muddy.
-- **HP_Vocal_AGG_4BAND_44100.pth** - This model also emphasizes vocal extraction and is a bit more aggressive than the previous model.
-
-## Choose Conversion Method
-
-### Single Model
-
-Run your tracks through a single model only. This is the default conversion method.
-
-- **Choose Main Model** - Here is where you choose the main model to perform a deep vocal removal.
- - The *'Model Test Mode'* option makes it easier for users to test different models on given tracks.
-
-### Ensemble Mode
-
-Ensemble Mode will run your track(s) through multiple models and combine the resulting outputs for a more robust separation. Higher level ensembles will have stronger separations, as they use more models.
-
-- **Choose Ensemble** - Here, choose the ensemble you wish to run your track through.
- - **There are 4 ensembles you can choose from:**
- - **HP1 Models** - Level 1 Ensemble
- - **HP2 Models** - Level 2 Ensemble
- - **All HP Models** - Level 3 Ensemble
- - **Vocal Models** - Level 1 Vocal Ensemble
- - A directory is auto-generated with the name of the ensemble. This directory will contain all of the individual outputs generated by the ensemble and auto-delete once the conversions are complete if the *'Save All Outputs'* option is unchecked.
- - When checked, the *'Save All Outputs'* option saves all of the outputs generated by each model in the ensemble.
-
-- **List of models included in each ensemble:**
- - **HP1 Models**
- - HP_4BAND_44100_A
- - HP_4BAND_44100_B
- - **HP2 Models**
- - HP2_4BAND_44100_1
- - HP2_4BAND_44100_2
- - HP2_3BAND_44100_MSB2
- - **All HP Models**
- - HP_4BAND_44100_A
- - HP_4BAND_44100_B
- - HP2_4BAND_44100_1
- - HP2_4BAND_44100_2
- - HP2_3BAND_44100_MSB2
- - **Vocal Models**
- - HP_Vocal_4BAND_44100
- - HP_Vocal_AGG_4BAND_44100
-
-- **Please Note:** Ensemble mode is very resource heavy!
-
-## Other GUI Notes
-
-- The application will automatically remember your *'save to'* path upon closing and reopening until it's changed.
- - **Note:** The last directory accessed within the application will also be remembered.
-- Multiple conversions are supported.
-- The ability to drag & drop audio files to convert has also been added.
-- Conversion times will significantly depend on your hardware.
- - **Note:** This application will *not* be friendly to older or budget hardware. Please proceed with caution! Please pay attention to your PC and make sure it doesn't overheat. ***We are not responsible for any hardware damage.***
-
-## Troubleshooting
-
-### Common Issues
-
-- This application is not compatible with 32-bit versions of Python. Please make sure your version of Python is 64-bit.
-- If FFmpeg is not installed, the application will throw an error if the user attempts to convert a non-WAV file.
-
-### Issue Reporting
-
-Please be as detailed as possible when posting a new issue. Make sure to provide any error outputs and/or screenshots/gif's to give us a clearer understanding of the issue you are experiencing.
-
-If the *'VocalRemover.py'* file won't open *under any circumstances* and all other resources have been exhausted, please do the following -
-
-1. Open the cmd prompt from the ultimatevocalremovergui-master directory
-2. Run the following command -
-```
-python VocalRemover.py
-```
-3. Copy and paste the error output shown in the cmd prompt to the issues center on the GitHub repository.
-
-## License
-
-The **Ultimate Vocal Remover GUI** code is [MIT-licensed](LICENSE).
-
-- **Please Note:** For all third-party application developers who wish to use our models, please honor the MIT license by providing credit to UVR and its developers Anjok07, aufr33, & tsurumeso.
-
-## Contributing
-
-- For anyone interested in the ongoing development of **Ultimate Vocal Remover GUI**, please send us a pull request, and we will review it. This project is 100% open-source and free for anyone to use and/or modify as they wish.
-- Please note that we do not maintain or directly support any of tsurumesos AI application code. We only maintain the development and support for the **Ultimate Vocal Remover GUI** and the models provided.
-
-## References
-- [1] Takahashi et al., "Multi-scale Multi-band DenseNets for Audio Source Separation", https://arxiv.org/pdf/1706.09588.pdf
From 07455485f6bb7905b4982f435bf940866585f88f Mon Sep 17 00:00:00 2001
From: Anjok07 <68268275+Anjok07@users.noreply.github.com>
Date: Tue, 10 May 2022 19:02:26 -0500
Subject: [PATCH 03/33] Delete inference_v5_ensemble.py
---
inference_v5_ensemble.py | 605 ---------------------------------------
1 file changed, 605 deletions(-)
delete mode 100644 inference_v5_ensemble.py
diff --git a/inference_v5_ensemble.py b/inference_v5_ensemble.py
deleted file mode 100644
index 43acdd5..0000000
--- a/inference_v5_ensemble.py
+++ /dev/null
@@ -1,605 +0,0 @@
-from functools import total_ordering
-import pprint
-import argparse
-import os
-from statistics import mode
-
-import cv2
-import librosa
-import numpy as np
-import soundfile as sf
-import shutil
-from tqdm import tqdm
-
-from lib_v5 import dataset
-from lib_v5 import spec_utils
-from lib_v5.model_param_init import ModelParameters
-import torch
-
-# Command line text parsing and widget manipulation
-from collections import defaultdict
-import tkinter as tk
-import traceback # Error Message Recent Calls
-import time # Timer
-
-class VocalRemover(object):
-
- def __init__(self, data, text_widget: tk.Text):
- self.data = data
- self.text_widget = text_widget
- # self.offset = model.offset
-
-data = {
- # Paths
- 'input_paths': None,
- 'export_path': None,
- # Processing Options
- 'gpu': -1,
- 'postprocess': True,
- 'tta': True,
- 'save': True,
- 'output_image': True,
- # Models
- 'instrumentalModel': None,
- 'useModel': None,
- # Constants
- 'window_size': 512,
- 'agg': 10,
- 'ensChoose': 'HP1 Models'
-}
-
-default_window_size = data['window_size']
-default_agg = data['agg']
-
-def update_progress(progress_var, total_files, file_num, step: float = 1):
- """Calculate the progress for the progress widget in the GUI"""
- base = (100 / total_files)
- progress = base * (file_num - 1)
- progress += base * step
-
- progress_var.set(progress)
-
-def get_baseText(total_files, file_num):
- """Create the base text for the command widget"""
- text = 'File {file_num}/{total_files} '.format(file_num=file_num,
- total_files=total_files)
- return text
-
-def main(window: tk.Wm, text_widget: tk.Text, button_widget: tk.Button, progress_var: tk.Variable,
- **kwargs: dict):
-
- global args
- global nn_arch_sizes
-
- nn_arch_sizes = [
- 31191, # default
- 33966, 123821, 123812, 537238 # custom
- ]
-
- p = argparse.ArgumentParser()
- p.add_argument('--aggressiveness',type=float, default=data['agg']/100)
- p.add_argument('--high_end_process', type=str, default='mirroring')
- args = p.parse_args()
-
-
- def save_files(wav_instrument, wav_vocals):
- """Save output music files"""
- vocal_name = '(Vocals)'
- instrumental_name = '(Instrumental)'
- save_path = os.path.dirname(base_name)
-
- # Swap names if vocal model
-
- VModel="Vocal"
-
- if VModel in model_name:
- # Reverse names
- vocal_name, instrumental_name = instrumental_name, vocal_name
-
- # Save Temp File
- # For instrumental the instrumental is the temp file
- # and for vocal the instrumental is the temp file due
- # to reversement
- sf.write(f'temp.wav',
- wav_instrument, mp.param['sr'])
-
- # -Save files-
- # Instrumental
- if instrumental_name is not None:
- instrumental_path = '{save_path}/{file_name}.wav'.format(
- save_path=save_path,
- file_name = f'{os.path.basename(base_name)}_{ModelName_1}_{instrumental_name}',
- )
-
- sf.write(instrumental_path,
- wav_instrument, mp.param['sr'])
- # Vocal
- if vocal_name is not None:
- vocal_path = '{save_path}/{file_name}.wav'.format(
- save_path=save_path,
- file_name=f'{os.path.basename(base_name)}_{ModelName_1}_{vocal_name}',
- )
- sf.write(vocal_path,
- wav_vocals, mp.param['sr'])
-
- data.update(kwargs)
-
- # Update default settings
- global default_window_size
- global default_agg
- default_window_size = data['window_size']
- default_agg = data['agg']
-
- stime = time.perf_counter()
- progress_var.set(0)
- text_widget.clear()
- button_widget.configure(state=tk.DISABLED) # Disable Button
-
- # Separation Preperation
- try: #Ensemble Dictionary
- HP1_Models = [
- {
- 'model_name':'HP_4BAND_44100_A',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP_4BAND_44100_A.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 1/2'
- },
- {
- 'model_name':'HP_4BAND_44100_B',
- 'model_params':'lib_v5/modelparams/4band_v2.json',
- 'model_location':'models/Main Models/HP_4BAND_44100_B.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 2/2'
- }
- ]
-
- HP2_Models = [
- {
- 'model_name':'HP2_4BAND_44100_1',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP2_4BAND_44100_1.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 1/3'
- },
- {
- 'model_name':'HP2_4BAND_44100_2',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP2_4BAND_44100_2.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 2/3'
- },
- {
- 'model_name':'HP2_3BAND_44100_MSB2',
- 'model_params':'lib_v5/modelparams/3band_44100_msb2.json',
- 'model_location':'models/Main Models/HP2_3BAND_44100_MSB2.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 3/3'
- }
- ]
-
- All_HP_Models = [
- {
- 'model_name':'HP_4BAND_44100_A',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP_4BAND_44100_A.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 1/5'
- },
- {
- 'model_name':'HP_4BAND_44100_B',
- 'model_params':'lib_v5/modelparams/4band_v2.json',
- 'model_location':'models/Main Models/HP_4BAND_44100_B.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 2/5'
- },
- {
- 'model_name':'HP2_4BAND_44100_1',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP2_4BAND_44100_1.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 3/5'
-
- },
- {
- 'model_name':'HP2_4BAND_44100_2',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP2_4BAND_44100_2.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 4/5'
-
- },
- {
- 'model_name':'HP2_3BAND_44100_MSB2',
- 'model_params':'lib_v5/modelparams/3band_44100_msb2.json',
- 'model_location':'models/Main Models/HP2_3BAND_44100_MSB2.pth',
- 'using_archtecture': '537238KB',
- 'loop_name': 'Ensemble Mode - Model 5/5'
- }
- ]
-
- Vocal_Models = [
- {
- 'model_name':'HP_Vocal_4BAND_44100',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP_Vocal_4BAND_44100.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 1/2'
- },
- {
- 'model_name':'HP_Vocal_AGG_4BAND_44100',
- 'model_params':'lib_v5/modelparams/4band_44100.json',
- 'model_location':'models/Main Models/HP_Vocal_AGG_4BAND_44100.pth',
- 'using_archtecture': '123821KB',
- 'loop_name': 'Ensemble Mode - Model 2/2'
- }
- ]
-
- if data['ensChoose'] == 'HP1 Models':
- loops = HP1_Models
- ensefolder = 'HP_Models_Saved_Outputs'
- ensemode = 'HP_Models'
- if data['ensChoose'] == 'HP2 Models':
- loops = HP2_Models
- ensefolder = 'HP2_Models_Saved_Outputs'
- ensemode = 'HP2_Models'
- if data['ensChoose'] == 'All HP Models':
- loops = All_HP_Models
- ensefolder = 'All_HP_Models_Saved_Outputs'
- ensemode = 'All_HP_Models'
- if data['ensChoose'] == 'Vocal Models':
- loops = Vocal_Models
- ensefolder = 'Vocal_Models_Saved_Outputs'
- ensemode = 'Vocal_Models'
-
- #Prepare Audiofile(s)
- for file_num, music_file in enumerate(data['input_paths'], start=1):
- # -Get text and update progress-
- base_text = get_baseText(total_files=len(data['input_paths']),
- file_num=file_num)
- progress_kwargs = {'progress_var': progress_var,
- 'total_files': len(data['input_paths']),
- 'file_num': file_num}
- update_progress(**progress_kwargs,
- step=0)
-
- #Prepare to loop models
- for i, c in tqdm(enumerate(loops), disable=True, desc='Iterations..'):
-
- text_widget.write(c['loop_name'] + '\n\n')
-
- text_widget.write(base_text + 'Loading ' + c['model_name'] + '... ')
-
- arch_now = c['using_archtecture']
-
- if arch_now == '123821KB':
- from lib_v5 import nets_123821KB as nets
- elif arch_now == '537238KB':
- from lib_v5 import nets_537238KB as nets
- elif arch_now == '537227KB':
- from lib_v5 import nets_537227KB as nets
-
- def determineenseFolderName():
- """
- Determine the name that is used for the folder and appended
- to the back of the music files
- """
- enseFolderName = ''
-
- if str(ensefolder):
- enseFolderName += os.path.splitext(os.path.basename(ensefolder))[0]
-
- if enseFolderName:
- enseFolderName = '/' + enseFolderName
-
- return enseFolderName
-
- enseFolderName = determineenseFolderName()
- if enseFolderName:
- folder_path = f'{data["export_path"]}{enseFolderName}'
- if not os.path.isdir(folder_path):
- os.mkdir(folder_path)
-
- # Determine File Name
- base_name = f'{data["export_path"]}{enseFolderName}/{file_num}_{os.path.splitext(os.path.basename(music_file))[0]}'
- enseExport = f'{data["export_path"]}{enseFolderName}/'
- trackname = f'{file_num}_{os.path.splitext(os.path.basename(music_file))[0]}'
-
- ModelName_1=(c['model_name'])
-
- print('Model Parameters:', c['model_params'])
-
- mp = ModelParameters(c['model_params'])
-
- #Load model
- if os.path.isfile(c['model_location']):
- device = torch.device('cpu')
- model = nets.CascadedASPPNet(mp.param['bins'] * 2)
- model.load_state_dict(torch.load(c['model_location'],
- map_location=device))
- if torch.cuda.is_available() and data['gpu'] >= 0:
- device = torch.device('cuda:{}'.format(data['gpu']))
- model.to(device)
-
- text_widget.write('Done!\n')
-
- model_name = os.path.basename(c["model_name"])
-
- # -Go through the different steps of seperation-
- # Wave source
- text_widget.write(base_text + 'Loading wave source... ')
-
- X_wave, y_wave, X_spec_s, y_spec_s = {}, {}, {}, {}
-
- bands_n = len(mp.param['band'])
-
- for d in range(bands_n, 0, -1):
- bp = mp.param['band'][d]
-
- if d == bands_n: # high-end band
- X_wave[d], _ = librosa.load(
- music_file, bp['sr'], False, dtype=np.float32, res_type=bp['res_type'])
-
- if X_wave[d].ndim == 1:
- X_wave[d] = np.asarray([X_wave[d], X_wave[d]])
- else: # lower bands
- X_wave[d] = librosa.resample(X_wave[d+1], mp.param['band'][d+1]['sr'], bp['sr'], res_type=bp['res_type'])
-
- # Stft of wave source
-
- X_spec_s[d] = spec_utils.wave_to_spectrogram_mt(X_wave[d], bp['hl'], bp['n_fft'], mp.param['mid_side'],
- mp.param['mid_side_b2'], mp.param['reverse'])
-
- if d == bands_n and args.high_end_process != 'none':
- input_high_end_h = (bp['n_fft']//2 - bp['crop_stop']) + (mp.param['pre_filter_stop'] - mp.param['pre_filter_start'])
- input_high_end = X_spec_s[d][:, bp['n_fft']//2-input_high_end_h:bp['n_fft']//2, :]
-
- text_widget.write('Done!\n')
-
- update_progress(**progress_kwargs,
- step=0.1)
-
- text_widget.write(base_text + 'Stft of wave source... ')
- text_widget.write('Done!\n')
- text_widget.write(base_text + "Please Wait...\n")
-
- X_spec_m = spec_utils.combine_spectrograms(X_spec_s, mp)
-
- del X_wave, X_spec_s
-
- def inference(X_spec, device, model, aggressiveness):
-
- def _execute(X_mag_pad, roi_size, n_window, device, model, aggressiveness):
- model.eval()
-
- with torch.no_grad():
- preds = []
-
- iterations = [n_window]
-
- total_iterations = sum(iterations)
-
- text_widget.write(base_text + "Processing "f"{total_iterations} Slices... ")
-
- for i in tqdm(range(n_window)):
- update_progress(**progress_kwargs,
- step=(0.1 + (0.8/n_window * i)))
- start = i * roi_size
- X_mag_window = X_mag_pad[None, :, :, start:start + data['window_size']]
- X_mag_window = torch.from_numpy(X_mag_window).to(device)
-
- pred = model.predict(X_mag_window, aggressiveness)
-
- pred = pred.detach().cpu().numpy()
- preds.append(pred[0])
-
- pred = np.concatenate(preds, axis=2)
-
- text_widget.write('Done!\n')
- return pred
-
- def preprocess(X_spec):
- X_mag = np.abs(X_spec)
- X_phase = np.angle(X_spec)
-
- return X_mag, X_phase
-
- X_mag, X_phase = preprocess(X_spec)
-
- coef = X_mag.max()
- X_mag_pre = X_mag / coef
-
- n_frame = X_mag_pre.shape[2]
- pad_l, pad_r, roi_size = dataset.make_padding(n_frame,
- data['window_size'], model.offset)
- n_window = int(np.ceil(n_frame / roi_size))
-
- X_mag_pad = np.pad(
- X_mag_pre, ((0, 0), (0, 0), (pad_l, pad_r)), mode='constant')
-
- pred = _execute(X_mag_pad, roi_size, n_window,
- device, model, aggressiveness)
- pred = pred[:, :, :n_frame]
-
- if data['tta']:
- pad_l += roi_size // 2
- pad_r += roi_size // 2
- n_window += 1
-
- X_mag_pad = np.pad(
- X_mag_pre, ((0, 0), (0, 0), (pad_l, pad_r)), mode='constant')
-
- pred_tta = _execute(X_mag_pad, roi_size, n_window,
- device, model, aggressiveness)
- pred_tta = pred_tta[:, :, roi_size // 2:]
- pred_tta = pred_tta[:, :, :n_frame]
-
- return (pred + pred_tta) * 0.5 * coef, X_mag, np.exp(1.j * X_phase)
- else:
- return pred * coef, X_mag, np.exp(1.j * X_phase)
-
- aggressiveness = {'value': args.aggressiveness, 'split_bin': mp.param['band'][1]['crop_stop']}
-
- if data['tta']:
- text_widget.write(base_text + "Running Inferences (TTA)... \n")
- else:
- text_widget.write(base_text + "Running Inference... \n")
-
- pred, X_mag, X_phase = inference(X_spec_m,
- device,
- model, aggressiveness)
-
- update_progress(**progress_kwargs,
- step=0.85)
-
- # Postprocess
- if data['postprocess']:
- text_widget.write(base_text + 'Post processing... ')
- pred_inv = np.clip(X_mag - pred, 0, np.inf)
- pred = spec_utils.mask_silence(pred, pred_inv)
- text_widget.write('Done!\n')
-
- update_progress(**progress_kwargs,
- step=0.85)
-
- # Inverse stft
- text_widget.write(base_text + 'Inverse stft of instruments and vocals... ') # nopep8
- y_spec_m = pred * X_phase
- v_spec_m = X_spec_m - y_spec_m
-
- if args.high_end_process.startswith('mirroring'):
- input_high_end_ = spec_utils.mirroring(args.high_end_process, y_spec_m, input_high_end, mp)
- wav_instrument = spec_utils.cmb_spectrogram_to_wave(y_spec_m, mp, input_high_end_h, input_high_end_)
- else:
- wav_instrument = spec_utils.cmb_spectrogram_to_wave(y_spec_m, mp)
-
- if args.high_end_process.startswith('mirroring'):
- input_high_end_ = spec_utils.mirroring(args.high_end_process, v_spec_m, input_high_end, mp)
-
- wav_vocals = spec_utils.cmb_spectrogram_to_wave(v_spec_m, mp, input_high_end_h, input_high_end_)
- else:
- wav_vocals = spec_utils.cmb_spectrogram_to_wave(v_spec_m, mp)
-
- text_widget.write('Done!\n')
-
- update_progress(**progress_kwargs,
- step=0.9)
-
- # Save output music files
- text_widget.write(base_text + 'Saving Files... ')
- save_files(wav_instrument, wav_vocals)
- text_widget.write('Done!\n')
-
- # Save output image
- if data['output_image']:
- with open('{}_Instruments.jpg'.format(base_name), mode='wb') as f:
- image = spec_utils.spectrogram_to_image(y_spec_m)
- _, bin_image = cv2.imencode('.jpg', image)
- bin_image.tofile(f)
- with open('{}_Vocals.jpg'.format(base_name), mode='wb') as f:
- image = spec_utils.spectrogram_to_image(v_spec_m)
- _, bin_image = cv2.imencode('.jpg', image)
- bin_image.tofile(f)
-
- text_widget.write(base_text + 'Clearing CUDA Cache... ')
-
- torch.cuda.empty_cache()
- time.sleep(3)
-
- text_widget.write('Done!\n')
-
- text_widget.write(base_text + 'Completed Seperation!\n\n')
-
- # Emsembling Outputs
- def get_files(folder="", prefix="", suffix=""):
- return [f"{folder}{i}" for i in os.listdir(folder) if i.startswith(prefix) if i.endswith(suffix)]
-
- ensambles = [
- {
- 'algorithm':'min_mag',
- 'model_params':'lib_v5/modelparams/1band_sr44100_hl512.json',
- 'files':get_files(folder=enseExport, prefix=trackname, suffix="_(Instrumental).wav"),
- 'output':'{}_Ensembled_{}_(Instrumental)'.format(trackname, ensemode),
- 'type': 'Instrumentals'
- },
- {
- 'algorithm':'max_mag',
- 'model_params':'lib_v5/modelparams/1band_sr44100_hl512.json',
- 'files':get_files(folder=enseExport, prefix=trackname, suffix="_(Vocals).wav"),
- 'output': '{}_Ensembled_{}_(Vocals)'.format(trackname, ensemode),
- 'type': 'Vocals'
- }
- ]
-
- for i, e in tqdm(enumerate(ensambles), desc="Ensembling..."):
-
- text_widget.write(base_text + "Ensembling " + e['type'] + "... ")
-
- wave, specs = {}, {}
-
- mp = ModelParameters(e['model_params'])
-
- for i in range(len(e['files'])):
- spec = {}
-
- for d in range(len(mp.param['band']), 0, -1):
- bp = mp.param['band'][d]
-
- if d == len(mp.param['band']): # high-end band
- wave[d], _ = librosa.load(
- e['files'][i], bp['sr'], False, dtype=np.float32, res_type=bp['res_type'])
-
- if len(wave[d].shape) == 1: # mono to stereo
- wave[d] = np.array([wave[d], wave[d]])
- else: # lower bands
- wave[d] = librosa.resample(wave[d+1], mp.param['band'][d+1]['sr'], bp['sr'], res_type=bp['res_type'])
-
- spec[d] = spec_utils.wave_to_spectrogram(wave[d], bp['hl'], bp['n_fft'], mp.param['mid_side'], mp.param['mid_side_b2'], mp.param['reverse'])
-
- specs[i] = spec_utils.combine_spectrograms(spec, mp)
-
- del wave
-
- sf.write(os.path.join('{}'.format(data['export_path']),'{}.wav'.format(e['output'])),
- spec_utils.cmb_spectrogram_to_wave(spec_utils.ensembling(e['algorithm'],
- specs), mp), mp.param['sr'])
-
- if not data['save']: # Deletes all outputs if Save All Outputs: is checked
- files = e['files']
- for file in files:
- os.remove(file)
-
- text_widget.write("Done!\n")
-
- update_progress(**progress_kwargs,
- step=0.95)
- text_widget.write("\n")
-
- except Exception as e:
- traceback_text = ''.join(traceback.format_tb(e.__traceback__))
- message = f'Traceback Error: "{traceback_text}"\n{type(e).__name__}: "{e}"\nFile: {music_file}\nPlease contact the creator and attach a screenshot of this error with the file and settings that caused it!'
- tk.messagebox.showerror(master=window,
- title='Untracked Error',
- message=message)
- print(traceback_text)
- print(type(e).__name__, e)
- print(message)
- progress_var.set(0)
- button_widget.configure(state=tk.NORMAL) #Enable Button
- return
-
- if len(os.listdir(enseExport)) == 0: #Check if the folder is empty
- shutil.rmtree(folder_path) #Delete folder if empty
-
- update_progress(**progress_kwargs,
- step=1)
-
- print('Done!')
-
- os.remove('temp.wav')
-
- progress_var.set(0)
- text_widget.write(f'Conversions Completed!\n')
- text_widget.write(f'Time Elapsed: {time.strftime("%H:%M:%S", time.gmtime(int(time.perf_counter() - stime)))}') # nopep8
- torch.cuda.empty_cache()
- button_widget.configure(state=tk.NORMAL) #Enable Button
From b4a7f5f7ef21b7bf83415730034ac816616b5928 Mon Sep 17 00:00:00 2001
From: Anjok07 <68268275+Anjok07@users.noreply.github.com>
Date: Tue, 10 May 2022 19:02:32 -0500
Subject: [PATCH 04/33] Delete inference_v5.py
---
inference_v5.py | 439 ------------------------------------------------
1 file changed, 439 deletions(-)
delete mode 100644 inference_v5.py
diff --git a/inference_v5.py b/inference_v5.py
deleted file mode 100644
index 4ca9b4a..0000000
--- a/inference_v5.py
+++ /dev/null
@@ -1,439 +0,0 @@
-from functools import total_ordering
-import pprint
-import argparse
-import os
-import importlib
-from statistics import mode
-
-import cv2
-import librosa
-import math
-import numpy as np
-import soundfile as sf
-from tqdm import tqdm
-
-from lib_v5 import dataset
-from lib_v5 import spec_utils
-from lib_v5.model_param_init import ModelParameters
-import torch
-
-# Command line text parsing and widget manipulation
-from collections import defaultdict
-import tkinter as tk
-import traceback # Error Message Recent Calls
-import time # Timer
-
-class VocalRemover(object):
-
- def __init__(self, data, text_widget: tk.Text):
- self.data = data
- self.text_widget = text_widget
- self.models = defaultdict(lambda: None)
- self.devices = defaultdict(lambda: None)
- # self.offset = model.offset
-
-data = {
- # Paths
- 'input_paths': None,
- 'export_path': None,
- # Processing Options
- 'gpu': -1,
- 'postprocess': True,
- 'tta': True,
- 'output_image': True,
- # Models
- 'instrumentalModel': None,
- 'useModel': None,
- # Constants
- 'window_size': 512,
- 'agg': 10
-}
-
-default_window_size = data['window_size']
-default_agg = data['agg']
-
-def update_progress(progress_var, total_files, file_num, step: float = 1):
- """Calculate the progress for the progress widget in the GUI"""
- base = (100 / total_files)
- progress = base * (file_num - 1)
- progress += base * step
-
- progress_var.set(progress)
-
-def get_baseText(total_files, file_num):
- """Create the base text for the command widget"""
- text = 'File {file_num}/{total_files} '.format(file_num=file_num,
- total_files=total_files)
- return text
-
-def determineModelFolderName():
- """
- Determine the name that is used for the folder and appended
- to the back of the music files
- """
- modelFolderName = ''
- if not data['modelFolder']:
- # Model Test Mode not selected
- return modelFolderName
-
- # -Instrumental-
- if os.path.isfile(data['instrumentalModel']):
- modelFolderName += os.path.splitext(os.path.basename(data['instrumentalModel']))[0]
-
- if modelFolderName:
- modelFolderName = '/' + modelFolderName
-
- return modelFolderName
-
-def main(window: tk.Wm, text_widget: tk.Text, button_widget: tk.Button, progress_var: tk.Variable,
- **kwargs: dict):
-
- global args
- global model_params_d
- global nn_arch_sizes
-
- nn_arch_sizes = [
- 31191, # default
- 33966, 123821, 123812, 537238 # custom
- ]
-
- p = argparse.ArgumentParser()
- p.add_argument('--paramone', type=str, default='lib_v5/modelparams/4band_44100.json')
- p.add_argument('--paramtwo', type=str, default='lib_v5/modelparams/4band_v2.json')
- p.add_argument('--paramthree', type=str, default='lib_v5/modelparams/3band_44100_msb2.json')
- p.add_argument('--paramfour', type=str, default='lib_v5/modelparams/4band_v2_sn.json')
- p.add_argument('--aggressiveness',type=float, default=data['agg']/100)
- p.add_argument('--nn_architecture', type=str, choices= ['auto'] + list('{}KB'.format(s) for s in nn_arch_sizes), default='auto')
- p.add_argument('--high_end_process', type=str, default='mirroring')
- args = p.parse_args()
-
- def save_files(wav_instrument, wav_vocals):
- """Save output music files"""
- vocal_name = '(Vocals)'
- instrumental_name = '(Instrumental)'
- save_path = os.path.dirname(base_name)
-
- # Swap names if vocal model
-
- VModel="Vocal"
-
- if VModel in model_name:
- # Reverse names
- vocal_name, instrumental_name = instrumental_name, vocal_name
-
- # Save Temp File
- # For instrumental the instrumental is the temp file
- # and for vocal the instrumental is the temp file due
- # to reversement
- sf.write(f'temp.wav',
- wav_instrument, mp.param['sr'])
-
- appendModelFolderName = modelFolderName.replace('/', '_')
-
- # -Save files-
- # Instrumental
- if instrumental_name is not None:
- instrumental_path = '{save_path}/{file_name}.wav'.format(
- save_path=save_path,
- file_name=f'{os.path.basename(base_name)}_{instrumental_name}{appendModelFolderName}',
- )
-
- sf.write(instrumental_path,
- wav_instrument, mp.param['sr'])
- # Vocal
- if vocal_name is not None:
- vocal_path = '{save_path}/{file_name}.wav'.format(
- save_path=save_path,
- file_name=f'{os.path.basename(base_name)}_{vocal_name}{appendModelFolderName}',
- )
- sf.write(vocal_path,
- wav_vocals, mp.param['sr'])
-
- data.update(kwargs)
-
- # Update default settings
- global default_window_size
- global default_agg
- default_window_size = data['window_size']
- default_agg = data['agg']
-
- stime = time.perf_counter()
- progress_var.set(0)
- text_widget.clear()
- button_widget.configure(state=tk.DISABLED) # Disable Button
-
- vocal_remover = VocalRemover(data, text_widget)
- modelFolderName = determineModelFolderName()
- if modelFolderName:
- folder_path = f'{data["export_path"]}{modelFolderName}'
- if not os.path.isdir(folder_path):
- os.mkdir(folder_path)
-
- # Separation Preperation
- try: #Load File(s)
- for file_num, music_file in enumerate(data['input_paths'], start=1):
- # Determine File Name
- base_name = f'{data["export_path"]}{modelFolderName}/{file_num}_{os.path.splitext(os.path.basename(music_file))[0]}'
-
- model_name = os.path.basename(data[f'{data["useModel"]}Model'])
- model = vocal_remover.models[data['useModel']]
- device = vocal_remover.devices[data['useModel']]
- # -Get text and update progress-
- base_text = get_baseText(total_files=len(data['input_paths']),
- file_num=file_num)
- progress_kwargs = {'progress_var': progress_var,
- 'total_files': len(data['input_paths']),
- 'file_num': file_num}
- update_progress(**progress_kwargs,
- step=0)
-
- #Load Model
- text_widget.write(base_text + 'Loading models...')
-
-
- if 'auto' == args.nn_architecture:
- model_size = math.ceil(os.stat(data['instrumentalModel']).st_size / 1024)
- args.nn_architecture = '{}KB'.format(min(nn_arch_sizes, key=lambda x:abs(x-model_size)))
-
- nets = importlib.import_module('lib_v5.nets' + f'_{args.nn_architecture}'.replace('_{}KB'.format(nn_arch_sizes[0]), ''), package=None)
-
- ModelName=(data['instrumentalModel'])
-
- ModelParam1="4BAND_44100"
- ModelParam2="4BAND_44100_B"
- ModelParam3="MSB2"
- ModelParam4="4BAND_44100_SN"
-
- if ModelParam1 in ModelName:
- model_params_d=args.paramone
- if ModelParam2 in ModelName:
- model_params_d=args.paramtwo
- if ModelParam3 in ModelName:
- model_params_d=args.paramthree
- if ModelParam4 in ModelName:
- model_params_d=args.paramfour
-
- print('Model Parameters:', model_params_d)
-
- mp = ModelParameters(model_params_d)
-
- # -Instrumental-
- if os.path.isfile(data['instrumentalModel']):
- device = torch.device('cpu')
- model = nets.CascadedASPPNet(mp.param['bins'] * 2)
- model.load_state_dict(torch.load(data['instrumentalModel'],
- map_location=device))
- if torch.cuda.is_available() and data['gpu'] >= 0:
- device = torch.device('cuda:{}'.format(data['gpu']))
- model.to(device)
-
- vocal_remover.models['instrumental'] = model
- vocal_remover.devices['instrumental'] = device
-
- text_widget.write(' Done!\n')
-
- model_name = os.path.basename(data[f'{data["useModel"]}Model'])
-
- mp = ModelParameters(model_params_d)
-
- # -Go through the different steps of seperation-
- # Wave source
- text_widget.write(base_text + 'Loading wave source...')
-
- X_wave, y_wave, X_spec_s, y_spec_s = {}, {}, {}, {}
-
- bands_n = len(mp.param['band'])
-
- for d in range(bands_n, 0, -1):
- bp = mp.param['band'][d]
-
- if d == bands_n: # high-end band
- X_wave[d], _ = librosa.load(
- music_file, bp['sr'], False, dtype=np.float32, res_type=bp['res_type'])
-
- if X_wave[d].ndim == 1:
- X_wave[d] = np.asarray([X_wave[d], X_wave[d]])
- else: # lower bands
- X_wave[d] = librosa.resample(X_wave[d+1], mp.param['band'][d+1]['sr'], bp['sr'], res_type=bp['res_type'])
-
- # Stft of wave source
-
- X_spec_s[d] = spec_utils.wave_to_spectrogram_mt(X_wave[d], bp['hl'], bp['n_fft'], mp.param['mid_side'],
- mp.param['mid_side_b2'], mp.param['reverse'])
-
- if d == bands_n and args.high_end_process != 'none':
- input_high_end_h = (bp['n_fft']//2 - bp['crop_stop']) + (mp.param['pre_filter_stop'] - mp.param['pre_filter_start'])
- input_high_end = X_spec_s[d][:, bp['n_fft']//2-input_high_end_h:bp['n_fft']//2, :]
-
- text_widget.write('Done!\n')
-
- update_progress(**progress_kwargs,
- step=0.1)
-
- text_widget.write(base_text + 'Stft of wave source...')
-
- text_widget.write(' Done!\n')
-
- text_widget.write(base_text + "Please Wait...\n")
-
- X_spec_m = spec_utils.combine_spectrograms(X_spec_s, mp)
-
- del X_wave, X_spec_s
-
- def inference(X_spec, device, model, aggressiveness):
-
- def _execute(X_mag_pad, roi_size, n_window, device, model, aggressiveness):
- model.eval()
-
- with torch.no_grad():
- preds = []
-
- iterations = [n_window]
-
- total_iterations = sum(iterations)
-
- text_widget.write(base_text + "Processing "f"{total_iterations} Slices... ")
-
- for i in tqdm(range(n_window)):
- update_progress(**progress_kwargs,
- step=(0.1 + (0.8/n_window * i)))
- start = i * roi_size
- X_mag_window = X_mag_pad[None, :, :, start:start + data['window_size']]
- X_mag_window = torch.from_numpy(X_mag_window).to(device)
-
- pred = model.predict(X_mag_window, aggressiveness)
-
- pred = pred.detach().cpu().numpy()
- preds.append(pred[0])
-
- pred = np.concatenate(preds, axis=2)
- text_widget.write('Done!\n')
- return pred
-
- def preprocess(X_spec):
- X_mag = np.abs(X_spec)
- X_phase = np.angle(X_spec)
-
- return X_mag, X_phase
-
- X_mag, X_phase = preprocess(X_spec)
-
- coef = X_mag.max()
- X_mag_pre = X_mag / coef
-
- n_frame = X_mag_pre.shape[2]
- pad_l, pad_r, roi_size = dataset.make_padding(n_frame,
- data['window_size'], model.offset)
- n_window = int(np.ceil(n_frame / roi_size))
-
- X_mag_pad = np.pad(
- X_mag_pre, ((0, 0), (0, 0), (pad_l, pad_r)), mode='constant')
-
- pred = _execute(X_mag_pad, roi_size, n_window,
- device, model, aggressiveness)
- pred = pred[:, :, :n_frame]
-
- if data['tta']:
- pad_l += roi_size // 2
- pad_r += roi_size // 2
- n_window += 1
-
- X_mag_pad = np.pad(
- X_mag_pre, ((0, 0), (0, 0), (pad_l, pad_r)), mode='constant')
-
- pred_tta = _execute(X_mag_pad, roi_size, n_window,
- device, model, aggressiveness)
- pred_tta = pred_tta[:, :, roi_size // 2:]
- pred_tta = pred_tta[:, :, :n_frame]
-
- return (pred + pred_tta) * 0.5 * coef, X_mag, np.exp(1.j * X_phase)
- else:
- return pred * coef, X_mag, np.exp(1.j * X_phase)
-
- aggressiveness = {'value': args.aggressiveness, 'split_bin': mp.param['band'][1]['crop_stop']}
-
- if data['tta']:
- text_widget.write(base_text + "Running Inferences (TTA)...\n")
- else:
- text_widget.write(base_text + "Running Inference...\n")
-
- pred, X_mag, X_phase = inference(X_spec_m,
- device,
- model, aggressiveness)
-
- update_progress(**progress_kwargs,
- step=0.9)
- # Postprocess
- if data['postprocess']:
- text_widget.write(base_text + 'Post processing...')
- pred_inv = np.clip(X_mag - pred, 0, np.inf)
- pred = spec_utils.mask_silence(pred, pred_inv)
- text_widget.write(' Done!\n')
-
- update_progress(**progress_kwargs,
- step=0.95)
-
- # Inverse stft
- text_widget.write(base_text + 'Inverse stft of instruments and vocals...') # nopep8
- y_spec_m = pred * X_phase
- v_spec_m = X_spec_m - y_spec_m
-
- if args.high_end_process.startswith('mirroring'):
- input_high_end_ = spec_utils.mirroring(args.high_end_process, y_spec_m, input_high_end, mp)
-
- wav_instrument = spec_utils.cmb_spectrogram_to_wave(y_spec_m, mp, input_high_end_h, input_high_end_)
- else:
- wav_instrument = spec_utils.cmb_spectrogram_to_wave(y_spec_m, mp)
-
- if args.high_end_process.startswith('mirroring'):
- input_high_end_ = spec_utils.mirroring(args.high_end_process, v_spec_m, input_high_end, mp)
-
- wav_vocals = spec_utils.cmb_spectrogram_to_wave(v_spec_m, mp, input_high_end_h, input_high_end_)
- else:
- wav_vocals = spec_utils.cmb_spectrogram_to_wave(v_spec_m, mp)
-
- text_widget.write('Done!\n')
-
- update_progress(**progress_kwargs,
- step=1)
-
- # Save output music files
- text_widget.write(base_text + 'Saving Files...')
- save_files(wav_instrument, wav_vocals)
- text_widget.write(' Done!\n')
-
- update_progress(**progress_kwargs,
- step=1)
-
- # Save output image
- if data['output_image']:
- with open('{}_Instruments.jpg'.format(base_name), mode='wb') as f:
- image = spec_utils.spectrogram_to_image(y_spec_m)
- _, bin_image = cv2.imencode('.jpg', image)
- bin_image.tofile(f)
- with open('{}_Vocals.jpg'.format(base_name), mode='wb') as f:
- image = spec_utils.spectrogram_to_image(v_spec_m)
- _, bin_image = cv2.imencode('.jpg', image)
- bin_image.tofile(f)
-
- text_widget.write(base_text + 'Completed Seperation!\n\n')
- except Exception as e:
- traceback_text = ''.join(traceback.format_tb(e.__traceback__))
- message = f'Traceback Error: "{traceback_text}"\n{type(e).__name__}: "{e}"\nFile: {music_file}\nPlease contact the creator and attach a screenshot of this error with the file and settings that caused it!'
- tk.messagebox.showerror(master=window,
- title='Untracked Error',
- message=message)
- print(traceback_text)
- print(type(e).__name__, e)
- print(message)
- progress_var.set(0)
- button_widget.configure(state=tk.NORMAL) # Enable Button
- return
-
- os.remove('temp.wav')
-
- progress_var.set(0)
- text_widget.write(f'\nConversion(s) Completed!\n')
- text_widget.write(f'Time Elapsed: {time.strftime("%H:%M:%S", time.gmtime(int(time.perf_counter() - stime)))}') # nopep8
- torch.cuda.empty_cache()
- button_widget.configure(state=tk.NORMAL) # Enable Button
\ No newline at end of file
From 5cfcefdb15e682b8be5a71c496a525d4b43bab80 Mon Sep 17 00:00:00 2001
From: Anjok07 <68268275+Anjok07@users.noreply.github.com>
Date: Tue, 10 May 2022 19:02:40 -0500
Subject: [PATCH 05/33] Delete VocalRemover.py
---
VocalRemover.py | 786 ------------------------------------------------
1 file changed, 786 deletions(-)
delete mode 100644 VocalRemover.py
diff --git a/VocalRemover.py b/VocalRemover.py
deleted file mode 100644
index 9fd172b..0000000
--- a/VocalRemover.py
+++ /dev/null
@@ -1,786 +0,0 @@
-# GUI modules
-import tkinter as tk
-import tkinter.ttk as ttk
-import tkinter.messagebox
-import tkinter.filedialog
-import tkinter.font
-from tkinterdnd2 import TkinterDnD, DND_FILES # Enable Drag & Drop
-from datetime import datetime
-# Images
-from PIL import Image
-from PIL import ImageTk
-import pickle # Save Data
-# Other Modules
-import subprocess # Run python file
-# Pathfinding
-import pathlib
-import sys
-import os
-import subprocess
-from collections import defaultdict
-# Used for live text displaying
-import queue
-import threading # Run the algorithm inside a thread
-
-
-from pathlib import Path
-
-import inference_v5
-import inference_v5_ensemble
-# import win32gui, win32con
-
-# the_program_to_hide = win32gui.GetForegroundWindow()
-# win32gui.ShowWindow(the_program_to_hide , win32con.SW_HIDE)
-
-# Change the current working directory to the directory
-# this file sits in
-if getattr(sys, 'frozen', False):
- # If the application is run as a bundle, the PyInstaller bootloader
- # extends the sys module by a flag frozen=True and sets the app
- # path into variable _MEIPASS'.
- base_path = sys._MEIPASS
-else:
- base_path = os.path.dirname(os.path.abspath(__file__))
-
-os.chdir(base_path) # Change the current working directory to the base path
-
-instrumentalModels_dir = os.path.join(base_path, 'models')
-banner_path = os.path.join(base_path, 'img', 'UVR-banner.png')
-efile_path = os.path.join(base_path, 'img', 'file.png')
-DEFAULT_DATA = {
- 'exportPath': '',
- 'inputPaths': [],
- 'gpu': False,
- 'postprocess': False,
- 'tta': False,
- 'save': True,
- 'output_image': False,
- 'window_size': '512',
- 'agg': 10,
- 'modelFolder': False,
- 'modelInstrumentalLabel': '',
- 'aiModel': 'Single Model',
- 'ensChoose': 'HP1 Models',
- 'useModel': 'instrumental',
- 'lastDir': None,
-}
-
-def open_image(path: str, size: tuple = None, keep_aspect: bool = True, rotate: int = 0) -> ImageTk.PhotoImage:
- """
- Open the image on the path and apply given settings\n
- Paramaters:
- path(str):
- Absolute path of the image
- size(tuple):
- first value - width
- second value - height
- keep_aspect(bool):
- keep aspect ratio of image and resize
- to maximum possible width and height
- (maxima are given by size)
- rotate(int):
- clockwise rotation of image
- Returns(ImageTk.PhotoImage):
- Image of path
- """
- img = Image.open(path).convert(mode='RGBA')
- ratio = img.height/img.width
- img = img.rotate(angle=-rotate)
- if size is not None:
- size = (int(size[0]), int(size[1]))
- if keep_aspect:
- img = img.resize((size[0], int(size[0] * ratio)), Image.ANTIALIAS)
- else:
- img = img.resize(size, Image.ANTIALIAS)
- return ImageTk.PhotoImage(img)
-
-def save_data(data):
- """
- Saves given data as a .pkl (pickle) file
-
- Paramters:
- data(dict):
- Dictionary containing all the necessary data to save
- """
- # Open data file, create it if it does not exist
- with open('data.pkl', 'wb') as data_file:
- pickle.dump(data, data_file)
-
-def load_data() -> dict:
- """
- Loads saved pkl file and returns the stored data
-
- Returns(dict):
- Dictionary containing all the saved data
- """
- try:
- with open('data.pkl', 'rb') as data_file: # Open data file
- data = pickle.load(data_file)
-
- return data
- except (ValueError, FileNotFoundError):
- # Data File is corrupted or not found so recreate it
- save_data(data=DEFAULT_DATA)
-
- return load_data()
-
-def drop(event, accept_mode: str = 'files'):
- """
- Drag & Drop verification process
- """
- path = event.data
-
- if accept_mode == 'folder':
- path = path.replace('{', '').replace('}', '')
- if not os.path.isdir(path):
- tk.messagebox.showerror(title='Invalid Folder',
- message='Your given export path is not a valid folder!')
- return
- # Set Variables
- root.exportPath_var.set(path)
- elif accept_mode == 'files':
- # Clean path text and set path to the list of paths
- path = path.replace('{', '')
- path = path.split('} ')
- path[-1] = path[-1].replace('}', '')
- # Set Variables
- root.inputPaths = path
- root.update_inputPaths()
- else:
- # Invalid accept mode
- return
-
-class ThreadSafeConsole(tk.Text):
- """
- Text Widget which is thread safe for tkinter
- """
- def __init__(self, master, **options):
- tk.Text.__init__(self, master, **options)
- self.queue = queue.Queue()
- self.update_me()
-
- def write(self, line):
- self.queue.put(line)
-
- def clear(self):
- self.queue.put(None)
-
- def update_me(self):
- self.configure(state=tk.NORMAL)
- try:
- while 1:
- line = self.queue.get_nowait()
- if line is None:
- self.delete(1.0, tk.END)
- else:
- self.insert(tk.END, str(line))
- self.see(tk.END)
- self.update_idletasks()
- except queue.Empty:
- pass
- self.configure(state=tk.DISABLED)
- self.after(100, self.update_me)
-
-class MainWindow(TkinterDnD.Tk):
- # --Constants--
- # Layout
- IMAGE_HEIGHT = 140
- FILEPATHS_HEIGHT = 80
- OPTIONS_HEIGHT = 190
- CONVERSIONBUTTON_HEIGHT = 35
- COMMAND_HEIGHT = 200
- PROGRESS_HEIGHT = 26
- PADDING = 10
-
- COL1_ROWS = 6
- COL2_ROWS = 6
- COL3_ROWS = 6
-
- def __init__(self):
- # Run the __init__ method on the tk.Tk class
- super().__init__()
- # Calculate window height
- height = self.IMAGE_HEIGHT + self.FILEPATHS_HEIGHT + self.OPTIONS_HEIGHT
- height += self.CONVERSIONBUTTON_HEIGHT + self.COMMAND_HEIGHT + self.PROGRESS_HEIGHT
- height += self.PADDING * 5 # Padding
-
- # --Window Settings--
- self.title('Vocal Remover')
- # Set Geometry and Center Window
- self.geometry('{width}x{height}+{xpad}+{ypad}'.format(
- width=620,
- height=height,
- xpad=int(self.winfo_screenwidth()/2 - 550/2),
- ypad=int(self.winfo_screenheight()/2 - height/2 - 30)))
- self.configure(bg='#000000') # Set background color to black
- self.protocol("WM_DELETE_WINDOW", self.save_values)
- self.resizable(False, False)
- self.update()
-
- # --Variables--
- self.logo_img = open_image(path=banner_path,
- size=(self.winfo_width(), 9999))
- self.efile_img = open_image(path=efile_path,
- size=(20, 20))
- self.instrumentalLabel_to_path = defaultdict(lambda: '')
- self.lastInstrumentalModels = []
- # -Tkinter Value Holders-
- data = load_data()
- # Paths
- self.exportPath_var = tk.StringVar(value=data['exportPath'])
- self.inputPaths = data['inputPaths']
- # Processing Options
- self.gpuConversion_var = tk.BooleanVar(value=data['gpu'])
- self.postprocessing_var = tk.BooleanVar(value=data['postprocess'])
- self.tta_var = tk.BooleanVar(value=data['tta'])
- self.save_var = tk.BooleanVar(value=data['save'])
- self.outputImage_var = tk.BooleanVar(value=data['output_image'])
- # Models
- self.instrumentalModel_var = tk.StringVar(value=data['modelInstrumentalLabel'])
- # Model Test Mode
- self.modelFolder_var = tk.BooleanVar(value=data['modelFolder'])
- # Constants
- self.winSize_var = tk.StringVar(value=data['window_size'])
- self.agg_var = tk.StringVar(value=data['agg'])
- # Choose Conversion Method
- self.aiModel_var = tk.StringVar(value=data['aiModel'])
- self.last_aiModel = self.aiModel_var.get()
- # Choose Ensemble
- self.ensChoose_var = tk.StringVar(value=data['ensChoose'])
- self.last_ensChoose = self.ensChoose_var.get()
- # Other
- self.inputPathsEntry_var = tk.StringVar(value='')
- self.lastDir = data['lastDir'] # nopep8
- self.progress_var = tk.IntVar(value=0)
- # Font
- self.font = tk.font.Font(family='Microsoft JhengHei', size=9, weight='bold')
- # --Widgets--
- self.create_widgets()
- self.configure_widgets()
- self.bind_widgets()
- self.place_widgets()
- self.update_available_models()
- self.update_states()
- self.update_loop()
-
- # -Widget Methods-
- def create_widgets(self):
- """Create window widgets"""
- self.title_Label = tk.Label(master=self, bg='black',
- image=self.logo_img, compound=tk.TOP)
- self.filePaths_Frame = tk.Frame(master=self, bg='black')
- self.fill_filePaths_Frame()
-
- self.options_Frame = tk.Frame(master=self, bg='black')
- self.fill_options_Frame()
-
- self.conversion_Button = ttk.Button(master=self,
- text='Start Conversion',
- command=self.start_conversion)
- self.efile_Button = ttk.Button(master=self,
- image=self.efile_img,
- command=self.open_newModel_filedialog)
-
- self.progressbar = ttk.Progressbar(master=self,
- variable=self.progress_var)
-
- self.command_Text = ThreadSafeConsole(master=self,
- background='#a0a0a0',
- borderwidth=0,)
- self.command_Text.write(f'COMMAND LINE [{datetime.now().strftime("%Y-%m-%d %H:%M:%S")}]') # nopep8
-
- def configure_widgets(self):
- """Change widget styling and appearance"""
-
- ttk.Style().configure('TCheckbutton', background='black',
- font=self.font, foreground='white')
- ttk.Style().configure('TRadiobutton', background='black',
- font=self.font, foreground='white')
- ttk.Style().configure('T', font=self.font, foreground='white')
-
- def bind_widgets(self):
- """Bind widgets to the drag & drop mechanic"""
- self.filePaths_saveTo_Button.drop_target_register(DND_FILES)
- self.filePaths_saveTo_Entry.drop_target_register(DND_FILES)
- self.filePaths_musicFile_Button.drop_target_register(DND_FILES)
- self.filePaths_musicFile_Entry.drop_target_register(DND_FILES)
- self.filePaths_saveTo_Button.dnd_bind('<GbcDVMtobA3der98Z=IKw#8xYVdRe&u-5
zvBP dS4PFcBtg$b<;#1w(EaMhFstI$Qe2J6~q4
zwDH=P+!snvkQ%(buWRHDhGkju$dVi3*Itv4fK5}Xe5t!$MhU;u*+$>gg+=1Pk&u*x
zcH1)0rK=p_!_$oR`6r(mM%oZ*ety;GR~ay^pZ+fwI rMMr _F~NJD!TpM5?_sh5&7r3#j&|&8E8waXjjh_
zs}aCo;}bGqQ5nT@`BmZN_ELI+O^t<;P!dd|*YV&~r@2&JAZkIC2~Gr|fMZ2k`y7d+K|XXQ(r(??|3seUD84%<8*j`)5&}qsvw59O~z)RkT@6{T)^VT52~)4L0so`{H1oIaL}
z*TK*<0-%Re*9L?lXv6rpJ~7e&swW5#1`sj#>4RKEFE%!{`wht8v=dCq4KWChU>}|O
zJPw|O0
$WI$x)_lFQnNM0<8UKs~$`Xd_kv~asaIK@CDg0RFvAsdH`
zh;wmyBI3;V#HRMOVj?uefCTF3kYRjaQ;d|^pf&CaHszBOS0Ta6i8FFCXB;`Y;RH;Q
z+1S|hG8cr$cn*9>IN%Dw5G;k@QDf64CbYyP<&>^6XE{7LNjMeq@<>zHxR*v?l1IK!
zBtXmrdwz0;0BvP*f=xeuMMigv)8jiJq7dOQH4z0vG8z~Xk;@}s2*?;V9A*G*R(=ni
z8NU;o_!_?_6c(2grUlYM27*mA$#=IW42dAHg#)nb5-*Gg5Nx`cemD@K?g!CfuXsP0
z5siPR!yA%=j-jyj-_Q%WykRrIzOi645FolR23i;L_JQBx2>P`_2MH>083_8YUcsga
zNrI!K34R|6264`?54OOj&xgK9WL$g?dg)|ys}U+XvcJn$38eB72!w%jI4IbJG`%E=
ziQ
EF4
zB0qU>(Yo`deWN^gnTUB~^mmnCSN>Ts9UdnCMTEr48qv!uD|5*2`QIxPRrL3?jGq3U
zd@+QwnxbOtcT1}CtD+fb_3;y-ac4luT{NgmGftN4F`}Wmt
z`1i5@Ir=O+ee~Je`5xRhO6`0@!2Uad{cc#gx*yjc(!a*Mp%=W#{fCl$&hQ5L50-nx
z@FIDU