Audio-Visual Generation of Guitar Cover Videos
• Implemented generative adversarial networks that generated video clips from audio. • Constructed a hybrid LSTM-CNN architecture for audio-to-image cross-domain translation. • Utilized pose estimation and hands segmentation algorithms as a labor-saving effort for generating training data. • Designed feature engineering using signal processing techniques based on short-time Fourier transform (STFT) and deep-learning-based pitch detection models. • Utilized: Keras, tensorflow, OpenCV, LibROSA