11785 - The fundamentals of deep learning
This course introduces us to the fundamentals of deep learning, taught by Prof BhikshaRamakrishnan.
The content starts off basic human brain design (neuron firing) and swiftly moves to MLPs, followed by CNNs, RNNs, sequence-to-sequence models, Attention, Autoencoders, VAE, GANs. Each topic is covered in great depth and details along with assignments aimed at implementation from scratch and on some real data. It had four total assignments focusing on MLP, CNN, RNN, and Attention. Each assignment has two parts: In part 1 wewere expected to code everything from scratch in Numpy without the use of any automatic differentiation library like Pytorch. The intention is to understand everything using fundamentals.
In part 2: wewere expected to use Pytorch and compete on Kaggle by pushing the models in terms of performance. This covers the more practical aspect of making models work.
There are 14 timed quizzes in total which need to be completed over the weekend and a Final Project.
In final project, we implemented and tuned models that performed QuestionAnswering (QA) tasks on the abstracts of academic Computer Science papers.Our baseline models were simple QA models based primarily on BERT. Eachmodel involved switching out BERT for one of its variants, including SciBERT(pre-trained on scientific literature) and DistilBERT (BERT after network pruning).Our main model was a combination of SciBERT and BiDAF. For all models, wetrained and evaluated the model on a SQuAD-like dataset called PaperQA, whichcontains crowd-sourced questions/answers about CS paper abstracts.
Link to video -
The course website can be found here.