Machine Learning 101 - Workshops & More at NIT Warangal
In my final year of Undergraduate studies, I wanted to start an active machine learning group in our college where like-minded people would come together, ideate, learn and publish. Not that such groups were a radically new idea of mine, it had been attempted in past but couldn’t sustain.
When I asked around, to my surprise, I found that many people were interested in the idea and were enthusiastic to join the group. But they wanted to join the group to simply learn ML because of the lack of a sound understanding of the theoretical and applied knowledge of the subject. That’s when I realised that almost everyone is talking about ML and AI and they are excited by the possibilities that it has to offer. People also have all sorts of ideas that they want to approach using ML, be it in the field of computer science, circuit design, biotechnology or mechanical engineering, but they lack the requisite knowledge to realise these ideas.
That’s when I decided to organize workshops for people to help them develop the skill set that will allow them to work on their on ML projects. Now, Tz (Tz…technozion is the annual flagship technical festival organized by NITW) preparation was also going on at full swing around the same time. So, I along with Ashish Rai and Protik Biswas came together to organize the workshops in collaboration with Tz-17 and CSEA (Computer Science and Engineering Association). Tz-17 provided us the opportunity to serve a larger target audience and we also decided to organize a ML contest as a Full-Fledged online event. The event also served as an incentive for people to learn and work on a practical ML problem to test their skills. CSEA at the same time took care of logistics for the workshops and later for contest.
The team was formed!
As enthusiastic as I was to deliver the sessions, Now came the hardest part - preparing the content for the first ever ML workshop for a diverse audience with varying knowledge of the pre-requisites in calculus, linear algebra and computer programming etc.
Ashish, Protik and I decided to keep the content completely separate from the programming. Because Many first year students were still writing their first c++ program. That doesn’t mean that we didn’t encourage them to get their hands dirty and dive right in to understand the implementation challenges which are as intriguing as the theory backing it, But we kept the programming exercises optional for all.
Preparation took-off on other fronts as well. Posters were designed and shared across platforms. Below is the poster we came up with for our workshops.
About 360 people from different engineering specializations and courses including B.Tech, M.Tech and MCA registered for the first workshop out of which 250 showed up. I was overwhelmed by the huge response we had received. The content delivery was a big challenge for addressing such a large audience but thanks to Ashish and Protik who filled in whenever I missed some point and also took over to suggest alternate views and helped in giving personalized support to the audience, we were able to manage well. The first workshop lasted for about 5 hours in 2 sessions, one before lunch 10-1 and next post lunch from 2-4. It was a huge success. This further motivated us to organize such sessions in future.
We organized three workshops as part of the series and covered a wide spectrum of algorithms including linear regression, logistic regression, neural networks, support vector machines, k-means, decision trees etc, along with some techniques and practical issues around them including L1 norm, L2 norm, tree-pruning, class imbalance, one-hot encoding, gaussian noise, underfitting, overfitting, bias and variance, vectorization, effects of outliers, cross-validation, K-fold-cross-validation, batch, mini-batch and stochastic methods etc. We also discussed some mathematics here and there to help us provide solid foundational support for formulas and some intuitive justification around them including linear algebra, calculus, partial derivatives, optimizations, convex and concave functions etc. We focussed on the programming exercises for people who wanted to work on their implementation skills. We asked them to write their own logic for algorithms as well as introduced libraries for the same. We created elaborate assignments for people on Jupyter notebooks and used scipy, numpy, scikit-learn, matplotlib and some other open-source tools.
We organized a special coding session where we discussed 2017’s Goldman Sachs GS-Quantify ML problem statement and built several solutions around it. The session was organized specifically to introduce how multiple algorithms can be used together, to demonstrate the importance of data cleaning and pre-processing and gathering useful insights by doing simple statistical analysis on the data. It was a very interesting problem statement and dataset I had encountered recently. We discussed how handling class imbalance and hyper-parameter tuning could affect the model performance among few other practical insights.
Next, we organized Tz ML contest on kaggle which was open to public and allowed people to put the skills they had learnt to use. We saw good participations and innovative solutions to the problems.
We also later organized a deep learning session where we discussed CNNs, Auto-Encoders, RNNs, RBMs etc and focussed on some other projects we had been working on, to give our audience an idea of the kind of research work that goes on in the field. Around that time I was working on Discriminatory Scatternets for my UG thesis. You can read more about my work in this blog post.
Overall, I must say it was an awesome experience for me. I learnt a lot out of these sessions and exercises both on the technical and non-technical fronts.