Neural Networks

Following resources explain the overview of Neural Networks and aided me enough to get recorded here for future references.

Deep Learning in Neural Networks: An Overview –

Neural Network zoo –

Multiple links here –

This post will be updated with more links on a continuous basis.


Longreads and current reads

I have always respected longreads in journalism. Irrespective of their topic, all riveting longreads are a result of significant research and time. These are how I choose to spend most of my leisure time on the internet. They are like short fiction but only more informative, you always have certain takeaways and ideas at the end of each longread.

Recently, I came across an author in ‘The Atlantic’ – Conor Friedersdorf. The article that introduced me to his work ( I might have previously read his work, but this time his list of articles really caught my attention) is this – Slightly more than 100 fantastic pieces of journalism.

This list contains around 100 best longreads that Conor came across in 2015. I was very excited to have come across this and looked if he had done this before. It seems he has as can be seen from these:

So, there went my longest articles to be saved in my Pocket – to read app. Conor also runs a mail list subscription where he shares brilliant longreads that he comes across. I believe it is totally worth it to be subscribed to that.


Meanwhile, I have finally managed to finish book 4 of the malazan book of the fallen series – House of Chains. After having read the 3rd book i.e. Memories of Ice, I thought I’d read the best book of the series yet (and possible among all the 10 books). But I was proven wrong and how! This book was more riveting than book 3 and it was a beginning to what was to come for the surviving bridgeburners after Whiskeyjack’s incidents in book 3. I am currently taking a break from the malazan series wherein I make my way through the ebook of wait but why series on Musk.

This is the link for anyone who is interested.

This book unlike the biography by Ashley Vance, goes into the details of the companies Tesla, SpaceX and why Musk is able to do what he is currently doing. This book is born out of 4 blog posts on the wait but why blog, so the same can be read on the blog too. Tried reading Thomas Bernhard in between (‘Concrete’) but couldn’t continue and hence left in the middle. That’s that.


Books etc. for April 2016

Steven Erikson, House of chains
Galbraith, Career of Evil 
Finally finished Memories of Ice, book 3 in the malazan book of the fallen series. Was the best book in the series so far, glad to see more written about Anomander Rake, Whiskeyjack and Quick Ben. Big book but worth every page I read. Reading about Silverfox reminded me of Brienne, good intentions but of no avail. 
Interesting reads of March, 2016

Fermat’s last theorem – interesting observation

Today, while reading about an answer related Fermat’s last theorem in Quora, I came across an interesting comment that I felt was interesting.

It goes like this:

Fermat was mistaken. He could not possibly have devised a simple proof, The proof that was finally devised relied heavily on  a branch of Maths that wasnt even around at the time, known as Modular Functions.

Modular Form — from Wolfram MathWorld

There was originally an unsolved problem called the Taniyama–Shimura–Weil conjecture

Taniyama-Shimura Conjecture

which states that elliptic curves over the field of rational numbers are related to modular forms.  There was also a second unsolved problem, Ribets  theorem (earlier called the epsilon conjecture or ε-conjecture) which is a statement in number theory concerning properties of Galois representations associated with modular forms

Ribet’s Theorem — from Wolfram MathWorld

Frey Curve — from Wolfram MathWorld

In a nutshell:

Ribet  showed that Frey curves cannot be modular, so if the Taniyama-Shimura conjecture were true, Frey curves couldn’t exist and Fermat’s last theorem would follow with b even and a=-1 (mod 4). However, Frey did not actually prove that his curve was not modular. The conjecture that Frey’s curve was not modular came to be called the “epsilon conjecture,” and was quickly proved by Ribet (Ribet’s theorem) in 1986, establishing a very close link between two mathematical structures (the Taniyama-Shimura conjecture and Fermat’s last theorem) which appeared previously to be completely unrelated.

Only by solving these two problems was it realised that it also served as a proof for Fermats Last Theorem. The proof itself is over 150 pages long and consumed seven years of Wiles’s research time.

I wish to verify the authenticity of the statement above. What is more interesting is to understand what route Fermat might have possibly taken to answer this.

According to me, new findings in any research area help in aiding the long existing attempts to solve famous unsolved questions in that area. But lack of these cannot be a primary reason for the elusion of solution so far.

Why we follow cricket

The match between India and Bangladesh was the latest in the game of cricket that makes you realize how lucky you are to have been following this game. In this batsman dominated format and the game itself these days, there are very few such instances when you get such a feeling of awe after having watched a game.
What happened in the match was not about a team’s win or loss but about why this game continues to get this kind of support in the subcontinent. Sport itself, like Harsha Bhogle said, one of the biggest forces of our time. There are not many things that can make an adult stop what he/she is doing at work and start watching this game, forgetting the meetings he has, forgetting the report he has to make, forgetting the calls and texts. Only things that matter are, the tension before and after each ball of the game and conversations with fellow peers about the game’s happenings. Thanks to technology, you don’t have to have these peers sitting beside you, the peers can be situated in other offices just like you, ignoring all that should be done, again, just like you. So this is what this game gives us, those fleeting glimpses of a team rising only to fall at the end.
We get to remember once again, those days when you watched the game in your hostel common room, 50 or more guys crowded in front of a small not-LCD tv, all praying for their teams. We would have seen this kind of results happen multiple times with South Africa or Pakistan, but would you not rather watch a game where Gayle is beating the bowlers to death or a game where both teams are desperately trying to clinch the victory from each other’s hands.
What this game gives you is the moment when you can thump your fist in the air, just like Federer does after having scored a point with a majestic forehand. We, sitting in our offices, are no Federers, but these games do give us the opportunity to feel so, when our team wins such matches, clinching victories from almost certain defeat. In that one moment, we all feel like Federers having just scored a beautiful backhand/forehand, in that one moment, we all feel like Phelpss having just beat the hell out of a opponent just by a fraction of a second, having burnt our lungs, in that one moment, we all feel like someone who defeated Usain Bolt (bbecause we sure as hell didn’t win as comfortably as Bolt does).
That is why, we follow cricket, for that momentary transformation from being a mere fan to having that feeling of an athlete, right in the middle of the action, winning, brilliantly. Period.

On training, validation and testing datasets

Today as I was going through top answers of interesting users on Cross Validated, I came across a question that stuck me with its fundamental nature.

The question asks:

Cross-validation including training, validation, and testing. Why do we need three subsets?

This seemed to be a very interesting question. One one fold it was concentration on the need for 3 datasets and essentially its difference with cross validation.

Assuming clarity on why we do cross validation (to be discussed in detail probably in another post), we need 3 subsets for the following purpose as per the top answer in the link:

  • The training set is used to choose the optimum parameters for a given model. Note that evaluating some given set of parameters using the training set should give you an unbiased estimate of your cost function – it is the act of choosing the parameters which optimise the estimate of your cost function based on the training set that biases the estimate they provide. The parameters were chosen which perform best on the training set; hence, the apparent performance of those parameters, as evaluated on the training set, will be overly optimistic.

  • Having trained using the training set, the validation set is used to choose the best model. Again, note that evaluating any given model using the validation set should give you a representative estimate of the cost function – it is the act of choosing the model which performs best on the validation set that biases the estimate they provide. The model was chosen which performs best on the validation set; hence, the apparent performance of that model, as evaluated on the validation set, will be overly optimistic.

  • Having trained each model using the training set, and chosen the best model using the validationset, the test set tells you how good your final choice of model is. It gives you an unbiased estimate of the actual performance you will get at runtime, which is important to know for a lot of reasons. You can’t use the training set for this, because the parameters are biased towards it. And you can’t use the validation set for this, because the model itself is biased towards those. Hence, the need for a third set.

These outline some of the very important ideas that anyone who works with statistical modeling should have embedded into their skulls. Often, we build models and test them without knowing the complete picture. It involves using both our business acumen and statistical acumen for one without another is  pointless.

That is why during the assignments or exercises, I always use the concept of asking myself ‘Why’ are we doing this? Why? WhY? Why? But stopping once I get the answer to that would not help, I should also be asking, what-if questions. There can be two types of what-if questions:

  1. What-if it is not done this way?
  2. What-if I do it in an alternate way?

Now, these two questions might seem to be essentially the same, but they aren’t. And understanding the difference between these two makes a remarkable difference in one’s learning process according to me.

  1. What-if it is not done this way?

This mostly applies to techniques which have a methodology set in place for them without much of a choice. In such cases, knowing the repercussions of not using this technique or misusing this technique can often allow us in diagnostics during the journey of a project. Some unwanted result or behavior that you might encounter during your project can be diagnosed by asking this question and knowing its answer(s).

  1.   What-if I do it in an alternate way?

This question lets you discover any possible alternatives that can be learnt about. This also gives you vital information about the technique/aspect you are trying to learn. And that aspect is, the important role played by the given technique in answering the question it is dealing with. If there are few or zero alternatives, then this might mostly imply that the technique is either redundant or all-ruling (which is rarely the case). Now coming to how this question differs from the 1st question, it is understanding the behavior in case of not using a technique vs knowing the choices you have when such a situation arises. A combination of these two questions will give you a certain clarity and technical mastery that are utterly essential according to me.

The elements of statistical learning

Having just finished exams for the first flex of spring semester, I am looking at last two months of study in the master’s program I am in. The courses I will be taking are already going to make this time a lot more interesting (not to mention the invisible never ending job search :p). I have however recently seen that lot of machine learning aspects can be implemented using R and need not necessarily require expertise in Python. Although I am all in for learning and advancing my python knowledge, I have come to face the fact that my coursework will not be allowing me to do so.

Coming to the machine learning part, I will be using the following book for this purpose:

  1. The elements of statistical learning (2nd edition)

I have numbered this list because I plan to add more books in case I come across any as good enough to be added to this list.

I will be posting my notes, thoughts, doubts and links from the internet in this blog. I wish to help them serve as references for myself in future times. Hence, I will not be cutting any corners while doing so and will try to make them as rigorous as possible.

Also, as and when the time permits, I would try using LATEX.