musings on music and life

December 16, 2016

My experiences with learning “Data Science” in 2016

Filed under: Coding, Data Science — sankirnam @ 11:21 pm

Well, 2016 is drawing to a close…

This has been a weird year globally, with the death of a lot of influential people in history (including, among others, Muhammad Ali and lately J. Jayalalitha, the Chief Minister of Tamil Nadu, India), and some other strange political occurrences (Brexit and Trump getting elected). I haven’t posted here much because I have a million thoughts swirling around my mind all the time now, and finding a couple of hours of focused time in order to distill them down into an article on a single topic is a bit challenging. Nonetheless, there is something that I want to discuss today.

Firstly, I had the sobering realization a few days ago that it has been 2 years since I finished my PhD and I have nothing concrete to show for it; I’ve been unemployed for the past two years. Well, I’ve learned some valuable things about life and other topics which I wouldn’t have been able to learn otherwise, but it has been at a rather expensive cost: progress in my career.

In any case, one of the major themes of this year (for me) was that I made major progress in learning programming! I want to share what I learned so that others who are thinking of venturing down this path can learn from my experiences.

Firstly, my motivation in learning to code resulted in me being a little unfocused; I was unemployed and seeing a lot of people around me getting hired for cushy tech jobs with great salaries. Desperation shouldn’t be your only motivation for trying something. I was also unaware of the vast variety of “coding” jobs out there, and they can be quite different; CSS is considered “coding”, but it is vastly different from doing software development in C++, for instance.

I’m all for teaching computer science principles in the grade school level; the basics of control flow are not terribly complicated – it’s just logic, after all. Understanding looping, recursion, iterations, and conditionals does not require a very advanced background in any other subject, and knowing these will take you very far later on in life. I’m a strong believer that everyone should learn to code, given the increasing automation that is threatening all industries today. Those who can code will be the last people to have their jobs automated out of existence, pure and simple.

All this being said, I started my journey down this rabbit hole with Codecademy. I highly recommend this for others who also have no formal background in programming/CS, as it eases you into the relevant concepts of the language of your choice. It’s a great place to learn the higher-level languages (such as JavaScript, Python, and Ruby), but keep in mind that the courses are introductory, and very short (they can be completed in a few hours). They’re designed to give you just enough knowledge so that you can go out and keep learning on your own or from other sources.

After Codecademy, my next stop was FreeCodeCamp. FreeCodeCamp is amazing, and I hope it grows from strength to strength over time. It is the brainchild of Quincy Larson, and it attempts to create a fairly rigorous curriculum in Full-Stack Web Development starting from scratch; no prior knowledge is required, like Codecademy. The first lesson is literally “Hello World!”. It starts off with a comprehensive coverage of the front-end (website building with HTML and CSS), and also covers responsive design using Twitter’s Bootstrap API. It then progresses into JQuery and vanilla JavaScript, and it has you also do some pretty challenging algorithm challenges, which reinforce your understanding of all the methods and properties in JavaScript. The bonus with FreeCodeCamp is that it also has you working on projects, which can be incorporated into a portfolio so that you have something to show to prospective employers.

Web Development has the lowest barrier to entry among all the different types of programming, and that’s why places like FreeCodeCamp thrive. It was after doing it for a while that I realized webdev wasn’t for me, however; I don’t have the patience to mess around with DOM elements and get that alignment juuuuust right; if I really had to choose, I would be more comfortable doing back-end stuff.

I continued working on JavaScript and FreeCodeCamp while applying to programming bootcamps in April-May 2016, and eventually ended up taking a “Data Science” bootcamp by Logit in Hollywood. I wrote about it earlier,  so there’s no need to reiterate what’s already been said. I felt like “Data Science” would be the best fit, given what I had experienced with programming thus far, and also (naively) thought it would give me the best ability to leverage my PhD.

I used the word “naively” in the previous paragraph; here’s what I learned:

  1. Getting a job after a bootcamp is all about how strong your resume is prior to the bootcamp. Now, that may not seem fair, as people want to go to bootcamps to “reset” their careers and get a fresh start, but the reality is that you really can’t learn much in just 12 weeks. And now that bootcamps are getting more popular, employers are looking for other ways to distinguish you from the hundreds or thousands of other people who are also taking bootcamps. Sure, you took a JavaScript bootcamp, but what else stands out? Do you have an advanced degree (MS/PhD) or did you go to a top university (Harvard/Stanford/MIT/Caltech/CMU etc.)? Do you have relevant prior work experience?
  2. In “Data Science”, degrees in CS, math, statistics, computational fields (e.g. computational biology), biostatistics, or physics are extremely sexy. If you have one, flaunt it as much as you can! Any other degree (including my PhD in Organic Chemistry, as I discovered) is worthless in this context. That’s because “Data Science” is a poorly defined field and a lot of employers still don’t know what they want. If you look at job descriptions, most will require knowledge of a scripting language (R or Python), Java, a lower-level language (C or C++), thorough understanding of SQL, and Bash scripting (on Linux). These are not things you can pick up in a few weeks at a bootcamp.
  3. The “Data Science” market is cooling off right now. A few years ago, there was a massive hype surrounding “Data Science”, and there were numerous articles talking about how there was a critical shortage of “Data Scientists” in the country. My experiences have shown the opposite, however – it took one of my friends in my cohort (who has a PhD in physics, one of the “sexy” subjects I mentioned above), about 4 months to land a job after the bootcamp.

So – what useful, actionable advice can I give after all this? What I can say is that if you want to learn “Data Science”, all the material is available online for free. The advantage with a bootcamp is that it gives you a roadmap of what to study, as well as connections – to your classmates, instructors, and other people who the organization is affiliated with. Out of all the courses I’ve seen and taken online regarding “Data Science”, this progression is probably the best, and most logical (feel free to leave comments if you have other suggestions):

  1. Start with Codecademy if you have 0 programming experience. If you want to get into Web Development, complete the JavaScript, HTML, CSS and related tracks, and then dive right into FreeCodeCamp. Otherwise, if you think you may want to do Data Science or want to have a broader understanding of CS fundamentals, stick with Python.
    N. B. Something to keep in mind: if you have no prior experience with programming, don’t worry about R. R is a specialized language for statistics; it is written by statisticians for statisticians, and the syntax is very challenging even for experienced programmers.
  2. Once you’ve completed Codecademy, the next course I would take is MIT’s 6.00.1x Intro CS course on EdX. I have taken this course myself and I have written about it. This course gives you a fantastic intro to the fundamentals of computer science at a fairly rigorous academic level, and it uses Python as well, so that should give you more practice with programming in vanilla Python. The follow-up course 6.00.2x is also good and covers more advanced topics including algorithms, random walks, and other topics, which should put you in a good position to learn more about “Data Science”.
  3. HarvardX’s PH526x course on EdX is a good follow up to this sequence, since it introduces a lot of the popular Python packages for “Data Science” including numpymatplotlib, Pandas, and others. I also just finished the course earlier this week and will put my thoughts on it in a separate post here.
  4. Microsoft DAT210x on EdX is also highly recommended, and I also wrote about it after completing the course. This course gives plenty of practice with machine learning, and will put you in a good position to learn more about any of the algorithms in the course (K-Means, KNN, SVM, Random Forest, and others).

So – after taking all of these courses, THEN you can think about joining a bootcamp to further your knowledge. I wish I had done all the above courses before I did the “Data Science” bootcamp this summer; I would have been in a better position to learn, absorb, and better assimilate the material. But what’s done is done, and I’m continuing to learn Python, Machine Learning, and “Data Science” concepts at my own pace. I’m continuing to practice vanilla Python on Hackerrank, and you can follow my progress on my github – I’m trying to make github commits on a regular basis so that it makes a favorable impression on whoever happens to stumble across it! Interestingly, some of my repositories are getting a fair bit of traffic….so, you never know!

I sincerely hope that this rather “stream-of-consciousness” post helps you, if you do decide to venture down this path!

 

October 31, 2016

Review of MITx 6.00.1x

Filed under: Coding, education — sankirnam @ 12:53 pm

just finished the above course – I just completed the last problem on the final exam and completed the exit survey a few minutes ago, so I figured I would write my thoughts on the course while they’re still fresh.

My impressions of the course are unanimously positive. I just finished the current iteration of the course (Aug – Nov 2016), and I found it to be excellent. I just finished writing an email to Prof. Grimson (the professor conducting the course), thanking him for all his efforts in preparing such high-quality materials!

Keep in mind that the title of the course is “Introduction to Computer Science and Programming using Python”, and so it is aimed to be an intro CS course of sorts. Nonetheless, it does serve as a very good introduction to the Python language, and covers fundamental CS concepts while teaching the Python language, including the various data structures (lists, tuples, and dictionaries), functions, and classes. The course isn’t intended to teach Python specifically, and so doesn’t cover a lot of the things unique to Python (such as lambda functions, list comprehensions, and other topics).

In retrospect, I wish I had taken this course before taking the “Data Science” bootcamp this summer – I would have been better prepared and would have had at least a rudimentary understanding of the CS fundamentals. Anyway, what’s done is done, and I’m glad that I was able to take this course.

The problem sets were very well crafted. They were appropriately challenging, and I probably did spend around the recommended 15 hours/week or so on them, and they weren’t too difficult where I would have ended up throwing my computer out the window and quitting in frustration. The bonus is that I ended up also learning how to use my computer better – since this course uses Python 3, I ended up using Anaconda to install that (so that I could manage that alongside my existing Python 2 install). I also ended up using Spyder as my IDE of choice for the course, and I’ve come to like that a lot.

As always, if you want to take a look at the problem sets, exercises, and my solutions, I’m posting everything to my github.

Proof of completion (I blacked out my username and email to dodge spambots):mitx6001x

Anyway, onwards to the sequel course, 6.00.2x! I started this course and it’s proving to be MUCH tougher, since the barrier is no longer the Python language, but abstractly developing algorithms before implementing them in Python.

May 17, 2016

On learning to code

Filed under: Coding, Data Science, education — sankirnam @ 11:05 am

Last week, the following article was published in TechCrunch: Please don’t learn to Code. This was swiftly followed by Quincy Larson’s reply, Please do learn to code.

For those who don’t know, Quincy Larson is the founder and director of FreeCodeCamp, an online programming education website that is disrupting the traditional paradigm of teaching programming/ CS. I’m going through it myself, and highly recommend it for anyone who wants to learn programming – the front-end web development curriculum is very well done, and it walks you through HTML, CSS (including responsive design with Bootstrap), JQuery, and JavaScript. Even if you do not necessarily want to go into webdev, this is a good place to start; it has you make projects to really cement your knowledge. Until I did this program, I had no idea how to make a website from scratch with HTML and CSS!

In any case, with regards to the articles I linked at the beginning, I am siding with Quincy Larson on the issue. Computers and digital devices are ubiquitous in our lives nowadays, and we spend at least 5 hours or more (a very conservative estimate) a day interacting with computers, whether it is in the form of desktop computers, servers, laptops, tablets, or mobile smartphones. Knowing how to use these devices is one thing, but that is the bare minimum; if you want to be truly productive in today’s society, you need to be able to get these devices to work for you, and that is where a knowledge of programming comes into the picture. In addition, with the rise of machine learning and increased automation, we’re beginning to see an increased number of jobs that were traditionally done by humans now being done by computers. This automation is beginning to seep into areas that are considered “high-skill”, such as organic synthesis. Thus, it’s like I say nowadays:

You don’t want to lose your job because someone else automates your position, right? You would rather be in a position where you automate someone else’s job. The only way to ensure that you are in the latter position is to learn programming/computer science.

The beauty of the field of programming/computer science is that it is extremely egalitarian, compared to other fields. In the programming arena, people care only about what you’ve done, what you’ve accomplished, and whether you know your stuff or not; educational pedigree is largely irrelevant. Contrast this to a field like organic chemistry, where if you do not have a degree from MIT/Caltech/Harvard/Stanford/Berkeley your resume will be swiftly thrown in the trash. This is why, in CS, it is now accepted that a GitHub profile is the new resume.

In other news, I have been applying to bootcamps for the last few weeks, in order to have something do this summer given that the job situation in organic chemistry continues to remain abysmal. I know I have been scornful of bootcamps and “data science” in the past, but my reason for applying to these places is simple. I could learn the material on my own for free (or a significantly reduced cost), but it would take a long time – at least a year or two. If I can accelerate the process and learn everything in 12 weeks, then it is worth the extra cash, and after all, time is the most valuable asset we have in our lives. This video explains it pretty well:

After interviewing at several places, I was accepted to Codesmith, Logit Data Science, and Dev Bootcamp. I’ve decided to go with Logit Data Science simply because it makes more sense given my background; going into full-stack web development is orthogonal to my past education. There are pros and cons to all decisions; Logit is cheaper, but I’m going to be in the first cohort, so it remains to be seen how good the program is going to be. Also, given that my CS, math, and statistics backgrounds are very minimal, I’m anticipating that this is going to be extremely challenging. But sometimes, succeeding in life is all about risks and taking that first leap of faith! Codesmith is a little better established; they’ve been around for a year. I visited their campus/office a couple of weeks ago in Playa Vista, and was very impressed. The atmosphere is quite relaxed, but I did feel the “work hard, play hard” spirit there. The CEO, Will Sentance, is one of the main instructors there, and his teaching style is absolutely fantastic. He explains all the concepts thoroughly and clearly, and his enthusiasm for the subject is infectious. If you’re considering joining a full-stack bootcamp, I highly recommend Codesmith – do check them out! They are up there with Hack Reactor in terms of quality of instruction and overall experience.

April 20, 2016

ok now, this is getting a little ridiculous

Filed under: Coding, Data Science — sankirnam @ 11:26 pm

As part of my job search (which has been ongoing for the last year and a half now), I’m applying to several programming and “Data Science” bootcamps. I have posted my thoughts about “Data Science” before, but it seems the juggernaut is nigh unstoppable. During this process, I have experienced a multitude of things that I need to get down.

First off, I want to get a satisfactory answer to this question: If people with just 12 weeks of education can compete for the same jobs as computer science graduates from a university, does it mean that a CS degree is not really worth that much? On the flip side, the relative value of these skills is still pretty high – you can study chemistry for 10+ years, get a PhD, and end up unemployed (as in my case), or you can go through a bootcamp and code JavaScript and look forward to jobs with a minimum starting salary of $105,000 (so CS >>>>>>>>>>>>> chemistry, every time).

I have also heard that there are an astonishingly high number of CS graduates, even those with advanced degrees, who cannot do simple programming exercises like the “FizzBuzz” challenge or simple algorithms. So perhaps there are a large number of mediocre CS students who are getting through the university system and are unable to pass job interviews or fulfill job requirements. In chemistry, this would be like studying organic chemistry on paper but having trouble going into the lab and doing synthesis (or if you’re a theoretician, not being able to input and optimize a model system in a program like Gaussian or Spartan properly, and draw reasonable conclusions).

The other thing that I have been told by a lot of people who studied computer science formally and are now practicing computer scientists (or programmers) is that “computer science ≠ programming”. While this may be obvious to those in the field, it is not obvious to those outside, such as myself; for a long time, I was belaboring under the illusion that they were the same thing. Pure computer science is more akin to math or logic, and one spends a lot of time learning about abstract concepts such as Data Structures, and it is implied that students should be able to pick up programming skills along the way. The current rise of bootcamps and websites such as FreeCodeCamp and Codecademy has decoupled a “pure” CS education from that of programming; these programs get you coding first, usually with HTML, CSS, and JavaScript, without worrying about the underlying logic or science behind the code. Interestingly enough, when I asked interviewers at bootcamps about this (whether bootcamp graduates with a shallow theoretical CS education could compete with regular CS grads for programming jobs), they mentioned that bootcamp graduates were often competitive, simply because of their ability to code better and faster.

The analogous situation in chemistry would be decoupling experimental and theoretical chemistry – e.g. doing organic synthesis without knowing anything about the theory. Is this possible? We’ll never know, because I don’t think there will ever come a time where the demand for synthetic chemists will jump that high, to obscene levels beyond the ability of universities to produce sufficient graduates. At the same time, safety is the big consideration when comparing computer science and chemistry. If you screw up in CS, nobody will get hurt, but if you screw up in the chemistry lab, a range of things can happen, ranging from nothing (if you’re lucky), to killing yourself (if you’re not careful). But from an educational perspective, is it possible to teach “applied chemistry” in order to reach the masses, the same way websites like Codecademy, FreeCodeCamp, and Code School have revolutionized programming education to make it more egalitarian? Chemical concepts like equilibrium, reaction kinetics, etc. can be dry and theoretical; can you teach chemistry in a way to make it more understandable by the masses, but at the same time maintain the “tactility” required to really understand the subject that can only be achieved through lab work? This is a challenge for the next generation of instructors, and one that we as chemists all must face as we strive to prove to upcoming generations that our subject is relevant!

In any case, back to the subject of bootcamps. One of my friends mentioned earlier today:

“honestly you becoming a vanilla webdev is a waste of your talents and training
a lot of people can do that job
not many people can do research in organic chemistry”

Formatting is messed up because I copy-pasted this from a google chat. This friend does bring up a valid point though; why am I trying to go into CS? I have addressed this before, but I still have inner conflicts where I feel like I should keep trying for a job in chemistry (due to the sunk cost fallacy). In any case, this friend is forgiven for not having an accurate knowledge of the chemistry job market – that last statement is completely inaccurate, as there is a massive glut of people who can do research in organic chemistry.

But the sudden rise of bootcamps has got me thinking – is this indicative of another bubble? There are so many coding bootcamps now all over the US, and “Data Science” bootcamps are also springing up all over the place. BTW, the next person who tells me “with a PhD in science, you should think about going into “data science!” is going to get a kick in a very sensitive place. Unfortunately, as I have learned, organic chemistry is not a “quantitative” discipline, and I have been rejected from The Data Incubator, Metis, and Insight for not having the correct background. Also, the programming background required for “data science” is rather steep, and it is not something that can be easily picked up if you don’t have prior training in CS or programming, which is why I’m looking into “vanilla webdev” bootcamps, as the entry requirements are easier for me to meet with my limited coding background.

As to the title of this post, today I came across this.

I have NO idea what to make of this – it’s a prep course to help you get into a bootcamp (o_O). This is like what goes on in India today – you have prep courses to help you get into prep courses for the IIT JEE entrance exam. This has me completely flummoxed, and is another indicator of how the demand for programmers is far exceeding the supply – App Academy (the company running the prep course) is simply cashing in on this trend. Is this indicative of another imminent bubble? One can’t predict the future, but it certainly does seem that way…

August 3, 2015

learning to code…FizzBuzz

Filed under: Coding — Tags: — sankirnam @ 9:51 am

I’m back from a bit of a hiatus… while I’m still not gainfully employed, I’ve been keeping myself busy with a variety of things. Lately, I’ve been learning to code on Codeacademy – I highly recommend this website for other beginners like myself since it is interactive and the lessons are planned out very well, with a gradual introduction of new concepts and periodic refreshers and reviews where necessary.

I’ve been doing the Javascript lessons on Codeacademy, and along the way I had to do the famous FizzBuzz exercise. For those who don’t know:

“[…] questions I call “FizzBuzz Questions” named after a game children often play (or are made to play) in schools in the UK. An example of a Fizz-Buzz question is the following:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

Sounds pretty trivial on the surface, right? I mean, anyone can do this on pen and paper, but it takes a little bit of programming knowledge to write code that accomplishes this. The scary part?

The majority of comp sci graduates can’t. I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.” (Source)

Wow. And these people will still be able to get jobs that pay salaries far, far beyond what competent PhD chemists make, due to the robust job growth and demand for computer scientists/programmers.

In any case, here’s my solution (for numbers 1-20):

for(var i=1; i<21; i++) {
if(i%3 === 0 && i%5 === 0) {
console.log("FizzBuzz");
} else if (i%5 === 0) {
console.log("Buzz");
} else if (i%3 === 0) {
console.log("Fizz");
} else {
console.log(i);
};
}

Blog at WordPress.com.