Machine Learning: 6 hours progress

Capture-6hrs17 March 2016

In CZ4041 Machine Learning assignment, my group has chosen to do a quite difficult data science challenge from Kaggle. I’ve been approaching it using Artificial Neural Network (Multi-layer perceptron to be exact). I’ve switched from using Pybrain library to Keras.io library. And since the raw training data is 10 GB, I pre-processed the data first and suddenly it became 26 GB. Phew, luckily I have lots of space in my hard drive. Now the challenge is how to fit those data in my 12-GB RAM. When I ran using Pybrain, I naively loaded the 10-GB raw data to the program and my computer frozen, like the good old days. Switching to Keras, it is more difficult to understand since there are much more configurations to set, but yeah I’ve been copy-pasting codes and doing trial-and-error with smaller dataset before doing this. So far, it seemed to be quite promising. Assuming the same speed, it will take 30+ hours to go. Hopefully it works.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.