Shortest Path to Productivity, Google Colaboratory

Efficiency, productivity and collaboration are critical in scaling up machine learning. Being a machine learning practitioner means doing a significant amount of devops and systems integration. Enter Google Colaboratory. Its message is to "help disseminate machine learning education and research" using a Jupyter notebook environment that runs entirely in the cloud and integrated with your Google Drive.

With the release of Colaboratory, machine learning education and project sharing has become as simple as sharing a document in Google Docs.

This post will highlight Colaboratory's benefits to the machine learning community and how Launchpad.AI is now using the platform to help big enterprises efficiently build its own team of data scientists--similar to our own Fellowship program.

Working Together

Colaboratory's main highlight is precisely its name: collaboration. In addition to the already popular Jupyter notebook environment, Google has created sharing with instantaneous updates for even higher efficiency and a huge boost to the pair programming model. As a fellow of Launchpad.AI, open collaboration and pair programming are at the core of what gives us fast and reliable results in a real-world industry setting. Using Colaboratory, we could work together anytime in any location!

Its usability allowed teams to work all off one notebook, preventing the need for each team member to do any setup locally to their machine (including installing any libraries or loading large files). Not only this, Google has also introduced the option for GPU runtime. Each instance is given a K80 GPU for free, up to 12 hours at a time per VM. To verify this, run the following code:

import tensorflow as tf
tf.test.gpu_device_name()

If GPU is enabled you should see the output `/device:GPU:0` otherwise it will output ``
To enable GPU

Runtime > Change runtime type > Hardware Accelerator > GPU (from list) > Save

To verify the version of Python being used (Google Colaboratory only supports Python 2.7 and 3.6 at this time):

import sys
print(f'Python {sys.version_info[0]}.{sys.version_info[1]}')

More quick tips in verifying specs can be found here.

Machine Learning Education

 
Screen Shot 2018-03-06 at 2.14.24 AM.png
 

As mentioned, Colaboratory has served as a form of collaboration for the fellowship program. It's also capable of serving a teaching tool, enabling our cohort to discuss implementations and go over statistical and deep learning models together. An example of this can be found here. Adapted from the Amazon SageMaker tutorial for Time Series Regression, I used Google Colaboratory to create a notebook lecture. This was shared amongst the cohort and we were all able to go through in real-time the implementation without any confusion from missing libraries or files. With the entire lecture centralized in one notebook, all that was needed from each fellow was access to Colaboratory. In addition to creating a notebook within Colaboratory, you can also import any Jupyter notebook and run/share as needed.

Highlights

  • Free to use
  • Up to 13GB RAM available
  • Free K80 GPU available for all users for 12 hours at a time
  • Idle VMs will timeout after 90 minutes
  • Recommended browsers: Chrome, Firefox
  • Python 2.7 or 3.6
  • Changes are visible instantaneously (similar to Docs or Sheets) if viewing/editing same instance

Installations need to done for every new VM connection. To ensure that each notebook runs properly it is advised to include all cells that install and load any libraries or files to be run when needed. Use !pip install

!pip install xgboost

Benefits

  1. Time-effective to share notebook that doesn't need to run locally on every machine
  2. Presenter has all information, files, and libraries ready and tested
  3. No latency issues, missing information, or improper installations during a lesson
  4. Those in viewing mode get real time updates to any changes being made and can serve as a useful tools in proving certain intuitions
  5. Information can be presented clearly with both text and visuals to keep user engaged, limiting any confusion
  6. In sharing mode, those viewing can also leave comments that can serve as questions or recommendations for the presenter without time being lost
Ryan Gaspar