Cloud ML compute vendors
August 23, 2016 — March 29, 2022
Cloud, n.
A place of terror and dismay, a mysterious digital onslaught, into which we all quietly moved.
A fictitious place where dreams are stored. Once believed to be free and nebulous, now colonized and managed by monsters. See ‘Castle in the Air’. […]
[…] other peoples’ computers
via Bryan Alexander’s Devil’s Dictionary of educational computing
See also Julia Carrie Wong and Matthew Cantor’s devil’s dictionary of Silicon Valley:
cloud, the (n) — Servers. A way to keep more of your data off your computer and in the hands of big tech, where it can be monetized in ways you don’t understand but may have agreed to when you clicked on the Terms of Service. Usually located in a city or town whose elected officials exchanged tens of millions of dollars in tax breaks for seven full-time security guard jobs.
If I want a GPU this all becomes incredibly tedious. Anyway…
General considerations: Is a billion-dollar worth of server lying on the ground? tl;dr Amazon is weirdly expensive. Why not go, e.g. OVH? Is it strictly because they have fewer HOWTOs?
My local HPC is not a cloud provider in the modern sense but it can pretend sometimes.
Vast.ai allows everyone else who overinvested in buying GPUs during the bitcoin boom to sell their excess GPU time to make back the cash. Looks fiddly to use but also cheap.
Floydhub do deep-learning-oriented cloud stuff and come with an easy CLI.
Microsoft Azure — haven’t really tried it but presumably Microsoft are good at the computers?
Amazon has stepped up the ease of doing compute recently. It’s still over-engineered for people who aren’t building the next instagram or whatever, and expensive to get your data out. See my Amazon Cloud notes.
Google cloud might interoperate well with a bunch of google products, such as Tensorflow, although it has weirdnesses like relying on esoteric google APIs so hard to prototype offline or with awful internet. See my google cloud notes.
IBM has a cloud offering (documentation) but now that my brother has left the company and can’t get me sweet inside deals, I can’t be bothered.
Turi is also in this business, I think? I’ve gotten confused by all their varied ventures and offerings over many renames and pivots. I’m sure they are perfectly lovely.
cloudera supplies nodes that even run python but AFAICT they are thousands of bucks per year, very enterprisey. Not grad-student-appropriate.
Databricks, spun off from the esteemed Apache spark team, does automated spark deployment. The product looks tasty, but has a savage baseline rate of USD99/month, which is enough to rule it out for, e.g. grad students.
1 ML services in particular
Paperspace is a node supplier specialising in GPU/machine learning ease.
Runway.ml is
RunwayML is a platform for creators of all kinds to use machine learning tools in intuitive ways without any coding experience. Find resources here to start creating with RunwayML quickly.
Best practices for implementing machine learning on Google Cloud | Cloud Architecture Center
Best Machine Learning as a Service Platforms (MLaaS) That You Want to Check as a Data Scientist
2 RONIN cloud
RONIN (ALLCAPS apparently obligatory) is
an incredibly simplistic web application that allows researchers and scientists to launch complex compute resources within minutes, without the nerding.
It seems to handle provisioning virtual machines in an especially friendly way for ML. It also seems to be frighteningly sparsely documented, especially with regard to certain key features for me: How do I design my own machine with my desired data and code to actually do a specific thing? Answer: read RONIN BLOG (allcaps obligatory).
Pricing is mysterious and looks enterprisey, so you and I probably will not benefit from it; my current employers have a subscription though.