Why we built an ML platform for developers—not just data scientists

栏目: IT技术 · 发布时间: 6年前

内容简介:Machine learning has, historically, been the purview of data science teams. This makes it a bit counterintuitive that we builtWhile on the surface it seems like we chose the wrong user to emphasize, our decision reflects a fundamental shift within the mach

Focusing on the people who build products

Mar 4 ·5min read

Why we built an ML platform for developers—not just data scientists

Source: Robert Lucian Chiriac

Machine learning has, historically, been the purview of data science teams. This makes it a bit counterintuitive that we built Cortex , our open source ML infrastructure platform, primarily for software engineers.

While on the surface it seems like we chose the wrong user to emphasize, our decision reflects a fundamental shift within the machine learning ecosystem.

The rest of this article will explain this change in more detail, but the short version is that ten years ago, building a product that relied on ML—as opposed to using ML to generate a report—was only feasible for large tech companies. Now, ML has matured to the point to where even solo engineers with little data science background can build machine learning applications.

In other words, there is a new group of engineers focused not on fundamental ML research, but on building products with machine learning—and they have a particular set of needs that differ from those of a researcher.

Machine learning now enables products—not just papers

Going all the way back to machine learning’s roots in the 1950s, the field has historically been research-focused—things like Arthur Samuel’s checkers-playing AI (1959) or IBM’s chess-playing Deep Blue (1988).

Starting around 2010, there was a renewed interest in deep learning, with major tech companies releasing breakthroughs. Projects like Google Brain, DeepMind, and OpenAI (among others) began publishing new, state-of-the-art results.

These breakthroughs manifested as features in big companies’ products:

  • Netflix’s recommendation engine
  • Gmail’s smart compose
  • Facebook’s facial recognition tags

In addition, this renewed focus on machine learning—and particularly deep learning—lead to the creation of better tools and frameworks, like Google’s TensorFlow and Facebook’s PyTorch, as well as open source models and datasets, like OpenAI’s GPT-2 and ImageNet.

With better tools, open source models, and accessible data, it became possible for small teams to train models for production. As a consequence of this democratization, a wave of new products have emerged, all of which at their core are “just” ML models wrapped in software. We refer to these products as ML-native.

The emergence of ML-native software

A lot of the early examples of ML-powered products feature machine learning that improves the user experience, but isn’t necessarily core to the product. You can still write emails without Gmail’s smart compose, or watch YouTube videos without the “Recommended For You” section, for example.

ML-native products are different in that their core functionality is a model making predictions, and we’re seeing them everywhere.

Take computer vision models:

And that’s just computer vision. You could make a similar list for natural language processing models, where startups like AI Dungeon (an AI choose-your-own-adventure game) have used NLP models to create completely interactive experiences.

Why we built an ML platform for developers—not just data scientists

Source: AI Dungeon

These products rely both on the research of data science teams—though sometimes it’s just an engineer finetuning an open source model —and on the design of software engineers.

And designing production software around a model, it turns out, is a speciality of its own.

Production machine learning has unique challenges

In order to make models accessible to engineers, there needs to be an interface that turns a model into something usable for engineers—like a predict() function that takes input and outputs a prediction from the model.

One of the most popular design patterns for building this predict() function is realtime inference , in which a model is deployed as a microservice that engineers can query like any other API. For example, a smart compose-esque feature might take a user’s input text, query a prediction API, and return the predicted next word or phrase, like so:

Why we built an ML platform for developers—not just data scientists

Source: Write With Transformer

And while wrapping a model in a JSON API is fairly straightforward, scaling it is difficult.

First, the model has to be loaded and queried within a microservice, probably using a framework like Flask. That microservice then needs to be containerized and deployed to the cloud (e.g. a Kubernetes cluster) in order handle scale. On top of all of that, the cluster needs to be provisioned correctly to handle challenges specific to inference workloads, like:

  • The size of models. GPT-2, OpenAI’s state-of-the-art NLP model, is over 5 GBs.
  • The high compute cost of inference . Many models require GPUs to compute a single inference in under a minute.
  • The challenges of concurrency. It’s not uncommon for just a couple inferences to completely utilize a single instance, meaning instances need to aggressively autoscale to handle traffic.

And that’s without getting into the optimizations required to minimize cost.

These infrastructure challenges represent the largest remaining bottleneck preventing engineers from building products out of models.

The floodgates are inching open

You almost certainly already use ML-powered software—just look at the most used apps in your phone—but it is still a field dominated by a few massive companies.

Very quickly, however, we are seeing a generation of ML-native startups emerge on the back of improved tooling and frameworks, similar to how progress within web frameworks lead to an explosion of web apps in the mid-to-late 2000s.

Infrastructure is one of the last hurdles preventing engineers from building software on top of machine learning, and by raising the level of abstraction around ML infra, ML-native software should benefit from the sort of boom we saw from the democratization of web and mobile.

That, in a nutshell, is why we built ML infrastructure for developers—not just data scientists.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

可伸缩架构

可伸缩架构

【美】Lee Atchison / 张若飞、张现双 / 电子工业出版社 / 2017-7 / 65

随着互联网的发展越来越成熟,流量和数据量飞速增长,许多公司的关键应用程序都面临着伸缩性的问题,系统变得越来越复杂和脆弱,从而导致风险上升、可用性降低。《可伸缩架构:面向增长应用的高可用》是一本实践指南,让IT、DevOps和系统稳定性管理员能够了解到,如何避免应用程序在发展过程中变得缓慢、数据不一致或者彻底不可用等问题。规模增长并不只意味着处理更多的用户,还包括管理更多的风险和保证系统的可用性。作......一起来看看 《可伸缩架构》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

随机密码生成器
随机密码生成器

多种字符组合密码

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具