The Single Biggest Reason Why AI/ML Companies Fail to Scale?

栏目: IT技术 · 发布时间: 4年前

内容简介:“What’s the accuracy of this machine learning (ML) model?”“How long is the training time?”“How much training data do you need?”

And Three Things You Can do to Avoid it Happening to you.

The Single Biggest Reason Why AI/ML Companies Fail to Scale?

source: Long Ma on Unsplash

“What’s the accuracy of this machine learning (ML) model?”

“How long is the training time?”

“How much training data do you need?”

Working for a company that builds machine learning software for robotics, I hear these questions every day. Machine learning has become a shiny object that everyone wants to pursue. Over 80% of the companies are looking into at least one AI project.

Users generally want to know how long it would take to onboard a new item and how well the models perform or generalize. They want a way to measure the overall cost against performance. However, answers to the above questions don’t give you a full picture. Even worse, they are misleading.

Model training is only the tip of the iceberg. What most users and AI/ML companies overlook is the massive hidden cost of acquiring appropriate datasets, cleaning, storing, aggregating, labeling, and building reliable data flow as well as infrastructure pipeline.

Based on recent research, companies spend more than 80% of the time in AI/ML projects on data preparation and engineering tasks. In other words, if you pay most of your attention to building and training models, your total engineering effort and cost could be five times more than your initial estimation.

Furthermore, machine learning blurs the line between user and software providers.

We have begun to see the emergence of AIaaS or MLaaS. Models in the cloud continue to improve with more data. This is the biggest benefit of machine learning but also what makes MLaaS a more challenging business than SaaS.

Models learn from training data. Without quality data, models won’t perform well. Users don’t always know the best practice of generating or annotating appropriate datasets. In fact, most of the time, they don’t.

When systems underperform, users tend to blame models. Therefore, AI/ML companies generally spend significant time and resources training and working with users to ensure data quality, which becomes a shared responsibility between AI companies and their customers.

For example, to train a model for defect inspection on the production line, a computer vision company needs to work with its client to install cameras with the right angle and position, check the resolution and frame rate, make sure that there are enough positive and negative training samples for every scenario.

Data collection is even more time consuming and costly with robotics or self-driving car applications because someone needs to control robots or vehicles to perform certain actions.

Even with all the training sessions, user manuals, and guidelines, you still do not have full control over user-generated data. A machine vision camera company told me that they have engineers manually verify 100% of the incoming data to ensure data integrity.

All these additional and often overlooked training, manual checking, data cleaning, and labeling tasks lead to significant overheads for AI companies, making MLaaS a more challenging business with a lower gross margin.

To build a more scalable AI/ML business, there are three things we can do:

Scalability is King. Identify the right use cases that a significant number of customers would pay you for and you can solve with the same model architecture. The last thing you want to do is to build and train different models for different companies without a standard product offering.

Make as many things self-serve as possible. Automate your training and data pipeline as much as possible to increase operational efficiency and reduce dependency on manual labor. Companies tend to prioritize customer visible features over internal tools or automation. But the latter pay off quickly. Make sure that you allocate enough resources for internal process automation.

Lastly, identify and track your costs, especially the hidden ones.How much time do your engineers spend cleaning, filtering, or aggregating data? How much time to they spend ensuring annotation is done correctly by a third party? How often do they need to help customers set up the environments and correctly gather data? How much of this can be automated or outsourced?


以上所述就是小编给大家介绍的《The Single Biggest Reason Why AI/ML Companies Fail to Scale?》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

进化式运营:从互联网菜鸟到绝顶高手

进化式运营:从互联网菜鸟到绝顶高手

李少加 / 电子工业出版社 / 2016-11 / 59

互联网运营作为一个新兴的岗位,一方面它是企业的核心岗职,身负重任,另一方面,又由于其短暂的历史,缺乏成熟体系的工作方法论,而目前业界主流的运营方法却是从企业视角出发,存在极大的改进空间。 《进化式运营:从互联网菜鸟到绝顶高手》作者基于自身十年的互联网洞察、实践经验,并融合了信息论、心理学、经济学、管理学、甚至包括生态学、进化论等跨学科跨学业的知识,从无到有地构建了一套全新的互联网运营体系:基......一起来看看 《进化式运营:从互联网菜鸟到绝顶高手》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具