Sentiment Analyzer with BERT (build, tune, deploy)

栏目: IT技术 · 发布时间: 4年前

Sentiment Analyzer with BERT (build, tune, deploy)

Brief description of how I developed sentiment analyzer. It covers text preprocessing, model building, tuning, API, frontend creation and containerization.

Zuzanna Deutschman

Jul 24 ·4min read

Sentiment Analyzer with BERT (build, tune, deploy)

Dataset

I used the dataset published by The Stanford NLP Group . I merged two files, namely ‘dictionary.txt’ including 239,232 text fragments and ‘sentiment_labels.txt’ containing the sentiment scores assigned to the various text fragments.

Text preprocessing with regular expressions

To clean the text, I usually use a bunch of functions containing regular expressions. In common.py you can find all of them, for example remove_nonwords described below:

Similar functions were used for empty rows, special signs, numbers and html code removal.

After text cleaning, it’s time for BERT embeddings creation. For that purpose, I used bert-as-service . It is very simple and consists of only 3 steps: download a pre-trained model, start the BERT service and use client for sentence encodings of specified length.

There are multiple parameters that can be setup, when running a service. For example, to define max_seq_len , I calculated 0.9 quantile of train data length.

Preprocessed data has a form of data frame containing 768 features. For full code, please go to nlp_preprocess.py .

Model building with Keras

In this part, we build and train the model on different parameters. Let’s assume we want 5-layers neural network as below. We will parametrize batch_size, number of epochs, number of nodes in the first 4 dense layers and 5 dropout layers.

Model tuning with Sacred

Now we can tune the parameters. We will use sacred module. Key points here are:

1. Create an Experiment and add Observer

First we need to create an experiment and observer that logs all kinds of information. It’s very simple!

2. Define the main function

The @ex.automain decorator defines and runs the main function of the experiment when we run the Python script.

3. Add the Configuration parameters

We will define them through Config Scope.

4. Add metrics

In our case here I want to know the MAE and MSE . We can use the Metrics API for that.

5. Run the experiment

Functions from the previous steps are stored in model_experiment.py script. In order to run our exepriment for bunch of parameters, we create and run run_sacred.py. For all possible permutations, MAE and MSE will be saved in MongoDB.

The best result I got is 9% of MAE score. That means that our sentiment analyzer works pretty good. We can check it with model_inference function.

Please note that the score is normalized so that outlier values can be also obtained. After model is saved, we can build a Web API!

Web API creation with Flask

Now we want to create an API that runs the code in the function and displays the returned result in the browser.

The syntax @app.route('/score', methods=['PUT']) lets Flask know that the function, score , should be mapped to the endpoint/score . The methods list is a keyword argument that tells us what kind of HTTP requests are allowed. We’ll be using PUT requests to receive sentences from a user. In function score , we get a score in dictionary form, since it can be easily converted to a JSON string. Full code is available in api.py .

Frontend

For web interface, three files were created:

index.html
style.css
index.js

For gradient HSV model was used. Saturation and Value are constants. Hue corresponds to score value. Changing hue in range [0;120] yields smooth colour change from red to yellow to green.

Docker containerization

The brilliance of Docker is that, once you package an application and all its dependencies into container, you ensure it will run in any environment. It is generally recommended to separate areas of concern by using one service per container. In my small app there are 3 parts that should be combined: bert-as-service, application and frontend. The tool that helps you build Docker images and run containers is Docker Compose .

Steps that we need to do to dockerize our code:

Create separate folders for bert-as-service, api and frontend,
Put there relevant files,
Add requirenments.txt and Dockerfile to each folder. The first file should cover all needed libraries that will be installed via command in the second file. Its format is described in docker documentation
Create docker-compose.yaml in the 3 folders directory. Define the 3 services that make up the app in this file, so they can be run together in an isolated environment.

Now we are ready to build and run our application! Please see the sample outputs below.

As usual, please feel free to view the full code on my Gitlab.

Projects · Zuzanna / Sentiment Analysis with BERT

GitLab.com

gitlab.co

以上所述就是小编给大家介绍的《Sentiment Analyzer with BERT (build, tune, deploy)》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Sentiment Analyzer with BERT (build, tune, deploy)

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

图论——一个迷人的世界

本杰明,查特兰,张萍 / 机械工业出版社 / 2001-1-1

本书介绍了图论的基本概念，解释了图论中各种经典问题。例如，熄灯的问题、小生成树问题、哥尼斯堡七桥问题、中国邮递员问题、国际象棋中马的遍历问题和路的着色问题等等。书中也给出了各种类型的图，例如，二部图、欧拉图、彼得森图和树；等等。每一章都为读者设置了练习题，包含了具有挑战性的探索性问题。一起来看看《图论——一个迷人的世界》这本书的介绍吧!

码农工具

Sentiment Analyzer with BERT (build, tune, deploy)