Gerapy: Distributed Crawler Management Framework Based for Scrapy

栏目: IT技术 · 发布时间: 4年前

内容简介:Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js.Documentation is available online at

Gerapy

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js.

Documentation

Documentation is available online at https://docs.gerapy.com/ and https://github.com/Gerapy/Docs .

Support

Gerapy is developed based on Python 3.x. Python 2.x may be supported later.

Usage

Install Gerapy by pip:

pip3 install gerapy

After the installation, you need to do these things below to run Gerapy server:

If you have installed Gerapy successfully, you can use command gerapy . If not, check the installation.

First use this command to initialize the workspace:

gerapy init

Now you will get a folder named gerapy . Also you can specify the name of your workspace by this command:

gerapy init <workspace>

Then cd to this folder, and run this command to initialize the Database:

cd gerapy
gerapy migrate

Next you need to create a superuser by this command:

gerapy createsuperuser

Then you can runserver by this command:

gerapy runserver

Then you can visit http://localhost:8000 to enjoy it. Also you can vist http://localhost:8000/admin to get the admin management backend.

If you want to run Gerapy in public, just run like this:

gerapy runserver 0.0.0.0:8000

Then it will run with public host and port 8000.

In Gerapy, You can create a configurable project and then configure and generate code of Scrapy automatically. But this module is unstable, we're trying to refine it.

Also you can drag your Scrapy Project to projects folder. Then refresh web, it will appear in the Project Index Page and comes to un-configurable, but you can edit this project through the web page.

As for deployment, you can move to Deploy Page. Firstly you need to build your project and add client in the Client Index Page, then you can deploy the project just by clicking button.

After the deployment, you can manage the job in Monitor Page.

Docker

Just run this command:

docker run -d -v ~/gerapy:/app/gerapy -p 8000:8000 germey/gerapy

Then it will run at port 8000. You can use the temp admin account (username: admin, password: admin) to login. And please change the password later for safety.

Command Usage:

docker run -d -v <workspace>:/app/gerapy -p <public_port>:<container_port> germey/gerapy

Please specify your workspace to mount Gerapy workspace by -v <workspace>:/app/gerapy and specify server port by -p <public_port>:<container_port> .

If you run Gerapy by Docker, you can visit Gerapy website such as http://localhost:8000 and enjoy it, no need to do other initialzation things.

TodoList

Communication

If you have any questions or ideas, you can send Issues or Pull Requests , your suggestions are really import for us, thanks for your contirbution.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

今日简史

今日简史

[以] 尤瓦尔·赫拉利 / 林俊宏 / 中信出版集团 / 2018-8 / 68

四年前,尤瓦尔•赫拉利的《人类简史》横空出世,颠覆了我们关于人类进化的认知。2016年,他的《未来简史》再度惊艳四座,刷新了我们对未来的想象,掀起了全球关于人工智能讨论的新思潮。现在,“简史三部曲”收官之作《今日简史》推出,将目光聚焦到当下,直面今天关乎我们每个人命运的问题和挑战。 尤瓦尔•赫拉利认为,智人之所以能够崛起成为地球的主宰者,主要原因在于其具备了虚构故事的能力。然而,在当前这样一......一起来看看 《今日简史》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具