BMG: A Production Ready Relational Algebra in Ruby

栏目: IT技术 · 发布时间: 5年前

内容简介:Bmg is a relational algebra implemented as a ruby library. It implements theLike Alf, Bmg can be used to query relations in memory, from various files, SQL databases, and any data sources that can be seen as serving relations. Cross data-sources joins are

Bmg, a relational algebra (Alf's successor)!

Bmg is a relational algebra implemented as a ruby library. It implements the Relation as First-Class Citizen paradigm contributed with Alf a few years ago.

Like Alf, Bmg can be used to query relations in memory, from various files, SQL databases, and any data sources that can be seen as serving relations. Cross data-sources joins are supported, as with Alf.

Unlike Alf, Bmg does not make any core ruby extension and exposes the object-oriented syntax only (not Alf's functional one). Bmg implementation is also much simpler, and make its easier to implement user-defined relations.

Example

require 'bmg'
require 'json'

suppliers = Bmg::Relation.new([
  { sid: "S1", name: "Smith", status: 20, city: "London" },
  { sid: "S2", name: "Jones", status: 10, city: "Paris"  },
  { sid: "S3", name: "Blake", status: 30, city: "Paris"  },
  { sid: "S4", name: "Clark", status: 20, city: "London" },
  { sid: "S5", name: "Adams", status: 30, city: "Athens" }
])

by_city = suppliers
  .restrict(Predicate.neq(status: 30))
  .extend(upname: ->(t){ t[:name].upcase })
  .group([:sid, :name, :status], :suppliers_in)

puts JSON.pretty_generate(by_city)
# [{...},...]

Connecting to a SQL database

Bmg requires sequel >= 3.0 to connect to SQL databases.

require 'sqlite3'
require 'bmg'
require 'bmg/sequel'

DB = Sequel.connect("sqlite://suppliers-and-parts.db")

suppliers = Bmg.sequel(:suppliers, DB)

big_suppliers = suppliers
  .restrict(Predicate.neq(status: 30))

puts big_suppliers.to_sql
# SELECT `t1`.`sid`, `t1`.`name`, `t1`.`status`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)

puts JSON.pretty_generate(big_suppliers)
# [{...},...]

How is this different from similar libraries?

  1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ, etc.) do not implement a genuine relational algebra: their support for chaining relational operators is limited (yielding errors or wrong SQL queries). Bmg always allows chaining operators. If it does not, it's a bug. In other words, the following query is 100% valid:

    relation
      .restrict(...)    # aka where
      .union(...)
      .summarize(...)   # aka group by
      .restrict(...)
  2. Bmg supports in memory relations, json relations, csv relations, SQL relations and so on. It's not tight to SQL generation, and supports queries accross multiple data sources.

  3. Bmg makes a best effort to optimize queries, simplifying both generated SQL code (low-level accesses to datasources) and in-memory operations.

  4. Bmg supports various structuring operators (group, image, autowrap, autosummarize, etc.) and allows building 'non flat' relations.

How is this different from Alf?

  1. Bmg's implementation is much simpler than Alf, and uses no ruby core extention.

  2. We are confident using Bmg in production. Systematic inspection of query plans is suggested though. Alf was a bit too experimental to be used on (critical) production systems.

  3. Alf exposes a functional syntax, command line tool, restful tools and many more. Bmg is limited to the core algebra, main Relation abstraction and SQL generation.

  4. Bmg is less strict regarding conformance to relational theory, and may actually expose non relational features (such as support for null, left_join operator, etc.). Sharp tools hurt, use them with great care.

  5. Bmg does not yet implement all operators documented on try-alf.org, even if we plan to eventually support them all.

  6. Bmg has a few additional operators that prove very useful on real production use cases: prefix, suffix, autowrap, autosummarize, left_join, rxmatch, etc.

Supported operators

r.allbut([:a, :b, ...])                      # remove specified attributes
r.autowrap(split: '_')                       # structure a flat relation, split: '_' is the default
r.autosummarize([:a, :b, ...], x: :sum)      # (experimental) usual summarizers supported
r.constants(x: 12, ...)                      # add constant attributes (sometimes useful in unions)
r.extend(x: ->(t){ ... }, ...)               # add computed attributes
r.group([:a, :b, ...], :x)                   # relation-valued attribute from attributes
r.image(right, :x, [:a, :b, ...])            # relation-valued attribute from another relation
r.join(right, [:a, :b, ...])                 # natural join on a join key
r.join(right, :a => :x, :b => :y, ...)       # natural join after right reversed renaming
r.left_join(right, [:a, :b, ...], {...})     # left join with optional default right tuple
r.left_join(right, {:a => :x, ...}, {...})   # left join after right reversed renaming
r.matching(right, [:a, :b, ...])             # semi join, aka where exists
r.matching(right, :a => :x, :b => :y, ...)   # semi join, after right reversed renaming
r.not_matching(right, [:a, :b, ...])         # inverse semi join, aka where not exists
r.not_matching(right, :a => :x, ...)         # inverse semi join, after right reversed renaming
r.page([[:a, :asc], ...], 12, page_size: 10) # paging, using an explicit ordering
r.prefix(:foo_, but: [:a, ...])              # prefix kind of renaming
r.project([:a, :b, ...])                     # keep specified attributes only
r.rename(a: :x, b: :y, ...)                  # rename some attributes
r.restrict(a: "foo", b: "bar", ...)          # relational restriction, aka where
r.rxmatch([:a, :b, ...], /xxx/)              # regex match kind of restriction
r.summarize([:a, :b, ...], x: :sum)          # relational summarization
r.suffix(:_foo, but: [:a, ...])              # suffix kind of renaming
r.union(right)                               # relational union

Who is behind Bmg?

Bernard Lambeau ( bernard@klaro.cards ) is Alf & Bmg main engineer & maintainer.

Enspirit ( https://enspirit.be ) and Klaro App ( https://klaro.cards ) are both actively using and contributing to the library.

Feel free to contact us for help, ideas and/or contributions. Please use github issues and pull requests if possible if code is involved.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

智能Web算法(第2版)

智能Web算法(第2版)

【英】Douglas G. McIlwraith(道格拉斯 G. 麦基尔雷思)、【美】Haralambos Marmanis(哈若拉玛 玛若曼尼斯)、【美】Dmitry Babenko(德米特里•巴邦科) / 达观数据、陈运文 等 / 电子工业出版社 / 2017-7 / 69.00

机器学习一直是人工智能研究领域的重要方向,而在大数据时代,来自Web 的数据采集、挖掘、应用技术又越来越受到瞩目,并创造着巨大的价值。本书是有关Web数据挖掘和机器学习技术的一本知名的著作,第2 版进一步加入了本领域最新的研究内容和应用案例,介绍了统计学、结构建模、推荐系统、数据分类、点击预测、深度学习、效果评估、数据采集等众多方面的内容。《智能Web算法(第2版)》内容翔实、案例生动,有很高的阅......一起来看看 《智能Web算法(第2版)》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

在线进制转换器
在线进制转换器

各进制数互转换器

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具