Scripting in rust with self-interpreting source code

栏目: IT技术 · 发布时间: 4年前

内容简介：I have a soft spot in my heart for rust and a passionate distrust (that has slowly turned into hatred) for interpreted, loosely typed languages, but it’s hard to deny the convenience of being able to bang out a bash script you can literally just write and

I have a soft spot in my heart for rust and a passionate distrust (that has slowly turned into hatred) for interpreted, loosely typed languages, but it’s hard to deny the convenience of being able to bang out a bash script you can literally just write and run without having to deal with the edit-compile-run loop, let alone create a new project, worry about whether or not you’re going to check it into version control, and everything else that somehow tends to go hand-in-hand with modern strongly typed languages.

A nifty but scarcely known rust feature is that the language parser will ignore a shebang at the start of the source code file, meaning you can install an interpreter that will compile and run your rust code when you execute the .rs file – without losing the ability to compile it normally. cargo-script is one such interpreter, meaning you can cargo install cargo-script then execute your source code (after making it executable, :! chmod +x % ) with something like this:

#!/usr/bin/env cargo-script

fn main() {
    println!("Hello, world!");
}

That’s pretty cool. But it’s bogged down by the inertia of an external dependency (even if it’s on crates.io) , and more importantly, needing to install an interpreter just isn’t true to the hacker spirit. Fortunately, we can do better: it’s possible to write code that is simultaneously a valid (cross-platform!) shell script and valid rust code at the same time, which we can abuse to make the code run itself!

Rust already treats a line starting with #!/ as a comment, meaning we don’t have to worry about how we’re going to prevent the shebang from preventing our code from being a valid, conformant rust file. But how do we inject a shell-scripted “interpreter” into the source code afterwards? Fortunately/unfortunately # is not a comment in rust and // is not a comment in sh , so a comment in either language to get it to ignore a line while the other interprets it will work… but will also cause the other to complain about invalid syntax.

The trick is that we can abuse the rustc preprocessor by using a no-op crate attribute at the start of the file to get an sh comment that is still valid rust code and the rest, as they say, is history:

#!/bin/sh
#![allow()] /*
# rust self-compiler by M. Al-Qudsi, licensed as public domain or MIT.
# See <https://neosmart.net/blog/self-compiling-rust-code/> for info & updates.
OUT=/tmp/$(printf "%s" $(realpath $(which "$0")) | md5sum | cut -d' '  -f1)
MD5=$(md5sum "$0" | cut -d' '  -f1)
(test -f ${OUT}.md5 -a ${MD5} = $(cat ${OUT}.md5) ||
(grep -Eq '^\s*(\[.*?\])*\s*fn\s*main\b*' "$0" && (rm -f ${OUT};
rustc "$0" -o ${OUT} && printf "%s" ${MD5} > ${OUT}.md5) || (rm -f ${OUT};
printf "fn main() {//%s\n}" "$(cat $0)" | rustc - -o ${OUT} &&
printf "%s" ${MD5} > ${OUT}.md5))) && exec ${OUT} || exit $? #*/

// Wrapping your code in `fn main() { … }` is altogether optional :)
fn main() {
    println!("Hello, world!");
}

The program above is simultaneously a valid rust program and a valid shell script that should run on most *nix platforms.

The self-compiling header actually does a bit more than just compile the rust source code and run the result:

fn main()
fn main() { ... }

The self-compiling/self-interpreting header above has been optimized for size, absolutely at the cost of legibility. But fear not, here’s a line-by-line annotated equivalent to explain what is going on:

#!/bin/sh
#![allow()] /*

# rust self-compiler by Mahmoud Al-Qudsi, Copyright NeoSmart Technologies 2020
# See <https://neosmart.net/blog/self-compiling-rust-code/> for info & updates.
#
# This code is freely released to the public domain. In case a public domain
# license is insufficient for your legal department, this code is also licensed
# under the MIT license.

# Get an output path that is derived from the complete path to this self script.
# - `realpath` makes sure if you have two separate `script.rs` files in two
#   different directories, they get mapped to different binaries.
# - `which` makes that work even if you store this script in $PATH and execute
#   it by its filename alone.
# - `cut` is used to print only the hash and not the filename, which `md5sum`
#   always includes in its output.
OUT=/tmp/$(printf "%s" $(realpath $(which "$0")) | md5sum | cut -d' '  -f1)

# Calculate hash of the current contents of the script, so we can avoid
# recompiling if it hasn't changed.
MD5=$(md5sum "$0" | cut -d' '  -f1)

# Check if we have a previously compiled output for this exact source code.
if !(test -f ${OUT}.md5 && test ${MD5} = $(cat ${OUT}.md5);) then
	# The script has been modified or is otherwise not cached.
	# Check if the script already contains an `fn main()` entry point.
	if grep -Eq '^\s*(\[.*?\])*\s*fn\s*main\b*' "$0"; then
		# Compile the input script as-is to the previously determined location.
		rustc "$0" -o ${OUT}
		# Save rustc's exit code so we can compare against it later.
		RUSTC_STATUS=$?
	else
		# The script does not contain an `fn main()` entry point, so add one.
		# We don't use `printf 'fn main() { %s }' because the shebang must
		# come at the beginning of the line, and we don't use `tail` to skip
		# it because that would result in incorrect line numbers in any errors
		# reported by rustc, instead we just comment out the shebang but leave
		# it on the same line as `fn main() {`.
		printf "fn main() {//%s\n}" "$(cat $0)" | rustc - -o ${OUT}
		# Save rustc's exit code so we can compare against it later.
		RUSTC_STATUS=$?
	fi

	# Check if we compiled the script OK, or exit bubbling up the return code.
	if test "${RUSTC_STATUS}" -ne 0; then
		exit ${RUSTC_STATUS}
	fi

	# Save the MD5 of the current version of the script so we can compare
	# against it next time.
	printf "%s" ${MD5} > ${OUT}.md5
fi

# Execute the compiled output. This also ends execution of the shell script,
# as it actually replaces its process with ours; see exec(3) for more on this.
exec ${OUT}

# At this point, it's OK to write raw rust code as the shell interpreter
# never gets this far. But we're actually still in the rust comment we opened
# on line 2, so close that: */

<code>
fn main() {
	println!("Hello, world!");
}
</code>

If you would like to receive a notification the next time we release a rust library, publish a crate, or post some rust-related developer articles, you can subscribe below. Note that you'll only get notifications relevant to rust programming and development by NeoSmart Technologies. If you want to receive email updates for all NeoSmart Technologies posts and releases, please sign up in the sidebar to the right instead.

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Scripting in rust with self-interpreting source code

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

PHP与MySQL权威指南

吴津津、田睿、李云、刘昊 / 机械工业出版社华章公司 / 2011-10 / 118.00元

PHPChina官方出品，Discuz！创始人戴志康、UCHome创始人李国德、ThinkPHP创始人刘晨、PHPCMS项目负责人王参加等联袂推荐。本书是目前为止最全面的关于PHP与MySQL开发技术的书籍之一，系统而全面地讲解了PHP与MySQL技术的方方面面，适合初中级的PHP程序员系统地学习；本书也是目前为止首本系统而深入地讲解UCenter、Discuz!、UCHome、ShopN......一起来看看《PHP与MySQL权威指南》这本书的介绍吧!

码农工具

Scripting in rust with self-interpreting source code

PHP与MySQL权威指南

JSON 在线解析

RGB转16进制工具

UNIX 时间戳转换