Seemingly impossible functional programs

栏目: IT技术 · 发布时间: 4年前

内容简介：Andrej has invited me to write about certain surprising functional programs. The first program, due toA stronger one amounts to finding an example such that the predicate holds, if such an example exists, and saying that there isn’t any otherwise.I will us

Andrej has invited me to write about certain surprising functional programs. The first program, due to Ulrich Berger (1990), performs exhaustive search over the “ Cantor space ” of infinite sequences of binary digits. I have included references at the end. A weak form of exhaustive search amounts to checking whether or not a total predicate holds for all elements of the Cantor space. Thus, this amounts to universal quantification over the Cantor space. Can this possibly be done algorithmically, in finite time?

A stronger one amounts to finding an example such that the predicate holds, if such an example exists, and saying that there isn’t any otherwise.

I will use the language Haskell , but it is possible to quickly translate the programs to e.g. ML or OCaml . The source code shown here is attached as seemingly-impossible.hs .

We could use the booleans to represent binary digits, or even the integers, but I prefer to use a different type to avoid confusions:

> data Bit = Zero | One
>          deriving (Eq)

The deriving clause tells Haskell to figure out how to decide equality of bits automatically.

For the type of infinite sequences, we could use the built-in type of lazy lists for most algorithms considered here. But, in order to illustrate certain points, I will take the mathematical view and regard sequences as functions defined on the natural numbers. The next version of the definition of Haskell will have a built-in type of natural numbers. For the moment, I implement it as the type of integers:

> type Natural = Integer
> type Cantor = Natural -> Bit

The operator (#) takes a bit x and a sequence a and produces a new sequence x # a with x as the head and a as the tail (very much like the built-in operation (:) for lists):

> (#) :: Bit -> Cantor -> Cantor
> x # a = \i -> if i == 0 then x else a(i-1)

Notice that the notation \i -> ... stands for $\lambda i. \dots$.

Next, we come to the heart of the matter, the functions that perform exhaustive search over the Cantor space. The specification of the function find is that, for any total p , one should have that find p is always a total element of the Cantor space, and, moreover, if there is a in the Cantor space with p a = True , then a = find p is an example of such an a .

> forsome, forevery :: (Cantor -> Bool) -> Bool
> find :: (Cantor -> Bool) -> Cantor

Because I will have several implementations of find , I have to choose one to be able to compile and run the program. A canonical choice is the first one,

> find = find_i

but you are invited to experiment with the other ones. For the following definition of find_i to make sense, you have to take the above choice.

The function find takes a predicate on the Cantor space, and hence it will typically have a $\lambda$-expression as argument. In the following definition this is not necessary, because (\a -> p a) = p by the $\eta$ rule. But I have adopted it for the sake of clarity, as then we can read “ find(\a -> p a) ” aloud as “find a such that p(a) “:

> forsome p = p(find(\a -> p a))
> forevery p = not(forsome(\a -> not(p a)))

Notice that the function forevery (universal quantification) is obtained from the function forsome (existential quantification) via the De Morgan Law . The functionals forsome and find_i are defined by mutual recursion:

> find_i :: (Cantor -> Bool) -> Cantor
> find_i p = if forsome(\a -> p(Zero # a))
>            then Zero # find_i(\a -> p(Zero # a))
>            else One  # find_i(\a -> p(One  # a))

The intuitive idea of the algorithm find_i is clear: if there is an example starting with zero, then the result is taken to start with zero, otherwise it must start with one. Then we recursively build the tail using the same idea. What may not be clear is whether the recursion eventually produces a digit, because of the indirect recursive call via the call to forsome . A mathematical proof proceeds by induction on the modulus of uniform continuity of p , defined below.

It may be more natural to return an example only if there is one, and otherwise tell there isn’t any:

> search :: (Cantor -> Bool) -> Maybe Cantor
> search p = if forsome(\a -> p a) then Just(find(\a -> p a)) else Nothing

The Maybe type constructor is predefined by Haskell as

data Maybe a = Just a | Nothing

Type-theoretic remark: the type Maybe a corresponds to the sum type $A+1$, where the only element of $1$ is called Nothing and where Just is the insertion $A \to A+1$.

Exercise: show that both forsome and find can be defined directly from search assuming we had defined search first.

Common wisdom tells us that function types don’t have decidable equality. In fact, e.g. the function type Integer -> Integer doesn’t have decidable equality because of the Halting Problem , as is well known. However, common wisdom is not always correct, and, in fact, some other function types do have decidable equality, for example the type Cantor -> y for any type y with decidable equality, without contradicting Turing:

> equal :: Eq y => (Cantor -> y) -> (Cantor -> y) -> Bool
> equal f g = forevery(\a -> f a == g a)

This seems strange, even fishy, because the Cantor space is in some sense bigger than the integers. In a follow-up post, I’ll explain that this has to do with the fact that the Cantor space is topologically compact, but the integers are not.

Let’s run an example:

> coerce :: Bit -> Natural
> coerce Zero = 0
> coerce One = 1

> f, g, h :: Cantor -> Integer

> f a = coerce(a(7 * coerce(a 4) +  4 * (coerce(a 7)) + 4))

> g a = coerce(a(coerce(a 4) + 11 * (coerce(a 7))))

> h a = if a 7 == Zero
>       then if a 4 == Zero then coerce(a  4) else coerce(a 11)
>       else if a 4 == One  then coerce(a 15) else coerce(a  8)

Now we call the ghci interpreter:

$ ghci seemingly-impossible.lhs
   ___         ___ _
  / _  /  // __(_)
 / /_// /_/ / /  | |      GHC Interactive, version 6.6, for Haskell 98.
/ /_/ __  / /___| |      http://www.haskell.org/ghc/
____// /_/____/|_|      Type :? for help.

Loading package base ... linking ... done.
[1 of 1] Compiling Main             ( seemingly-impossible.lhs, interpreted )
Ok, modules loaded: Main.

*Main>

At this point we can evaluate expressions at the interpreter’s prompt. First I ask it to print time and space usage after each evaluation:

*Main> :set +s

On my Dell 410 laptop running at 1.73GHz, I test the following expressions:

*Main> equal f g
False
(0.10 secs, 3490296 bytes)

*Main> equal f h
True
(0.87 secs, 36048844 bytes)

*Main> equal g h
False
(0.09 secs, 3494064 bytes)

*Main> equal f f
True
(0.91 secs, 38642544 bytes)

*Main> equal g g
True
(0.15 secs, 6127796 bytes)

*Main> equal h h
True
(0.83 secs, 32787372 bytes)

By changing the implementation of find , I’ll make this faster and also will be able to run bigger examples. But let’s carry on with the current implementation for the moment.

The following was Berger’s main motivation for considering the above constructions:

> modulus :: (Cantor -> Integer) -> Natural
> modulus f = least(\n -> forevery(\a -> forevery(\b -> eq n a b --> (f a == f b))))

This is sometimes called the Fan Functional , and goes back to Brouwer (1920’s) and it is well known in the higher-type computability theory community (see Normann (2006) below). It finds the modulus of uniform continuity , defined as the least natural number $n$ such that

$\forall \alpha,\beta(\alpha =_n \beta \to f(\alpha)=f(\beta),$

where

$\alpha =_n \beta \iff \forall i 
	
What is going on here is that computable functionals are continuous,
which amounts to saying that finite amounts of the output depend only
on finite amounts of the input. But the Cantor space is compact, and
in analysis and topology there is a theorem that says that continuous
functions defined on a compact space are		
			uniformly
		
continuous. In this context, this amounts to the existence of a single
$n$ such that for all inputs it is enough to look at depth $n$ to get
the answer (which in this case is always finite, because it is an
integer). I’ll explain all this in another post. Here I will
illustrate this by running the program in some examples.	
	Notice that the Haskell definition is the same as the mathematical
one, provided we define all the other needed ingredients:
	> least :: (Natural -> Bool) -> Natural
> least p = if p 0 then 0 else 1 + least(\n -> p(n+1))

> (-->) :: Bool -> Bool -> Bool
> p --> q = not p || q

> eq :: Natural -> Cantor -> Cantor -> Bool
> eq 0 a b = True
> eq (n+1) a b = a n == b n  &&  eq n a b
	To understand the modulus functional in practice, define projections
as follows:
	> proj :: Natural -> (Cantor -> Integer)
> proj i = \a -> coerce(a i)
	Then we get:
	*Main> modulus (\a -> 45000)
0
(0.00 secs, 0 bytes)

*Main> modulus (proj 1)
2
(0.00 secs, 0 bytes)

*Main> modulus (proj 2)
3
(0.01 secs, 0 bytes)

*Main> modulus (proj 3)
4
(0.05 secs, 820144 bytes)

*Main> modulus (proj 4)
5
(0.30 secs, 5173540 bytes)

*Main> modulus (proj 5)
6
(1.69 secs, 31112400 bytes)

*Main> modulus (proj 6)
7
(9.24 secs, 171456820 bytes)
	So, intuitively, the modulus is the last index of the input that the
function uses plus one. For a constant function, like the above, the
modulus is zero, because no index is used.
	
Technical remark. The notion of modulus of uniform
continuity needed for the proof of termination of		find_i
is not literally the same as above, but a slight variant
(sometimes called the		intensional
modulus of uniform
continuity, whereas ours is referred to as the		extensional
one). But I won’t go into such mathematical
subtleties here. The main idea is that when the modulus is $0$ the
recursion terminates and one of the branches of the definition of		find_i
is followed, and a new recursion is started, to
produce the next digit of the example. When the modulus of		p
is $n+1$, the modulus of the predicate		\a -> p(Zero # a)
is $n$ or smaller, and so recursive calls are always made
with smaller moduli and hence eventually terminate.		End of
remark.
	
	
Now I’ll try to get faster implementations of		find
.
I’ll modify the original implementation in several stages.
Firstly, I will remove the mutual recursion by expanding the
definition of the function		forsome
in the definition of
the function		find_i
:	
	> find_ii p = if p(Zero # find_ii(\a -> p(Zero # a)))
>             then Zero # find_ii(\a -> p(Zero # a))
>             else One  # find_ii(\a -> p(One  # a))
	
This should have essentially the same speed.
Now notice that the branches of the conditional are the same if we
make zero and one into a parameter		h
. Hence one can “factor
out” the conditional as follows:	
	> find_iii :: (Cantor -> Bool) -> Cantor
> find_iii p = h # find_iii(\a -> p(h # a))
>        where h = if p(Zero # find_iii(\a -> p(Zero # a))) then Zero else One
	
This is (exponentially!) faster for some examples. A clue for
this is that		h
will be evaluated only if and when it is
“used” (our language is lazy). Let’s run an example, replacing the
above definition of		find
by		find = find_iii
:	
	*Main> equal f h
True
(0.00 secs, 522668 bytes)

*Main> equal (proj 1000) (proj 1000)
True
(0.00 secs, 0 bytes)

*Main> equal (proj 1000) (proj 4000)
False
(0.03 secs, 1422680 bytes)

*Main> equal (proj 1000) (proj(2^20))
False
(7.02 secs, 336290704 bytes)
	
As you can see, the bigger the projection functions we try,
the longer the comparison gets. To see how bad the first algorithm is, let’s switch back to		find = find_i
:	
	*Main> equal (proj 10) (proj 10)
True
(0.05 secs, 1529036 bytes)

*Main> equal (proj 10) (proj 15)
False
(1.61 secs, 72659036 bytes)

*Main> equal (proj 10) (proj 20)
False
(60.62 secs, 2780497676 bytes)
	The previous examples cannot be run with this algorithm unless we had
more bits available than there are atoms in the observable universe
and we were willing to wait several billion-billion years, because the
algorithm is exponential in the modulus of continuity.
	
You probably noticed that there is another obvious improvement
starting from		find_ii
:	
	> find_iv :: (Cantor -> Bool) -> Cantor
> find_iv p = let leftbranch = Zero # find_iv(\a -> p(Zero # a))
>             in if p(leftbranch)
>                then leftbranch
>                else One # find_iv(\a -> p(One # a))
	
Actually, I never thought about the performance of this algorithm or
experimented with it. Let’s see what we get (you need to replace		find = find_iv
):	
	*Main> equal (proj 10) (proj 20)
False
(0.00 secs, 522120 bytes)

*Main> equal (proj 10) (proj 200)
False
(0.04 secs, 1550824 bytes)

*Main> equal (proj 10) (proj 2000)
False
(3.71 secs, 146039744 bytes)

*Main> equal (proj 10) (proj 20000)
Interrupted.
	
Much better than		find_i
, but much worse than		find_iii
! I gave up in the last example, because it
started to slow down my edition of this post after a minute or so.	
	But there is a much better algorithm, which I now present. I won’t
attempt to explain the working of this algorithm in this post (see
my LICS’2007 paper below if you are really interested), but I include
a few remarks below:
	> find_v :: (Cantor -> Bool) -> Cantor
> find_v p = \n ->  if q n (find_v(q n)) then Zero else One
>  where q n a = p(\i -> if i < n then find_v p i else if i == n then Zero else a(i-n-1))
	
All the above algorithms, except this one, can be easily rewritten to
use lazy lists rather than functions defined on the natural
numbers. This algorithm takes advantage of the fact that to access an
element of a sequence represented as a function, it is not necessary
to scan all the preceding elements. In a perhaps mysterious way, this
algorithm implicitly figures out which entries of its argument		p
uses, and constructs only those explicitly. You can access
the other ones if you wish, but the algorithm		find_v
doesn’t
force their evaluation. One way to see that		find_v
is correct
is to show, by induction on		n
, that		find_i p n = find_v p n
, which is not too difficult, although the calculations get big
at some stages if one doesn’t carefully introduce suitable auxiliary
notation. A better way is to understand this directly, as done in the
above paper (you need to look for the product functional, which
generalizes this).	
	
Now this gets really fast (take		find = find_v
):	
	*Main> equal (proj (2^300)) (proj (2^300))
True
(0.00 secs, 522148 bytes)
*Main> equal (proj (2^300)) (proj (2^400))
False
(0.00 secs, 525064 bytes)
	
But if the functions use several of their arguments, not just one (see
example below), this isn’t so good any more. To fix this, first
rewrite the above program as follows, introducing an auxiliary
variable		b
to name the result, and replace one of the
recursive calls (there are two) to use		b
instead:	
	> find_vi :: (Cantor -> Bool) -> Cantor
> find_vi p = b
>  where b = \n -> if q n (find_vi(q n)) then Zero else One
>        q n a = p(\i -> if i < n then b i else if i == n then Zero else a(i-n-1))
	
Lazy evaluation doesn’t help here, because		b
is a
function, and in fact this makes the program slightly slower. Now,
to make it significantly faster, we apply the identity function to
the definition of		b
. Or rather an elaborate
implementation of the identity function, that stores		b
into an infinite binary tree in a breadth-first manner and then
retrieves it back (this trick implements memoization with
logarithmic overhead):	
	> find_vii :: (Cantor -> Bool) -> Cantor
> find_vii p = b
>  where b = id'(\n -> if q n (find_vii(q n)) then Zero else One)
>        q n a = p(\i -> if i < n then b i else if i == n then Zero else a(i-n-1))

> data T x = B x (T x) (T x)

> code :: (Natural -> x) -> T x
> code f = B (f 0) (code(\n -> f(2*n+1)))
>                  (code(\n -> f(2*n+2)))

> decode :: T x -> (Natural -> x)
> decode (B x l r) n |  n == 0    = x
>                    |  odd n     = decode l ((n-1) `div` 2)
>                    |  otherwise = decode r ((n-2) `div` 2)

> id' :: (Natural -> x) -> (Natural -> x)
> id' = decode.code
	
Now take		find = find_vii
, and test:	
	> f',g',h' :: Cantor -> Integer

> f' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(5^80)) where a' i = coerce(a i)

> g' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(6^80)) where a' i = coerce(a i)

> h' a = a' k
>     where i = if a'(5^80) == 0 then 0    else 1000
>           j = if a'(3^80) == 1 then 10+i else i
>           k = if a'(4^80) == 0 then j    else 100+j
>           a' i = coerce(a i)

*Main> equal f' g'
False
(6.75 secs, 814435692 bytes)

*Main> equal f' h'
True
(3.20 secs, 383266912 bytes)

*Main> equal g' h'
False
(6.79 secs, 813083216 bytes)

*Main> equal f' f'
True
(3.20 secs, 383384908 bytes)

*Main> equal g' g'
True
(3.31 secs, 400711084 bytes)

*Main> equal h' h'
True
(3.22 secs, 383274252 bytes)
	
Among all the above algorithms, only		find_vii
can cope
with the above examples. A more interesting example is this. Two
finite sequences $s$ and $t$ of natural numbers have the
same		set
of elements iff the two functions	
	$\bigwedge_{i 
	are equal, where the above notation indicates the pointwise
logical-and (conjunction) of the projections, and where $|s|$ is the
length of $s$. Here is an implementation of this
idea:
	> pointwiseand :: [Natural] -> (Cantor -> Bool)
> pointwiseand [] = \b -> True
> pointwiseand (n:a) = \b -> (b n == One) && pointwiseand a b

> sameelements :: [Natural] -> [Natural] -> Bool
> sameelements a b = equal (pointwiseand a) (pointwiseand b)

*Main>  sameelements
  [6^60, 5^50, 1, 5, 6, 6, 8, 9, 3, 4, 6, 2, 4,6, 1, 6^60, 7^700, 1, 1, 1, 3, 3^30]
  [1, 2, 3, 4, 5, 6, 7, 8, 9, 3^30, 5^50, 6^60, 7^70]
False
(0.14 secs, 21716248 bytes)

*Main> sameelements
  [6^60, 5^50, 1, 5, 6, 6, 8, 9, 3, 4, 6, 2, 4,6, 1, 6^60, 7^70, 1, 1, 1, 3, 3^30]
  [1, 2, 3, 4, 5, 6, 7, 8, 9, 3^30, 5^50, 6^60, 7^70]
False
(0.10 secs, 14093520 bytes)

*Main> sameelements
  [6^60, 5^50, 1, 5, 6, 6, 8, 9, 3, 4, 6, 2, 4,6, 1, 6^60, 7^70, 1, 1, 1, 3, 3^30]
  [1, 2, 3, 4, 5, 6, 8, 9, 3^30, 5^50, 6^60, 7^70]
True
(0.10 secs, 12610056 bytes)

*Main> sameelements
 [6^60, 5^50, 1, 5, 6, 6, 8, 9, 3, 4, 6, 2, 4,6, 1, 6^60, 7^70, 1, 1, 1, 3, 3^30]
 [1, 2, 3, 4, 5, 6, 8, 9, 3^30, 5^50, 6^60, 7^700]
False
(0.12 secs, 17130684 bytes)

*Main> sameelements
[6^60, 5^50, 1, 5, 6, 6, 8, 9, 3, 4, 6, 2, 4,6, 1, 6^60, 7^700, 1, 1, 1, 3, 3^30]
[1, 2, 3, 4, 5, 6, 8, 9, 3^30, 5^50, 6^60, 7^700]
True
(0.12 secs, 17604776 bytes)
	
It is natural to ask whether there are applications to program
verification. I don’t know, but		Dan Ghica
and I speculate that
there are, and we are planning to investigate this.	
	An even faster search algorithm is offered in the first comment below.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Seemingly impossible functional programs

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

蚂蚁金服

由曦 / 中信出版集团股份有限公司 / 2017-4-7 / CNY 59.00

在中国，支付宝（其母公司为蚂蚁金服）是一个家喻户晓的品牌。我们在用手机扫码支付，或者用余额宝理财的时候，一定会和支付宝发生关系。但是很多人不知道，支付宝的母公司叫作“蚂蚁金服”。蚂蚁金服不仅有支付宝，还有余额宝、网商银行、芝麻信用等一系列产品和服务。成立于2004年、起始于支付宝的蚂蚁金服集团，如今已经是全球估值最高的科技金融企业。然而，在成立之初，它只是淘宝网的结算部门，员工只有区区几人，记账用......一起来看看《蚂蚁金服》这本书的介绍吧!

码农工具

Seemingly impossible functional programs

蚂蚁金服

HTML 压缩/解压工具

图片转BASE64编码

HSV CMYK 转换工具