开发者

Left and Right Folding over an Infinite list

开发者 https://www.devze.com 2023-04-04 05:22 出处:网络
I have issues with the following passage from Learn You A Haskell (Great book imo, not dissing it): One big difference is that right

I have issues with the following passage from Learn You A Haskell (Great book imo, not dissing it):

One big difference is that right folds work on infinite lists, whereas left ones don't! To put it plainly, if you take an infinite list at some point and you fold it up from the right, you'll eventually reach the beginning of the list. However, if you take an infinite list at a point and you try to fold it up from the left, you'll never reach an end!

I just don't get this. If you take an infinite list and try to fold it up from the right then you'll have to start at the point at infinity, which just isn't happening (If anyone knows of a language where you can do this do tell :p). At least, you'd have to start there according to Haskell's implementation because in Haskell foldr and foldl don't take an argument that determines where in the list they should start folding.

I would agree with the quote iff foldr and foldl took arguments that determined where in the list they should start folding, because it makes sense that if you take an infinite list and start folding right from a defined index it will eventually terminate, whereas it doesn't matter where you start with a left fold; you'll be folding towards infinity. However foldr and foldl do not take this argument, and开发者_Python百科 hence the quote makes no sense. In Haskell, both a left fold and a right fold over an infinite list will not terminate.

Is my understanding correct or am I missing something?


The key here is laziness. If the function you're using for folding the list is strict, then neither a left fold nor a right fold will terminate, given an infinite list.

Prelude> foldr (+) 0 [1..]
^CInterrupted.

However, if you try folding a less strict function, you can get a terminating result.

Prelude> foldr (\x y -> x) 0 [1..]
1

You can even get a result that is an infinite data structure, so while it does in a sense not terminate, it's still able to produce a result that can be consumed lazily.

Prelude> take 10 $ foldr (:) [] [1..]
[1,2,3,4,5,6,7,8,9,10]

However, this will not work with foldl, as you will never be able to evaluate the outermost function call, lazy or not.

Prelude> foldl (flip (:)) [] [1..]
^CInterrupted.
Prelude> foldl (\x y -> y) 0 [1..]
^CInterrupted.

Note that the key difference between a left and a right fold is not the order in which the list is traversed, which is always from left to right, but rather how the resulting function applications are nested.

  • With foldr, they are nested on "the inside"

    foldr f y (x:xs) = f x (foldr f y xs)
    

    Here, the first iteration will result in the outermost application of f. Thus, f has the opportunity to be lazy so that the second argument is either not always evaluated, or it can produce some part of a data structure without forcing its second argument.

  • With foldl, they are nested on "the outside"

    foldl f y (x:xs) = foldl f (f y x) xs
    

    Here, we can't evaluate anything until we have reached the outermost application of f, which we will never reach in the case of an infinite list, regardless of whether f is strict or not.


The key phrase is "at some point".

if you take an infinite list at some point and you fold it up from the right, you'll eventually reach the beginning of the list.

So you're right, you can't possibly start at the "last" element of an infinite list. But the author's point is this: suppose you could. Just pick a point waaay far out there (for engineers, this is "close enough" to infinity) and start folding leftwards. Eventually you end up at the start of the list. The same is not true of the left fold, if you pick a point waaaay out there (and call it "close enough" to the start of the list), and start folding rightwards, you still have an infinite way to go.

So the trick is, sometimes you don't need to go to infinity. You may not need to even go waaaay out there. But you may not know how far out you need to go beforehand, in which case infinite lists are quite handy.

The simple illustration is foldr (:) [] [1..]. Let's perform the fold.

Recall that foldr f z (x:xs) = f x (foldr f z xs). On an infinite list, it actually doesn't matter what z is so I'm just keeping it as z instead of [] which clutters the illustration

foldr (:) z (1:[2..])         ==> (:) 1 (foldr (:) z [2..])
1 : foldr (:) z (2:[3..])     ==> 1 : (:) 2 (foldr (:) z [3..])
1 : 2 : foldr (:) z (3:[4..]) ==> 1 : 2 : (:) 3 (foldr (:) z [4..])
1 : 2 : 3 : ( lazily evaluated thunk - foldr (:) z [4..] )

See how foldr, despite theoretically being a fold from the right, in this case actually cranks out individual elements of the resultant list starting at the left? So if you take 3 from this list, you can clearly see that it will be able to produce [1,2,3] and need not evaluate the fold any farther.


Remember in Haskell you can use infinite lists because of lazy evaluation. So, head [1..] is just 1, and head $ map (+1) [1..] is 2, even though `[1..] is infinitely long. If you dont get that, stop and play with it for a while. If you do get that, read on...

I think part of your confusion is that the foldl and foldr always start at one side or the other, hence you dont need to give a length.

foldr has a very simple definition

 foldr _ z [] = z
 foldr f z (x:xs) = f x $ foldr f z xs

why might this terminate on infinite lists, well try

 dumbFunc :: a -> b -> String
 dumbFunc _ _ = "always returns the same string"
 testFold = foldr dumbFunc 0 [1..]

here we pass into foldr a "" (since the value doesn't matter) and the infinite list of natural numbers. Does this terminate? Yes.

The reason it terminates is because Haskell's evaluation is equivalent to lazy term rewriting.

So

 testFold = foldr dumbFunc "" [1..]

becomes (to allow pattern matching)

 testFold = foldr dumbFunc "" (1:[2..])

which is the same as (from our definition of fold)

 testFold = dumbFunc 1 $ foldr dumbFunc "" [2..]

now by the definition of dumbFunc we can conclude

 testFold = "always returns the same string"

This is more interesting when we have functions that do something, but are sometimes lazy. For example

foldr (||) False 

is used to find if a list contains any True elements. We can use this to define the higher order functionn any which returns True if and only if the passed in function is true for some element of the list

any :: (a -> Bool) -> [a] -> Bool
any f = (foldr (||) False) . (map f)

The nice thing about lazy evaluation, is that this will stop when it encounters the first element e such that f e == True

On the other hand, this isn't true of foldl. Why? Well a really simple foldl looks like

foldl f z []     = z                  
foldl f z (x:xs) = foldl f (f z x) xs

Now, what would have happened if we tried our example above

testFold' = foldl dumbFunc "" [1..]
testFold' = foldl dumbFunc "" (1:[2..])

this now becomes:

testFold' = foldl dumbFunc (dumbFunc "" 1) [2..]

so

testFold' = foldl dumbFunc (dumbFunc (dumbFunc "" 1) 2) [3..]
testFold' = foldl dumbFunc (dumbFunc (dumbFunc (dumbFunc "" 1) 2) 3) [4..]
testFold' = foldl dumbFunc (dumbFunc (dumbFunc (dumbFunc (dumbFunc "" 1) 2) 3) 4) [5..]

and so on and so on. We can never get anywhere, because Haskell always evaluates the outermost function first (that is lazy evaluation in a nutshell).

One cool consequence of this is that you can implement foldl out of foldr but not vice versa. This means that in some profound way foldr is the most fundamental of all the higher order string functions, since it is the one we use to implement almost all the others. You still might want to use a foldl sometimes, because you can implement foldl tail recursively, and get some performance gain from that.


There is good plain explanation on Haskell wiki. It shows step-by-step reduction with different types of fold and accumulator functions.


Your understanding is correct. I wonder if the author is trying to talk about Haskell's lazy evaluation system (in which you can pass an infinite list to various functions not including fold, and it will only evaluate however much is needed to return the answer). but I agree with you that the author isn't doing a good job describing anything in that paragraph, and what it says is wrong.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号